There are two approaches taken to model earthquake risk: 1) Try to predict the chances of an earthquake happening; and 2) Assume one is going to happen and model the likely effects of the quake. Some models unify these approaches, but even then these two aspects are distinctly used and output. In this post I'll focus on the first approach. I’ll explore the second approach next week.

Trying to predict an earthquake is tough, if not impossible. Predictions are usually short term (not much use for insurers), and involve everything from radon gas emissions to animal behavior. In fact, anyone that claims to have predicted an earthquake is usually discredited within months, and the term “predicting” is frowned upon by the seismic community – forecasting is what they try to do.

This unpredictability is a problem for property insurers because they really need to understand the likelihood of a damaging event. So, what they do is use a couple of different approaches to the problem by using proxies for the probability that an earthquake will happen. Some insight is better than no insight, and the inherent uncertainty needs to be understood and accommodated in business rules/pricing.

There are two primary indicators of the likelihood of an earthquake:

**Proximity to a fault.**

If a location is near a fault, where tectonic plates interface with each other, it is susceptible to earthquakes. If it’s not, it’s not. As the distance between a location and a fault decreases, the chances of the location being impacted by an earthquake increase. The use of this elementary geospatial metric is widespread throughout modeling circles because it provides a reasonable view on likelihood, particularly for defining broad areas where earthquakes are and are not likely to happen. There is reliable fault location data available for most of the world, and calculating the distance is trivial. The trick is establishing the correlation between relative location to faults and likelihood, and tweaking a model to apply just the right amount of importance to this measurement.

**Estimated return period.**

This is an estimate of how frequently a damage-causing earthquake is likely to hit a location. The historical record plays a part in this analytic, as a bit of interpolation is possible from a long-enough history. However, this is all undermined by the irregularity of earthquakes throughout history — there is no tangible pattern related to time nor location. At best, extending the historical record into the future is a very rough estimation. The size and volatility of nearby faults is then taken into account in an attempt to refine the estimate. Unfortunately, the amount of information needed to really hone the estimate is unavailable, because the seismic readings that describe the volatility of those faults is all too brief compared to the geologic time periods involved. Interpolating return periods of 1,000 years, or 10,000 years, based on 50 years of information is weak. But, it’s also the best we can do.

With the two types of metric above, estimates are created that suggest the annual probability of an earthquake happening in a given location. These probabilities are highly subjective. Earthquake models tend to be quite different from each other in this regard, and when used, the limitations need to be understood.

Models that forecast the annual probability of an earthquake require a revision to Box’s famous quote: “All models are wrong, and it’s important to know how they are wrong, and how wrong they are.”