Earlier this month I published a post on LinkedIn about underwriting “challenging” flood in California’s Central Valley. It generated a decent readership and some likes, but, most importantly, it generated some comments. One of the comments posted was a very prescient piece of commentary, and it deserves a blog post to explore the topic it raised: The limitations of analytics.
The comment came from Mr. Tim Pappas (a VP at Gen Re), and I am grateful to him for raising this important topic. Here are the points he raises that this post will address:
- Without understanding the limitations of “superior analytics,” insurance companies can be putting their bottom line in great danger.
- The picture may not be quite as clear as the “high resolution” data provided indicates.
- The complexity of all the factors involved can lead to errors in estimation, making the computations imprecise and sometimes outright faulty.
- If the data is off on just a small percentage of risks, the impact on portfolio profitability can be quite large.
Assessing the risk of flood is a messy business – there is no debate or dispute about that. The real-world variables that determine where flood waters (churned by the incredible forces of unsettled nature) will ultimately go are impossible to predict, just as it is impossible to predict when floods will happen. But it is this complete unknowability that makes analytics necessary to work with flood risk (and all risk, really) – analytics are an imperfect substitute when measurements are not possible.
Imperfect? Absolutely – all models are wrong, but some are useful (thanks again to Dr. Box). A caveat must be understood by anyone underwriting flood: the models and analytics are not 100% accurate, ever…not even close. Below is a list (with links to supporting material) of some ways flood models and data are flawed:
- Flood zones don’t capture all the locations prone to flooding.
- Flood models can be over generalized.
- Flood models, like all models, have a bias.
- Flood models for large regions miss localized conditions.
Thus, to address squarely the first point above – it is true that an insurer’s results can be harmed by assuming their data and analytics will be entirely accurate. The remaining points to be addressed suggest ways to handle this fact.
For the remaining three points, it is helpful to explore what an analytic actually is. Wikipedia states:
Analytics is the discovery, interpretation, and communication of meaningful patterns in data.
In the context of flood-risk-assessment analytics for underwriting, I would define it thus:
The automated interpretation of multiple datasets to estimate the relative risk of flooding at a specified location.
From this definition, it is possible to outline the key points that address the limitations of analytics:
- They measure relative risk, and not absolute risk. It is pretty much impossible to understand the actual probability of flooding in a given location, but analytics help (quite well) determine which locations are higher and lower risk than other locations. In other words, they segment risk.
- The better the data used in the analytic, the better the results. No data will be high-resolution enough to remove all inaccuracies or uncertainty, but results will improve with solid, accurate datasets included in the analytic.
- Analytics can (and should) use multiple datasets. Since no flood model or dataset is entirely accurate, analytics should use at least two (preferably three if they are available). This is why underwriters should use underwriting software, and not accumulation software, for underwriting.
With all of this in mind, here are the remaining three limitations addressed:
- The picture is never clear, but analytics that can leverage more and better data makes it more clear.
- Expect errors and imprecision…and base the business on this fact.
- By using multiple independent datasets, and preferably multiple types of data, the impact of imprecise data is greatly lessened on a portfolio. Ideally, insurers should build a custom analytic to use those datasets in the way that fits their view of risk, their appetite, and their experience.
Finally, Mr. Pappas mentions the importance of the distinction between precision and accuracy. For underwriters, accuracy answers: “what’s the risk at this location?” Precision answers: “how dependable is that answer?”
These are all important things for users of underwriting analytics to understand, and I agree when Mr. Pappas says it is important for providers of analytics (like me) to help.