Known unknowns

What premium should the owners of the Empire State Building pay for terrorism insurance in 2017? Insurers and reinsurers often need to quantify the risk of events occurring where there is no relevant data to analyse or where the risk changes continuously. Standard statistical methods are little help in these situations, and expert judgment becomes a crucial tool.

But experts often disagree. In this guest article, Dr Raveem Ismail and Christoph Werner discuss a new approach for evaluating expert judgment in an objective way.

There are no hard facts, just endless opinions. Every day, the news media deliver forecasts without reporting, or even asking, how good the forecasters really are. Every day, corporations and governments pay for forecasts that may be prescient or worthless or something in between. And every day, all of us - leaders of nations, corporate executives, investors, and voters - make critical decisions on the basis of forecasts whose quality is unknown.

Superforecasting: The Art & Science Of Prediction.

Expert Opinion

Reinsurers and ILS funds frequently have to act in data-poor environments, relying heavily on expert judgement. This occurs particularly in low frequency, high severity/loss practice areas: rare and catastrophic risk appraisal is almost entirely based on expert judgement.

Structured Expert Judgement, is an auditable and objective combination of multiple judgements, each weighted by its skill in gauging uncertainty. This produces a better overall judgement within a plausible range of outcomes.

Consulting ten experts will yield ten different answers of differing quality. Each answer will be affected by factors including a particular expert's previous experience, grasp of the data, judgemental capability, biases and mood on the day. Simply averaging of all answers may be less accurate than some individual answers due to the differing expertise of the experts.Screen Shot 2016-11-02 at 7.52.05 AM

Structured Expert Judgement

Structured Expert Judgement differs from and extends previous opinion-pooling methods. Here, each expert is first rated with regard to their prior performance by first being asked a set of seed questions, for which the answers are already known by the facilitator, but not necessarily by the expert.

Each expert’s performance on these seed questions ascertains that expert’s given weighting.

Each expert is then asked the target questions, where their actual judgements are being sought, and to which the answers are not known. The weightings drawn up from the seed question phase are then used to combine the experts’ judgements on the target questions, producing an outcome that truly combines the different judgements in a performance-based way. This should provide a better overall answer than each individual answer.

To ensure proper weightings of the experts, the seed question design is critical. They must be carefully chosen for their tight alignment with the target questions; testing the same ability required for answering the target questions, and thus maximising the utility of performance weighting. The effectiveness of the weighting is impacted by poorly designed seed questions.

An insurance risk case study

Screen Shot 2016-10-30 at 11.08.46 PMCooke’s Classical Model for Structured Expert Judgement involves asking each expert to give their replies for two metrics: a confidence interval between which they think the true value lies (5% to 95%) and a central median value. These are then used to calculate how well the expert gauges uncertainty spreads (“information”), and how reliably they capture true values within their ranges (statistical accuracy or “calibration”).

This approach was used in a study in January, which used 18 seed and 8 target questions. The study involved an inherently unknowable future metric: the 2016 frequency of strikes riots and civil commotion (SR&CC) in blocs of countries (Central Asia, Maghreb, etc.), with participants drawn from across the (re)insurance profession. An example of their judgements on a single seed question (events in Southeast Asia) are shown in the figure above (“Seed Question 11”).

The experts produced a variety of median values and ranges, some with tightly bound ranges that captured the true value (the dotted red line). The table below shows the information and calibration scores across the full seed question set. Two experts (Experts 1 and 4) emerge with notably strong performance-based weightings. If all the experts were weighted equally, this discovered capability would have been diluted away.

Screen Shot 2016-10-30 at 11.10.34 PM
However, if the experts’ judgements are combined using the weights from the calibration exercise (last column), then a combination emerges that capitalises on these high-performance experts and produces better results than from all of them. This performance-weighted combination was then used for a target question, and the results are shown in the figure below (“Target Question 7”).

Screen Shot 2016-10-30 at 11.11.21 PM

Note, for this forward-looking question, there was no known answer, yet we can see that the performance-weighted process has allowed the influence of Experts 1 and 4 to provide a much tighter and more informative judgement than the answers given from most individual experts, or from the equal-weighted combination of all the experts answers (which is inflated by the outliers). In the performance-weighted combination, the outliers are ameliorated and instead the identified experts answers are given more weight. Such a final frequency, with its associated range, could now be fed into underwriting/pricing decisions or catastrophe models and provide them with greater assurance than data obtained from customary approaches.


Structured expert judgement still involves judgement, but it is not guesswork. It represents a transparent method for pooling multiple opinions, with these opinions weighted according to performance-based criteria aligned to the actual judgements being sought. Where data or models are lacking, it forms an objective and auditable method for producing or supporting decision-making judgements and inputs to models.

It should be noted that this approach is not a crystal ball solution to the problem, and where there are science-based models or suitable data, these should trump expert judgement (or at least be used in tandem). But in their absence, in classes of business, such as political violence, and for situations where tail risk is being gauged, structured expert judgement could significantly enhance current decision-making and risk appraisal.


Dr Raveem Ismail is a Specialty Treaty Underwriter at Ariel Re, Bermuda, chair of the Reinsurance Special Interest Group of EU COST Action IS1304 on Structured Expert Judgement, and a cofounder of The Journal Of Terrorism & Cyber Insurance.

Christoph Werner is a doctoral researcher at Strathclyde University in probabilistic risk analysis. His focus is on expert judgement, including the underwriting geopolitical risk.

The full study will be a forthcoming publication in a scientific journal (permanent URL:, corresponding author A shorter version of this article Ask The Experts, co-authored by Scott Reid, appeared in The Actuary and a longer version in The Journal Of Terrorism & Cyber Insurance’s inaugural edition (October 2016).

Posted: Monday, October 31st, 2016