Estimating uncertainty in historical climate data sets is not easy. Occasionally the methods and results are criticised, often by people from an engineering background. In engineering, you have to get things right. If you don’t propagate your measurement uncertainties correctly things don’t work. You also get a lot of feedback from the real world which will let you know when you’ve screwed up. It’s not uncommon for someone with such a background to look at a global average temperature quoted with a 0.1K uncertainty and snort.

They point out that the basic measurements with which we work can’t possibly be measuring temperature with an uncertainty much less than 1K therefore, the uncertainty in the global average sea-surface temperature must be around 1K. At best, probably worse. If pressed they point out that given these are measurements of uncertain provenance, we should treat them with deep suspicion. I agree. Deep suspicion is an excellent starting point and one I revisit on a regular basis.

However, deep suspicion is not an uncertainty analysis and it’s when getting down to the nitty gritty of actually estimating and propagating uncertainties that it becomes clear that it’s hard to use deep suspicion as a consistent and thoroughgoing philosophy.

For example. Say we want to know the average of a set of temperatures, which is a common way of calculating a climatology. An average is easy to calculate: add the temperatures up and divide by the number of measurements. Propagating the uncertainty is likewise easy – the deep suspicion solution is simply to average the uncertainties. If you are in any doubt about this, go have a look at the OK Wikipedia entry on propagating uncertainty.

Next, we might want to calculate an anomaly by subtracting a climatological average from a single temperature measurement. Applying DS reasoning to get the uncertainty, we say that we ought to sum the squared uncertainties (of the single measurement and of the climatological average) and then take the square root. Easy enough, but something peculiar has happened here.

In calculating the climatology, the DS approach had to assume that the errors in all those measurements were perfectly correlated. However in estimating the uncertainty in the anomaly we have to assume that the errors are completely uncorrelated. The approach, applied in that way, is inconsistent.

The approach is also slightly absurd – and I mean that in the gentlest possible way. If one were averaging a handful of temperatures made in the same lab, then it’s easy to imagine ways in which the errors might end up being perfectly correlated. However, when calculating a real climatology, we have to imagine that the errors in hundreds or thousands of measurements are perfectly correlated over a period of decades. That is much harder to imagine and would, I contend, be damned near impossible to rig deliberately.

Applying the deep suspicion approach consistently through an analysis turns out to be surprisingly difficult and applied without careful consideration can lead to one making contradictory unstated assumptions. It is easier, on the whole, to start from reasonable and consistent assumptions and work – with constant reference to the actual data – from there. You might (OK, probably will) have to toss the assumptions out from time to time and start again with something more complex, but the result will be more coherent.