HadCRUT4 data download page

There are many ways to look at global temperatures. Here’s another, which wiggles.

There are already some, rather more coherent thoughts on this plot by Michael de Podesta and Victor Venema.

The wiggles are intended to express our uncertainty about past climate. They show different ways that the past evolution of global temperature might have been, taking into account what we know (or think we know) about likely residual errors in the data.

There are always errors in data and measurements and part of the job of the scientist (this scientist at least) is to try and adequately quantify and characterise the likely range of those errors.

The grey area in the diagram shows the envelope in which we think global temperatures are likely to have been contained over the past 166 years or so. The width of the grey area has been estimated such that we’d expect the true global temperature to have been outside that grey area only a handful of times through that period. The grey area represents the compounded effects of errors associated with the measurements, residual errors associated with instrumentation changes and limited coverage.

“Expect” is one of those magic words that scientists and mathematicians imbue with a deeper meaning than most people give it. The wiggling graph is partly my attempt to show what the expectations look like.

You see, the grey area is only part of the story. It’s not the case that the true global temperature could have taken just any path through that range (with occasional forays outside). What we know about the kinds of errors there are in the data constrains the possibilities. The wiggling blue shows how we think a subset of those residual errors behaves.

The errors represented by the blue line are: measurement errors, residual errors associated with instrumentation changes and the effects of limited sampling in the individual map cells that make up the HadCRUT4 data set. The blue line does not take into account errors associated with limited sampling on a larger scale. We can estimate the magnitude of that (and it’s included in the gray shading), but we don’t yet have a very good idea of how those errors correlate in time: we know how far it wiggles, but not the tune to which it dances.

So, how was the graph put together?

First I took the HadCRUT4 ensemble (there’s a link at the top of the post). This collection of 100 data sets represents uncertainties associated with residual errors coming from instrumentation changes, station moves and so on. The errors associated with these tend to vary quite slowly in time, they cause whole segments of the blue line to rise and fall together.

To each of the 100 ensemble members, I then added an extra contribution drawn from a normal distribution with a standard deviation equal to the estimated uncertainty arising from other measurement errors and limited sampling in the individual map cells, which is also provided in the files. I approximated this as having no dependency in time, which isn’t quite correct.

I then play the 100 graphs in a  loop with a single interpolation step between each one to smooth out the transitions.

Things to note:

  1. the most recent 15 or so years tends to move as a block. There’s a large contribution in this period from residual errors due to instrumentation changes, largely arising from residual errors in the ocean part of the analysis. This can seem counterintuitive because large numbers of observations in this period are from more reliable drifting buoys. The reason is that we express the global temperature as a temperature difference from the period 1961-1990 which is dominated by less-reliable ship data. The ensemble is constrained to have an average of about zero through the 1961-1990 period, so any uncertainty during that time gets squeezed out of that period and into the rest of the time series
  2. You can see this squeezing at work if the blue line jigs upwards at the end of the 1961-1990 period, it has to jog downwards at the other end.
  3. The 19th century is a lot more jagged and jumpy. This is because of the relatively larger contribution from the errors with no time dependency, reflecting the far smaller number of measurements we have access to during this period.

The graph is work in progress.

It’s intended to try out a new (to me) way of presenting global temperatures. Feedback has so far been positive, but I’m always keen to hear constructive criticism. Criticism of other types is welcomed too, but won’t so easily lead to improvements and there’s  a small chance it will make me sad.

At the moment, there are some approximations that can be improved on with a moderate amount of work. These include things like the way that temporal dependence is modelled. It should also be possible with a less moderate amount of work to include all the known uncertainty terms including the large scale sampling.

But there’s also a more fundamental problem which is this: the global time series has these error in it. To make the dancing line, I make an estimate of what the errors are and add them to the series. In effect each of the blue lines has a double dose of error: the error that’s actually in it and the error I’ve added. A better way to do this would be to assimilate the data into a statistical model of global temperatures and then draw samples from that, but that, as they say, is another story…