Occasionally, I see it stated that averaging of repeated measurements only reduces the uncertainty if they are repeated measurements of the same thing.

This is simply not true.

It is, however, a good place to start thinking about the general problem. If we take three measurements of (say) the temperature of a water bath *M1, M2* and *M3*, the average is *(M1+M2+M3)/3.*

We can think of those measurements as a sum of the true temperature of the water bath, *T*, and an error which may change from one measurement to the next i.e.

*M1 = T + E1 *etc.

This makes the average, A

*A = (T + E1 + T + E2 + T + E3)/3*

Or

*A = T + (E1 + E2 + E3)/3*

If we increase the number of measurements to the magic number *n* then *A* will end up as

*A = T + (E1 + E2 + … + En)/n*

Now the question is, under what circumstances does averaging reduce the uncertainty on the estimated temperature. To state the bleeding obvious, it’s when the average of the errors tends towards zero. Call this Condition X.

A slightly different problem is when the temperature of the water bath is changing over time and we want to know the average temperature of the water bath as sampled at regular points through time. Again we have a series of measurements *M1, M2 … Mn* but now the true temperature is different for each one.

*M1 = T1 + E1*

…

*Mn = Tn + En*

The average of the measurements is now

*A = (T1 + E1 + T2 +E2 + … Tn +En)/n*

Which we can split up into two pieces

*A = (T1 + T2 … + Tn)/n + (E1 + E2 + … + En)/n*

The first part of that is the true average temperature which we want to know and the second part is the average of the errors, exactly as in the first example. In this case averaging reduces the uncertainty of the estimated average temperature if condition X still holds. In other words, **if there is a Condition X in which averaging multiple measurements of the same thing together reduces the uncertainty then if Condition X continues to hold in situations where the thing being measured changes then the uncertainty in the average of the measured things will also reduce as the number of measurements increases.**

Admittedly, this is not the most pithy of aphorisms.

Note what this does NOT mean. It doesn’t mean that the uncertainty in any one of the measurements is lower. We don’t know *T1* or any of the other temperatures with any more certainty than before. It is only the average of all the temperatures that would have a lower uncertainty.

**What might Condition X be?**

One possible Condition X is that the errors, *En*, are independent and all normally distributed with mean of zero and some fixed standard deviation. In that case we know that the average of the errors in a large number of measurements will tend to the mean of zero. Actually, for any distribution of errors with a mean of zero and finite variance, averaging a large number of observations together will reduce the error towards zero.

**Does Condition X hold in general?**

No. One can imagine distributions where the mean is not zero or the errors are not independent or the distribution keeps changing. It is important always to bear this in mind when evaluating uncertainties in measurements of any kind. Almost all cases are non-ideal and some are manifestly horrid. The point, however, is that **if there is a Condition X in which averaging multiple measurements of the same thing together reduces the uncertainty then if Condition X continues to hold in situations where the thing being measured changes then the uncertainty in the average of the measured things will also reduce as the number of measurements increases.**

[Update 22/04/2016: Moyhu has a nice post on a very similar topic and goes into more mathematical detail about Condition X.]