I got a comment on a tweet of mine from a friend of mine on Facebook:
Statistically? What does that mean? I mean, I think statistics like "1 in 5" or is that just NASA's way of saying '09 had an average temp equal to '98's average temp?
It's a pretty reasonable question, which kind of underlines the misperceptions on how temperature stuff is measured. John's second point is pretty close to the mark, but there are some subtleties that get lost along the way.
So, first consider the way these temperatures are measured. Obviously, there is inherent instrument error, but there are much larger sources of error. Consider what would happen if you had 100 sensors seperated by 1 m in Death Valley, one sensor in the middle of the Pacific, and one in Antarctica. Obviously, an arithmetic mean is grossly inadequate. So, the data has to be integrated in some weighted manner, which introduces statistical error. Then consider local effects. If you have a thermometer at location X, but if you move it one meter to the left a building drops a shadow over the thermometer. Which is "more true"? People and critters, etc, will experience both temperatures. Do you use both? What about locations that have such a situation, but only one sensor? You need a model to integrate the data.
Then consider what it means for the temps to be equal. Which is a higher temp: 30 +/ 0.3, or 29.8 + 0.5 / - 0.1 ? What about 30.1 +/- 0.7? There are good arguments on various sides, but the most honest thing to say is that they are statistically equivalent — that is, the temperatures all lie within the error bars of the other temperatures. However, the media would probably report the 29.8 temperature as the lowest and 30.1 temperature as the highest, though there is a compelling argument to be made for the reverse.
Here's a couple fairly extreme examples: is 32 +/- 0.1, or 30 +/- 5 larger? There is around a 30% chance that the second temperature is higher, and possibly as much as 3 degrees higher, which is significant. Statistically, you can't tell the difference between the two. A second case would be if you consider a stream of years with steadily improving detectors. If you had a five year span that the error bars kept shrinking, and each subsequent year was entirely within the error bars of the previous, can you say anything about trends in that data? Of course not.
Scientific data without error bars isn't scientific data, and it's a travesty the media almost never mentions error. There is a reason, though, that all the caveats are required and provided by people like NASA.
Information and Links
Join the fray by commenting, tracking what others have to say, or linking to it from your blog.