Bryan Caplan  

Meteorological Impossibilities

PRINT
The Wisdom of Chairman Dwight... Robert Litan on Microeconomist...
The Weather Channel's daily and hourly forecasts often seem logically incompatible.  Consider Oakton, VA's forecast for today.  The current daily prediction says "60% chance of rain."  But several evening hours individually have the same probability of 60%.  Unless I'm missing something, this is only possible if those probabilities are perfectly dependent (if rain happens, it happens during every hour) or negatively independent (if rain happens one hour, it doesn't happen during other hours). 

These extreme cases seems unlikely.  The ironclad puzzle, though, is that the current forecast for 7 PM is a 70% chance of rain.  How can an hour have 70% when the whole day only has 60%?  Nor is this a fluke case; in my experience, hourly rain probabilities slightly above the daily probabilities pop up every few days.

I'm tempted to dismiss my own puzzlement by quoting The Simpsons:
Comic Book Guy: Last night's Itchy & Scratchy was, without a doubt, the worst episode ever. Rest assured I was on the Internet within minutes registering my disgust throughout the world.

Bart: Hey, I know it wasn't great, but what right do you have to complain?

Comic Book Guy: As a loyal viewer, I feel they owe me.

Bart: For what? They're giving you thousands of hours of entertainment for free. What could they possibly owe you? If anything, you owe them.

Comic Book Guy: ...Worst episode ever.
Fair point, but is there anything I'm missing? 

Update: Minutes after writing this post, I realized that the problem is more severe than I thought.  The daily and multiple hourly forecasts can indeed be equal if the probabilities are perfectly dependent (or nearly perfectly dependent, with a slight rounding error).  But the "negative dependence" loophole I suggested is completely confused.  If there is a 60% chance at 6 PM and 7 PM, and rain doesn't happen at 6 PM, then any lingering positive probability of rain at 7 PM implies that the probability of rain for the day initially exceeded 60%.  This is true for partial dependence, independence, and negative dependence.


Comments and Sharing





COMMENTS (13 to date)
Stefan writes:

I've have the same issue with some weather data reported in weather apps. It appears that the probability reported is normalized to some concept of a 24 hour probability. Sort of like a Poisson process hazard. Seeing the actual formula would be nice -- they could obviously be doing something less coherent.

Adam writes:

I think the issue is that rain probability is dependent on both time and area. You can think of it that at any given time 60% of the area will have rain or there is a 60% chance that any part of the area will have rain at some time or that any part of the area will have rain 60% of the time.

The formula is: PoP = C x A

Probability of Precipitation equals Confidence of rain somewhere in the forecast area times the percent of the Area forecast to receive precipitation.

Smaller areas can have significantly larger probabilities than the whole (think a mountain top that almost always receives rain relative to the nearby valley that receives much less). The same is true with smaller time slices.

Steve writes:

I'm pretty sure Adam is correct on this one; this is how I've heard it explained before. That is to say, the probabilities are expressed relative to both the forecast area and the time interval itself.

Stefan writes:

I don't think Adam is correct. The formula and interpreation (via http://www.srh.noaa.gov/ffc/?n=pop) is

The "Probability of Precipitation" (PoP) describes the chance of precipitation occurring at any point you select in the area.

How do forecasters arrive at this value?

Mathematically, PoP is defined as follows:

PoP = C x A where "C" = the confidence that precipitation will occur somewhere in the forecast area, and where "A" = the percent of the area that will receive measureable precipitation, if it occurs at all.
So... in the case of the forecast above, if the forecaster knows precipitation is sure to occur ( confidence is 100% ), he/she is expressing how much of the area will receive measurable rain. ( PoP = "C" x "A" or "1" times ".4" which equals .4 or 40%.)

But, most of the time, the forecaster is expressing a combination of degree of confidence and areal coverage. If the forecaster is only 50% sure that precipitation will occur, and expects that, if it does occur, it will produce measurable rain over about 80 percent of the area, the PoP (chance of rain) is 40%. ( PoP = .5 x .8 which equals .4 or 40%. )

In either event, the correct way to interpret the forecast is: there is a 40 percent chance that rain will occur at any given point in the area.

The key point is that 'at any given point' is not the same as 'at any point'.

Samuel Hammond writes:

In Nate Silver's book Signal and Noise he has a whole chapter on weather forecasters. He explains in detail how major weather news sites regularly calibrate their PoP numbers to adjust for availability heuristic bias.

The idea is that if the PoP is genuinely 5% people read that as "I don't need my umbrella". But if 1 in 20 forecasts leads people getting poured on, they have no recollection of the 19 times it was sunny and so cry bloody murder at the meteorologists. So if the true PoP is 5% many organizations report it as 20% and so on.

This is a reliable phenomenon, and Nate's book has an incredible graph showing the PoP differential between real and reported. In essence, weather providers optimize for viewer satisfaction, not accuracy. So it's no surprising that the numbers don't add up. They're not true probabilistic estimates; they're pseudo-estimates designed to match human behavior.

Samuel Hammond writes:

I found the excerpt from The Signal Signal and the Noise, Nate Silver, 2012, Chapter 4:

For instance, the for-profit weather forecasters rarely predict exactly a 50 percent chance of rain, which might seem wishy-washy and indecisive to consumers. Instead, they'll flip a coin and round up to 60, or down to 40, even though this makes the forecasts both less accurate and less honest.

Floehr also uncovered a more flagrant example of fudging the numbers, something that may be the worst-kept secret in the weather industry. Most commercial weather forecasts are biased, and probably deliberately so. In particular, they are biased toward forecasting more precipitation than will actually occur43-what meteorologists call a "wet bias." The further you get from the government's original data, and the more consumer facing the forecasts, the worse this bias becomes. Forecasts "add value" by subtracting accuracy.

...The National Weather Service's forecasts are, it turns out, admirably well calibrated (figure 4-7). When they say there is a 20 percent chance of rain, it really does rain 20 percent of the time. They have been making good use of feedback, and their forecasts are honest and accurate. The meteorologists at the Weather Channel will fudge a little bit under certain conditions. Historically, for instance, when they say there is a 20 percent chance of rain, it has actually only rained about 5 percent of the time. In fact, this is deliberate and is something the Weather Channel is willing to admit to. It has to do with their economic incentives.

People notice one type of mistake-the failure to predict rain-more than another kind, false alarms. If it rains when it isn't supposed to, they curse the weatherman for ruining their picnic, whereas an unexpectedly sunny day is taken as a serendipitous bonus. It isn't good science, but as Dr. Rose at the Weather Channel acknolwedged to me: "If the forecast was objective, if it has zero bias in precipitation, we'd probably be in trouble."

...forecasts were quite a bit worse than those issued by the National Weather Service, which they could have taken for free from the Internet and reported on the air. And they weren't remotely well calibrated. In Eggleston's study, when a Kansas City meteorologist said there was a 100 percent chance of rain, it failed to rain about one-third of the time.

Michael Crone writes:
The ironclad puzzle, though, is that the current forecast for 7 PM is a 70% chance of rain. How can an hour have 70% when the whole day only has 60%? Nor is this a fluke case; in my experience, hourly rain probabilities slightly above the daily probabilities pop up every few days.

It seems unlikely that the weather channel would fudge in a way that left an hourly percent higher than a daily percent for the same day, since that's too obviously a problem. My first theory for that inconsistency is that the hourly and daily forecasts are updated in different time intervals and Bryan is comparing a daily prediction that is several hours old with an hourly prediction that is much more recent.

Sieben writes:

"Bart: For what? They're giving you thousands of hours of entertainment for free. What could they possibly owe you? If anything, you owe them."

It's not free. I watched the advertisements. Worst episode ever.

Thomas Boyle writes:

The ironclad puzzle, though, is that the current forecast for 7 PM is a 70% chance of rain. How can an hour have 70% when the whole day only has 60%? Nor is this a fluke case; in my experience, hourly rain probabilities slightly above the daily probabilities pop up every few days.


I have always interpreted things like this to mean that "the probability that it will be raining at a randomly-chosen time through the whole day is 60%; but the probability that it will be raining at a randomly-chosen time during that hour is 70%".

These are entirely consistent.

Stefan writes:

Why do you guys find my explanation plausible enough to engage with? The numbers here are hazards, but reported standardized to 24 h probabilities.

the current forecast for 7 PM is a 70% chance of rain. How can an hour have 70% when the whole day only has 60%?

Here, the day has a 60% probability, which is generated by a hazard of '60% over 24 hours'. If you look at each hour of those 24 hours in the day, some hours have hazards above and some have hazards below the hazard that gets you 60% over a day. That's what's going on.

The weather people don't believe people can handle hazard time aggregation, so they do it for you.

Flocccina writes:

Could it be that the day forcast is not updated after the day begins but the hourly forecasts are?

Glenn writes:

"I have always interpreted things like this to mean that 'the probability that it will be raining at a randomly-chosen time through the whole day is 60%; but the probability that it will be raining at a randomly-chosen time during that hour is 70%'."

This strikes my as unlikely. A reasonable interpretation of the "probability of precipitation" is that it is the likelihood of measurable rainfall sometime in the next 24-hours.

Given that, if the weather services intended it to be interpreted otherwise, they would likely say so. The interpretation you offer - that it is a cumulative probability of rainfall at any random point in time - would qualify as an unexpected meaning demanding interpretation; when checking my AM forecast, it would not be telling me whether I need an umbrella TODAY, but whether I need it AT THIS MOMENT. The latter issue requires no forecast, but is readily resolved by looking outside.

The most probable explanation is that hourly forecasts are updated with greater frequency than dailies.

Granite26 writes:

Samuel has most of it, in that there is an overestimation bias.

Adam has a good point as well, but can be generalized.

If you cut the area into east and west, and the wind is blowing from the east, then there can be a 60% chance of rain in the east at 5pm, then a 60% chance of rain in the east at 6pm.

At best you're getting the granularity of a zip code, but I would be very surprised if the hourly forecasts were ACTUALLY that granular.

Comments for this entry have been closed
Return to top