## Wednesday, April 6, 2011

### Andrew Gelman buries the lede

As Joseph mentioned earlier, Andrew Gelman has a must-read post up at the Monkey Cage. The whole thing is worth checking out but for me the essential point came at the end:

Internal (probabilistic) vs. external (statistical) forecasts

In statistics we talk about two methods of forecasting. An internal forecast is based on a logical model that starts with assumptions and progresses forward to conclusions. To put it in the language of applied statistics: you take x, and you take assumptions about theta, and you take a model g(x,theta) and use it to forecast y. You don't need any data y at all to make this forecast! You might use past y's to fit the model and estimate the thetas and test g, but you don't have to.

In contrast, an external forecast uses past values of x and y to forecast future y. Pure statistics, no substantive knowledge. That's too bad, put the plus side is that it's grounded in data.

A famous example is the space shuttle crash in 1986. Internal models predicted a very low probability of failure (of course! otherwise they wouldn't have sent that teacher along on the mission). Simple external models said that in about 100 previous launches, 2 had failed, yielding a simple estimate of 2%.

We have argued, in the context of election forecasting, that the best approach is to combine internal and external approaches.

Based on the plausibility analysis above, the Beach et al. forecast seems to me to be purely internal. It's great that they're using real economic knowledge, but as a statistician I can see what happens whey your forecast is not grounded in the data. Short-term, I suggest they calibrate their forecasts by applying them to old data to forecast the past (this is the usual approach). Long-term, I suggest they study the problems with their forecasts and use these flaws to improve their model.

When a model makes bad predictions, that's an opportunity to do better.

All too often, we treat models like the ancient Greeks might have treated the Oracle of Delphi, an ultimate and unknowable authority. If we're going to use models in our debates, we also need to talk about where they come from, what assumptions go into them, how range-of-data concerns might affect them.