Tuesday, January 2, 2024

The very idea of claiming a candidate a year before an election has X% chance of winning is gross statistical malpractice.

[This has been sitting in the queue for a while but I think it still has another month or two on the sale-by date.]

A couple of issues make talking about predictive modeling difficult: 

Predictive range -- When we say someone accurately predicted an outcome, are we talking about an event that happened the the next day or the next year? Most are easier in the short range. Some are easier in the long (we'll all be dead) range. This has been particularly relevant with poll-based electoral predictions, where the track record for short term models has been great and long term models has been disastrous. We have an extensive history of pundits bragging about successes in the first category while hoping you'll forget about their failures in the second.

So, Is Obama Toast? by Nate Silver

Then there's modelers' luck. The problem with checking any probabilistic claim is that being right (got the outcome predicted) doesn't mean you were right (used a sound approach to estimate reasonable odds). The person who told you not to try to fill an inside straight was right and the person who told you to go for it was wrong, even if you did end up getting the card you were looking for.  

Back in 2011, Nate Silver said that, unless there was a major uptick in the economy, Obama had very little chance (think Russian Roulette odds) of winning the election. Instead, the economy was basically flat and yet the incumbent not only won but won by a comfortable margin. It is safe to say the model is wrong but was it bad or merely unlucky? Based on this article's long and admirably transparent explanation, I have to go with bad and here are some of the reasons why.

The fundamental assumption of predictive modeling is that things still work like they used to. Correlations and causal relationships from the past still hold. Data are collected in roughly the same way and the statistics derived from them have the same definitions.

The first practical implication of the fundamental assumption is that you can't push the boundaries of your data back too far. If things were two different beyond a certain point, you can't reasonably assume that they will generalize to today.

How far back you can reasonably go depends on what kinds of questions you are trying to answer and what types of data you're relying on. In terms of re-elections, 1931 is certainly too far back for any kind of meaningful comparison. This would have been 80 years before Nate Silver did his analysis which is a long time with respect to making political or social comparisons. More importantly, the way public opinion was formed and measured is enormously different. Add to that the huge outlier which was the beginning of the Great Depression.

We are even further into outlier territory with the entire presidency of FDR, especially if we're talking about the concept of re-election. (Silver goes back to 1944 in his analysis.) Truman is also problematic for a number of reasons, not the least of which being the fact he was not technically re-elected. The same concerns apply to LBJ and Gerald Ford.

This leaves us with Eisenhower, Nixon, Carter, Reagan, HW Bush, Clinton, and George W Bush. 

N equals 7.

Even if we ignore the distinction between election and reelection (which is a pretty big jump) and look at all elections going back to 1952, which is about the maximum I would be comfortable with, we're still looking at 15 elections to take us to Obama versus Romney. 

N equals 15.

(If we were just looking at win/loss, one of those 15 data points is missing since we will never know who actually won the 2000 election.)

That would be a small sample under the best of circumstances, but in this case we also have messy data, major one time events like the Cuban Missile Crisis, the Vietnam War, the Watts riots and the Iranian hostage crisis, not to mention waaaaaaay more than 15 researcher degrees of freedom.

Case in point. Look at how Silver handles the 800 lb gorilla of the model.

A president’s approval rating at the beginning of his third year in office has historically had very little correlation to his eventual fate. In January 1983, Reagan had an approval rating of just 37 percent, but he won in a landslide. George H. W. Bush had a 79 percent approval rating in January 1991 and was soundly defeated. But voters start to think differently about a president over the course of his third year; they view him more on the basis of his performance and less on the hopes they had for him. These perceptions are sharpened by the beginning of the opposition party’s primary campaign, which, of course, accentuates the negatives.

A president’s approval rating toward the end of his third year, therefore, has been a decent (although imperfect [I love how Silver throws in these little qualifiers while getting further and further ahead of the data -- MP]) predictor of his chances of victory. Reagan saw his approval rating shoot up to 51 percent in November 1983 amid the V-shaped recovery from the recession of the previous year — the first sign that he was headed for a big win. Obama’s approval rating may have rebounded by a point or two from its lows after the debt-ceiling debacle — but not by much more than that. In late October, it ranged between 40 and 46 percent in different polls and averaged about 43 percent.

Look at the forks. Of the various factors we can put in the model,  we pick approval rating but the fit to our fourteen data points is still crappy, so we limit ourselves to an arbitrary interval. Silver tells a good story to justify setting the the cut-off at the end of the third year, but that's all it is, a story, and even if it's true, we have no way of knowing if that particular cut-off will be appropriate going forward.

Silver also considered

The good news is that voters have short memories. If there are hopeful signs during an election year, they may be willing to forget earlier problems. Reagan, Nixon, Eisenhower and Truman all won despite recessions earlier in their terms. Moreover, voters’ evaluations of the economy are relatively forward-looking. Even if the economy is below its full productive capacity — as it was in November 1984 when the unemployment rate was 7.2 percent, and as it certainly was in 1936, when it was still around 17 percent — voters may be willing to overlook this, provided it seems headed in the right direction.


  1. I think there's also a problem with characterizing a recovery as "U-shaped" or "V-shaped." Life isn't that simple.


  2. It would be interesting to know how much these and other economic assumptions figured in his model. -- MP