Monday, October 8, 2018

The tyranny of the mean or why most of the analysis you're about to hear on Kavanaugh's impact on the election may be wrong.

[Corrected the title. Everything else is the same.]


This is not a prediction.

This is not a counter analysis.


I don't want to get sucked up into predictions and subjective probabilities or whether my priors can beat up your priors. Instead, I want to make a point about the framework and the assumptions generally used in these conversations and the way they often overlook the obvious.

I perhaps should have called this "the tyranny of central tendency" or possibly even thrown in something about expected value. The argument here is not limited to statements about means but they are the most familiar and by far the most often abused example.

One of our long-standing concerns here at the blog is that while journalists and commentators are far more likely to talk about data these days, they have not gotten any more sophisticated in how they think about statistics. We've already discussed naïve reductionism, the implicit and often completely inappropriate assumptions of linearity, non-interaction, and stability.

This is a good time to add a couple more to the list. When people talk about the impact of some treatment or event, there is a strong tendency to assume that the mean (or in some cases the median) has changed but everything else has remained the same. You still have a nice, symmetric Gaussian with the same variance. You've just shifted it from here to there.

These assumptions are particularly likely to bite you in the ass when the point of interest involves a quantile other than the median. For example, when rent-controlled housing in an expensive neighborhood is torn down and replaced with high density market priced apartment buildings, it is entirely possible for both the average price and the amount of low-cost housing available to go down at the same time.

This brings us to the impact of the Kavanaugh confirmation. Over the next couple of weeks, when you hear people discussing the chances of the Democrats retaking the House, you will notice that many, probably most, of the answers will be to an entirely different question: what are the projected totals for the election?

Once again at the risk of stating the obvious, if you keep the mean the same and increase the variance, the probability of passing some cut off that is between the max and min possible values will approach 50%. Even if you shift the mean away from the cut off, it is still possible to increase the probability of passing that point by increasing the variance.

At the moment (and that's an important qualifier), it is entirely possible that we are seeing this situation with Kavanaugh and the midterms. While we can argue about what the polls are telling us, I think it is reasonable to claim that Kavanaugh has been through most of this process and an increasingly unpopular choice and that most news outlets outside of conservative media have raised serious questions about his fitness. At the same time, his confirmation has unquestionably enraged and energized voters on both sides. This possibly unsustainable level of enthusiasm almost inevitably increases variability. As a result, it is entirely possible for the expected number of Republican House seats to drop while the chances of the Republicans holding the house increases.

This is likely to be short-lived phenomena. A week from now, I expect we will have both a clearer picture and a more stable situation. For now although, this is a good time to remind ourselves to be more careful about how we frame analytic questions.


No comments:

Post a Comment