Thursday, October 4, 2018

Repost -- some thoughts on poll volatility and self-selection

Tuesday, November 6, 2012

Life on 49-49

[Following up on this post, here are some more (barely) pre-election thoughts on how polls gang aft agley. I believe Jonathan Chait made some similar points. Some of Nate Silver's critics also wandered into some neighboring territory (with the important distinction that Chait understood the underlying concepts)]

Assume that there's an alternate world called Earth 49-49. This world is identical to ours in all but one respect: for almost all of the presidential campaign, 49% of the voters support Obama and 49% support Romney. There has been virtually no shift in who plans to vote for whom.

Despite this, all of the people on 49-49 believe that they're on our world, where large segments of the voters are shifting their support from Romney to Obama then from Obama to Romney. They weren't misled to this belief through fraud -- all of the polls were administered fairly and answered honestly -- nor was it a case of stupidity or bad analysis -- the political scientists on 49-49 are highly intelligent and conscientious -- rather it had to do with the nature of polling.

Pollsters had long tracked campaigns by calling random samples of potential voters. As campaign became more drawn out and journalistic focus shifted to the horse race aspects of election, these phone polls proliferated. At the same time, though, the response rates dropped sharply, going from more than one in three to less than one in ten.

A big drop in response rates always raises questions about selection bias since the change may not affect all segments of the population proportionally (more on that -- and this report -- later). It also increases the potential magnitude of these effects.

Consider these three scenarios. What would happen if you could do the following (in the first two cases, assume no polling bias):

A. Convince one percent of undecideds to support you. Your support goes to 50 while your opponent stays at 49 -- one percent poll advantage

B. Convince one percent of opponent's supporters to support you. Your support goes to 50 while your opponent drops to 48 -- two percent poll advantage

C. Convince an additional one percent of your supporters to answer the phone when a pollster calls. You go to over 51% while your opponent drops to under 47%-- around a five percent poll advantage.

Of course, no one was secretly plotting to game the polls, but poll responses are basically just people agreeing to talk to you about politics, and lots of things can affect people's willingness to talk about their candidate, including things that would almost never affect their actual votes (at least not directly but more on that later).

In 49-49, the Romney campaign hit a stretch of embarrassing news coverage while Obama was having, in general, a very good run. With a couple of exceptions, the stories were trivial, certainly not the sort of thing that would cause someone to jump the substantial ideological divide between the two candidates so, none of Romney's supporters shifted to Obama or to undecided. Many did, however, feel less and less like talking to pollsters. So Romney's numbers started to go down which only made his supporters more depressed and reluctant to talk about their choice.

This reluctance was already just starting to fade when the first debate came along. As Josh Marshall has explained eloquently and at great length since early in the primaries, the idea of Obama, faced with a strong attack and deprived of his teleprompter, collapsing in a debate was tremendously important and resonant to the GOP base. That belief was a major driver of the support for Gingrich, despite all his baggage; no one ever accused Newt of being reluctant to go for the throat.

It's not surprising that, after weeks of bad news and declining polls, the effect on the Republican base of getting what looked very much like the debate they'd hoped for was cathartic. Romney supporters who had been avoiding pollsters suddenly couldn't wait to take the calls. By the same token. Obama supporters who got their news from Ed Schultz and Chris Matthews really didn't want to talk right now.

The polls shifted in Romney's favor even though, had the election been held the week after the debate, the result would have been the same as it would have been had the election been held two weeks before -- 49% to 49%. All of the changes in the polls had come from core voters on both sides. The voters who might have been persuaded weren't that interested in the emotional aspect of the conventions and the debates and were already familiar with the substantive issues both events raised.

So response bias was amplified by these factors:

1. the effect was positively correlated with the intensity of support

2. it was accompanied by matching but opposite effects on the other side

3. there were feedback loops -- supporters of candidates moving up in the polls were happier and more likely to respond while supporters of candidates moving down had the opposite reaction.

You might wonder how the pollsters and political scientists of this world missed this. The answer that they didn't. They were concerned about selection effects and falling response rates, but the problems with the data were difficult to catch definitively thanks to some serious obscuring factors:

1. Researchers have to base their conclusions off of the historical record when the effect was not nearly so big.

2. Things are correlated in a way that's difficult to untangle. The things you would expect to make supporters less enthusiastic about talking about their candidate are often the same things you'd expect to lower support for that candidate

3. As mentioned before, there are compensatory effects. Since response rates for the two parties are inversely related, the aggregate is fairly stable.

4. The effect of embarrassment and elation tend to fade over time so that most are gone by the actual election.

5. There's a tendency to converge as the election approaches. Mainly because likely voter screens become more accurate.

6. Poll predictions can be partially self-fulfilling. If the polls indicate a sufficiently low chance of winning, supporters can become discouraged, allies can desert you and money can dry up. The result is, again, convergence.

For the record, I don't think we live on 49-49. I do, however, think that at least some of the variability we've seen in the polls can be traced back to selection effects similar to those described here and I have to believe it's likely to get worse.

1 comment:

  1. Jesus Christ. Back in 2012, you completely anticipated the main result of our Mythical Swing Voter paper, which is based on data we collected in 2012, analyzed in 2013, wrote up in 2014, and published in 2016, and which other people picked up on in time for the 2016 campaign.

    I probably even read your post when it came out, but I didn't get the point.

    There's something wrong with the world that your blog doesn't have a million readers.

    ReplyDelete