Non-response has become a hot topic among political writers and data. I'm not entirely happy with some of the analyses we've been seeing, so I need to get serious about the thread on electoral forecast I've been putting for years.
In the meantime, here was our first foray into the topic.
Life on 49-49
[Following up on this
post,
here are some more (barely) pre-election thoughts on how polls gang aft
agley. I believe Jonathan Chait made some similar points. Some of Nate
Silver's critics also wandered into some neighboring territory (with the
important distinction that Chait understood the underlying concepts)]
Assume that there's an alternate world called Earth 49-49. This world is
identical to ours in all but one respect: for almost all of the
presidential campaign, 49% of the voters support Obama and 49% support
Romney. There has been virtually no shift in who plans to vote for whom.
Despite this, all of the people on 49-49 believe that they're on our
world, where large segments of the voters are shifting their support
from Romney to Obama then from Obama to Romney. They weren't misled to
this belief through fraud -- all of the polls were administered fairly
and answered honestly -- nor was it a case of stupidity or bad analysis
-- the political scientists on 49-49 are highly intelligent and
conscientious -- rather it had to do with the nature of polling.
Pollsters had long tracked campaigns by calling random samples of
potential voters. As campaign became more drawn out and journalistic
focus shifted to the horse race aspects of election, these phone polls
proliferated. At the same time, though, the response rates dropped
sharply, going from more than one in three to less than one in ten.
A big drop in response rates always raises questions about selection
bias since the change may not affect all segments of the population
proportionally (more on that -- and this
report -- later). It also increases the potential magnitude of these effects.
Consider these three scenarios. What would happen if you could do the
following (in the first two cases, assume no polling bias):
A. Convince one percent of undecideds to support you. Your support goes
to 50 while your opponent stays at 49 -- one percent poll advantage
B. Convince one percent of opponent's supporters to support you. Your
support goes to 50 while your opponent drops to 48 -- two percent poll
advantage
C. Convince an additional one percent of your supporters to answer the
phone when a pollster calls. You go to over 51% while your opponent
drops to under 47%-- around a five percent poll advantage.
Of course, no one was secretly plotting to game the polls, but poll
responses are basically just people agreeing to talk to you about
politics, and lots of things can affect people's willingness to talk
about their candidate, including things that would almost never affect
their actual votes (at least not directly but more on that later).
In 49-49, the Romney campaign hit a stretch of embarrassing news
coverage while Obama was having, in general, a very good run. With a
couple of exceptions, the stories were trivial, certainly not the sort
of thing that would cause someone to jump the substantial ideological
divide between the two candidates so, none of Romney's supporters
shifted to Obama or to undecided. Many did, however, feel less and less
like talking to pollsters. So Romney's numbers started to go down which
only made his supporters more depressed and reluctant to talk about
their choice.
This reluctance was already just starting to fade when the first debate
came along. As Josh Marshall has explained eloquently and at great
length since early in the primaries, the idea of Obama, faced with a
strong attack and deprived of his teleprompter, collapsing in a debate
was tremendously important and resonant to the GOP base. That belief was
a major driver of the support for Gingrich, despite all his baggage; no
one ever accused Newt of being reluctant to go for the throat.
It's not surprising that, after weeks of bad news and declining polls,
the effect on the Republican base of getting what looked very much like
the debate they'd hoped for was cathartic. Romney supporters who had
been avoiding pollsters suddenly couldn't wait to take the calls. By the
same token. Obama supporters who got their news from Ed Schultz and
Chris Matthews really didn't want to talk right now.
The polls shifted in Romney's favor even though, had the election been
held the week after the debate, the result would have been the same as
it would have been had the election been held two weeks before -- 49% to
49%. All of the changes in the polls had come from core voters on both
sides. The voters who might have been persuaded weren't that interested
in the emotional aspect of the conventions and the debates and were
already familiar with the substantive issues both events raised.
So response bias was amplified by these factors:
1. the effect was positively correlated with the intensity of support
2. it was accompanied by matching but opposite effects on the other side
3. there were feedback loops -- supporters of candidates moving up in
the polls were happier and more likely to respond while supporters of
candidates moving down had the opposite reaction.
You might wonder how the pollsters and political scientists of this
world missed this. The answer that they didn't. They were concerned
about selection effects and falling response rates, but the problems
with the data were difficult to catch definitively thanks to some
serious obscuring factors:
1. Researchers have to base their conclusions off of the historical record when the effect was not nearly so big.
2. Things are correlated in a way that's difficult to untangle. The
things you would expect to make supporters less enthusiastic about
talking about their candidate are often the same things you'd expect to
lower support for that candidate
3. As mentioned before, there are compensatory effects. Since response
rates for the two parties are inversely related, the aggregate is fairly
stable.
4. The effect of embarrassment and elation tend to fade over time so that most are gone by the actual election.
5. There's a tendency to converge as the election approaches. Mainly because likely voter screens become more accurate.
6. Poll predictions can be partially self-fulfilling. If the polls
indicate a sufficiently low chance of winning, supporters can become
discouraged, allies can desert you and money can dry up. The result is,
again, convergence.
For the record, I don't think we live on 49-49. I do, however, think
that at least some of the variability we've seen in the polls can be
traced back to selection effects similar to those described here and I
have to believe it's likely to get worse.