Guest Post: Why the polls were wrong
A guest post by Stephen Russell:
Wrong again! To the huge embarrassment of pollsters in the US, once again the polls underestimated support for Donald Trump.
Although, in this case, that did not result in a surprise outcome, the magnitude of the error was actually much greater than in 2016. Back then, national polls proved off by 1-1.5% (and state polls by more). In 2020, the magnitude of error was about 4%, with state polls just as bad.
Some of the errors were real whoppers: the Florida poll average was off by about 6%, Iowa and Wisconsin by 7%, and polls in Maine’s 2nd district an extraordinary 11%. Only Colorado and Georgia saw the polling average get within about 1% of the result. Only Nebraska 2nd district polls saw a significant error the other way.
Individual polls produced even larger errors: A late October ABC News Poll gave Wisconsin to Biden by a gobsmacking 17%. On November 3 he won the state by 0.63%.
But – to be fair – some polls did a lot better. Pollsters such as Trafalgar Group and Susquehanna insisted all along that the election would be close (at least in the key states – as it was). Iowa pollster J Ann Selzer (who famously picked Obama’s Iowa caucus win in 2008) produced a late October surprise with a poll that showed Trump ahead by 7% in Iowa (when other polls said it was neck-and-neck). He won by 8.2%.
So what went wrong? It should be noted that there could be multiple sources of error – some in opposite directions, which both compounded and cancelled in different places. A poll that was right might have been luck. Or the reverse.
One theory is that most of the pollsters were lying. They were part of a vast Left-wing conspiracy to depress Republican turnout by convincing people that Joe Biden had the election in a bag. Or possibly to provide cover for the Democrats’ vote-fraud operation. In which case comparing polls to results is pointless since both are false.
For those who seek other explanations, there are some other possibilities. The usual suspects are a late shift in support, undecided voters breaking more for one candidate, and failure by pollsters to reach some key group.
One example of the latter that certainly occurred was related to Latino voters. They are notoriously hard to poll, and pollsters treated them as a single bloc. They knew that Trump was doing a bit better here than in 2016, but failed to pick that it was mainly with particular subgroups (Venezuelan and Cuban Latinos). Hence the big error in Florida where those groups were concentrated.
Selzer has suggested that postal voting created an asymmetric turnout surge. Four weeks before polling day Democrats were fired up to vote (by mail) – and this showed up in more people telling pollsters they definitely would vote (sometimes because they already had). But some marginal Republicans were still saying “maybe” (and thus being undercounted) until the final week, when Republican “get out the vote” efforts peaked. This explains why two earlier Selzer polls had more “normal” results than the October surprise one. She also suggests mail voting meant surprisingly short queues on polling day – which prompted more people to vote.
However, while this makes sense, there is no empirical evidence available to support it, or quantify the magnitude of the effect.
The two theories that have gained most currency since the vote are the Covid-19 twist and the hidden (but not shy) Trumper theories.
Put simply the Covid-19 theory says that Democrats, concerned for their own and for public health in the pandemic, stayed home and answered the phone when the pollster called. Republicans did not. There is empirical data to back this. There was an upsurge in response rates following lockdowns. There was an upsurge in response by registered Democrats, and in the share of respondents who evidenced high social integration. In any event, this source of error was (hopefully) a one-off.
A third theory is that pollsters simply failed to reach a significant group of voters: not so much shy Trumpers as low-engagement ones. These people did not lie or prevaricate to pollsters. They simply never answered the phone, or just declined to participate in the poll.
Pollsters have long known about these people. But they never before worried about them because they did not vote, or if they did, voted much the same as other people. The polls’ failure to capture their intentions made no difference.
But these people just love Trump, because Trump is the quintessential anti-politician. He denounces and smashes at every part of the dysfunctional and elitist “system” they despise. (Of course, smashing your car because it does not go may not be a helpful response, but it is emotionally very satisfying.)
This accounts for the fact that polls for the 2018 mid-term elections were accurate. And Democrats did much better that year – winning House elections by a popular vote margin of 8.6% (as opposed to about 3% in 2020). The low-engagement voters simply did not turn out – and probably wont again without Trump on the ballot.
It may also explain the pattern in the state poll errors. In 2016, the biggest errors were in the rust-belt states where low-education voters switched from Obama to Trump. In 2020 the biggest errors were in the reddest of the swing states, and perhaps the most rural ones. Pollsters such as Monmouth’s Patrick Murray have said that polling in high-income high-education suburban areas was quite accurate, but missed a lot of Trump voters in rural areas.
If this is correct, it presages difficulty for Republicans, again, in the 2022 mid-terms, but more accurate polling along with it. And maybe a return of the same errors in 2024, if Trump is again on the ballot. The big problem is that it is so very difficult for pollsters to do anything about this. But if there is one useful lesson to come out of this, it is surely that we place too much faith in polls. They can still tell us useful things. After all, in most cases a 4% polling error doesn’t really matter much. In close US election races however, we need to be