The wrongness of the polls: 2016
A guest post by Stephen Russell:
This is the first of a series of posts on polling and the United States’ 2020 presidential election.
In the days before the 2016 presidential election, almost all polls showed Hillary Clinton comfortably ahead of Donald Trump. Some pundits proclaimed that Clinton’s victory was a 99% surety. It is claimed that even Trump believed she would win. She didn’t. Ever since, supporters of Trump have brushed off poor poll results with the retort that the polls were wrong in 2016 and will be proved wrong again in 2020.
So just how badly did the pollsters do? And how likely is it that there will be a repeat?
First of all, it needs to be understood that almost all polls have been, and will be, wrong. Even if a poll is perfectly carried out, without bias, the mathematics mean that it is probably going to be inaccurate.
A poll consists of consulting a sample of a few hundred people out of 200-million-odd potential voters, and hoping that said sample is representative of the whole. It probably won’t be. But the laws of chance say that it will – probably – come close, and only occasionally land on the wrong planet. That is what the “margin of error” is all about.
Secondly, polls are an out-of-focus snapshot of the situation at a single point in time – some way ahead of the actual vote. The situation can change faster than polls can detect and report it. This is part of what happened in 2016: those who decided only in the final week broke strongly for Trump, an on-the-ground shift that polls could not entirely catch.
But on top of the limits to accuracy imposed by timing, and the laws of statistics, there is also the problem of polls being poorly conducted (to be fair, it is actually very hard – and expensive – to do a poll right); the problem of bias (which may be conscious or unconscious); and the deliberate production of outright rubbish, usually for political effect.
However, all those sources of error are double-edged. They can result in polls that overly favour Republicans as much as Democrats. Fivethiryeight.com reports that over the last 11 election cycles (including mid-terms) this happened almost as often as the reverse. Presidential polls significantly over-predicted for the Republican side in both 2000 and 2012. US Senate elections over this period saw over-prediction for Republicans more often than for Democrats, including in 2018.
In 2016 however, there is no dispute that the polls over-predicted for Hillary Clinton. There is some dispute over how badly – but that is mainly because the truth is actually quite complex.
The American Association for Public Opinion Research analysed the 2016 polls, and came to a very surprising conclusion: “National polls were generally correct and accurate by historical standards”.
Their report explains that “Collectively, [the polls] indicated that Clinton had about a 3 percentage point lead, and they were basically correct; she ultimately won the popular vote by 2 percentage points.” The New York Times’ average of polls put Clinton’s lead at 3.1%. Fivethirtyeight.com put it at 3.6% (see here) and RealClearPolitics (see here) had Clinton ahead in the popular vote by 3.3% (though the median and mode were both 4%). RCP’s table shows that of the twelve public polls in conducted entirely in the few November days prior to the vote, nine gave Clinton a margin higher than she achieved. But only one put that margin greater than 4%. Ten out of twelve were within 2% of the actual result: a 2.1% margin.
The discrepancy between the polls and the outcome came from the fact that US presidential elections are decided not by the national popular vote, but by individual state results. And here, there were some major errors.
Of 17 swing states polled, seven had results within 1.5% of the polling averages published on fivethirtyeight. In three of those (Colorado, New Mexico and Nevada) Clinton actually did better than the polls projected.
But ten states produced results more than 1.5% better for Trump than the polls had projected and eight of those more than 4%. Three of those eight big errors came in Clinton’s top five “must win” swing states: Pennsylvania (with a 4.2% error), Michigan (4.4%) and Wisconsin (6.1%).
State polls were thus (mostly) less accurate than national ones, but even then the AAPOR explains that “Eight states with more than a third of the electoral votes needed to win the presidency had polls showing a lead of three points or less… The polls on average indicated that Trump was one state away from winning the election.” Fivethirtyeight warned at the time (Trump is just a normal polling error behind Clinton) that the outcome was finely poised, and that there was a substantive chance that Clinton would win the popular vote but lose the election.
Most (though not all) mainstream media pundits, being enthusiastic supporters of Clinton, simply failed to look closely enough at the data, got carried away on a tide of excitement, dismissed the evidence that the race was tightening in the final stretch, and predicted the result that they wanted to see. The polls did justify predictions that Clinton would win – but only marginally so – not the 99% level of certainty that some foolishly claimed. In reality, the contest was close enough that it took only a small error to change the outcome.
02Despite the caveats, experience has taught us that polls are still a better means of gauging the likely outcome of an election than consulting chicken entrails, taxi drivers, or what we think people ought to think. So the next post in this series will look at the 2020 situation.