August 19, 2020 12:00pm by David Farrar

Guest Post: Are the polls wrong again?

A guest post by Stephen Russell:

In my previous post I described how the polls were wrong in 2016. What can we now say about the possibility they will be wrong in 2020?

Many polls in the last few months have reported Trump trailing Biden, often by large margins. But Trump himself rejects these: “I’m not losing, because those are fake polls. They were fake in 2016 and now they’re even more fake.”

Fivethirtyeight.com reports that the average difference between a presidential election poll conducted in the final 21 days of the campaign, and the actual result is 4.8%. The previous four elections produced error rates of 4.4%, 3.2%, 3.2% and 3.6%.

However, if you average the results of multiple polls (which will include errors in both directions) you do get some increase in accuracy. The difference between the polling average and the actual result in 2016 was 3.1%. It was a smaller difference than in many previous elections such as 2012 (3.3%) and 2000 (3.9%) and less than all of the elections from 1980 to 1996 (1980 saw a whopping 8.9% error).

Note that this is looking at polls up to three weeks before the vote, and the gap between polls and results can be ascribed mainly to real shifts in voting intentions over that period. This was a big factor in 2016, as the polls did indeed tighten up in the last week. This is the main (but not only) reason why the average of the polls over the last three weeks (3.1% different from the result) was much more than the average of the final polls (all within the last seven days) which was in the 1.0-1.5% range.

But while national polls actually proved reasonably accurate in 2016, state-level polls had (on average) larger errors. This is not surprising in general – there are fewer of them in any individual state than for the whole US, so averages are more likely to be distorted by a single rogue result. Also, state polls are more often conducted by small local outfits with fewer resources and less expertise. But that does not explain enough of the difference.

The AAPOR analysis also points at a failure of most polls to weight their samples to avoid over-representation of (mostly Clinton-voting) college-educated voters. That proved to be an especially critical error, because such voters made up a notably smaller portion of the electorate in several knife-edge states (eg Pennsylvania and Michigan) – creating larger errors.

Data from Fivethirtyeight shows that of polling averages in seventeen swing states, only two correctly anticipated the state’s actual deviation from the national popular vote. Five of the averages underestimated Clinton’s position. Ten (including all of the mid-West and New England averages) underestimated Trump. (Note this is a measure different to the raw deviation of polls from results, measuring relative position on the partisan scale, rather than absolute.)

That education-weighting error at least, is unlikely to be repeated. Pollsters have made changes to compensate.

In May 2020 Alan Abramowitz, a political science professor at Emory University, noted: “The recent 2020 polling results correlate much more strongly with the 2016 election results than with the final 2016 polling results.” He believes this “suggests that pollsters have adjusted their sampling and weighting procedures to correct for some of the problems that occurred in 2016 in light of the 2016 results.”

He is correct about the correlation. Analysis of a set of polling numbers from June shows that of 17 swing states, only two (Colorado and Ohio) were deviating by more than 2.5% from the relative pattern in the 2016 results. Looking at the 2016 polls, that level of deviation from results showed up in nine states. Furthermore, the 2020 deviations go in both directions – seven favouring Trump and nine favouring Biden.

Of course, the 2020 results will not match those of 2016 either in the absolute or relative sense. Every election sees at least some small shuffling of which states lean by how much in which direction. That is driven by demographic change, by purely local conditions and by having a different matchup of candidates who will have slightly different appeal to different groups. Biden is not Clinton: for example, he appeals slightly more to older voters, slightly less to Black voters, and that will show up in states where those groups are larger or smaller.

In 2016 there was an unusually large reshuffle. That seems unlikely to be repeated. Trump is still Trump. It was his atypical pattern of appeal (relative to his Republican predecessors) that drove the realignment. It is not unreasonable to think that there might be some life in that pattern-shift yet, as Trump doubles down on his appeal to the same groups that swung to him in 2016 – and his increases his negatives with other groups. But revisions to polling methodology to account for the shift in 2016 should also capture any further shift on the same axis.

Pollsters may have corrected for their known 2016 errors, but perhaps they have found exciting new ways to err! Probably the best candidate for that is “shy Trumpers” – addressed in the next post in this series.