US polls: third time lucky?
Revealed: the big mistake they made in 2020 - and why it won't be repeated.
With five weeks to go to election day, America’s pollsters are as anxious as the politicians. Eight years ago they said Hillary Clinton would defeat Donald Trump. Four years ago they predicted a landslide win for Joe Biden. Today they tell us that the contest between Kamala Harris and Trump is on a knife-edge. If the polls are wrong again, Trump is heading for a clear victory, and the pollsters for humiliation.
In fact, I think this year’s polls, taken together, are about right. America’s electorate is evenly divided. But as we shall see, there are more specific reasons to trust the polls this time.
In 2016, the polls questioned too many Clinton-supporting graduates, especially in the swing states in the rust belt, and too few non-graduates backing Trump. This mistake was bad but easily corrected, which it was in 2020. So, why did the polls overstate Biden’s vote four years ago? What new mistake did they make then, and can we be sure the polls won’t repeat it this time?
The reason we can is rooted in data published in 2022 by AAPOR, the American Association for Public Opinion Research. Their detailed post mortem contains the numbers we need. But the crucial evidence doesn’t appear until half-way through their 106-page report; and its authors muffled its significance.
The starting point for making sense of what happened is the profound polarisation of US elections.
American polls tend to be far less transparent than British polls; however we have compelling data about polarisation from two large-sample surveys: a national poll of more than 9,000 by Pew Research, and one of more than 11,000 by Morning Consult. According to Pew, just three per cent of people who voted for Trump last time would now vote for Kamala Harris, while four per cent of Biden’s supporters would back Trump. Morning Consult has similar, very small, figures. Groups that Trump has won over in past elections, such as some Hispanic and older white working-class voters, have stayed with him. In terms of overall numbers, the loyalties of those who voted in 2020 have barely changed at all. For that reason alone, another close race beckons.
The reason for having more faith in the polls this time starts with this basic truth, but goes further. Something odd, and unlikely to be repeated, happened in 2020. Turnout jumped 6.5 percentage points to 66.6 per cent of eligible voters, by far the highest in modern times. The total number casting their votes rose by almost 22 million. That is the net figure. Detailed post-election analysis by Pew Research, based on matching their polling data to official voting records, found that around 40 million Americans who voted in 2020 had not done so in 2016. This is consistent with eight million dying between the two elections and another ten million (a modest eight per cent) voting in 2016 but not in 2020. That makes 40 million “new” voters, offset by 18 million “lost” voters.
It was these 40 million that seem to have caught the pollsters out. This was spotted by AAPOR’s report. It contains a remarkable table (on page 52 for those who want to check for themselves). Rather, the table itself looks innocuous at first sight, and the discussion of its implications was notably cautious. This may be why it caused no fuss at the time.
However, it takes just a few minutes with a pocket calculator to show how the pollsters got it wrong in 2020. Much, and possibly all, of the pollsters’ overall error can be explained by their massive overstatement of Biden’s lead among that 40 million. AAPOR’s table covers 11 states, including all the swing states. The pollsters’ combined figures showed Biden ahead by an average of 25 per cent among new voters across those states. In hindsight, AAPOR estimated that his true lead averaged just nine per cent. The polls were an extraordinary sixteen points adrift.
Nationally, this implies that the polls collectively exaggerated Biden’s victory among the 40 million by more than six million. That is almost exactly the size of the pollsters’ total error among ALL voters across the country. This takes us to the heart of the matter. The polls accurately reported the three-quarters of people who voted in both 2016 and 2020. They were let down by their failure to get right the one-quarter who had not voted in 2016. They found Biden’s new voters but missed many of Trump’s.
How come? The answer is to be found not just in the number of new voters but who they were. Comparing data from the US Census Bureau for 2016 and 2020, we find that, in net terms, half the increase in turnout was among people over 55. Turnout rose more among white and Hispanic than black voters. All this suggests a demographic skew in Trump’s favour that offset the (mainly Democratic) young adults who voted for the first time.
The likeliest reason why a number of Trump voters remained undetected by the pollsters – discussed on page 66 of AAPOR’s report, though again rather cautiously – is that many of them simply declined to respond to polls. Some have called these “shy Trump” voters, similar to Britain’s “shy Tories”. I’m not sure “shy” is right this time. There was nothing demur about them. “Trump refuseniks” is better: people who distrust the world of mainstream politics and media, and refuse to answer pollsters’ questions. Their absence from polling samples fatally distorted their published results. Raw data can be readily weighted to get many things right: age, gender, race, education, past vote and so on. Correcting for Trump refuseniks with no voting history? That’s tougher.
I have gone into this in some detail, because if this analysis is right, then this year should be different. The big jump in turnout last time was an exceptional response to Trump’s years in the White House, not least from his refuseniks. 2020 set a modern-day record for both the level of turnout and the amount by which it grew.
Trump will find it hard to match that poll-defying achievement this year. Pew’s recent survey identified 932 new voters, compared with 6,501 last-time and this-time voters. (Pew’s figures are worth taking seriously: they checked who voted in 2020 and who didn’t against official records collected by each state.) The Harris-Trump contest is close in both groups. So the proportion of new likely voters is well down on 2020, and there is no sign that, overall, they will vote very differently from the rest of the electorate. This fits with the proposition that the refusenik surge that fooled the pollsters in 2020 was a one-off, and that there will not be a further such surge this year.
(As for the 2020 refuseniks, they may well still be hiding from the pollsters. But this doesn’t matter. We know how many people voted for Trump, and that almost all of those who do answer polling questions this time will back him again. As long as the polls contact the right total number of 2020 Republicans, it makes little difference whether they include all, some or none of the 2020 refuseniks.)
The real danger facing the polls is not that they will repeat old errors but that they will make new ones. As a retired pollster, I know this only too well. However, the American pollsters I know are going to great lengths to improve their samples. They are all keen to avoid another big error.
The main thing is that there is no clear reason to suppose that the US polls this time are systematically skewed either way. America’s political geography is another matter. Two of the last three successful Republican candidates won fewer votes than their Democratic rival – George W Bush in 2000 and Trump in 2016. This year Harris will probably need a lead of at least two per cent in the national popular vote, and possibly four per cent, to defeat Trump.
Here's a rule-of-thumb for following the polls between now and November 5. As always, individual polls, even the best, are at risk of random error. But there are enough of them nationally to smooth these wrinkles. An average polling lead of three per cent for Harris is a coin-toss: victory could go either way. The more Harris’s lead exceeds three per cent, the better her chances of victory – but the more she falls short, the more likely it is that Trump will return.