Nathan Silver’s FiveThirtyEight has had an excellent coverage of the US Presidential Elections with some great analytical pieces and very interesting insights in their models. Each and every poll predicted Hillary Clinton to win the election and FiveThirtyEight was no exception to that. Consequently, there was a lot of discussion on pollsters, their methods and how they – again after “Brexit” – failed to predict the outcome of the election. There are many parallels between the elections in the US and the Brexit-vote in the UK. At least for the US, however, the predictions weren’t that far off. And FiveThirtyEight in particular, gave Trump better chances than anyone else:
For most of the presidential campaign, FiveThirtyEight’s forecast gave Trump much better odds than other polling-based models. Our final forecast, issued early Tuesday evening, had Trump with a 29 percent chance of winning the Electoral College. By comparison, other models tracked by The New York Times put Trump’s odds at: 15 percent, 8 percent, 2 percent and less than 1 percent. And betting markets put Trump’s chances at just 18 percent at midnight on Tuesday, when Dixville Notch, New Hampshire, cast its votes.
Also in the German-speaking press pollsters were talked about rather critically (e.g. in ZEIT Online). Personally, I think it is strange when journalists are so quick to point the finger at pollsters and their methodology, when they often fail to deliver the message of a well-conducted poll. In my opinion, this is part of the problem. Conveying “30% probability to win the election” as a measure of uncertainty, that is far away from “100%”, is very difficult, but many articles I have read did not even try. But also representatives of market researcher institutes were quick to question pollsters and saw a “crisis of polling” (see e.g. a statement from the CEO of Kantar TNS [in German] or from the chairman of the DGOF [in German]).
Looking at the statistics, I don’t think there is any new fundamental problem of polling in the 2010ers. The need for large, heterogeneous samples without any abnormally large weights for individual participants is common knowledge. Phone and online surveys are today seen as equally good with the latter becoming more and more relevant. It’s more in the details of modelling choices that need to the tuned. Thus, I’d rather agree with Nathan Silver:
We strongly disagree with the idea that there was a massive polling error. Instead, there was a modest polling error, well in line with historical polling errors, but even a modest error was enough to provide for plenty of paths to victory for Trump. We think people should have been better prepared for it. There was widespread complacency about Clinton’s chances in a way that wasn’t justified by a careful analysis of the data and the uncertainties surrounding it.
Further, looking at the popular vote, the polls were actually quite accurate in their prediction of Hillary Clinton winning by some percentage points:
In fact, the error in national polls wasn’t any worse than usual. Clinton was ahead by 3 to 4 percentage points in the final national polls. She already leads in the popular vote, and that lead will expand as mail ballots are counted from California and Washington, probably until she leads in the popular vote by 1 to 2 percentage points overall. That will mean only about a 2-point miss for the national polls. They may easily wind up being more accurate than in 2012, when they missed by 2.7 percentage points.
In contrast to the German electoral system, for example, the American system is rather difficult to model in way that reduces uncertainty. A difference of only 2-3 percent in one or two states has a big influence on the outcome of the electoral college.
If you are interested in poll-based predictions, I highly recommend to read Nathan Silver’s article in full.
Update (12:21): Andrew Gelman also covered the 2% deviation in the outcome and shared his thoughts, which is also – as always – an insightful piece to read.