First analysis of the PollyVote’s 2016 popular vote forecast

Since its first launch last January, the combined PollyVote forecast has consistently – and correctly – predicted Hillary Clinton to win the popular vote. And so she did, albeit barely.

In this preliminary post-mortem of how the PollyVote performed in predicting the 2016 popular vote, we compare how each component did relative to the other and to its record on the last six elections. The analysis is based on current projections according to which Clinton will end up winning 50.9% of the popular two-party vote.

Polls

Across the last 100 days, which is the time frame we usually look at in our publications, the MAE of combined vote-intention polls was 1.8 percentage points. That is, polls were in fact considerably more accurate than in previous elections.

polly_pm1

The problem is that there were large polling errors in certain key states, such as Michigan, Wisconsin, and Pennsylvania, that ended up deciding the election in the Electoral College and which no forecaster got right. We will do a separate analysis on whether the PollyVote reduced the error in the EC forecast in due time.

Prediction markets

In prediction markets traders bet on the outcome of an election, and the betting quotes provide a forecast of what is going to happen. Depending on the accuracy of their individual predictions, participants can either win or lose money, and thus have an incentive to be right. They should only participate if they think they have information that improves the current market forecast.

As in previous years, the PollyVote relied on the Iowa Electronic Markets’ (IEM) for predicting the popular vote. The IEM is operated by the Business School of the University of Iowa for teaching and research purposes. The market is relatively low volume and participants are not allowed to invest more than $500. The IEM is the only prediction market for forecasting the vote share, so one cannot combine across markets. We did, however, combine across time by calculating one-week averages to protect against short-term spikes.

Across the last six elections from 1992 to 2012, the IEM vote-share market was the second most accurate among the PollyVote’s components, after citizen forecasts. In 2016, however, the IEM performed poorly and provided the worst forecasts of all components throughout the campaign by wildly overestimated Clinton’s vote share.

Prior research shows that participants in these markets tend to be well educated and belong to middle and upper income groups. It may have been the case that this certain demographic group allowed their wishes to get the better of their judgment.

Expert judgment

Asking experts to predict what is going to happen is one of the oldest forecasting methods available. When it comes to forecasting elections, for example, experts have knowledge and expertise about how to read and interpret polls. In particular, experts know that polls have errors. Thus, they could be expected to be able to figure out in which direction the error is, and thus revise the forecast in the right direction.

In 2016, the combined expert forecast overestimated Clinton’s popular vote share relative to polls in five of the six surveys conducted since the parties’ conventions. On average, the expert forecast was 0.6 percentage points higher than the combined polls. In other words, the expert consensus was that the polls underestimestated Clinton’s support.

Citizen forecasts

As in previous elections, a forecast derived from survey respondents’ answer to a simple question “Who will win?” was among the most accurate methods for predicting the 2016 popular vote. Across the final 100 days before the election, citizen forecasts missed on average by only 1.4 percentage points, an error that is only slightly higher than the average across the six previous elections.

Models

The PollyVote used combined forecasts from two separate groups of models: index and econometric models.

Index models are based on the idea of prospective voting. That is, they rely on the idea that voters evaluate the candidates and how they stand on the issues when deciding upon for whom to vote. On average, the five available index models overestimated Clinton’s support, particularly due to two models that were especially far off (the bio-index and the issue-index). Three of these models (e.g., the big-issue model, the Issues and Leaders model, and the Keys to the White House) were close the final election outcome.

In contrast, most political-economic models are based on the idea of retrospective voting. That is, voters are expected to look back at how well the incumbent government has done its job, particularly in handling the economy. In addition, many models include some measure about how long the incumbent party has been in office to account for Americans’ desire for change, and some also include a measure of the president’s popularity. These  models predicted a very close race, at least on average, and thus were the most accurate component method this election, which is the first time that has happened since 1992. That said, the final forecasts from the 18 individual models differed by as much as 10-points, ranging from 44.0% to 53.9% of the two-party vote.

Combining forecasts

We know from prior research that the relative accuracy of the different methods varies from one election to the next, not just for election forecasting but also in other fields. We can see this again in 2016. Prediction markets, which were among the most accurate methods historically, were dramatically off, while econometric models, historically high in error, turned out to be more accurate this time. That is one of the reasons why combining forecasts usually works well. It’s extremely difficult to predict ex ante which method will end up being most accurate.

In 2016, five of the six component methods erred in the same direction, over-predicting Clinton’s vote share by an average of more than 2 percentage points. As already mentioned, the IEM, in particular, was way of the mark. The only component method that slightly underestimated Clinton’s support were econometric models. As a result, there was little “bracketing” of the true value, which limits the benefits of combining forecasts.

Given the little bracketing, the PollyVote performed only slightly better than the typical forecast. It performed worse than econometric models, citizen forecasts, and polls but outperformed expert judgment, index models, and prediction markets.

The principle of combining forecasts does not claim that the combined forecast will always be more accurate than any of its components. While that can happen – and did happen, for instance, in 2004 and 2012 – it is unlikely to happen in any single election.

The claim is that over time, as the relative accuracy of other methods varies, the combined forecast will outperform its components. This can be seen by looking at the mean absolute forecast errors across all seven elections, including this year’s. On average, the PollyVote error was lower than any of its components.

polly_pm2

Interestingly, citizen forecasts performed nearly as well as the PollyVote. So why not just use this one method in the future, you might ask? One major advantage of combining forecasts is that it’s often among the most accurate methods and, most importantly, it avoids large errors. There is not guarantee that citizen forecasts will perform as well in future elections.

How to improve

At the PollyVote, we are currently reviewing what we can learn from this election. This process includes reviewing which forecasts to include, how to better combine them, and how to better communicate their surrounding uncertainty.

The PollyVote aims at communicating scientific principles that can be used by any organization.

  1. We do know from prior research that the combined forecast will always be at least as accurate than the typical component forecasts, in any single election. So, it prevents you from making large errors.
  2. We also know that, over time and across many elections, the combined forecast will be among the most accurate forecasts available, because the performance of individual forecasts varies wildly.
  3. If you accept that it is extremely difficult to predict which forecast will turn out to be most accurate in a particular election, there is no better way to forecasting than combining forecasts.

The PollyVote forecasters are (in alphabetical order): J. Scott Armstrong, Alfred G. Cuzán, Andreas Graefe, and Randall J. Jones Jr.

A terrible day for election forecasters. Where are the winners?

The 2016 presidential election result may well be one of the biggest upsets in the history of election forecasting. Usually, after an election, people look for the best forecasters. This time, it’s hard to find anyone who got it right, us included.

Regarding the Electoral College, PollyVote combined state-level forecasts from 20 different sources, none of which predicted Trump to win a majority of electoral votes.

Now, people are pointing to the fact that some models did in fact predict Trump to win, such as those by Alan Abramowitz or Helmut Norpoth.

Yet, these models predicted Trump to win the popular vote, which, according to the latest projections, he most likely won’t. Norpoth’s model, for example, predicted Trump to gain 47.5% of the two-party vote and thus might miss by about three points, which is a larger error than most of the other econometric models will end up with.

So are there any winners? Well, you could look at the individual models’ vote share predictions to judge their accuracy. For example, Jim Campbell’s well-known trial-heat model predicted Clinton to gain 50.7% of the vote and might thus be very close. But then, these models don’t provide us with the most important information of who will become president. So, no winners in sight.