Hit enter after type your search item

Post-election analysis: Accuracy of popular vote forecasts

/
/
/
7560 Views

As in previous elections since 2004, PollyVote once again did a very good job in predicting the popular vote. In its final forecast, the PollyVote predicted that Joe Biden will gain 52.2% of the two-party popular vote, a forecast that has remained remarkably stable since it first publication on May 15. Although votes are still being counted, and we probably won’t know the actual result for several weeks, Biden’s two-party vote will likely end up around 52% – very close to what PollyVote predicted.

PollyVote did well compared to the closely watched forecasts published by FiveThirtyEight and the Economist. Due to their heavy reliance on polls, which substantially overestimated Biden’s lead not only on the state level but also nationwide, these models predicted that Biden would win the popular vote by eight to nine points. The following chart, which will automatically update until the final vote count is known, shows the extent to which the PollyVote outperformed these benchmarks over the course of the campaign.

 

It’s only one election. Did PollyVote just get lucky?

PollyVote’s 2020 performance is well in line with previous elections, as we have shown in our research, which has been awarded with an “Outstanding Paper Award” by the International Journal of Forecasting.

The average error of the PollyVote’s popular vote forecast across the last 100 days prior to the seven elections from 1992 to 2020 (ex post for the three elections from 1992 to 2000) was only 1.1 points, and more accurate than forecasts from any other method.

Now, let’s take a look at the historical performance of the PollyVote in predicting the popular vote, compared to well-known benchmarks methods that also have a few elections on their belts.

The following chart compares the accuracy of the PollyVote and FiveThirtyEight across the three elections from 2012 to 2020 for which we have data from both forecasts (FiveThirtyEight was first launched in 2008 but we don’t have historical data available). For each single day, the chart shows the average absolute error one would have achieved when relying on the PollyVote or the FiveThirtyEight model from that day until Election Day – across (and in each) of the three elections from 2012 to 2020. The line for the PollyVote is consistently below the respective line for FiveThirtyEight, which means that, on average, the PollyVote forecasts were more accurate. For example, across the last four weeks (or 28 days) across the last three elections, the PollyVote’s average error was 1.2 percentage points, compared to 1.6 points for FiveThirtyEight.

Now, let’s compare the PollyVote’s forecast accuracy to a simple polling average. Here, we use RealClearPolitics, which allows us to compare forecasts back to the 2004 election, which is when the PollyVote was first launched. Again, on average, one would have clearly fared better by relying on the PollyVote instead of the RCP poll average. For example, across the last four weeks (or 28 days) across the last five elections, the PollyVote’s average error was 0.9 percentage points, compared to 1.5 points for the RCP poll average.

 

These accuracy gains of roughly half a point may seem small in absolute terms. However, given the closeness of recent elections, errors of that magnitude can make a big difference. Also, if one looks at the percentage of error reduction, the improvement in forecast accuracy is substantial, as can be seen in the chart on the right.

For example, if one had relied on the PollyVote instead of the FiveThirtyEight model for the last four weeks before the elections from 2012 to 2020, one would have reduced the forecast error by roughly 30%. 

Compared to the RealClearPolitics poll average across the five elections from 2004 to 2020, error reductions are even higher and exceed 40%, depending on the time remaining to Election Day.

Benefits of combining forecasts

The PollyVote relies on a fundamental principles from nearly half a century of forecasting research: combining forecasts that rely on different methods and different information reduces error. This approach has two major advantages.

Combining protects from picking a poor forecast

The relative accuracy of different forecasting methods often varies across time. That is, methods that have worked well in one election might not work well in another, and vice versa. For example, polls performed pretty well in 2012, OK (at least at the national level) in 2016, and poorly in 2020. When predicting a single event (e.g., the 2020 election), a combined forecast such as the PollyVote will always be at least as accurate as the typical component forecast. Therefore, the PollyVote prevents the forecaster from making large errors when relying on a particular method that ends up being far off, such as polls in 2020.

Combining likely to prevail in the long run

Combining forecasts reduces error by canceling out systematic and random errors of individual forecasts, an effect that is particularly strong if the individual forecasts rely on different methods and data. If the different component forecasts “bracket” the final outcome, the combined forecast will be among the most accurate forecasts available. In the long run (e.g., when being used across many elections), it is difficult to think of a better way to forecasting than by combining different methods that use different information.

This div height required for enabling the sticky sidebar