The bio-index model, developed by PollyVote team members Scott Armstrong and Andreas Graefe, predicts U.S. presidential election winners based on information about candidates’ biographies. The model uses 58 biographical cues that are expected to influence the chances of a candidate on being elected. The bio-index is useful to political decision-makers, as it can provide advice on questions such as whether a candidate should run for office or which candidate a party should nominate.
A large stream of research, particularly in psychology, analyzes questions such as what makes people emerge as leaders? For example, meta-analyses found intelligence and height to have a positive impact on both leader performance and leader emergence. Such findings from prior research were used to identify and code the majority of variables in the bio-index. In addition, some variables are based on common sense. For example, it was assumed that voters are more attracted to candidates who are married but not divorced. As shown in Table 1 at the end of this page, the model distinguishes two types of variables:
- Yes / No variables (n=47): For this type of variable, candidates are assigned a score of 1 if they possess a certain attribute and 0 otherwise.
- Comparative variables (n=11): For this type of variable, the candidates of the two major parties are compared on the underlying attribute. The candidate who scores better than his opponent is assigned a score of 1 and 0 otherwise.
The model is based on the index method, which is useful in situations with many variables and good prior knowledge about the variables’ directional effect on the target criterion.
Predicting the election winner
After all variables have been coded, the total index scores for each candidate are calculated by summing up the scores using equal weights. Then, the candidate who achieves the higher overall score is predicted as the election winner.
Simple linear regression is then used to relate the incumbent party candidate’s relative bio-index score (
bio) to the dependent variable, which is the actual two-party popular vote share received by the candidate of the incumbent party (V). Using data from the 30 elections from 1896 to 2012 leads to the following vote equation:
V = 19.9 + 60.5 *
bio-score (incumbent) / [
bio-score (incumbent) +
That is, the big-issue model predicts that an incumbent would start with 19.9% of the vote, plus a share depending on
bio. If the incumbent’s relative
bio score went up by 10 percentage points, the incumbent’s vote share would go up by 6 percentage points.
The table above shows the coding for Hillary Clinton and Donald Trump. For some comparative variables, such as intelligence, data are not available. Other comparative variables, such as weight or height, are not coded due to the fact that the Clinton-Trump race is an inter-gender comparison.
Based on the coding, Clinton achieves a bio-index score of 18 points (compared to 11 points for Trump) and is thus predicted to win the election. Inserting these figures into the vote equation derived above leads to:
V = 19.9 + 60.5 * ( 19 / (19 + 11)) = 58.3
That is, the bio-index model predicts Clinton to achieve 58.3% of the vote, compared to 41.7% for Trump.
Compute your own forecast
You can also compute your own bio-index model forecasts. This feature allows you
- See how the model forecast would change for different variable coding.
- Compare two hypothetical candidates against each other.
- See how you would perform as a candidate.
To do so, scroll all the way down to Table 1 and adjust the variable values in the green highlighted cells. Whenever you change a variable value, the forecast will be updated at the bottom of the table.
Forecasts for other match-ups
The following chart shows the bio-index model’s forecasts for Clinton vs. other candidates. Trump stands out as one of the worst possible candidate (only Boehner scores lower than him). In comparison, Clinton performs well compared to the vast majority of potential opponents. In fact, there are only two candidates of those we coded that the model would predict to defeat Clinton: Jeb Bush and Lindsey Graham, both of whom already dropped out of the race.
The following chart shows the model’s predicted and actual percentage point lead in the two-party vote for the winners of the 30 elections from 1896 to 2012. The vote-share predictions are calculated in-sample. If both the grey and the orange bars are on the right hand side of zero, the model correctly predicted the final election winner.
As can be seen, the model’s forecasts failed only two times: it wrongly predicted Ford to beat Carter in 1976 as well as Bush to defeat Clinton in 1992. For the remaining 28 elections, the model correctly predicted the winner. This record of 93% correct predictions compares favorably to other statistical models as well as to polls and prediction markets. The forecast for the 2012 election, published in August 2011, correctly predicted Obama to defeat Romney.
The chart also puts the model’s 2016 forecast in historical perspective. The predicted 16.6-point lead for Clinton over Trump is the 3rd-largest margin ever (tied with the 1996 race between Clinton and Dole). The two elections in which the method predicted an even larger lead for a candidate were 1904 (Theodore Roosevelt vs. Alton Parker) and 1940 (Franklin D. Roosevelt vs. Wendell Willkie).
As any model, the bio-index is subject to limitations.
- The bio-index model ignores many factors that are also important for predicting election outcomes. Examples include information about the state of the economy, the time the incumbent party has held the White House, the perceived ability to handle issues, or the effectiveness of the advertising campaigns.
- The wrong forecasts in 1976 (predicted Ford) and 1992 (predicted Bush) indicate a certain bias towards experience. This is of course obvious, given that the model is based on the assumption that prior experience is a predictor of leader emergence.
- The 2016 election will most likely be the first male-female race for president in U.S. electoral history. In this situation, some variables (e.g., height, weight) are not coded since their predictive validity is unclear.
That said, the model’s major aim is not to produce the most accurate forecasts possible. If this is what you are looking for, check out the combined PollyVote forecast. Instead, the major goal of the bio-index was to provide decision-making implications by advising parties on who they should nominate. Given the predicted 18-point lead for Clinton in a hypothetical race against Trump, the model’s implications are clear: Donald Trump would be the worst possible choice of the remaining candidates.
|Table 1: The bio-index variables and their coding (1: True; 0: False) for a Trump – Clinton race
|1||Has adopted children||See children. Voters might favor candidates who adopted children.|
|2||Descends from a presidential family||Descent from renowned families has been shown to have a positive impact on an individual’s career chances.|
|3||Has children||The social norm is to have children, so voters might favor candidates who have children.|
|4||Has not been divorced||Although divorces are common, they violate the social norm.|
|5||Has a father who held a political office||The role of a candidate’s father may have an impact of a candidate’s chances to be elected. Similar to Simonton (1981), a score of 1 was assigned if a candidate’s father held one of the offices listed from questions 18 to 31.|
|6||Is the first-born child in the family||First-born children tend to achieve more than later-born children. One study analyzed samples of 45 male U.S. Governors and 24 Australian prime ministers. Compared to the population at large, the politicians in both samples are more likely to be first-born and less likely to be middle-born.|
|7||Is the single child||Single children have an advantage over children from larger families. For example, one study found a negative correlation between family size and political performance for the 38 U.S. presidents up to Jimmy Carter. Another study analyzed birth order data for almost 1200 Dutch politicians. Compared to the general population, they find single children to be overrepresented, whereas middle- children were underrepresented. Candidates are still considered single children if they have half siblings.|
|8||Is married||It is the social norm to get married.|
|9||Went to college||Similar to Simonton (1981), the level of formal education is coded by assigning values of 1, if a candidate went to college, graduated from college, obtained a Master’s degree, obtained a PhD degree, obtained a Law (J.D.) degree, or worked as a university professor.|
|10||Graduated from college|
|11||Has a Law (J.D.) degree|
|12||Has a Master’s degree|
|13||Has a PhD/doctoral degree|
|14||Has been a college or university professor|
|15||Is member of Phi beta kappa||Similar to Simonton (1981), scholastic performance is measured by quantifying whether a candidate was an in-course (not alumnus or honorary) member of Phi Beta Kappa.|
|16||Attended an Ivy-League college||To have an objective and unambiguous criterion for the reputation of a college, all Ivy- League colleges as well as the U.S. Naval and Military Academies were considered as prestigious.|
|17||Went to U.S. Naval/Military Academy|
|18||Is/was U.S. or State Attorney General||Similar to Simonton (1981), prior political experience was assessed by assigning values of 1 if a candidate had occupied one of the offices listed in variables 18 to 30.|
|19||Is/was a city major|
|20||Is/was a state governor|
|21||Is/was a judge|
|22||Is/was Lieutenant Governor|
|23||Is/was U.S. Solicitor General|
|24||Is/was a state representative|
|25||Is/was a state senator|
|26||Is/was U.S. president|
|27||Is/was a U.S. representative|
|28||Is/was a U.S. Secretary|
|29||Is/was a U.S. senator|
|30||Is/was Vice President of the U.S.|
|31||Has not been defeated in a political election|
|32||Suffers from physical or sensory disability||Traumatic experiences that may have a positive impact on leader emergence may be the survival of a major life-threatening disease, physical or sensory disability, or chronic illness in childhood.|
|33||Survived a major life-threatening disease|
|34||Has suffered from chronic illness in childhood or adolescence (before the age of 30)|
|35||Has lost one or more children||Empirical evidence supports the idea that the development of genius may be enforced by traumatic experiences, particularly in childhood or adolescence. He refers to literature that finds people who lost a parent during childhood to be more likely to achieve more in life. Following Simonton (1981), a candidate is considered an orphan if one (or both) of his parents died before the age of 30. Similarly, scores of 1 are assigned if a candidate lost one (or more) children, siblings, or a spouse.|
|36||Has lost one or more siblings|
|37||Has lost a spouse|
|38||Is an orphan|
|39||Is between 47 and 64 years old||Candidates might have a disadvantage if they are either too young or too old. Prior research supports this assumption for high-level positions in large public firms. In analyzing a sample of more than 10,000 CEOs, one study found that the median age was 57 years, the 10th percentile 47 years, and the 90th percentile 64 years.|
|40||Is known as athletic||One review of the literature summarizes several studies that found a positive relationship between leadership and athletic ability.|
|41||Has authored one or more books||The number of books that a president published prior to be elected has been found to have a positive impact on his political performance. In addition, a publishing record should have a positive impact on the wide recognition of a candidate among voters.|
|42||Is/was a celebrity in a field other than politics||Being a famous person in a field other than politics should have a positive impact on the wide recognition of a candidate among voters. This can include being a famous actor, athlete, artist, or TV (radio) moderator.|
|43||Is clean-shaved||Several studies examine how facial hair (i.e. clean-shaved, mustache, goateed, and beard) influences perception of people. For example, one study found beards to be associated with lessened competence. Findings from an experiment show that the rate of bearded applicants that are selected for management positions is lower compared to non-bearded applicants. By comparison, results from an experiment found consistently more positive perceptions of social/physical attractiveness, personality, competency, and composure for men with facial hair. Given that most politicians, especially in recent years (note that William Taft was the last U.S. president with facial hair), are clean-shaved, facial hair is expected to have a negative effect on the evaluation of candidates.|
|44||Wears glasses||One study found people wearing eyeglasses to be perceived more industrious, dependable, and honest. Another lab experiment finds that eyeglasses enhance an individual’s perceived authority. Another study found eyeglasses to be associated with heightened competence but also diminished forcefulness. Eyeglasses were expected to have a positive impact on the evaluation of candidates.|
|45||Is not bald||Although not identifying a voter bias, one study found that bald and balding men are underrepresented among governors and Congress members as compared to the general public.|
|46||Has military experience||Similar to Simonton (1981), military experience is coded if a candidate served as wartime recruit, professional soldier, or military general.|
|47||Has been awarded with military honors||Scores of 1 are assigned if a candidate was awarded with military honors.|
|48||Looks more competent||Several studies measure competence ratings based on people’s assessments of candidates’ headshots (Todorov et al., 2005; Antonakis and Dalgas, 2009; Armstrong et al., 2010). These studies show that candidates with higher ratings of ‘facial competence’ are more likely to win elections.|
|49||Has the more common first name||Candidates with the more common first name were expected to have an advantage. Name popularity was obtained from 1990 U.S. census (http://names.mongabay.com).|
|50||Is taller||Height is a well-known predictor for leadership emergence and performance. One meta-analysis found physical height to be positively correlated to esteem (r=.41), leader emergence (r=.24), performance (r=.18), and income (r = .26). In estimating factors to predict presidential greatness, both McCann (1992) and Simonton (1981) find a positive correlation between height and political performance.|
|51||Is from a state with more electoral votes||Candidates are likely to win the votes of their home state. Thus, the candidate coming from the state with more electoral votes was assumed to have an advantage.|
|52||Is more intelligent||Results from a meta-analysis show that intelligence predicts leader emergence. Simonton (2006) correlates IQ scores for all 42 U.S. Presidents before Barack Obama with evaluations of presidential leadership performance. He found that intelligence is positively correlated with political success.|
|53||Is more attractive||One study assessed the beauty of political candidates from major political parties and then estimate the effect of beauty on vote share for candidates in the 2004 Australian election. They find that beautiful candidates are more likely to win elections. Berggren et al. (2010) report a similar effect. In analyzing more than 10,000 visual assessments of almost 2000 Finnish political candidates, the authors report a positive relationship between attractiveness and the received vote share of candidates.|
|54||Represents the larger race||Voters were expected to more likely endorse a candidate that represents their race. Thus, the candidate that represents the larger race was expected to have an advantage. Also, in analyzing ballot photographs for low-information elections, one study found that the probability of winning for white candidates is 38% greater than for nonwhite candidates.|
|55||Is affiliated with the larger religion||Voters were expected to more likely endorse a candidate that identifies with their religious beliefs. Thus, the candidate that identifies himself with the larger religion was expected to have an advantage.|
|56||Has the more common surname||Candidates with the more common surname were expected to have an advantage. Name popularity was obtained from 1990 U.S. census (http://names.mongabay.com).|
|57||Has the more dominant voice||One study analyzed the acoustic frequency of candidates’ voices in presidential debates. The authors find that this nonverbal vocal communication reveals social dominance and thus can be helpful to predict the popular vote. This study uses the data from the eight elections in their sample for our analysis.|
|58||Is heavier||A review of the literature provides evidence that weight is positively correlated with leadership (r=.23): seven studies find leaders to be heavier, whereas two studies find leaders to be lighter; another two studies find no difference.|
|Total index score|
- Armstrong, J. S. & A. Graefe (2011). Predicting elections from biographical information about candidates, Journal of Business Research, 64(7), 699-706.
- Graefe, A. & Armstrong, J. S. (2011). Conditions under which index models are useful: Reply to bio-index commentaries. Journal of Business Research, 64(7), 693-695.
- Graefe, A. & Armstrong, J. S. (2011). Who should be nominated to run in the 2012 U.S. Presidential Election? Long-term forecasts based on candidates’ biographies, 2011 APSA Annual Meeting Paper.