Guide to Automated Journalism

Algorithms for automatically generating stories from machine-readable data are shaking up the news industry, not least since the Associated Press, one of the world’s largest and most reputable news organizations, has started to automate the production and publication of quarterly earnings reports. Once developed, such algorithms can create thousands of routine news stories for a particular topic, usually faster, cheaper, and with fewer errors than any human journalist ever could. See the Guide to Automated Journalism for a summary of the status quo of automated news generation and a discussion of key questions potential and potential implications for journalists, news consumers, media outlets as well as society at large.

Despite its potential, the technology is in an early market phase. Automated news generation is still limited to routine and repetitive topics for which (a) clean and accurate data are available, (b) the stories merely summarize facts, and therefore (c) leave little room for uncertainty and interpretation. Popular examples include recaps of lower league sports events, financial news, crime reports or weather forecasts. For such topics, research finds little difference in people’s relative perception of human-written and automated news. Also, due to the low-involvement nature of these topics, readers may be less concerned about issues regarding algorithmic transparency and accountability.

But what if the stories cover a high-involvement topic that also involves uncertainty? How do users perceive automated news for such topics and to what extent are they interested in how the underlying algorithms work?

To study these questions, two members of the PollyVote team, namely Andreas Graefe and Mario Haim, embarked on a project to study the creation and consumption of automated news for forecasts of the 2016 U.S. presidential election. That is, with funding provided by the Volkswagen Foundation, we will develop automated news based on the PollyVote data for national level forecasts. In addition, for the first time ever, we will expand the PollyVote to the state level and also generate automated news for such forecasts. This endeavor is funded by Columbia University’s Tow Center for Digital Journalism.