What's in a Name?
In our view, data journalism as a field encompasses a suite of practices for collecting, analyzing, visualizing, and publishing data for journalistic purposes. This definition may well be debated. The history of data journalism is full of arguments about what it should be called and what it includes.
In fact, data journalism has been evolving ever since CBS used a computer to successfully predict the outcome of the presidential election in 1952. As technology has advanced, so has the ability of journalists to tap that technology and use it for important storytelling.
One key definition of data journalism can be found in a 2014 report by Alexander Howard for the Tow Center for Digital Journalism and Knight Foundation. Data journalism is “gathering, cleaning, organizing, analyzing, visualizing, and publishing data to support the creation of acts of journalism,” Howard wrote. “A more succinct definition might be simply the application of data science to journalism, where data science is defined as the study of the extraction of knowledge from data.”1
But news games, drone journalism, and virtual reality—approaches that some may not consider mainstream data journalism today—may represent a much more dominant presence tomorrow. Or data journalism may evolve in yet another direction, perhaps into common applications for machine learning and algorithms. Data journalists are already working more with unstructured information (text, video, audio) as opposed to the historical elements of data journalism (spreadsheets and databases full of rows and columns of numbers).
“I think the one good thing about the name discussion is that people are realizing there are different kinds of approaches to data for journalism,” said Brant Houston, the Knight Chair in Investigative and Enterprise Reporting at the University of Illinois at Urbana-Champaign and a former executive director of Investigative Reporters and Editors (IRE).
The ever-evolving practice of data journalism has at heart represented what journalists do best—push against the boundaries of what is expected. Editors used to argue that readers wouldn’t understand a scatterplot published in the newspaper. Today, the New York Times’s Upshot, Fivethirtyeight.com, and others regularly provide informative data graphics and visualizations.
At the same time, each generation of data journalists has informed the next and balanced the desire to try new methods with foundational ethics and transparency.
Our study aims to provide a broad evaluation of many areas of journalistic practice involving data and to identify best practices for teaching these skills and the “data frame of mind” that goes with them. In doing so we looked at multiple forms of data journalism and defined them as best we could to ensure clear communication.
1. Howard, “The Art and Science of Data-Driven Journalism,” p. 4. ↩