Model 1: Integrating Data as a Core Class
Foundations of Data Journalism
This is a model for a required introductory course at the graduate or undergraduate level. What follows is a narrative account of how such a class may proceed in developing data literacy among beginning journalists. We realize that this course may need to fit the unique contours of different journalism programs, some of which contain boot camps and other introductory programs with idiosyncratic durations and varying levels of intensity and focus on different skills. The point is for this course to be given equal footing with other skills or subject matters that are currently treated as essential in a journalism education.
course description: This course is an introduction to the collection, analysis, presentation, and critique of structured information by journalists. As students are introduced to the basics of reporting and the range of journalistic methods that they may pursue in later coursework, an introduction to data and computation is an essential component of their journalism education.
Over the course of a term, students should begin to develop a frame of mind in which they approach every story looking for data possibilities. They should understand how to use basic methods using spreadsheets and relational databases. They should get a primer on using and understanding statistical concepts. They should learn how to take their data findings and locate the people who illustrate those findings for their stories. They will learn how to convert their data analysis into a pitch for a journalistic story.
Students will learn how to find data online, how to maintain personal records as they report stories, and how to use simple visualization methods to find new information: how keeping a timeline can help reveal discrepancies and how cross-checking sources of information may lead to new avenues of inquiry. The use of data in these contexts will benefit students no matter which area of journalism they choose to practice. Just like interviewing, which is a ubiquitous journalistic skill, the art of gathering and understanding data should extend widely across the field of journalism.
The trouble with data is that it so often appears clinical or detached from the richness of people’s lives. To reduce things to abstractions may seem limiting to some students. Early exercises may help to counter this presupposition. If you ask the class to gather information about each other such as their birth dates, blood types, eye color, and birthplace, they may see within a 20-minute exercise how interesting data can be when we learn something from data that we care to know.
From this point, the class may move to more journalistic exercises with spreadsheets. As students become more comfortable with spreadsheets, the class may turn to methods of data analysis such as pivot tables and other plotting methods.
Skills: This class should prepare students to use spreadsheets and databases to find and tell stories.
Central concerns include: spreadsheet training, how to find data, clean it, look for patterns and outliers, and question the biases and omissions in how it was gathered. Instructors may choose to use an introductory data set with a good story for beginners to find (examples are listed in the appendix).
Students should also learn how to critically assess claims surrounding data. Reporting on data may go astray when it presumes this information is complete and accurate. Reporters should be trained to look for problems in data. It is necessary to question every source of information.
Another foundational aspect of data-driven reporting is to recognize patterns and anomalies in data. Two skills that should emerge from this class are to look for trends and to identify outliers. Every data point is a possible source or anecdote.
This class will introduce data visualization, but mainly as a means to explore a data set. Using an approachable program such as Excel, Fusion Tables, or Tableau, students will learn to display data in graphic form as an inroad to asking journalistic questions. The goal is not to design a graphic for publication, but to graph for the sake of understanding the data. Students may think of this as a research method or a sketchpad for further reporting. Instruction should include discussion of the ways that different visualization methods can be misleading.
Along the way, this class can cover basic numeracy and descriptive statistics — skills that every journalist needs to know. This may include reminders about how to calculate percentage and percent change, working with units and measures, and even identifying large numbers like billions and trillions. Once those are covered, the material could move to statistics principles and methods such as standard deviation and regression analysis.
Topics: Data sources, importing data, negotiating for data, checking the veracity of data, data cleaning, using formulas in spreadsheets, querying databases, finding social significance in the data, writing a data story, visualizing a data story.
Course Structure: Mix of hands-on practice and lectures, primarily using spreadsheet tools and perhaps relational database software; some limited exposure to data visualization for story exploration.
Example Assignments:
- Homework: Bring a piece of data journalism to critique in class.
- Homework: Find a data set and explain why it’s interesting and what it might reveal.
- Classwork: Discuss basic data analysis and cleaning on prepared example data.
- Spreadsheet assignment: Analyze a government’s payroll, including overtime, or examine a city or county budget. This could be a bridge inspection data set, a city budget, or city payroll.
- Final: Produce a data story in three assignments: pitch, draft, final submission.