Model 3: Concentration in Data & Computation

A data journalism concentration should begin with several core, required classes before moving into a track of electives offering data journalism analysis, visualization, and online research/backgrounding.

The curriculum detailed below should provide a framework for a school to begin offering specialized coursework to students who wish to concentrate in data-driven reporting or computational journalism.

This section describes some of the courses that may form such a degree. Depending on the availability of instructors and other resources, classes like these may form either the mandatory core of a concentration in data and computation, or else a range of electives.

Please note that we would not expect any journalism school to offer all of these classes, nor only these, in its data and computational curriculum. This is just one picture of the skills and thematic exposure that could constitute a journalism degree specializing in data and computation.

Core Classes Required for Concentration in Data & Computation

Foundations of Data Journalism

This is the course outlined in the opening of this chapter (see full description on page 50) as a requirement for all journalism students. If students enter journalism school without declared concentrations, this introduction will be suitable for future data concentrators to learn the basics before proceeding to other required courses and electives. Schools may also choose to require applicants to be specifically accepted into the data concentration, in which case it may be advisable to offer a summer boot camp (see “Note on Incoming Skills, Technical Literacies, and Specialized Boot Camps,” page 74) to get students up to speed on the tools and methods they will need. In this case, data concentrators may be placed in a more advanced fall foundations course with their peers.

Introduction to Journalistic Programming

Course description: The purpose of this course is to introduce students to several foundational computer-programming skills that they will use to find and tell stories. This should be a requirement of those who concentrate in data and computation, but also open to students from other tracks.

Course structure: Meets twice weekly, first for lecture and then for an intensive workshop.

Skills: The Unix command line; basic Python programming for scraping, parsing, connecting to APIs; introduction to JavaScript for web work.

Tools: Bash utilities, Jupyter/IPython Notebook, Pandas, Matplotlib, JavaScript.

Example assignments:

  • Test proficiency with the command line with a quiz, or even a screencast demonstrating completion of a series of tasks using Bash alone.
  • Story assignment reported and submitted in Jupyter/IPython notebook.

Statistics for Journalism

Course description: The methods and principles of statistics have proven to be powerful tools in the hands of journalists. This course should be a rigorous introduction to stats work taught from within a framework of journalistic concerns. That means the course is story-based, in the sense of precision journalism and the CAR tradition.

Course structure: Weekly lectures with in-class exercises, regular homework, and a final exam.

Skills: Developing and testing hypotheses; understanding and applying the central limit theorem, normal distribution, and confidence intervals; Frequentist versus Bayesian statistics; linear regression; analysis of variance.

Tools: R Studio, Excel, MySQL, Microsoft Access, SAS, SPSS (proprietary) or PSPP (F/OSS).

Example assignments:

  • Analyze crime statistics, look for a trend, and try to explain its cause.
  • Look at the distribution of cancer cases and try to decide if there is evidence of an increase in more polluted areas.
  • Analyze statistical evidence for U.S. and international cases to predict whether reducing the number of guns would have an effect on gun violence.
  • Analyze the stats in a research paper and report them in plain language.

Distribution of Electives

For the concentration, the school may offer elective courses to fulfill requirements in two or three areas of data and computational work. We have divided these into three categories: presentation/visualization, analysis for story, and journalistic programming. As a matter of designing degree requirements, a program might choose to require at least one class from each category in addition to fulfilling overall credit requirements.

presentation & visualization

  • Data Visualization
  • Visual Journalism with Data and Computation
  • Advanced Data Visualization
  • Advanced Journalistic Mapping

analysis for story

  • Writing About Data
  • Statistical Analysis for Journalism
  • Advanced Computational Reporting Methods (Using CAR)

journalistic programming

  • Introduction to Journalistic Programming
  • Methods of Collecting Data and Automating Reporting
  • News App Development
  • Advanced Computational Journalism

Elective Coursework Graduate Degree with Concentration in Data & Computation

Methods of Collecting Data & Automating Reporting

Course description: This course focuses on developing expertise in gathering data, cleaning it, storing it in a database, and retrieving it with ease. It also emphasizes building automated tools to serve as data sources in reporting.

Course structure: Weekly workshop or lab-based instruction.

Skills: Web scraping, APIs, cron jobs, bash scripting, digitizing paper documents, regular expressions, parsing text and data, fuzzy string matching, record linkage, content analysis.

Tools: Python, Beautiful Soup, Mechanize, Scrapy, Tabula, SQL, MongoDB, data formats (CSV, JSON), Tesseract (OCR), Twitter bots.

Example assignments:

  • Classwork: Design Google Alerts to monitor subjects of interest.
  • homework: Write a program to scrape the Congressional Record for everything a particular representative has said on the floor of the House.
  • Homework: Write a web scraper in Python and automate it with a cron job.
  • Homework: Build a web app or Twitter bot to post useful information from an API.
  • Group project: Build a sensor network to automatically post temperature or air quality measurements online.
  • Final project: Gather a useful body of data, previously unavailable, and share it publicly.

Visual Journalism with Data and Computation

Course description: This course covers a range of methods, media, and formats for the graphic presentation of information. Readings should introduce principles of visual design and integrate these into regular assignments. Beginning with a fairly basic program like Tableau, the class should highlight the effective and accurate presentation of information in graphic form. By the middle of the term, students should branch out into using a programming library such as D3 to design their own graphics outside the constraints of existing software.

Course structure: Weekly seminar to discuss readings, followed by hands-on workshop.

Skills: Data visualization, news apps, GIS/mapping for presentation.

Tools: Tableau, JavaScript, D3, QGIS, CartoDB.

Example assignments:

  • Homework: Use Tableau to find a story in a previously unexplored data set.
  • Final project: Create an original visualization or interactive piece programmed by hand (presumably in D3 or even pure JavaScript if it was taught in an earlier class).

Advanced Data Analysis & Journalistic Algorithms

Course description: This course should build upon the core, required classes to bring together data and computation for finding stories and making predictions using algorithmic and computational analysis.

Course structure: Weekly lecture and workshop with regular homework and a final project.

Skills: Python for machine learning, clustering, classifying documents, standardizing and matching algorithms.

Tools: R, Python (Pandas, MatPlotLib, SciPy, scikit-learn), clustering algorithms (k-means, k-nearest neighbor clustering), topic modeling algorithms (LDA or NMF).

Example assignments:

  • Classwork: Record linkage for data cleaning, for example, analyze Federal Election Commission data to find top donors, which requires regularization of names, best done with machine learning.
  • Homework: Analyze State of the Union speeches since 1790 to make a visualization of how key topics have changed over time.
  • homework: Implement clustering to detect outliers in a data set.
  • Final project option: Build an election or market prediction model.
  • Final project option: Reverse engineer a pricing, lending, or credit score algorithm.

Advanced Data Visualization

Course description: This course would pick up from the data visualization skills developed in the core course in visual journalism. The technical aspects should be conducted entirely through programming. The most likely tools are JavaScript and D3, but others will certainly emerge. The key point is that data visualization at this level should be programmatic so that the software itself does not limit design possibilities.

Course structure: It may be designed to alternate between seminars (high-level reading, discussion, and analysis of visual communication and information design principles, focusing on how it is most effective and where it can be misleading) and lab classes (advanced practical instruction in application and coding frameworks for info design).

Skills: Designing for clarity, precision, impact.

Tools: D3, JavaScript, or another suitable programming framework.

Example assignments:

  • Homework: Regular data assignments in different media: static web, video, interactive.
  • Final project: An original analysis of unexplored data, presented in an original visualization programmed more or less from scratch, with cross-platform consistency.

Advanced Journalistic Mapping

Course description: This course should build on previous coursework in mapping to cover more advanced manipulations of data, to develop a higher degree of design sophistication, and to develop a high level of news judgment in the selection of timely, compelling, and original topics. This involves using GIS technologies, joining that spatial data with other information, using density and other spatial analysis to inform stories, not just building presentations.

Course structure: Hands-on workshop and lab.

Skills: Clustering, binning, heat maps, joining different geographic data sets.

Tools: Both GIS analysis software and presentation software, including Esri, QGIS, CartoDB, Leaflet, sensors like DustDuino.

Example assignments:

  • Homework: Weekly pitches and journalistic mapping assignments.
  • Final project: An interactive map or set of maps telling a story about a timely or unexplored subject, and/or a narrative story using findings from the mapping analysis.
  • Class project: Build a sensor network or otherwise amass an unexplored data set, then work in small groups to build a package of maps to explore the data.

Advanced Journalistic Text Mining

Course description: Text is data. The purpose of this class is to teach journalists to gather, analyze, and present stories using large amounts of textual data. This may build on material from the course “Methods of Collecting Data and Automating Reporting.”

Course structure: Weekly lecture and lab, working toward a final project.

Skills: Web scraping, analyzing large bodies of text, sentiment analysis, topic modeling.

Tools: Overview, DocumentCloud, Natural Language Toolkit (NLTK) or Stanford NLP.

Example assignments:

  • Homework: Use sentiment analysis to reproduce the before and after tone change of a story.
  • Final project: Build a scraper to crawl a significant chunk of the Web, for example, collecting the blogosphere of a country that’s in the news and learning what people are talking about.
  • Final project: Gather and analyze a large body of documents, such as looking for a story in a leaked cache of documents.

Advanced Computational Journalism

Course description: This course should reflect the state of computational tools in journalistic practice while looking toward novel applications of emerging and unexplored tools. By this time, students should have already developed a strong foundation of programming and data analysis skills. This class should build on that foundation and encourage in-depth, independent projects centered on reporting stories or developing a piece of software.

Course structure: Meets twice weekly, once for lecture and once for lab.

Tools: Python, Ruby, or a similarly powerful and versatile scripting language. Additionally, physical computing tools like Arduino.

Example assignments:

  • Homework: Use regular expressions to mine the Congressional Record for a senator’s stated positions on a political issue over the course of his or her career.
  • Mock coding interview: In the common style of interviewing for tech jobs, solve a given programming problem on a whiteboard and narrate your line of thinking.
  • Final project: An in-depth story reported using an advanced tool, including but not limited to a journalistic algorithm, machine learning, analysis of personally gathered data, or development of a piece of software.

Capstone or Thesis Project

A thesis in data journalism will arise, ideally, from work with instructors and an adviser. It could take the form of a reported story, a technical report, a piece of software, or a substantial design piece such as a map or data visualization.

A capstone project for concentrators in data and computation may take a class of students and coordinate a project using and honing the skills they have developed in their earlier coursework. Each student’s work should then be supplemented with an individual contribution such as a reported piece or data visualization.

results matching ""

    No results matching ""