Fall 2015
this website: http://github.com/sarahcnyt/data-journalism
Sarah Cohen / sarah.cohen@columbia.edu / @sarahcnyt (Thursday, 6-9pm)
Rob Gebeloff / rgebeloff@nytimes.com (Wednesdsay, 6-9pm) : SCHEDULE
##Description
This is a five-week class in the branch of data journalism dedicated to reporting, particularly public affairs and investigative reporting. Maurice Tamman of Reuters has dubbed this specialty the "empirical spine" of investigative reporting, in the tradition of computer-assisted reporting and precision journalism.
Using records in electronic form gives you a powerful way document patterns and find stories that no one will provide for you.
This 5-week skills class will give you a grounding on the basic skills needed to take advantage of electronic records and data, and, we hope, the confidence to tackle new tools and techniques.
There are two sections to the class: One with Sarah Cohen, the other with Rob Gebeloff. Check the schedules for each of them for more details on what is happening each day. This repository is a shared resource for both classes.
Generally, we will cover:
- Week 1: Excel refresher; understanding the power of data; using Excel as a reporting tool.
- Week 2: Newsroom math ; using Excel for basic newsroom analysis.
- Week 3: Using Excel as a database: Pivot tables and filtering; Demo of database programs and their power.
- Week 4: Getting data into Excel: text, pdfs, and web scraping
- Week 5: Basic visualization and mapping with Fusion Tables; more advanced data cleanup with Open Refine.
We will supply most of the handouts and materials you need for this class. We expect you to have access to a computer with Microsoft Excel (the actual version, not Google Sheets), and a computer that you have permissions to load software onto.
We'll provide links to resources each week. Optionally, you can purchase:
- Computer-Assisted Reporting, A Practical Guide, by Brant Houston
- Numbers in the Newsroom: Using Math and Statistics in News, by Sarah Cohen
- Scraping for Journalists, by Paul Bradshaw (Most of the specific tools he writes about have changed or disappeared since this book was written, but the general advice is still quite useful.)
- The Data Journalism Handbook, edited by Jonathan Gray
Most of the work for this session will be done during class. We'll ask you to review some materials and practice on your own during the week to the extent you feel you need it. Everyone starts this class with a different level of comfort. For some, it may require a little more time than for others to master the concepts and the techniques.
At the end of the course, we'll ask you to write a 200-word story memo that is based on your analysis of a dataset. We will provide you suggested datasets, or you can analyze one of your own.