As a Data Reporter on the IndyStar's Investigations team, I led the data-driven reporting process for the IndyStar's investigation of hazardous material transportation accidents in the Midwest.
I gathered data sources, asked research questions of the data, conducted exploratory data analysis and data integrity checks, merged the data with other relevant datasets, and conceptualized and created data visualizations. I fielded data-related questions from my collaborators and also engaged in shoe-leather reporting and interviewing.
This repo includes the final Rmd file that generates the statistics found in the final published project and also generates data that is correctly shaped for visualizations.
I studied a decade's worth of data from the Pipeline and Hazardous Materials Safety Administration about the movement of hazardous chemicals across Indiana, Illinois, Kentucky, Michigan and Ohio. I also utilized the Bureau of Economic Statistics GDP Chained Price Index, the National Center for Health Statistics Urban-Rural classification scheme and other urban-rural classifications.
Read about the full methodology behind this project here.
-
Challenge 1 | The Scope of the Project: Many of the preliminary decisions I had to make involved defining the scope of the investigation. I learned to balance competing priorities of feasiblity, deadline considerations, comprehensiveness, and relevance to readers. For example, I worked with my editor and spoke to PHMSA about the relevance of pipeline accidents to this project and ultimately decided that the data collection methodology and structure of pipeline accidents did not fit with the other modes of transportation.
-
Challenge 2 | Data Inconsistencies: I found and documented instances of missing and incorrect data through quantitative and qualitative measures. Based on my contextual research and communication with PHMSA, I identified a major error in PHMSA's data concerning damages in the East Palestine train derailment and ensured the information that the IndyStar and collaborators reported was accurate.
-
Challenge 3 | Demographic Information: Based on conversations from sources, I considered it incredibly important to study how communities of different demographics are affected by hazardous material transportation accidents. I tested Census Bureau race and ethnicity data against the PHMSA dataset, but found that the results were inconclusive due to data quality. I also tackled the question of what the best way to catagorize urban versus rural communities would be. I spoke to subject experts about different catagorizations of urban and rural and tested 3 different catagorizations to ensure the results with similar.
- Huge amounts of hazardous materials pass through the Midwest every day. How safe are you?, USA Today Network
- 6 things to know about the transportation of hazardous materials in the Midwest, IndyStar
- Behind the numbers: Hazmat transportation accidents are on the rise in Midwest, IndyStar
- Norfolk Southern behind 4 of 5 costliest hazmat incidents in Ohio in past decade, The Cinncinati Inquirer
- Michigan's 5 costliest hazmat accidents in the past decade all involved tanker trucks, The Detroit Free Press
- Kentucky's 5 costliest hazardous material transportation incidents of the past decade, The Courier Journal