The Canada Energy Regulator (CER) quarterly publishes a pipeline incidents dataset (ranges from 2008 to current). As with any dataset, it required some cleaning, so I wrote a python script that parsed the entire file and removed unnecessary columns. This same cleaning phase was also done more efficiently in an Alteryx workflow.
The Open Government website provides access to dataset and can be viewed here.
Alteryx provides an intuitive (intuitive for me because I have an Engineering background) way of analyzing data. The datastream (called workflow in Alteryx) provides an efficient way of importing, cleaning, and exporting data—Extract, Transform and Load (ETL) all in one tool!
Using Alteryx saved a massive amount of data preparation time—about 90%!
Alteryx Workflow is shown below.
Below is a snapshot of the dashboard I recently uploaded to Tableau Public. It can be accessed here.