Skip to content

Data cleaning, analysis and visualization of Paris metro traffic (Python, Pandas, Matplotlib, iPyLeaflet, Kepler.gl).

Notifications You must be signed in to change notification settings

fannykassapian/metro-traffic-data-analysis

Repository files navigation

Paris metro traffic

Data cleaning, analysis and visualization of Paris metro traffic with Python, Pandas, Matplotlib, iPyLeaflet & kepler.gl.

Example: Hourly traffic visualized with kepler.gl Visualization of hourly traffic with kepler.gl

Source:

Île-de-France Mobilités, formerly STIF, is the organisation authority that controls and coordinates the different transport companies operating in the Paris-area public transport network and Île-de-France region.

Since 2016, the STIF gives access to some of its raw data through an opendata portal.

The STIF operates both a road network (bus) and a rail network (train, metro, RER, funicular).

This analysis focuses on the rail newtork, and more specifically, on metro stations within Paris.

Datasets:

The STIF provides the following data about the rail network: daily traffic per stop (number of checkins per day and per ticket type), hourly profiles per stop (traffic distribution per hour of a typical day), geographical coordinates (arranged by stop or by line of transport), repositories of all stops (arranged by stop or by line of transport).

Data about daily traffic and hourly profiles is available for the years 2015 through 2018.

We will focus our analysis on the year 2018.

For the year 2018, data is split accross 2 datasets, corresponding to the 1st and 2nd semester of the year.

Below are the links to the aforementionned datasets:

Purpose:

  • Clean & wrangle available data to visualize & analyze the traffic within the Paris metro network in 2018.
  • Use Matplotlib, iPyLeaflet & kepler.gl to produce visualizations.

Example: Traffic visualized with iPyLeaflet Traffic visualization with iPyLeaflet

Notebooks:

This project consists of 4 separate notebooks:

  • Notebook 1/4: General introduction and initial data exploration
  • Notebook 2/4: Data cleaning (daily traffic and hourly profiles)
  • Notebook 3/4: Data cleaning (Repository and geographical coordinates)
  • Notebook 4/4: Analysis and visualization

Releases

No releases published

Packages

No packages published