Relevant training course: Data Scientist
Level of difficulty: 07/10
The objective of this project is to try to predict the severity of road accidents in France. Predictions will be based on historical data. It is a perfect problem to deal with all the stages of a Data Science project. A first step is to study and apply methods to clean the dataset. Once the dataset is clean, a second step is to extract from the match history the characteristics that seem to be relevant for estimating the severity of accidents. Then, from its results, the objective is to work on a scoring of the risk zones according to the meteorological information, the geographical location (GPS coordinates, satellite images, …). Once the model has been trained, we will compare our model with historical data.
https://www.kaggle.com/ahmedlahlou/accidents-in-france-from-2005-to-2016
Bases de données annuelles des accidents corporels de la circulation routière - Années de 2005 à 2020 - data.gouv.fr
-
an exploration, data visualization and data pre-processing report;
-
a modeling report;
-
a final report and associated GitHub.
Link to document https://docs.google.com/document/d/1m2ibEY6n6zcnqqxuJWyTgvIjmhyGQADBCmZThVQnQpA/edit