This project focuses on leveraging data to make informed decisions for a company. It utilizes techniques from both Data Science and Software Quality disciplines to analyze data, generate insights, and create actionable metrics. The project involves working with real-world data from Olist, a Brazilian e-commerce platform.
- Extracted data from Olist's dataset provided in CSV format.
- Cleaned and preprocessed the data to ensure accuracy and consistency.
- Utilized database diagram for understanding data relationships and structure.
- Conducted EDA to gain insights into the dataset's characteristics and trends.
- Identified patterns, outliers, and correlations within the data.
- Utilized visualizations such as graphs and tables to present findings.
- Identified five key decisions that a company could make based on the data.
- Each decision was supported by relevant analysis, graphs, and tables.
- Metrics were defined for tracking and evaluating the outcomes of each decision.
- Ensured the quality of the information presented in the project.
- Evaluated the accuracy, completeness, and reliability of data.
- Implemented best practices in data analysis and visualization.
- Documented the entire process including data preprocessing, analysis, and decision-making steps.
- Provided clear explanations for each decision and associated metrics.
- Included instructions for replicating the project on other datasets.
- Code: Avaiable in .py format and .ipynb (if possible consider vizualizing in .ipynb for better project structure).
- olist_data.zip: Compressed file containing the dataset in CSV format.
- Database Diagram: Visual representation of the dataset's structure and relationships.
- The project successfully demonstrates the application of data science techniques in making data-driven decisions for a company. By analyzing Olist's dataset, valuable insights were obtained, leading to actionable decisions. The inclusion of metrics and quality assessment ensures the reliability and effectiveness of the decision-making process. This project serves as a valuable resource for understanding the importance of data-driven decision-making and its implementation in real-world scenarios