This repository contains a Jupyter Notebook for a technical test focusing on NLP (Natural Language Processing) and data manipulation, specifically tailored for e-commerce data analysis.
- Data Preprocessing 🔄: Importing libraries, reading data, renaming columns, and date conversion.
- Dimension & Color Extraction 📏🎨: Functions to extract dimensions and colors from product descriptions.
- Categorization Correction 🏷️: Algorithms to check and correct product categorization.
- Data Analysis 📈: Visualization and statistics of the processed data.
Follow these steps to run the notebook locally:
-
Clone the Repository
git clone https://github.com/labrijisaad/Technical-Test-NLP-Category-Correction.git
-
Set Up the Environment
- Run
make setup
to create a virtual environment and install dependencies.
- Run
-
Launch Jupyter Lab
- Execute
make jupyter
to activate the virtual environment and start Jupyter Lab.
- Execute
-
Navigate to the Notebook
- Open the
/notebooks
directory and run the Jupyter Notebook to explore the data.
- Open the
Your contributions are welcome! Check out the issues page.