A telecom company (Vodafone) wants to find out the likelihood of a customer leaving the company. This project aims to build a classification model that predicts if a customer will churn or not.
In this project, we use Supervised Machine Learning (classification) Machine Learning to explore the significance of churn analytics as a strategic tool for telecommunication companies to proactively identify potential risk factors for churn, optimize retention efforts, and cultivate lasting customer relationships. The project follows the CRISP-DM (Cross-Industry Standard Process for Data Mining) framework to explore and analyze customer churn within the Vodafone network service.
The churn analytics predictive model is a data-driven solution designed to address the persistent challenge of customer churn in subscription-based industries. This model aims to identify customers at risk of churn, enabling businesses to take proactive measures and implement targeted retention strategies.
- Project Overview
- Project Links
- Some Tools Used For The Project
- Dataset
- Process
- Model Performance
- Dashboard
- Conclusion and Recommendation
- How to use this repository
- Author
Notebook | Published Article | PowerBI Dashboard |
---|---|---|
Customer Churn Classification Model | Medium Article | View Dashboard |
Feature Name | Description | Data Type |
---|---|---|
customerID | Contains customer ID | categorical |
gender | whether the customer female or male | categorical |
SeniorCitizen | Whether the customer is a senior citizen or not (1, 0) | numeric, int |
Partner | Whether the customer has a partner or not (Yes, No) | categorical |
Dependents | Whether the customer has dependents or not (Yes, No) | categorical |
tenure | Number of months the customer has stayed with the company | numeric, int |
PhoneService | Whether the customer has a phone service or not (Yes, No) | categorical |
MultipleLines | Whether the customer has multiple lines r not (Yes, No, No phone service) | categorical |
InternetService | Customer’s internet service provider (DSL, Fiber optic, No) | categorical |
OnlineSecurity | Whether the customer has online security or not (Yes, No, No internet service) | categorical |
OnlineBackup | Whether the customer has online backup or not (Yes, No, No internet service) | categorical |
DeviceProtection | Whether the customer has device protection or not (Yes, No, No internet service) | categorical |
TechSupport | Whether the customer has tech support or not (Yes, No, No internet service) | categorical |
streamingTV | Whether the customer has streaming TV or not (Yes, No, No internet service) | categorical |
streamingMovies | Whether the customer has streaming movies or not (Yes, No, No internet service) | categorical |
Contract | The contract term of the customer (Month-to-month, One year, Two year) | categorical |
PaperlessBilling | Whether the customer has paperless billing or not (Yes, No) | categorical |
PaymentMethod | The customer’s payment method (Electronic check, Mailed check, Bank transfer, Credit card) | categorical |
MonthlyCharges | The amount charged to the customer monthly | numeric , int |
TotalCharges | The total amount charged to the customer | object |
Churn | Whether the customer churned or not (Yes or No) | categorical |
-
Pull data from multiple sources, including remote sql server database
-
Develop the hypothesis and some analytical questions to answer
-
Data Preprocessing and cleaning & EDA(Univariate, Bivariate and multivariate Analysis)
-
Answering the analytical questions with visualizations
-
Deploying visualizations with PowerBI
-
Balancing the dataset with SMOTE algorithm for oversampling
-
Feature engineering and scaling
-
Model training and Evaluation
-
Hyperparameter Tuning
-
Prediction test and model improvements
-
Conclusion and article writing
Accuracy Scores of trained models |
---|
-
Number of months the customer has stayed with the company (tenure) and the contract term of the customer (contract) are the most important features that have strong correlation with churn of the customer
-
Vodafone should enhance Early Customer Experience because in the first 5-10 months, customer tenure shows a higher churn rate, suggesting that customer experience in the initial stages is vital. Focusing on improving onboarding processes, service quality, and addressing customer concerns during this crucial period with tech support can enhance customer satisfaction and loyalty.
-
Vodafone should Promote Long-Term Contracts, since the analysis indicates that customers with month-to-month contracts have a significantly higher churn rate compared to those with one-year or two-year contracts. Encouraging customers to opt for longer-term contracts through incentives, benefits, and increased tech support can potentially reduce churn rates and foster customer commitment.
-
Hyperparameter tuning does not always drasitically improve model performance
-
With 80/20 train/eval split, the random forest model achieved an accuracy of ~86% after hyperparameter tuning
-
Ensemble methods perform well on classification tasks, compared to using single classifiers
You need to have Python 3
on your system. Then you can clone this repo and being at the repo's root :: repository_name> ...
- Clone this repository:
git clone https://github.com/Azie88/ML-Classification-Customer-Churn-Prediction
- On your IDE, create A Virtual Environment and Install the required packages for the project:
-
Windows:
python -m venv venv; venv\Scripts\activate; python -m pip install -q --upgrade pip; python -m pip install -qr requirements.txt
-
Linux & MacOs:
python3 -m venv venv; source venv/bin/activate; python -m pip install -q --upgrade pip; python -m pip install -qr requirements.txt
The two long command-lines have the same structure. They pipe multiple commands using the symbol ;
but you can manually execute them one after the other.
- Create the Python's virtual environment that isolates the required libraries of the project to avoid conflicts;
- Activate the Python's virtual environment so that the Python kernel & libraries will be those of the isolated environment;
- Upgrade Pip, the installed libraries/packages manager to have the up-to-date version that will work correctly;
- Install the required libraries/packages listed in the
requirements.txt
file so that they can be imported into the python script and notebook without any issue.
NB: For MacOs users, please install Xcode
if you have an issue.
- Explore the Jupyter notebook for detailed steps and code execution.
- Check out the Power BI dashboard for interactive visualizations.
- Read the published article for a comprehensive understanding of the project.
Andrew Obando
Feel free to star ⭐ this repository if you find it helpful!