Skip to content

Machine learning classification model for telco service provider that predicts if a customer will churn or not.

License

Notifications You must be signed in to change notification settings

Azie88/ML-Classification-Customer-Churn-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ML-Classification-Customer-Churn-Prediction 🤖

Customer Churn

A telecom company (Vodafone) wants to find out the likelihood of a customer leaving the company. This project aims to build a classification model that predicts if a customer will churn or not.

Project Overview

In this project, we use Supervised Machine Learning (classification) Machine Learning to explore the significance of churn analytics as a strategic tool for telecommunication companies to proactively identify potential risk factors for churn, optimize retention efforts, and cultivate lasting customer relationships. The project follows the CRISP-DM (Cross-Industry Standard Process for Data Mining) framework to explore and analyze customer churn within the Vodafone network service.

The churn analytics predictive model is a data-driven solution designed to address the persistent challenge of customer churn in subscription-based industries. This model aims to identify customers at risk of churn, enabling businesses to take proactive measures and implement targeted retention strategies.

Table of Contents 🔖

Project Links 🔗

Notebook Published Article PowerBI Dashboard
Customer Churn Classification Model Medium Article View Dashboard

Some Tools Used For The Project 🧰

vscode pandas numpy python jupyter

Dataset 💾

Feature Name Description Data Type
customerID Contains customer ID categorical
gender whether the customer female or male categorical
SeniorCitizen Whether the customer is a senior citizen or not (1, 0) numeric, int
Partner Whether the customer has a partner or not (Yes, No) categorical
Dependents Whether the customer has dependents or not (Yes, No) categorical
tenure Number of months the customer has stayed with the company numeric, int
PhoneService Whether the customer has a phone service or not (Yes, No) categorical
MultipleLines Whether the customer has multiple lines r not (Yes, No, No phone service) categorical
InternetService Customer’s internet service provider (DSL, Fiber optic, No) categorical
OnlineSecurity Whether the customer has online security or not (Yes, No, No internet service) categorical
OnlineBackup Whether the customer has online backup or not (Yes, No, No internet service) categorical
DeviceProtection Whether the customer has device protection or not (Yes, No, No internet service) categorical
TechSupport Whether the customer has tech support or not (Yes, No, No internet service) categorical
streamingTV Whether the customer has streaming TV or not (Yes, No, No internet service) categorical
streamingMovies Whether the customer has streaming movies or not (Yes, No, No internet service) categorical
Contract The contract term of the customer (Month-to-month, One year, Two year) categorical
PaperlessBilling Whether the customer has paperless billing or not (Yes, No) categorical
PaymentMethod The customer’s payment method (Electronic check, Mailed check, Bank transfer, Credit card) categorical
MonthlyCharges The amount charged to the customer monthly   numeric , int
TotalCharges The total amount charged to the customer  object
Churn Whether the customer churned or not (Yes or No) categorical

Process

  • Pull data from multiple sources, including remote sql server database

  • Develop the hypothesis and some analytical questions to answer

  • Data Preprocessing and cleaning & EDA(Univariate, Bivariate and multivariate Analysis)

  • Answering the analytical questions with visualizations

  • Deploying visualizations with PowerBI

  • Balancing the dataset with SMOTE algorithm for oversampling

  • Feature engineering and scaling

  • Model training and Evaluation

  • Hyperparameter Tuning

  • Prediction test and model improvements

  • Conclusion and article writing

Model Performance Accuracy 📊

Accuracy Scores of trained models

Power BI Dashboard 📺

Dashboard

View Dashboard

Conclusion and Recommendation

  • Number of months the customer has stayed with the company (tenure) and the contract term of the customer (contract) are the most important features that have strong correlation with churn of the customer

  • Vodafone should enhance Early Customer Experience because in the first 5-10 months, customer tenure shows a higher churn rate, suggesting that customer experience in the initial stages is vital. Focusing on improving onboarding processes, service quality, and addressing customer concerns during this crucial period with tech support can enhance customer satisfaction and loyalty.

  • Vodafone should Promote Long-Term Contracts, since the analysis indicates that customers with month-to-month contracts have a significantly higher churn rate compared to those with one-year or two-year contracts. Encouraging customers to opt for longer-term contracts through incentives, benefits, and increased tech support can potentially reduce churn rates and foster customer commitment.

  • Hyperparameter tuning does not always drasitically improve model performance

  • With 80/20 train/eval split, the random forest model achieved an accuracy of ~86% after hyperparameter tuning

  • Ensemble methods perform well on classification tasks, compared to using single classifiers

How to use this repository 🧐

You need to have Python 3 on your system. Then you can clone this repo and being at the repo's root :: repository_name> ...

  1. Clone this repository: git clone https://github.com/Azie88/ML-Classification-Customer-Churn-Prediction
  2. On your IDE, create A Virtual Environment and Install the required packages for the project:
  • Windows:

      python -m venv venv; 
      venv\Scripts\activate; 
      python -m pip install -q --upgrade pip; 
      python -m pip install -qr requirements.txt  
    
  • Linux & MacOs:

      python3 -m venv venv; 
      source venv/bin/activate; 
      python -m pip install -q --upgrade pip; 
      python -m pip install -qr requirements.txt  
    

The two long command-lines have the same structure. They pipe multiple commands using the symbol ; but you can manually execute them one after the other.

  • Create the Python's virtual environment that isolates the required libraries of the project to avoid conflicts;
  • Activate the Python's virtual environment so that the Python kernel & libraries will be those of the isolated environment;
  • Upgrade Pip, the installed libraries/packages manager to have the up-to-date version that will work correctly;
  • Install the required libraries/packages listed in the requirements.txt file so that they can be imported into the python script and notebook without any issue.

NB: For MacOs users, please install Xcode if you have an issue.

  1. Explore the Jupyter notebook for detailed steps and code execution.
  2. Check out the Power BI dashboard for interactive visualizations.
  3. Read the published article for a comprehensive understanding of the project.

Author ✍️

Andrew Obando

Andrew Obando | LinkedIn Medium


Feel free to star ⭐ this repository if you find it helpful!