Performing a clustering model for Bank Customer Dataset using K-Means clustering
To increase the flow of money in this bank, the digital marketer team want to know what customer segment that have the highest balance percentage. This process is important to know what kind of services that suits best for this kind of customer. In this model I'm using K-Means Clustering as the main algorithm.
Requirements : numpy, pandas, matplotlib, seaborn, sklearn, yellowbrick
In this model I'm using three-level clustering. First clustering was separate the balane's outliers and non outliers, this analysis was helped with boxplot diagram.
from the boxplot above I separate the data into to cluster, avg_balance and high_balance.
The second clustering was the use of K-Means clustering algorithm based on "balance", "age", "duration" for each cluster in first clustering, and this is the result of the second clustering.
After knowing the main cluster, the last clustering level was clustering it again based on it's 'job', 'marital', and 'education'. The result of the segment that has the highest balance can bee seen in the table below (high balanced data).
And for the average balance data can be seen also at the table below.
from two table above we can conclude that both of them were in their early thirty, single, have tertiary level education, and work in management field. with that information digital marketing team can calculate the best service for them.
For complete code you can se it here