Market Segmentation: A Case Study

What is market segmentation?

Market segmentation is a practice used by companies to group similar customers together. By segmenting their customers, companies  can target different categories of customers and thus customers can receive a product that they want or need.

Overall, market segmentation allows a company to save money so that their marketing efforts are not scattered all over the place, and instead it is methodically distributed.

Hypothetical Case Study:

A credit card company wants to deliver an

Ad Campaign to gain more card holders. 


We have worked with this customer in the past. Although the company has gained customers from our last ad campaign, they noticed that customers are slowly leaving. This time, we want to focus the ad campaign on customers that are more likely to stay.

-----

The data used for this case study was retrieved from kaggle.com

" This dataset consists of 10,000 customers mentioning their age, salary, marital_status, credit card limit, credit card category, etc. There are nearly 18 features. "

-----
Explore the Data

Figure 1: We have lost 16.07% of our customers
We do not want to get customers that will leave. We want to focus our ad campaign on people who are similar to our existing customers.


Figure 2: The age of our customers follow a normal distribution
The age of the customers vary from 26 years to 73 years. Most of the customers are 40 -52 years old. With a median of 46 years and an average of 46.32 years.


Figure 3: A semi even distribution between male and females
We would expect 50% males and 50% females, however we have 5% more females than males. 
Ttest_1sampResult(statistic=5.862586246115964, pvalue=4.699545299388183e-09)
After conducting a 1 sample t-test, there is a significant result. There are more females than males, and this is not due to chance.


Figure 4: Most of our customers earn less than $40,000
We can see that most of our customers are in the lower income bracket. However, we do not know the income of 11% of our card holders.


Figure 5: Most have 2-3 dependents
The results from figure 5 is as expected. In the USA the average house hold has 2-3 dependents and so do the card holders.


Figure 6: Most of our customers hold the blue card


Figure 7: Attrited vs Existing Customers
Existing Customers have greater amounts of relationships, credit limits, total revolving balance, total transaction amounts, and overall utilize the credits cards more than the attrited customers.
Attrited customers were inactive and used the credit card less before they quit the credit card services.
We can see that on average existing customers have slightly more dependents than attrited customers.

A 2 sample t -test between the dependents of attrited and existing customers show that there is no significant difference between the two populations.
Ttest_indResult(statistic=1.9398433006370537, pvalue=0.052519604053987055)

A 2 sample t -test between the proportion of males/females of attrited and existing customers show that there is a significant difference between the two populations.
Ttest_indResult(statistic=3.7529997388764267, pvalue=0.0001757076182398843)


Figure 8: Gender vs Attrition vs Dependents
The number of dependents follow a normal distribution for all 4 groups. This is expected as the population of dependents alone also follow a normal distribution.
 

Unsupervised Learning

The dataset consists of both numerical and categorical data. K means is an algorithm that is used to find clusters in a dataset, however this works best with numerical data. K modes is an algorithm that is also used to find clusters, but this works best with categorical data. K Prototypes is a mixture of both algorithms. 
Figure 9: Elbow Method
When looking to find the optimal amount of clusters in a dataset, an elbow method is used. An elbow method used to find the minimal SSE as well as a the minimal K (clusters). Although there is no "elbow" we would use 4 as it is the closest. However, for future analysis we can also try computing silhouette scores. 

Figure 10: Most customers belong in cluster 3

Income_Category
02.187128
13.797280
22.129550
31.871679

The unsupervised machine learning algorithm clustered the data into 4 groups. And group 3 has the greatest amount of card holders. The full dataset showing the clusters can be found here.

Cluster 0 consists of mostly female customers (57%) that are mostly attrited. They do not use the credit card, and have an average utilization ratio of .055. Their total revolving balance is the lowest out of all 4 clusters.

Cluster 1 consists of mostly Male customers. Around 14.30% of the customers in cluster 1 are males. 88% of them are existing customers. Their average utilization ratio is lowest at .054. However, this is the group that has the highest credit limit and the highest open to buy. This group has a high limit, but does not use its credit card.

Cluster 2 consists of mostly female customers (58%). 86.47% of the these customers are existing customers. This group has the second highest average utilization ratio at 37%. This cluster has the oldest customers with an average of 56 years. They also have the least amount of dependents at 1.4 compared to the others around 2.5. These customers are loyal and stayed with us the longest with an average months on books of 44.5 months.


Cluster 3 consists of 65.55% female customers and 93.40% of the these customers are existing customers. These customers have an average utilization ratio of .511. These customers have the greatest total revolving balance, yet their credit limit is the lowest. These customers are also the youngest with an average age of 42 years. Cluster 3 is also the cluster with the most amount of customers and the lowest of the income bracket.

Recommendations

Although most of our attrited customers are females, this is because most of our customers in general are females. From this we can deduce the population for our advertisements and market to female customers. 
From there we can further deduce the population and market towards lower income customers. 
Low income customers tend to spend their credits cards a lot more than the rest of the groups.
And unsurprisingly lower income customers are those who are on average younger than the rest.

For our next Ad Campaign we should target young low income females, but make sure they will not default on credit card payments.

Comments