Hypermarts frequently use promotions via mail-in-rebate coupons, bulk buy discount offers etc. to influence customers to purchase greater number of products from their stores. Keeping this in mind, the potential benefit to the Hypermart can be significantly increased if the right promotions are targeted to the right customers - more specifically, identifying a new customer as a potential high margin customer and targeting him/her with promotions related to high margin products for greater sales turnover of such products.
New Customer: A customer who has purchased exactly once in the Hypermart
High Margin Customer ('H'): A customer who purchases high margin products more than 50% of the time.
The data mining problem is to 'classify' a new customer as either a high margin or a low margin customer using the supervised learning techniques.
We used Logistic Regression to classify customers in the high or low margin category using the following predictors obtained on the first basket purchase: # of SubDepts, Quantity Sold, Price of Basket, Age, Sex, Day of week. The predictors were selected using stepwise regression method and selecting the best subset of predictors.
After experimenting with the Logistic Regression and CART classification methods on partitioned data (50%: training, 30%: validation, 20%: testing), we compared the accuracy of the models using the confusion matrix (cutoff probability for a high margin customer = 0.5) results on the test data. We also performed an ensembles analysis on the results of the two models and noticed that the error rate on the (21%) is higher than the logistic regression model. Logistic regression gives us the best accuracy (error rate on 'H' prediction: 19%) and CART gives us the lowest accuracy (23%).
We recommend our Hypermart client to implement the logistic regression model in real-time while the new customer is checking out. Based on the prediction of the model, the customer may be offered promotions related to high margin products leading to increased return on marketing spend.