Predicting Loyal Customers for Sellers on Tmall to Increase Return on Promoting Cost

Project Details


Fall 2017


Wendy Huang, Yu-Chih Shih, Jessy Yang, Zoe Cheng





Sellers on E-commerce platform sometimes run big promotions (e.g., discounts or cash
coupons) on particular dates (e.g., Boxing-day Sales, "Black Friday" or "Double 11 (Nov 11th)”, in
order to attract a large number of new buyers. Unfortunately, many of the attracted buyers are
one-time deal hunters, and these promotions may have little long lasting impact on sales. To
alleviate this problem, it is important for sellers to identify who can be converted into repeated
buyers, in other words, the loyal customers. By targeting on these potential loyal customers,
sellers can greatly reduce the promotion cost and enhance the return on investment (ROI). It is
well known that in the field of online advertising, customer targeting is extremely challenging,
especially for fresh buyers. However, with the 6 months user behavior log accumulated by, we may be able to solve this problem.
To increase the return on promotion cost, it is definitely useful to build a model to predict
which new buyers for given sellers will become loyal customers in the future.Therefore, our
stakeholders are the sellers on Tmall and our goal is to predict loyal customers for sellers on
Tmall to increase the return on promotion cost. We use the data given by Tmall, which is a set of
sellers and their corresponding new buyers acquired during the promotion on the "Double 11"
day, to to derive a supervised classification model predicting the probability that these new
buyers would purchase items from the same sellers again within 6 months.
By trying the logistic regression, Random Forest, and Xgboost to build the model, we finally
chose the Random Forest with data from 5/11 to 10/25 and cut off probability 0.588 (which is the
probability of top 10% in the outputs) as our final model. However, there are some implementing
limitation and potential model risks in the models. We couldn’t guarantee the stablenes to our
model. For business policy, we recommend to collect more data about characteristics of the
sellers; in addition, increase the customer’s willingness to add their product into favorite; last but
not least, lurkers are potentially loyal customers, which the sellers should not ignore these

Application Area: