Predicting Customer Cancellation of Cab Bookings for

Project Details




Anupama Atmuri, Garrett Butler, Leena Bhai, Priyanka Paul, Rohith Lokareddy, Shweta Agarwal





Our Client, YourCabs is a Bangalore-­‐based technology platform that aggregates fleet owners and vehicles, in the car rental space. The company, founded by Rajath Kedilaya in 2011 has managed to create an intelligent network that manages real-­‐time supply and demand of cabs.

In this assignment we will try to predict possible cancellations of cab booking by the customer using data obtained from the company. Our goal is to reduce the cost incurred by the company as a result of cab cancellations made by the customer. By predicting possible cancellations an hour before the pickup time, YourCabs will be better able to manage its vendors and drivers by providing them with up to date information about customer cancellations and reduce the cost incurred from sending a cab to a booking location that has been cancelled by the customer. Accurate prediction of customer cancellations will lead to
a reduction in company costs. If we assume that the cost of sending a cab for a booking that will be cancelled by the customer is Rs 100, and the cost of calling a customer flagged by our model an hour before the pickup time to confirm the booking is Rs 10. For each possible cancellation that is predicted accurately, YourCabs will save Rs 90. If the model incorrectly predicts a customer cancellation, it will cost YourCabs Rs 10 to call the customer to confirm the booking. Success would be defined as a reduction in overall cost to the company for cab cancellations from the customer end.

Our data analysis model used several methods to analyze the data including classification tree, K-­‐nearest neighbor, Naïve Bayes and Ensemble. The accuracy of the model coupled with the final business goal of reducing cost for the company was used to finalize the model for the prediction. The model that we selected in the end was Naïve Bayes. Not only does the model have an overall low error rate, but also the cost incurred by the company using this model is the lowest. Our recommendation includes running the model in real time on an hourly basis for all pickup times, which are within an hour’s time. The model will flag all likely booking cancellations and the operator will call the customers to confirm the booking. Once the operator receives confirmation from the customer, the cab will be dispatched to the pickup location. By using the model for predicting possible customer cancellations, the company will successfully reduce the cost incurred from sending a cab to a pickup location where the customer is not present.

Application Area: