Predicting AsiaYo Users’ Spending for Improved Search Results

Project Details

Term: 

Fall 2017

Students: 

Travis Greene, Martin Hsia, Letitia She, Leo Lee

University: 

NTHU

Presentation: 

Report: 

I. Business Problem- How can we improve our conversion rate?
AsiaYo earns revenues by taking a fixed percentage of total booking costs, so if we could increase the number of bookings and the amount spent per user, we would directly increase sales revenues. In order to achieve this end, we propose sorting user searches based on an estimated budget, with properties closest to the predicted nightly budget listed first in the most prominent area of the screen. Currently, the sorting algorithm returns search results with a very wide range of prices. We believe that a better sorting algorithm will contribute to a better user experience and ultimately to an improved conversion rate, especially under the assumption that AsiaYo customers make booking decisions largely based on price.
II. Data
In order to build a statistical learning model that can predict users’ nightly spending budgets, we obtained a dataset of completed past transactions. Each row in the dataset consisted of a completed booking on AsiaYo. After cleaning, filtering, and manipulating the data into a usable form, there were approximately 50,000 rows of transactions and 16 variables (columns). Key variables used as input into the model included day of check in, day of booking order, user country, accommodation city and country, number of guests, and month of booking order. Our output column was the nightly spending amount.
Finally, after running several predictive models trained on the data, the most important variables associated with per night spending were the day of check-in (Saturday), accomodation city, particularly those in Japan, and the number of days booked in advance. We suspect this is because travelers to Japan and Korea spend more on average and also book longer in advance.
III. Analytics Solution
Our best performing model consisted of an ensemble model that used three separate models as input and gave us a predicted nightly spending budget as output. Though slightly more complicated to build, the ensemble’s predictions were more accurate compared to any of the individual models. We believe this tradeoff in speed and complexity was necessary as the predictions are only useful if they relatively accurate. The time needed to compute a new user’s predicted nightly budget is still negligible.
Using root mean squared error (RMSE) as our performance metric, the ensembled model scored 846. In more concrete terms, this means the difference between a customer’s true nightly spending and our prediction was around $846, on average. Further, by creating a Taipei-level model, we found that RMSE could be reduced to nearly $600. We are confident that with careful tuning of the default parameters of the input models and the creation of city-level models (based on the top 1-3 cities by country), we could generate predictions accurate enough to have a tangible impact on the booking conversion rate and user experience.
IV . Recommendations
We expect that with more detailed information about accommodation cities, such as city district, we could produce even more accurate predictions. However, it is not clear how we could use this information at the time of search, given that the current AsiaYo page only allows for the selection of city, and not districts. Additionally, if we could connect a user on the search page with her previous booking history, we could improve our predictions immensely. This past transaction data could also be linked with textual information in the form of user reviews and ratings. Overall, we hold the opinion that user budget predictions could be an effective input into a broader search results algorithm. Given the relatively low cost, ease of implementation, and the potential upside in user conversion, we regard this as a worthwhile business project for AsiaYo.

Application Area: