Forecasting fruit demand - Intelligent Procurement

Project Details


2012 (Dec)


Dinesh Ganti, Ravi Shankar, Shouri Kamtala, Supreet Kaur, Rachna Lalwani





Problem Description/Business Goal - As the fruit supplier to the hypermarket, we wish to match our
procurement with the fruit demand. This is important owing to the perishable nature of fruits. This
reduces over/under stocking. Also, by matching the fruit demand accurately, we can provide value-add
to the hypermarket and stay ahead of the competition.

Forecasting goal - To forecast the demand for five chosen fruit SKUs over a forecast period of 2 days;
the chosen SKUs are Pineapple cuts (Kg) (Mb), Apple red delicious, Watermelon striped, Packham pear
and Premium banana. The criterion for fruit selection was: High volume of transactions.

Data - Some main features to note within the data file are: (a) it has only 13 months of data which
meant that any annual seasonality or monthly predictions were out of question (b) it had no
information about the reason for zero demand (no demand versus stock out). Quantity sold for each
SKU was aggregated at the daily level. The missing values were filled with seasonal naïve or zeros on a
case-by-case basis depending on whether it was a stock out or zero demand. Please refer to exhibits for
charts showing actual and predicted values for each SKU.

Forecasting methods and performance metrics - The primary performance metric used to evaluate the
performance of different methods was the combined cost of under-stocking + over-stocking over
validation period. As the fruit vendor, the cost of over stocking was wastage of items and was equal to
the cost of the fruit. The cost of under stocking was lost sales and lost reputation with hypermarket. To
capture this difference, we considered a 3:1 weight-age which means: Cost (Under Stocking) = 3 * Cost
(Over Stocking). In all the cases the seasonal naïve was our bench mark.

Conclusions/Recommendations – Different models work best for different fruits. Replicating the
forecasting exercise for each SKU when there are 100s of them would be a costly affair. Automated
data-driven methods are preferable. Data quality and consistency must be ensured for data driven
models to work. One should also keep in mind the fact that data from the hypermarket will be available
to the vendor only after a certain lag of at least a day. When predicting for a very short period, it is not
possible to come up with prediction intervals around forecasts.

Application Area: