|Title||Automated Time Series Forecasting for Biosurveillance|
|Publication Type||Journal Article|
|Year of Publication||2007|
|Authors||Burkom, H. S., S. Murphy, and G. Shmueli|
|Journal||Statistics in Medicine|
For robust detection performance, traditional control chart monitoring for biosurveillance is based on input data free of trends, day-of-week effects, and other systematic behaviour. Time series forecasting methods may be used to remove this behaviour by subtracting forecasts from observations to form residuals for algorithmic input. We describe three forecast methods and compare their predictive accuracy on each of 16 authentic syndromic data streams. The methods are (1) a non-adaptive regression model using a long historical baseline, (2) an adaptive regression model with a shorter, sliding baseline, and (3) the Holt-Winters method for generalized exponential smoothing. Criteria for comparing the forecasts were the root-mean-square error, the median absolute per cent error (MedAPE), and the median absolute deviation. The median-based criteria showed best overall performance for the Holt-Winters method. The MedAPE measures over the 16 test series averaged 16.5, 11.6, and 9.7 for the non-adaptive regression, adaptive regression, and Holt-Winters methods, respectively. The non-adaptive regression forecasts were degraded by changes in the data behaviour in the fixed baseline period used to compute model coefficients. The mean-based criterion was less conclusive because of the effects of poor forecasts on a small number of calendar holidays. The Holt-Winters method was also most effective at removing serial autocorrelation, with most 1-day-lag autocorrelation coefficients below 0.15. The forecast methods were compared without tuning them to the behaviour of individual series. We achieved improved predictions with such tuning of the Holt-Winters method, but practical use of such improvements for routine surveillance will require reliable data classification methods.