Abstract
This study presents and applies a methodology for selecting anomaly detection algorithms for biosurveillance time series data. The study employs both an authentic dataset and a simulated dataset which are freely available for replication of the results presented and for extended analysis. Using this approach, a public health monitor may choose algorithms that will be suited to the scale and behavior of the data of interest based on the calculation of simple discriminants from a limited sample. The tabular classification of typical time series behaviors using these discriminants is achieved using the ROC approach of detection theory, with realistic, stochastic, simulated signals injected into the data. The study catalogues the detection performance of 6 algorithms across data types and shows that for practical alert rates, sensitivity gains of 20% and higher may be achieved by appropriate algorithm selection.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lombardo, J.S., et al.: A Systems Overview of the Electronic Surveillance System for the Early Notification of Community-Based Epidemics. Journal of Urban Health, 80(2), Supplement 1 (Proceedings of the 2002 National Syndromic Surveillance Conference), i32–i42 (2003)
Data download: see online research site for the International Society for Disease Surveillance: https://wiki.cirg.washington.edu/pub/bin/view/Isds/ResearchTopics
See groupings at: http://www.bt.cdc.gov/surveillance/syndromedef/word/syndromedefinitions.doc
Ryan, T.P.: Statistical Methods for Quality Improvement. John Wiley & Sons, New York (1989)
Hutwagner, L., et al.: Comparing aberration detection methods with simulated data. Emerg. Infect. Dis. (Feb. 2005), Available from http://www.cdc.gov/ncidod/EID/vol11no02/04-0587.htm
Wallenstein, S.: Naus Jhttp Scan Statistics for Temporal Surveillance for Biologic Terrorism, http://www.cdc.gov/MMWR/preview/mmwrhtml/su5301a17.htm
Kleinman, K., Lazarus, R., Platt, R.: A generalized linear mixed models approach for detecting incident clusters of disease in small areas, with an application to biological terrorism. Am. J. Epidemiol. 159, 217–224 (2004)
Reis, B.Y., Mandl, K.D.: Time series modeling for syndromic surveillance. BMC Medical Informatics and Decision Making 3(2) (2003)
Brillman, J.C., et al.: Modeling emergency department visit patterns for infectious disease complaints: results and application to disease surveillance. BMC Medical Informatics and Decision Making 5(4), 1–14 (2005)
Burkom, H.S., Murphy, S.P., Shmueli, G.: Automated Time Series Forecasting for Biosurveillance. Statistics in Medicine (accepted for 2007 publication)
Sartwell, P.: The distribution of incubation periods of infectious disease. Am. J. Hyg. 51, 310–318 (1950)
Burkom, H., Hutwagner, L., Rodriguez, R.: Using Point-Source Epidemic Curves to Evaluate Alerting Algorithms for Biosurveillance. In: Proceedings of the 2004 American Statistical Association, Statistics in Government Section (2005)
Chatfield, C.: The Holt-Winters Forecasting Procedure. App. Stats. 27, 264–279 (1978)
Reis, B.Y., Pagano, M., Mandl, K.D.: Using temporal context to improve biosurveillance (published online before print as 10.1073/pnas.0335026100). PNAS 100(4), 1961–1965 (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Burkom, H., Murphy, S. (2007). Data Classification for Selection of Temporal Alerting Methods for Biosurveillance. In: Zeng, D., et al. Intelligence and Security Informatics: Biosurveillance. BioSurveillance 2007. Lecture Notes in Computer Science, vol 4506. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72608-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-72608-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72607-4
Online ISBN: 978-3-540-72608-1
eBook Packages: Computer ScienceComputer Science (R0)