From Cluster-Based Outlier Detection to Time Series Discord Discovery

Kha, Nguyen Huy; Anh, Duong Tuan

doi:10.1007/978-3-319-25660-3_2

Nguyen Huy Kha¹⁹ &
Duong Tuan Anh¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9441))

1039 Accesses
8 Citations

Abstract

Anomalous patterns or discords are just the kind of outliers in time series. In this paper, we present a new approach for time series discord discovery which is based on cluster-based outlier detection. In this approach, first, subsequence candidates are extracted from the time series using a segmentation method, then these candidates are transformed into the same length and are input for an appropriate clustering algorithm, and finally, we identify discords by using a measure suggested in the cluster-based outlier detection method given by He et al. 2003. The experimental results show that our approach is much more efficient than the HOTSAX algorithm in detecting time series discords while the anomalous patterns discovered by the two methods perfectly match with each other.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bu, Y., Leung, T.W., Fu, A., Keogh, E., Pei, J., Meshkin, S.: WAT: finding Top-K discords in time series database. In: Proceedings of the 2007 SIAM International Conference on Data Mining (SDM 2007), Minneapolis, MN, USA (26–28 April 2007)
Google Scholar
Chuah, M.C., Fu, F.: ECG anomaly detection via time series analysis. In: Thulasiraman, P., He, X., Xu, T.L., Denko, M.K., Thulasiram, R.K., Yang, L.T. (eds.) ISPA Workshops 2007. LNCS, vol. 4743, pp. 123–135. Springer, Heidelberg (2007)
Chapter Google Scholar
Duan, L.D., Xu, L., Guo, F., Lee, J., Yan, B.: A local density based spatial clustering with noise. Inf. Syst. 32(7), 978–986 (2007)
Article Google Scholar
Duan, L.D., Xu, L., Liu, Y., Lee, J.: Cluster-based outlier detection. Ann. Oper. Res. 168, 151–168 (2009)
Article MathSciNet MATH Google Scholar
Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noises. In: Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press, Portland (1996)
Google Scholar
Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: Tiwary, A., Franklin, M. (eds.) Proceedings of 1998 ACM SIGMOD International Conference on Management of Data, Seattle, Washington, USA, pp. 73–84 (01–04 June 1998)
Google Scholar
Gruber, C., Coduro, M., Sick, B.: Signature verification with dynamic RBF network and time series motifs. In: Proceedings of 10th International Workshop on Frontiers in Handwriting Recognition (2006)
Google Scholar
He, Z., Xu, X., Deng, S.: Squeezer: an efficient algorithm for clustering categorical data. J. Comput. Sci. Technol. 17(5), 611–624 (2002)
Article MathSciNet MATH Google Scholar
He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recogn. Lett. 24(9–10), 1641–1650 (2003)
Article MATH Google Scholar
Hawkins, D.: Identification of Outliers. Chapman and Hall, London (1980)
Book MATH Google Scholar
Jiang, M.F., Tseng, S.S., Su, C.M.: Two phase clustering process for outlier detection. Pattern Recogn. Lett. 22(6–7), 691–700 (2001)
Article MATH Google Scholar
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality deduction for fast similarity search in large time series database. J. Knowl. Inf. Syst. 3(3), 263–286 (2001)
Article MATH Google Scholar
Keogh, E., Lonardi, S., Chiu, B.: Finding surprising patterns in a time series database in linear time and space. In: KDD 2002: Proceedings of 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 550–556 (2002)
Google Scholar
Keogh, E., Lin, J. and Fu, A.: HOT SAX: efficiently finding the most unusual time series subsequence. In: Proceedings of 5th IEEE International Conference on Data Mining (ICDM), pp. 226–233 (2005)
Google Scholar
Keogh E., Xi X., Wei L., Ratanamahatana C.A.: The UCR time series classification/clustering homepage (2013). www.cs.ucr.edu/~eamonn/time_series_data
Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discover (DMKD 2003), pp. 2–11 (13 June 2003)
Google Scholar
Ma, J., Perkins, S.: Online novelty detection on temporal sequences. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, NY, USA, pp. 614–618. ACM Press (2003)
Google Scholar
Oliveira, A.L.I., Neto, F.B.L., Meira, S.R.L.: A method based on RBF-DAA neural network for improving novelty detection in time series. In: Proceedings of 17th International FLAIRS Conference, AAAI Press, Miami Beach, Florida, USA (2004)
Google Scholar
Pratt, K.B., Fink, E.: Search for pattern in compressed time series. Int. J. Image Graph. 2(1), 89–106 (2002)
Article Google Scholar
Truong, C.D., Anh, D.T.: An efficient method for discovering motifs in large time series. In: Selamat, A., Nguyen, N.T., Haron, H. (eds.) ACIIDS 2013, Part I. LNCS, vol. 7802, pp. 135–145. Springer, Heidelberg (2013)
Chapter Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, pp. 103–114 (04–06 June 1996)
Google Scholar

Download references

Acknowledgement

We are grateful to Prof. Eamonn Keogh for his kindly providing all the test datasets used in this work.

Author information

Authors and Affiliations

Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam
Nguyen Huy Kha & Duong Tuan Anh

Authors

Nguyen Huy Kha
View author publications
You can also search for this author in PubMed Google Scholar
Duong Tuan Anh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Duong Tuan Anh .

Editor information

Editors and Affiliations

Institute of Infocomm Research, Singapore, Singapore
Xiao-Li Li
Ho Chi Minh City University of Tech, Ho Chi Minh City, Vietnam
Tru Cao
School of Information Systems, Singapore Management University, Singapore, Singapore
Ee-Peng Lim
Nanjing University, Nanjing, China
Zhi-Hua Zhou
Science & Technology, Japan Advanced Institute of, Nomi-shi, Ishikawa, Japan
Tu-Bao Ho
The University of Hong Kong, Hong Kong, China
David Cheung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kha, N.H., Anh, D.T. (2015). From Cluster-Based Outlier Detection to Time Series Discord Discovery. In: Li, XL., Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D. (eds) Trends and Applications in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science(), vol 9441. Springer, Cham. https://doi.org/10.1007/978-3-319-25660-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-25660-3_2
Published: 26 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25659-7
Online ISBN: 978-3-319-25660-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics