Skip to main content
Log in

Anomaly detection using piecewise aggregate approximation in the amplitude domain

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Anomaly detection has received much attention due to its various applications. Generally, the first step to discover anomalies is a data representation method which reduces dimensionality as well as preserves key information. Anomaly detection based on real-value representation methods is meaningful for its convenience in numeric operation. A typical real-value representation method is the Piecewise Aggregate Approximation (PAA) that is simple and intuitive by capturing mean values of segments in a sequence. However, if segments are same or similar in their average values but different in their oscillation amplitudes, the PAA method is ineffective to describe a sequence composed of such segments. To address this issue, we propose a representation method called the Piecewise Aggregate Approximation in the Amplitude Domain (AD-PAA). For discovering anomalies, a sequence is partitioned into subsequences by a sliding window firstly. Then in the AD-PAA method, a subsequence is divided into equal size subsections according to the amplitude domain. With mean values of subsections computed, the amplitude oscillation of a subsequence is embodied effectively. When the AD-PAA method is applied to approximate subsequences, the AD-PAA representation of a sequence is constructed. Anomalies are determined by anomaly scores that are based on similarities among representation results. Experimental results on various data confirm that the proposed method is more accurate than the PAA based method and other comparison methods. The ability to differentiate anomalies of the proposed algorithm is also superior.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Akouemo H N, Povinelli R J (2016) Probabilistic anomaly detection in natural gas time series data. Int J Forecast 32(3):948–956. doi:10.1016/j.ijforecast.2015.06.001

    Article  Google Scholar 

  2. Andrysiak T (2016) Machine learning techniques applied to data analysis and anomaly detection in ecg signals. Appl Artif Intell 30(6):610–634. doi:10.1080/08839514.2016.1193720

    Article  MathSciNet  Google Scholar 

  3. Avazbeigi M, Doulabi S H H, Karimi B (2010) Choosing the appropriate order in fuzzy time series: a new N-factor fuzzy time series for prediction of the auto industry production. Expert Syst Appl 37(8):5630–5639. doi:10.1016/j.eswa.2010.02.049

    Article  Google Scholar 

  4. Balasooriya U (1989) Detection of outliers in the exponential distribution based on prediction. Commun Stat- Theory Methods 18(2):711–720. doi:10.1080/03610928908829929

    Article  MathSciNet  MATH  Google Scholar 

  5. Breunig MM, Kriegel H, Ng RT, Jsander (2000) Lof: identifying density-based local outliers. In: ACM SIGMOD international conference on management of data, pp 93–104. doi:10.1145/342009.335388

  6. Buu HTQ, Anh DT (2011) Time series discord discovery based on iSAX symbolic representation. In: Proceedings of the third international conference on knowledge and systems engineering, pp 11–18. doi:10.1109/KSE.2011.11

  7. Chakrabarti K, Keogh E, Mehrotra S, Pazzani M (2002) Locally adaptive dimensionality reduction for indexing large time series databases. ACM Trans Database Syst 27(2):188–228. doi:10.1145/568518.568520

    Article  Google Scholar 

  8. Chan F K P, Fu A W C, Yu C (2003) Haar wavelets for efficient similarity search of time-series: with and without time warping. IEEE Trans Knowl Data Eng 15(3):686–705. doi:10.1109/TKDE.2003.1198399

    Article  Google Scholar 

  9. Chang P C, Fan C Y, Lin J L (2011) Trend discovery in financial time series data using a case based fuzzy decision tree. Expert Syst Appl 38(5):6070–6080. doi:10.1016/j.eswa.2010.11.006

    Article  Google Scholar 

  10. Chaovalit P, Gangopadhyay A, Karabatis G, Chen Z Y (2011) Discrete wavelet transform-based time series analysis and mining. ACM Comput Surv 43(2):33–63. doi:10.1145/1883612.1883613

    Article  MATH  Google Scholar 

  11. Chen X Y, Zhan Y Y (2008) Multi-scale anomaly detection algorithm based on infrequent pattern of time series. J Comput Appl Math 214(1):227–237. doi:10.1016/j.cam.2007.02.027

    Article  MathSciNet  MATH  Google Scholar 

  12. Esling P, Agon C (2012) Time-series data mining. ACM Comput Surv 45(1):12–45. doi:10.1145/2379776.2379788

    Article  MATH  Google Scholar 

  13. Fu AWC, Leung OTW, Keogh E, Lin J (2006) Finding time series discords based on haar transform. In: Proceedings of international conference on advanced data mining and applications, pp 31–41. doi:10.1007/11811305_3

  14. Fuchs E, Gruber T, Nitschke J, Sick B (2010) Online segmentation of time series based on polynomial least-squares approximations. IEEE Trans Pattern Anal Mach Intell 32(12):2232–2245. doi:10.1109/TPAMI.2010.44

    Article  Google Scholar 

  15. Guerrero J L, Berlanga A, Garc J, Molina J M (2010) Piecewise linear representation segmentation as a multiobjective optimization problem. Adv Intell Soft Comput 79:267–274. doi:10.1007/978-3-642-14883-5_35

    Google Scholar 

  16. Guo CH, Li HL, Pan DH (2010) An improved piecewise aggregate approximation based on statistical features for time series mining. In: International conference on knowledge science, engineering and management, pp 234–244. doi:10.1007/978-3-642-15280-1_23

  17. Hung NQ, Anh DT (2008) An improvement of PAA for dimensionality reduction in large time series databases. In: Proceedings of pacific rim international conference on artificial intelligence, pp 698–707. doi:10.1007/978-3-540-89197-0_64

  18. Izakian H, Pedrycz W (2013) Anomaly detection in time series data using a fuzzy C-means clustering. In: Proceedings of IFSA world congress and NAFIPS meeting, pp 1513–1518. doi:10.1109/IFSA-NAFIPS.2013.6608627

  19. Jaing M F, Tseng S S, Su C M (2001) Two-phase clustering process for outliers detection. Pattern Recogn Lett 22(6–7):691–700. doi:10.1016/S0167-8655(00)00131-8

    Article  MATH  Google Scholar 

  20. Jones M, Nikovski D, Imamura M, Hirata T (2016) Exemplar learning for extremely efficient anomaly detection in real-valued time series. Data Min Knowl Disc 30(6):1–28. doi:10.1007/s10618-015-0449-3

    Article  MathSciNet  Google Scholar 

  21. Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001a) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3):263–286. doi:10.1007/PL00011669

  22. Keogh E, Chu S, Hart D, Pazzani M (2001b) An online algorithm for segmenting time series. In: Proceedings of IEEE international conference on data mining, pp 289–296. doi:10.1109/ICDM.2001.989531

  23. Keogh E, Lin J, Fu AWC (2005) Details about time series discords. http://www.cs.ucr.edu/eamonn/discords

  24. Keogh E, Lin J, Fu A W, Herle H V (2006) Finding unusual medical time-series subsequences: algorithms and applications. IEEE Trans Inf Technol Biomed 10(3):429–439. doi:10.1109/TITB.2005.863870

    Article  Google Scholar 

  25. Knorr E M, Ng R, Tucakov V (2000) Distance-based outliers: algorithms and applications. VLDB J 8 (3):237–253. doi:10.1007/s007780050006

    Article  Google Scholar 

  26. Lemire D, 2007 A better alternative to piecewise linear time series segmentation. In: Proceedings of SIAM international conference on data mining, pp 985–993. doi:10.1137/1.9781611972771.59

  27. Leng MW, Lai XS, Tan G, Xu X (2009) Time series representation for anomaly detection. In: IEEE international conference on computer science and information technology, pp 628–632. doi:10.1109/ICCSIT.2009.5234775

  28. Leng M W, Yu W, Wu S, Hu H (2013) Anomaly detection algorithm based on pattern density in time series. Lecture Notes Electr Eng 236:305–311. doi:10.1007/978-1-4614-7010-6_35

    Article  Google Scholar 

  29. Li G L, Bräysy O, Jiang L X, Wu Z D, Wang Y Z (2013) Finding time series discord based on bit representation clustering. Knowl-Based Syst 54(4):243–254. doi:10.1016/j.knosys.2013.09.015

    Article  Google Scholar 

  30. Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the eighth ACM SIGMOD workshop on research issues in data mining and knowledge discovery, pp 2–11. doi:10.1145/882082.882086

  31. Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Disc 15(2):107–144. doi:10.1007/s10618-007-0064-z

    Article  MathSciNet  Google Scholar 

  32. Lippi M, Bertini M, Frasconi P (2013) Short-term traffic flow forecasting: an experimental comparison of time-series analysis and supervised learning. IEEE Trans Intell Transp Syst 14 (2):871–882. doi:10.1109/TITS.2013.2247040

    Article  Google Scholar 

  33. Lonardi S, Lin J, Keogh E, Chiu B (2006) Efficient discovery of unusual patterns in time series. N Gener Comput 25(1):61–93. doi:10.1007/s00354-006-0004-2

    Article  MATH  Google Scholar 

  34. Luo W, Gallagher M, Wiles J (2013) Parameter-free search of time-series discord. J Comput Sci Technol 28(2):300–310. doi:10.1007/s11390-013-1330-8

    Article  MATH  Google Scholar 

  35. Ma J, Perkins S (2003) Online novelty detection on temporal sequences. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 613–618. doi:10.1145/956750.956828

  36. Ma J G, Sun L, Wang H, Zhang Y C, Aickelin U (2016) Supervised anomaly detection in uncertain pseudoperiodic data streams. ACM Trans Internet Technol 16(1):1–20. doi:10.1145/2806890

    Article  Google Scholar 

  37. Mok M S, Sohn S Y, Ju Y H (2010) Random effects logistic regression model for anomaly detection. Expert Syst Appl 37(10):7162–7166. doi:10.1016/j.eswa.2010.04.017

    Article  Google Scholar 

  38. Quinn J A, Sugiyama M (2014) A least-squares approach to anomaly detection in static and sequential data. Pattern Recogn Lett 40(1):36–40. doi:10.1016/j.patrec.2013.12.016

    Article  Google Scholar 

  39. Shahabi C, Tian XM, Zhao WG (2000) TSA-tree: a wavelet-based approach to improve the efficiency of multi-level surprise and trend queries on time-series data. In: Proceedings of the twelfth international conference on scientific and statistical database management, pp 55–68. doi:10.1109/SSDM.2000.869778

  40. Tewatia D K, Tolakanahalli R P, Paliwal B R, Tomé W A (2011) Time series analyses of breathing patterns of lung cancer patients using nonlinear dynamical system theory. Phys Med Biol 56(7):2161–2181. doi:10.1118/1.4734982

    Article  Google Scholar 

  41. Truong C D, Anh D T (2015) An efficient method for motif and anomaly detection in time series based on clustering. Int J Bus Intell Data Min 10(4):356–377. doi:10.1504/IJBIDM.2015.072212

    Article  Google Scholar 

  42. Viinikka J, Debar H, Mé L, Lehikoinen A, Tarvainen M (2009) Processing intrusion detection alert aggregates with time series modeling. Inf Fusion 10(4):312–324. doi:10.1016/j.inffus.2009.01.003

    Article  Google Scholar 

  43. Yan Q Y, Chen X T (2013) A novel never-ending uncertain Top-k discord detection method. Inf Technol J 12(19):4906–4910. doi:10.3923/itj.2013.4906.4910

    Article  Google Scholar 

  44. Yang Y, Hu H P, Xiong W, Ding F (2011) A novel network traffic anomaly detection model based on superstatistics theory. J Networks 6(2):311–318. doi:10.4304/jnw.6.2.311-318

    Google Scholar 

  45. Yi BK, Faloutsos C (2000) Fast time sequence indexing for arbitrary L p Norms. In: Proceedings of the twenty-sixth international conference on very large data bases, pp 385–394

  46. Zhao J, Liu K, Wang W, Liu Y (2014) Adaptive fuzzy clustering based anomaly data detection in energy system of steel industry. Inf Sci Int J 259(3):335–345. doi:10.1016/j.ins.2013.05.018

    Google Scholar 

Download references

Acknowledgements

The authors extend their appreciation to the International Scientific Partnership Program ISPP at King Saud University for funding this research Work through ISPP#0799.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huorong Ren.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, H., Liao, X., Li, Z. et al. Anomaly detection using piecewise aggregate approximation in the amplitude domain. Appl Intell 48, 1097–1110 (2018). https://doi.org/10.1007/s10489-017-1017-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-017-1017-x

Keywords

Navigation