Abstract
Datasets may contain small sets of data objects whose characteristics are not in accordance with the mainstream characteristics of the data objects in a dataset. These data objects, which are not noise, may contain valuable information and are called outliers. Outlier detection is a topic of research in many fields like detecting malwares in cyber security, finding fake financial transactions, identifying defects in industrial products, detecting abnormality in health data, etc. Researchers have developed several application methods for detecting outliers and a few generic methods. These methods can be grouped into unsupervised methods, supervised methods and semi-supervised methods based on the readiness of class labels. We, in this paper, present the performance of three outlier detection algorithms using the realworld datasets. The algorithms used are one-class SVM, elliptic envelope and local outlier factor. In order to improve the performance, all these algorithms were selected and ensemble based on voting mechanism. The influence of dimensionality reduction on the proposed ensemble method has also been studied. Experiments using publicly available datasets show that the proposed technique outperforms individual outlier detectors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):1–58
Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques. Morgan Kaufmann, Massachusetts, US
Hauberg S, Feragen A, Enficiaud R, Black MJ (2016) Scalable robust principal component analysis using Grassmann averages. IEEE Trans Pattern Anal Mach Intell 38(11):2298–2311. https://doi.org/10.1109/tpami.2015.2511743
HodgeVJ Austin J (2004) A survey of outlier detection methodologies. Artif Intel Rev 22(2):85–126
Ienco D, Pensa RG, Meo R (2017) A semi-supervised approach to the detection and characterization of outliers in categorical data. IEEE Trans Neural Networks Learn Syst 28(5):1017–1029. https://doi.org/10.1109/tnnls.2016.2526063
Kumar R, Kundu PP, Phoha VV (2018) Continuous authentication using one-class classifiers and their fusion. In: IEEE 4th international conference on identity, security, and behavior analysis (ISBA). https://doi.org/10.1109/isba.2018.8311467
Mũnoz-Marà J, Bovolo F, Gómez-Chova L, Bruzzone L, Camp-Valls G (2010) Semisupervised one-class support vector machines for classification of remote sensing data. IEEE Trans Geosci Remote Sens 48(8):3188–3197. https://doi.org/10.1109/tgrs.2010.2045764
Patcha A, Park J-M (2007) An overview of anomaly detection techniques: existing solutions and latest technological trends. In: Computer networks
Radovanovic M, Nanopoulos A, Ivanovic M (2015) Reverse nearest neighbors in unsupervised distance-based outlier detection. IEEE Trans Knowl Data Eng 27(5):1369–1382. https://doi.org/10.1109/tkde.2014.2365790
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Thomas, R., Judith, J.E. (2020). Voting-Based Ensemble of Unsupervised Outlier Detectors. In: Jayakumari, J., Karagiannidis, G., Ma, M., Hossain, S. (eds) Advances in Communication Systems and Networks . Lecture Notes in Electrical Engineering, vol 656. Springer, Singapore. https://doi.org/10.1007/978-981-15-3992-3_42
Download citation
DOI: https://doi.org/10.1007/978-981-15-3992-3_42
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3991-6
Online ISBN: 978-981-15-3992-3
eBook Packages: EngineeringEngineering (R0)