Efficient Algorithms for Local Density Based Anomaly Detection

Sinha, Ankita; Jana, Prasanta K.

doi:10.1007/978-3-319-72344-0_30

Efficient Algorithms for Local Density Based Anomaly Detection

Ankita Sinha¹⁶ &
Prasanta K. Jana¹⁶

Conference paper
First Online: 29 November 2017

1199 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10722))

Abstract

Anomaly detection is a crucial problem in the field of data mining. However, prevailing anomaly detection algorithms are serial in nature which fail to handle huge volume of data. In this paper, we propose two parallel local density based algorithms namely, MapReduce based Local Outlier Factor (MRLOF) and Spark based Local Outlier Factor (SLOF). The proposed algorithms have time complexity of O(N) for each. This is an improvement over the Simplified LOF (Local Outlier Factor) which has time complexity of \( O(\textit{N}^{2}) \), where N is the data size. We conducted extensive experiments with MRLOF and SLOF on various real life and synthetic datasets. The proposed algorithms are shown to outperform the serial Simplified LOF.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Hayes, M.A., Capretz, M.A.: Contextual anomaly detection framework for big sensor data. J. Big Data 2(1), 2 (2015)
Article Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 15 (2009)
Article Google Scholar
Tan, P.N., Kumar, V., Steinbach, M.: Introduction to Data Mining. Pearson Education, India (2011)
Google Scholar
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. ACM SIGMOD Rec. 29(2), 93–104 (2000)
Article Google Scholar
Schubert, E., Zimek, A., Kriegel, H.P.: Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min. Knowl. Disc. 28(1), 190–237 (2014)
Article MathSciNet MATH Google Scholar
Zhang, K., Hutter, M., Jin, H.: A new local distance-based outlier detection approach for scattered real-world data. In: Advances in Knowledge Discovery and Data Mining, pp. 813–822 (2009)
Google Scholar
Schubert, E., Zimek, A., Kriegel, H.P.: Generalized outlier detection with flexible kernel density estimates. In: Proceedings of the 2014 SIAM International Conference on Data Mining, pp. 542–550. Society for Industrial and Applied Mathematics, April 2014
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Sinha, A., Jana, P.K.: A novel K-means based clustering algorithm for big data. In: 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1875–1879. IEEE, September 2016
Google Scholar
Apache Hadoop. http://hadoop.apache.org/
Karau, H., Konwinski, A., Wendell, P., Zaharia, M.: Learning Spark: Lightning-Fast Big Data Analysis. O’Reilly Media, Inc., USA (2015)
Google Scholar
https://spark.apache.org. Accessed 9 Aug 2017
http://archive.ics.uci.edu/ml/index.php. Accessed 14 Aug 2017

Download references

Acknowledgements

The authors would like to thank Council of Scientific and Industrial Research (CSIR), New Delhi, India for the financial support for this research work (File No. 09/085(0111)/2014.EMR.1).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology (ISM), Dhanbad, India
Ankita Sinha & Prasanta K. Jana

Authors

Ankita Sinha
View author publications
You can also search for this author in PubMed Google Scholar
Prasanta K. Jana
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ankita Sinha .

Editor information

Editors and Affiliations

University of Hyderabad, Hyderabad, India
Atul Negi
University of Cincinnati, Cincinnati, Ohio, USA
Raj Bhatnagar
IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA
Laxmi Parida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sinha, A., Jana, P.K. (2018). Efficient Algorithms for Local Density Based Anomaly Detection. In: Negi, A., Bhatnagar, R., Parida, L. (eds) Distributed Computing and Internet Technology. ICDCIT 2018. Lecture Notes in Computer Science(), vol 10722. Springer, Cham. https://doi.org/10.1007/978-3-319-72344-0_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-72344-0_30
Published: 29 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72343-3
Online ISBN: 978-3-319-72344-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics