Spark Based ANFIS Approach for Anomaly Detection Using Big Data

Santosh, Thakur; Ramesh, Dharavath

doi:10.1007/978-981-10-8660-1_34

Thakur Santosh¹³ &
Dharavath Ramesh¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 828))

Included in the following conference series:

International Conference on Next Generation Computing Technologies

1524 Accesses

Abstract

Business intelligence is one of the applications that can benefit from various techniques and methodologies to patronize the unlablled big data anomalies. To address this issue, in this paper, we present a model to identify anomalies in spark environment using related big data. To optimize this instance, we use an open source software framework named Spark for analyzing the big data. Spark contains powerful APIS for machine learning and soft computing algorithms. To handle and detect the anomaly instances in the perspective of big data, Apache spark is installed on the top of the Hadoop and Adaptive Neuro Fuzzy Interface System (ANFIS) is implemented in spark. The variant of Hadoop HDFS is used as a data source through resilient distributed data sets (RDDs) data which is fetched in the spark. Experimental results show that the proposed method outperforms in a fault tolerant manner and also records accurate instances in the distributed environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 15 (2009)
Article Google Scholar
Savage, D., Zhang, X., Yu, X., Chou, P., Wang, Q.: Anomaly detection in online social networks. Soc. Netw. 39, 62–70 (2014)
Article Google Scholar
Drosou, M., Jagadish, H.V., Pitoura, E., Stoyanovich, J.: Diversity in big data: a review. Big Data 5(2), 73–84 (2017)
Article Google Scholar
Erl, T., Khattak, W., Buhler, P.: Big Data Fundamentals: Concepts. Drivers & Techniques. Prentice Hall Press, Upper Saddle River (2016)
Google Scholar
Holmes, A.: Hadoop in Practice. Manning Publications Co, Shelter Island (2012)
Google Scholar
Grolinger, K., Hayes, M., Higashino, W.A., L’Heureux, A., Allison, D.S., Capretz, M.A.: Challenges for MapReduce in big data. In: 2014 IEEE World Congress on Services (SERVICES), pp. 182–189. IEEE, June 2014
Google Scholar
Dittrich, J., Quiané-Ruiz, J.A.: Efficient big data processing in Hadoop MapReduce. Proc. VLDB Endow. 5(12), 2014–2015 (2012)
Article Google Scholar
Sri, P.A., Anusha, M.: Big data-survey. Indones. J. Electr. Eng. Inform. (IJEEI) 4(1), 74–80 (2016)
Google Scholar
García, S., Ramírez-Gallego, S., Luengo, J., Benítez, J.M., Herrera, F.: Big data preprocessing: methods and prospects. Big Data Anal. 1(1), 9 (2016)
Article Google Scholar
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Jach, T., Magiera, E., Froelich, W.: Application of HADOOP to store and process big data gathered from an urban water distribution system. Procedia Eng. 119, 1375–1380 (2015)
Article Google Scholar
Gunarathne, T., Zhang, B., Wu, T.L., Qiu, J.: Scalable parallel computing on clouds using Twister4Azure iterative MapReduce. Future Gener. Comput. Syst. 29(4), 1035–1048 (2013)
Article Google Scholar
Chowdhury, M., Zaharia, M., Stoica, I.: Performance and scalability of broadcast in Spark (2014). http://www.cs.berkeley.edu/~agearh/cs267.sp10/files/mosharaf-spark-bc-report-spring10.pdf. Accessed 08 Oct 2014
Shanahan, J.G., Dai, L.: Large scale distributed data science using apache spark. In: Proceedings of 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2323–2324. ACM, August 2015
Google Scholar
Meng, X., Bradley, J., Yavuz, B., Sparks, E., Venkataraman, S., Liu, D., Freeman, J., Tsai, D.B., Amde, M., Owen, S., Xin, D.: Mllib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)
MathSciNet MATH Google Scholar
Bharill, N., Tiwari, A., Malviya, A.: Fuzzy based clustering algorithms to handle big data with implementation on Apache Spark. In: 2016 IEEE Second International Conference on Big Data Computing Service and Applications (BigDataService), pp. 95–104. IEEE, March 2016
Google Scholar
Chen, L., Wang, F., Deng, H., Ji, K.: A survey on hand gesture recognition. In: 2013 International Conference on Computer Sciences and Applications (CSA), pp. 313–316. IEEE, December 2013
Google Scholar
Chang, F.J., Chang, Y.T.: Adaptive neuro-fuzzy inference system for prediction of water level in reservoir. Adv. Water Resour. 29(1), 1–10 (2006)
Article Google Scholar
Polat, K., Güneş, S.: An expert system approach based on principal component analysis and adaptive neuro-fuzzy inference system to diagnosis of diabetes disease. Digit. Sig. Process. 17(4), 702–710 (2007)
Article Google Scholar
Klir, G., Yuan, B.: Fuzzy Sets and Fuzzy Logic, vol. 4. Prentice Hall, Upper Saddle River (1995)
MATH Google Scholar
Son, S., Gil, M.S., Moon, Y.S.: Anomaly detection for big log data using a Hadoop ecosystem. In: 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 377–380. IEEE, February 2017
Google Scholar
Sulaiman, S.M., Jeyanthy, P.A., Devaraj, D.: Big data analytics of smart meter data using Adaptive Neuro Fuzzy Inference System (ANFIS). In: International Conference on Emerging Technological Trends (ICETT), pp. 1–5. IEEE, October 2016
Google Scholar
Hayes, M.A., Capretz, M.A.: Contextual anomaly detection framework for big sensor data. J. Big Data 2(1), 2 (2015)
Article Google Scholar
Hill, D.J., Minsker, B.S.: Anomaly detection in streaming environmental sensor data: a data-driven modeling approach. Environ. Model Softw. 25(9), 1014–1022 (2010)
Article Google Scholar
Berger, J.O.: Statistical decision theory and Bayesian analysis. Springer Science & Business Media, New York (2013)
Google Scholar
Xie, M., Hu, J., Tian, B.: Histogram-based online anomaly detection in hierarchical wireless sensor networks. In: 2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 751–759. IEEE, June 2012
Google Scholar
Kittler, J., Christmas, W., De Campos, T., Windridge, D., Yan, F., Illingworth, J., Osman, M.: Domain anomaly detection in machine perception: a system architecture and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 845–859 (2014)
Article Google Scholar
Solaimani, M., Iftekhar, M., Khan, L., Thuraisingham, B., Ingram, J.B.: Spark-based anomaly detection over multi-source VMware performance data in real-time. In: 2014 IEEE Symposium on Computational Intelligence in Cyber Security (CICS), pp. 1–8. IEEE, December 2014
Google Scholar
ccFraud Dataset, August 2017. http://packages.revolutionanalytics.com/datasets/. Accessed 12 July 2017
Kamaruddin, S., Ravi, V.: Credit card fraud detection using big data analytics: use of PSOAANN based one-class classification. In: Proceedings of International Conference on Informatics and Analytics, p. 33. ACM, August 2016
Google Scholar

Download references

Acknowledgments

This work is partially supported by Indian Institute of Technology (ISM), Govt. of India. The authors wish to express their gratitude and thanks to the Department of Computer Science & Engineering, Indian Institute of Technology (ISM), Dhanbad, India for providing their support in arranging necessary computing facilities.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology (ISM), Dhanbad, Dhanbad, 826004, Jharkhand, India
Thakur Santosh & Dharavath Ramesh

Authors

Thakur Santosh
View author publications
You can also search for this author in PubMed Google Scholar
Dharavath Ramesh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dharavath Ramesh .

Editor information

Editors and Affiliations

Indian Institute of Technology Patna, Patna, Bihar, India
Pushpak Bhattacharyya
University of Petroleum and Energy Studies, Dehradun, India
Hanumat G. Sastry
University of Petroleum and Energy Studies, Dehradun, India
Venkatadri Marriboyina
University of Petroleum and Energy Studies, Dehradun, India
Rashmi Sharma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Santosh, T., Ramesh, D. (2018). Spark Based ANFIS Approach for Anomaly Detection Using Big Data. In: Bhattacharyya, P., Sastry, H., Marriboyina, V., Sharma, R. (eds) Smart and Innovative Trends in Next Generation Computing Technologies. NGCT 2017. Communications in Computer and Information Science, vol 828. Springer, Singapore. https://doi.org/10.1007/978-981-10-8660-1_34

Download citation

DOI: https://doi.org/10.1007/978-981-10-8660-1_34
Published: 09 June 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8659-5
Online ISBN: 978-981-10-8660-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics