Comparision of Classifiers Accuracies from FAVF and NOFI for Categorical Data

Lakshmi Sreenivasa Reddy, D.; Raveendra Babu, B.; Govardhan, A.; Kalpana, A.; Krishna Murthy, Mudimbi

doi:10.1007/978-81-322-2250-7_12

D. Lakshmi Sreenivasa Reddy⁷,
B. Raveendra Babu⁸,
A. Govardhan⁹,
A. Kalpana¹⁰ &
…
Mudimbi Krishna Murthy¹⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 339))

1672 Accesses
2 Citations

Abstract

Outlier analysis is an important task in data science. Specifically finding outliers in categorical data is a tough task. To build an accurate Classifier, it is needed to eliminate exact number of outliers from the data. If less number of outliers is found, the obstacles will remain in the original data. An accurate classifier cannot be built on this data. Similarly if more number of outliers is found and eliminated, some original records may be missed. From this data too an accurate classifier cannot be built. So it is needed to eliminate the exact number of outliers while modeling a classifier. Since the data is categorical, in classification modeling, most infrequent records are treated as outliers. These infrequent objects disturb the data in modeling classifier. But how many outliers needed to be found is a problem. This paper presents the new approach normally distributed Outlier factor by infrequency (NOFI) to improve the Classifier accuracy. In modeling a classifier for categorical data, high frequent records are most useful and infrequent records are most useless. So the infrequent records are obstacles in modeling the classifier. There are many effective approaches to detect outliers for numerical data. But for categorical datasets there are few numbers of methods exists. The experiments are conducted for this new method has been applied on bank dataset which is taken from UCI ML Repository. This approach is not needed any input of k, the required number of outliers. NOFI would find number of outliers automatically using infrequency of all possible combinations framed from attribute values included in any record.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Koufakou, A., Georgiopoulos, M.: A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes. Data Min. Knowl. Disc 20, 259–289 (2010)
Article MathSciNet Google Scholar
Lakshmi Sreenivasa Reddy, D., Raveendra Babu, B., Govardhan, A.: Outlier analysis of categorical data using NAVF. Informatica Economica 17(1), 5–13 (2013)
Article Google Scholar
Lakshmi Sreenivasa Reddy, D., Raveendra Babu, B.: Outlier analysis of categorical data using FuzzyAVF. Presented at IEEE International Conference ICCPCT-2013, pp. 1259–1263
Google Scholar
He, Z., Xu, X., Huang, J., Deng, S.: FP-Outlier: frequent pattern based outlier detection. Comput. Sci. Inf. Syst. (ComSIS’05) 2(1), 103–118 (2005)
Article Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings International Conference on Very Large Data Bases, pp. 487–499 (1994)
Google Scholar
He, Z., Deng, S., Xu, X.: A fast greedy algorithm for outlier mining. In: Proceedings of PAKDD (2006)
Google Scholar
Lakshmi Sreenivasa Reddy, D., Raveendra Babu, B.: Efficient model to find outliers in categorical data using outlier factor by infrequency. Presented at IEEE International Conference ICCPCT-2014, pp. 1324–1328
Google Scholar
Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA. http://archive.ics.uci.edu/ml (2010)

Download references

Author information

Authors and Affiliations

Department of CSE, RISE Gandhi Group of Institutions, Ongole, India
D. Lakshmi Sreenivasa Reddy
Department of CSE, VNRVJIET, Hyderabad, India
B. Raveendra Babu
SIT, JNTUH, Hyderabad, India
A. Govardhan
Department of MCA, RISE Gandhi Group of Institutions, Ongole, India
A. Kalpana & Mudimbi Krishna Murthy

Authors

D. Lakshmi Sreenivasa Reddy
View author publications
You can also search for this author in PubMed Google Scholar
B. Raveendra Babu
View author publications
You can also search for this author in PubMed Google Scholar
A. Govardhan
View author publications
You can also search for this author in PubMed Google Scholar
A. Kalpana
View author publications
You can also search for this author in PubMed Google Scholar
Mudimbi Krishna Murthy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. Lakshmi Sreenivasa Reddy .

Editor information

Editors and Affiliations

University of Kalyani, Kalyani, West Bengal, India
J. K. Mandal
Department of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences, Vishakapatnam, India
Suresh Chandra Satapathy
Dean, Faculty of Engineering, Technology, University of Kalyani, Kalyani, West Bengal, India
Manas Kumar Sanyal
Engineering and Technological Studies, University of Kalyani, Kalyani, West Bengal, India
Partha Pratim Sarkar
Department Computer Science & Engineering, University of Kalyani, Kalyani, India
Anirban Mukhopadhyay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lakshmi Sreenivasa Reddy, D., Raveendra Babu, B., Govardhan, A., Kalpana, A., Krishna Murthy, M. (2015). Comparision of Classifiers Accuracies from FAVF and NOFI for Categorical Data. In: Mandal, J., Satapathy, S., Kumar Sanyal, M., Sarkar, P., Mukhopadhyay, A. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 339. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2250-7_12

Download citation

DOI: https://doi.org/10.1007/978-81-322-2250-7_12
Published: 21 January 2015
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2249-1
Online ISBN: 978-81-322-2250-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics