Abstract
In this paper, a new feature selection method based on generalized data field (FR-GDF) is proposed. The goal of feature selection is selecting useful features and simultaneously excluding garbage features from a given feature set. It is Important to measure the “distance” between data points in existing feature selection approaches. To measure the “distance”, FR-GDF adopts potential value of data field. Information entropy of potential value is used to measure the inter-class distance and intra-class distance. This method eliminates unimportant or noise features of original feature sets and extracts the optional features. Experiments prove that FR-GDF algorithm performs well and is independent of the specific classification algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley&Sons, New York (2001)
Langley, P.: Selection of relevant features in machine learning. In: Proc of the AAAI Fall Symposium on Relevance, Menlo Park,CA, pp. 140–144 (1994)
Jolifie, I.T.: Principal component analysis. Springer, New York (1986)
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on Neural Networks 5 (1994)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(3), 1–12 (2005)
Estevez, P.A., Tesmer, M., Perez, C., Zurada, J.M.: Normalized Mutual Information Feature Selection. IEEE Transactions on Neural Networks, 1045–9227 (2009)
Mitra, P., Murthy, C., Pal, S.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell, 301–312 (2002)
Dash, M., Liu, H.: Consistency-based search in feature selection. Artif. Intell. 151, 155–176 (2003)
Dash, M., Choi, K., Scheuermann, P., Liu, H.: Feature selection for clustering – a filter solution. In: Second IEEE International Conference on Data Mining, p. 115 (2002)
Ho, T., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24, 289–300 (2002)
Guan, Y., Wang, H., Wang, Y., Yang, F.: Attribute reduction and optimal decision rules acquisition for continuous valued information systems. Inf. Sci. 179, 2974–2984 (2009)
Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Wei, H.-L., Billings, S.A.: Feature Subset Selection and Ranking for Data Dimensionality Reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(1), 162–166 (2007)
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository [EB/OL]. School of Inf.andCompSci, Univ of California, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Pnevmatikakis, A., Polymenakos, L.: Comparison of Eigenface Based Feature vectors under Different Impairments. In: Proceedings of the17th International Conference on Patten Recognition(ICPR), vol. (l), pp. 296–299 (2004)
Hand, D.J., Yu, K.: Idiot’s Bayes: Not So Stupid After All Internat. Statist. Rev. 69, 385–398 (2001)
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer-Verlag New York, Inc. (1995)
Loog, M., Duin, R.P.W., Haeb-Umbach, R.: Multiclass linear dimension reduction by weighted pairwise fisher criteria. IEEE Trans. PAMI 23(7), 762–766 (2001)
Yu, K., Ji, L., Zhang, X.G.: Kernel nearestneighbor algorithm. Neural Process.Lett 15, 147–156 (2002)
Saari, P., Eerola, T., Lartillot, O.: Generalizability and simplicity as criteria infeature selection: application to mood classification in music. IEEE Transactionson Audio, Speech, and Language Processing 19(6), 1802–1812 (2011)
Wright, J., Yang, A.Y., Ganesh, A.: Robust face recognition via sparserepresentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(2), 210–227 (2009)
Bishop, C.: Neural networks for pattern recognition [M]. Clarendon Press, Oxford (1995)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances inNeural Information Processing Systems, pp. 507–514 (2006)
Zhang, D., Chen, S., Zhou, Z.H.: Constraint Score: A new filter method for featureselection with pairwise constraints. Pattern Recognition 41(5), 1440–1451 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhao, L., Wang, S., Lin, Y. (2014). A New Filter Approach Based on Generalized Data Field. In: Luo, X., Yu, J.X., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2014. Lecture Notes in Computer Science(), vol 8933. Springer, Cham. https://doi.org/10.1007/978-3-319-14717-8_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-14717-8_25
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14716-1
Online ISBN: 978-3-319-14717-8
eBook Packages: Computer ScienceComputer Science (R0)