Abstract
The article presents a multiple classifiers approach to the obstructive nephropathy recognition - a disease posing a significant threat to newborns. Nature of the data reflects a problem known as high dimensionality small sample size. In presented approach a feature space division amongst number of classifiers is used to balance the relation between the number of objects and the number of features. Methods of feature selection are apllied for optimum splitting the feature space for classifier ensemble. The optimal size of subspaces and selection of classifier for ensemble is thoroughly tested. Complex performance test are presented to highlight the most efficent tuning of parameters for the presented approach, which is then compared to the classical solutions in this field.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Website of the data provider, http://www.e-lico.eu/
Alpaydin, E.: Combined 5 x 2 cv F Test for Comparing Supervised Classification Learning Algorithms. Neural Computation 11, 1885–1892 (1998)
Alpaydin, E.: Introduction to Machine Learning, 2nd edn. The MIT Press, London (2010)
Bryll, R.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition 20(6), 1291–1302 (2003)
Burduk, R.: Classification error in Bayes multistage recognition task with fuzzy observations. Pattern Analysis and Applications 13(1), 85–91 (2010)
Christianini, N., Shawe-Taylor, J.: An introduction to Support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley Interscience, Hoboken (2001)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., Inc., Boston (1989)
Guo, Y., Hastie, T., Tibshirani, R.: Regularized linear discriminant analysis and its application in microarrays. Biostatistics 8, 86–100 (2006)
Guerra-Salcedo, C., Whitley, D.: Feature selection mechanisms for ensemble creation: a genetic search perspective. In: Freitas, A.A. (ed.) Proceedings of the AAAI 1999 and GECCO 1999 Workshop on Data Mining with Evolutionary Algorithms: Research Directions, AAAI, Menlo Park (1999)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, Heidelberg (2009)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Inza, I., Larranaga, P., Blanco, R., Cerrolaza, A.: Filter versus wrapper gene selection approaches in DNA microarray domains. Artificial Intelligence in Medicine 31(2), 91–103 (2004)
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Chichester (2004)
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.): Feature extraction, foundations and applications. Springer, Heidelberg (2006)
Li, L., Weinberg, C.R., Darden, T.A., Pedersen, L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12), 1131–1142 (2001)
Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A.: Kernlab An S4 Package for Kernel Methods in R. Journal of Statistical Software 11(9) (2004)
Karatzoglou, A., Meyer, D., Hornik, K.: Support Vector Machines in R. Journal of Statistical Software 15(9) (2006)
Markowski, C.A., Markowski, E.D.: Conditions for the effectiveness of preliminary test of variance. The American Statistican 44, 322–326 (1990)
Michalewicz, Z.: Genetic algorithms + data structures = evolution programs, 3rd edn. Springer, London (1996)
Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J., Poggio, T., Gerald, W., Loda, M., Lander, E., Golub, T.: Multiclass cancer diagnosis using tumor gene expression signature. PNAS 98, 15149–15154 (2001)
Rapaport, F., Zinovyev, A., Dutreix, M., Barillot, E., Vert, J.-P.: Classification of microarray data using gene networks, BMC Bioinformatics 8, article 35 (2007)
Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of relieff and rrelieff. Machine Learning 53, 23–69 (2003)
Rokarch, L.: Pattern Classification using ensemble methods. World Scientific Publishing Co. Pte. Ltd., Singapore (2010)
Skurichina, M.: Bagging, boosting and the random subspace method for linear classifiers. Pattern Analysis and Applications 5(2), 121–135 (2002)
Sun, Y., Goodison, S., Li, J., Liu, L., Farmerie, W.: Improved breast cancer prognosis through the combination of clinical and genetic markers. Bioinformatics 23(1), 30–37 (2007)
Tao, D.: Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence (2006)
Tibshirani, R., Hastie, T.: Margin trees for high-dimensional classification. Journal of Machine Learning Research 8, 637–652 (2007)
Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)
Xing, E.P., Jordan, M.I., Karp, R.M.: Feature Selection for High-Dimensional Genomic Microarray Data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 601–608 (2001)
Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution. In: 12th Int. Conf. on Machine Learning (ICML 2003), Washington, D.C, pp. 856–863. Morgan Kaufmann, San Francisco (2003)
Zou, H.: The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101, 1418–1429 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Krawczyk, B. (2011). Classifier Committee Based on Feature Selection Method for Obstructive Nephropathy Diagnosis. In: Katarzyniak, R., Chiu, TF., Hong, CF., Nguyen, N.T. (eds) Semantic Methods for Knowledge Management and Communication. Studies in Computational Intelligence, vol 381. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23418-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-23418-7_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23417-0
Online ISBN: 978-3-642-23418-7
eBook Packages: EngineeringEngineering (R0)