Abstract
We introduce the notion of a bireduct, which is an extension of the notion of a reduct developed within the theory of rough sets. For a decision system \(\mathbb{A}=(U,A\cup\{d\})\), a bireduct is a pair (B,X), where B ⊆ A is a subset of attributes that discerns all pairs of objects in X ⊆ U with different values of the decision attribute d, and where B and X cannot be, respectively, reduced and extended without losing this property. We investigate the ability of ensembles of bireducts (B,X) characterized by significant diversity with respect to both B and X to represent knowledge hidden in data and to serve as the means for learning robust classification systems. We show fundamental properties of bireducts and provide algorithms aimed at searching for ensembles of bireducts in data. We also report results obtained for some benchmark data sets.
The authors were supported by the grant N N516 077837 from the Ministry of Science and Higher Education of the Republic of Poland and by the National Centre for Research and Development (NCBiR) under the grant SP/I/1/77065/10 by the Strategic scientific research and experimental development program: “Interdisciplinary System for Interactive Scientific and Scientific-Technical Information”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bazan, J., Nguyen, H., Nguyen, S., Synak, P., Wróblewski, J.: Rough Set Algorithms in Classification Problem. In: Polkowski, L., Tsumoto, S., Lin, T. (eds.) Rough Set Methods and Applications. STUDFUZZ, vol. 56, pp. 49–88. Physica Verlag (2000)
Dietterich, T.G.: An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Machine Learning 40(2), 139–157 (2000)
Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2010)
Frank, E., Kramer, S.: Ensembles of Nested Dichotomies for Multi-class Problems. In: Proc. of Int. Conf. on Machine Learning (ICML). ACM International Conference Proceeding Series, vol. 69 (2004)
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer (1998)
Janusz, A.: Similarity Relation in Classification Problems. In: Chan, C.-C., Grzymala-Busse, J.W., Ziarko, W.P. (eds.) RSCTC 2008. LNCS (LNAI), vol. 5306, pp. 211–222. Springer, Heidelberg (2008)
Janusz, A., Ślęzak, D.: An Unsupervised Model for Rule-based Similarity Learning from Textual Data: A General Idea. In: Proc. of Int. Workshop on Concurrency, Specification, and Programming (CS&P), pp. 229–237 (2011)
Kurgan, Ł.A., Cios, K.J., Tadeusiewicz, R., Ogiela, M.R., Goodenday, L.S.: Knowledge Discovery Approach to Automated Cardiac SPECT Diagnosis. Artificial Intelligence in Medicine 23(2), 149–169 (2001)
Liu, H., Motoda, H. (eds.): Computational Methods of Feature Selection. Chapman & Hall/CRC (2008)
Mirkin, B.: Mathematical Classification and Clustering. Kluwer (1996)
Nguyen, H.S.: Approximate Boolean Reasoning: Foundations and Applications in Data Mining. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets V. LNCS, vol. 4100, pp. 334–506. Springer, Heidelberg (2006)
Pawlak, Z., Skowron, A.: Rudiments of Rough Sets. Information Sciences 177(1), 3–27 (2007)
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008), http://www.R-project.org
Ślęzak, D.: Rough Sets and Functional Dependencies in Data: Foundations of Association Reducts. In: Gavrilova, M.L., Tan, C.J.K., Wang, Y., Chan, K.C.C. (eds.) Transactions on Computational Science V. LNCS, vol. 5540, pp. 182–205. Springer, Heidelberg (2009)
Ślęzak, D., Widz, S.: Is It Important Which Rough-Set-Based Classifier Extraction and Voting Criteria Are Applied Together? In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS (LNAI), vol. 6086, pp. 187–196. Springer, Heidelberg (2010)
Wojnarski, M., Janusz, A., Nguyen, H.S., Bazan, J., Luo, C., Chen, Z., Hu, F., Wang, G., Guan, L., Luo, H., Gao, J., Shen, Y., Nikulin, V., Huang, T.-H., McLachlan, G.J., Bošnjak, M., Gamberger, D.: RSCTC’2010 Discovery Challenge: Mining DNA Microarray Data for Medical Diagnosis and Treatment. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS (LNAI), vol. 6086, pp. 4–19. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ślęzak, D., Janusz, A. (2011). Ensembles of Bireducts: Towards Robust Classification and Simple Representation. In: Kim, Th., et al. Future Generation Information Technology. FGIT 2011. Lecture Notes in Computer Science, vol 7105. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27142-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-27142-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27141-0
Online ISBN: 978-3-642-27142-7
eBook Packages: Computer ScienceComputer Science (R0)