Abstract
Dimension reduction approach is one of the main data reduction approaches in order to reduce the storage and processing time while maintaining the integrity of the original data. A wide range of dimension reduction approaches are based on classical approaches such as PCA and Bayer’s, and machine learning approaches such as clustering, and feature selection techniques. However, many of the approaches do not consider the incomplete information systems where some attribute values are missing or incomplete. Only few studies were proposed for the problem in incomplete information systems due to its complexities, specifically on attribute selection. The most popular approaches is based on probability theory to replace missing values with the most common values, or remove the missing objects from the information systems. However, it needs to know the probability distribution of data in advance. To overcome these issues, we propose a new approach based on conditional entropy to reduce dimensionality. The results show that the proposed approach achieves better data reduction with higher accuracy for objects and dimensionality reduction in incomplete information systems.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Chandramouli, B., Goldstein, J., Duan, S.: Temporal analytics on Big Data for web advertizing. In: 28th IEEE International Conference on Data Engineering, pp. 90–101 (2012)
Pawlak, Z.: Rough sets. Int. J. Comput. Inform. Sci. 11(5), 341–356 (1982)
Lu, Z., Qin, Z.: Rule extraction from incomplete decision system based on novel dominance relation. In: Proceedings of the 4th International Conference on Intelligent Networks and Intelligent Systems, pp. 149–152 (2011)
Dai, J., Wang, W., Xu, Q., Tian, H.: Uncertainty measurement for interval-valued decision systems based on extended conditional entropy. Knowl. Based Syst. 27, 443–450 (2012)
Skowron, A., Wasilewski, P.: Toward interactive Rough-Granular Computing. Control Cybern. 40(2), 213–235 (2011)
Skowron, A., Stepaniuk, J., Swiniarski, R.: Approximation spaces in Rough-Granular Computing. Fundamentae Informaticae 100(1–4), 141–157 (2010)
Yanto, I.T.R., Vitasari, H.T., Deris, M.M.: Applying variable precision rough set model for clustering suffering student’s anxiety. Expert Syst. Appl. 39(1), 452–459 (2012)
Herawan, T., Deris, M.M., Abawajy, J.H.: A rough set approach for selecting clustering attributes. Knowl. Based Syst. 23(3), 220–231 (2010)
Parmar, D., Wu, T., Blackhurst, J.: MMR: an algorithm for clustering categorical data using rough set theory. Data Knowl. Eng. 63(3), 879–893 (2007)
Kim, D.: Data classification based on tolerant rough set. Pattern Recogn. 34(8), 1613–1624 (2001)
Trabelsi, S., Elouedi, Z., Lingras, P.: Classification systems based on rough sets under the belief function network. Int. J. Approximate Reasoning 52(9), 1409–1432 (2011)
Kaneiwa, K.: A rough set approach to multiple dataset analysis. J. Appl. Soft Comput. 11(2), 2538–2547 (2011)
Yan, T., Han, C.: A novel approach of rough conditional entropy-based attribute selection for incomplete decision system. Math. Probl. Eng. 2014, 1–15 (2014)
Grzymala-Busse, J.W.: Rough set strategies to data with missing attribute values. In: Proceedings of the workshop on Foundation and New Directions in Data Mining, associated with the 3rd IEEE International Conference on Data Mining, pp. 56–63 (2003)
Kryszkiewicz, M.: Rough set approach to incomplete information systems. Inf. Sci. 112(1–4), 39–49 (1998)
Kryszkiewicz, M.: Rules in incomplete information systems. Inf. Sci. 113(3–4), 271–292 (1999)
Stefanowski, J., Tsoukiàs, A.: On the extension of rough sets under incomplete information. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), vol. 1711, pp. 73–81. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-540-48061-7_11
Stefanowski, J., Tsoukias, A.: Incomplete information table and rough classification. Comput. Intell. 17(3), 545–566 (2001)
Wang, G.Y.: Extension of rough set under incomplete system. In: IEEE International Conference on Fuzzy Systems, pp. 1098–1103 (2002)
Yang, X., Song, X., Hu, X.: Generalization of rough set for rule induction in incomplete system. Int. J. Granular Comput. Rough Sets Intell. Syst. 2(1), 37–50 (2011)
Nguyen, D.V., Yamada, K., Unehara, M.: Extended tolerance relation to define a new rough set model in incomplete information systems. Adv. Fuzzy Syst. 2013, 1–11 (2013)
Deris, M.M., Abdullah, Z., Mamat, R., Yuan, Y.: A new limited tolerance relation for attribute selection in incomplete information systems. In: IEEE International Conference on Fuzzy Systems and Knowledge Discovery, pp. 964–969 (2015)
Acknowledgment
The research was supported from Ministry of Higher Education through Fundamental Research Grant Scheme (FRGS) vote number 1643.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Deris, M.M., Senan, N., Abdullah, Z., Mamat, R., Handaga, B. (2019). Dimensional Reduction Using Conditional Entropy for Incomplete Information Systems. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2019. Lecture Notes in Computer Science(), vol 11657. Springer, Cham. https://doi.org/10.1007/978-3-030-25636-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-25636-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25635-7
Online ISBN: 978-3-030-25636-4
eBook Packages: Computer ScienceComputer Science (R0)