Abstract
Data processing and knowledge discovery for massive data is always a hot topic in data mining, along with the era of cloud computing is coming, data mining for massive data is becoming a highlight research topic. In this paper, attribute reduction for massive data based on rough set theory is studied. The parallel programming mode of MapReduce is introduced and combined with the attribute reduction algorithm of rough set theory, a parallel attribute reduction algorithm based on MapReduce is proposed, experiment results show that the proposed method is more efficiency for massive data mining than traditional method, and it is a effective method effective method effective method for data mining on cloud computing platform.
This paper is partially supported by National Natural Science Foundation of China under Grants No.60773113 and No.60573068, Natural Science Foundation of Chongqing under Grants No.2008BA2017 and No.2008BA2041, CQUPT-ICST Fundation under Grants No.JK-Y-2010002 and CY-CNCL-2009-02.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hu, F., Wang, G.: Quick reduction algorithm based on attribute order. Chinese Journal of Computers 30(8), 1429–1435 (2007)
Sharer, J., Agrawal, R., Mehta, M.: SPRINTA Scalable Parallel Classifier for Data Mining. In: Proceedings of the 22th International Conference on Very Large Data Bases, pp. 544–555 (1996)
Andrew, W., Christopher, K., Kevin, D.: Parallel PSO Using MapReduce. In: Proceedings of 2007 IEEE Congress on Evolutionary Computation, pp. 7–16 (2007)
Abhishek, V., Xavier, L., David, E., Roy, H.: Scaling Genetic Algorithms using MapReduce. In: Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications, pp. 13–18 (2009)
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM 51(1), 107–113 (2008)
Jaliya, E., Shrideep, P., Geoffrey, F.: MapReduce for Data Intensive Scientific Analyses. In: Proceedings of Fourth IEEE International Conference on eScience, pp. 277–284 (2008)
Pawlak, Z.: On Rough Sets. Bulletin of the EATCS 24, 94–108 (1984)
Pawlak, Z.: Rough Classification. International Journal of Man-Machine Studies 20(5), 469–483 (1984)
Wang, G.: Rough reduction in algebra view and information view. International Journal of Intelligent System 18(6), 679–688 (2003)
Miao, D., Hu, G.: A Heuristic Algorithm for Reduction of Knowledge. Journal of Computer Research and Development 6, 681–684 (1999) (in Chinese)
Skowron, A., Rauszer, C.: The Discernibility Functions Matrics and Fanctions in Information Systems. In: Slowinski, R. (ed.) Intelligent Decision Support CHandbook of Applications and Advances of the Rough Sets Theory, pp. 331–362. Kluwer Academic Publisher, Dordrecht (1991)
Hadoop MapReduce, http://hadoop.apache.org/mapreduce/
Wang, G.: Rough Set Theory and Knowledge Acquisition. Xi’an Jiaotong University Press, Xi’an (2001) (in Chinese)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, Y., Chen, Z., Liang, Z., Wang, G. (2010). Attribute Reduction for Massive Data Based on Rough Set Theory and MapReduce. In: Yu, J., Greco, S., Lingras, P., Wang, G., Skowron, A. (eds) Rough Set and Knowledge Technology. RSKT 2010. Lecture Notes in Computer Science(), vol 6401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16248-0_91
Download citation
DOI: https://doi.org/10.1007/978-3-642-16248-0_91
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16247-3
Online ISBN: 978-3-642-16248-0
eBook Packages: Computer ScienceComputer Science (R0)