Attribute Reduction for Massive Data Based on Rough Set Theory and MapReduce

Yang, Yong; Chen, Zhengrong; Liang, Zhu; Wang, Guoyin

doi:10.1007/978-3-642-16248-0_91

Yong Yang²⁴,
Zhengrong Chen²⁴,
Zhu Liang²⁴ &
…
Guoyin Wang²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6401))

Included in the following conference series:

International Conference on Rough Sets and Knowledge Technology

1163 Accesses
22 Citations

Abstract

Data processing and knowledge discovery for massive data is always a hot topic in data mining, along with the era of cloud computing is coming, data mining for massive data is becoming a highlight research topic. In this paper, attribute reduction for massive data based on rough set theory is studied. The parallel programming mode of MapReduce is introduced and combined with the attribute reduction algorithm of rough set theory, a parallel attribute reduction algorithm based on MapReduce is proposed, experiment results show that the proposed method is more efficiency for massive data mining than traditional method, and it is a effective method effective method effective method for data mining on cloud computing platform.

This paper is partially supported by National Natural Science Foundation of China under Grants No.60773113 and No.60573068, Natural Science Foundation of Chongqing under Grants No.2008BA2017 and No.2008BA2041, CQUPT-ICST Fundation under Grants No.JK-Y-2010002 and CY-CNCL-2009-02.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hu, F., Wang, G.: Quick reduction algorithm based on attribute order. Chinese Journal of Computers 30(8), 1429–1435 (2007)
MathSciNet Google Scholar
Sharer, J., Agrawal, R., Mehta, M.: SPRINTA Scalable Parallel Classifier for Data Mining. In: Proceedings of the 22th International Conference on Very Large Data Bases, pp. 544–555 (1996)
Google Scholar
Andrew, W., Christopher, K., Kevin, D.: Parallel PSO Using MapReduce. In: Proceedings of 2007 IEEE Congress on Evolutionary Computation, pp. 7–16 (2007)
Google Scholar
Abhishek, V., Xavier, L., David, E., Roy, H.: Scaling Genetic Algorithms using MapReduce. In: Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications, pp. 13–18 (2009)
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM 51(1), 107–113 (2008)
Article Google Scholar
Jaliya, E., Shrideep, P., Geoffrey, F.: MapReduce for Data Intensive Scientific Analyses. In: Proceedings of Fourth IEEE International Conference on eScience, pp. 277–284 (2008)
Google Scholar
Pawlak, Z.: On Rough Sets. Bulletin of the EATCS 24, 94–108 (1984)
Google Scholar
Pawlak, Z.: Rough Classification. International Journal of Man-Machine Studies 20(5), 469–483 (1984)
Article MATH Google Scholar
Wang, G.: Rough reduction in algebra view and information view. International Journal of Intelligent System 18(6), 679–688 (2003)
Article MATH Google Scholar
Miao, D., Hu, G.: A Heuristic Algorithm for Reduction of Knowledge. Journal of Computer Research and Development 6, 681–684 (1999) (in Chinese)
Google Scholar
Skowron, A., Rauszer, C.: The Discernibility Functions Matrics and Fanctions in Information Systems. In: Slowinski, R. (ed.) Intelligent Decision Support CHandbook of Applications and Advances of the Rough Sets Theory, pp. 331–362. Kluwer Academic Publisher, Dordrecht (1991)
Google Scholar
Hadoop MapReduce, http://hadoop.apache.org/mapreduce/
KDDCUP99, http://kdd.ics.uci.edu/databases/kddcup99/
Wang, G.: Rough Set Theory and Knowledge Acquisition. Xi’an Jiaotong University Press, Xi’an (2001) (in Chinese)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science & Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, P.R. China
Yong Yang, Zhengrong Chen, Zhu Liang & Guoyin Wang

Authors

Yong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhengrong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhu Liang
View author publications
You can also search for this author in PubMed Google Scholar
Guoyin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer and Information Technology, Beijing Jiaotong University, 100044, Beijing, China
Jian Yu
Faculty of Economics, University of Catania, Corso Italia, 55, 95129, Catania, Italy
Salvatore Greco
Department of Mathematics and Computing Science, Saint Mary’s University, B3H 3C3, Halifax, Nova Scotia, Canada
Pawan Lingras
Institute of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
Guoyin Wang
Institute of Mathematics, Warsaw University, Banacha 2, 02-097, Warsaw, Poland
Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Y., Chen, Z., Liang, Z., Wang, G. (2010). Attribute Reduction for Massive Data Based on Rough Set Theory and MapReduce. In: Yu, J., Greco, S., Lingras, P., Wang, G., Skowron, A. (eds) Rough Set and Knowledge Technology. RSKT 2010. Lecture Notes in Computer Science(), vol 6401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16248-0_91

Download citation

DOI: https://doi.org/10.1007/978-3-642-16248-0_91
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16247-3
Online ISBN: 978-3-642-16248-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics