Skip to main content

Development of the Data Preprocessing Agent’s Knowledge for Data Mining Using Rough Set Theory

  • Conference paper
Rough Sets and Knowledge Technology (RSKT 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5589))

Included in the following conference series:

  • 2665 Accesses

Abstract

Data preprocessing is one of the important task in Knowledge Discovery in Databases or Data Mining. The preprocessing is complex and tedious task especially involving large dataset. It is crucial for a data miner to be able to determine the appropriate data preprocessing techniques for a particular data set as it will save the processing time and retain the quality of the data for data mining. Current data mining researchers use agent as a tool to assist data mining process. However, very few researches focus on using agent in the data preprocessing. Applying agents with autonomous, flexible and intelligence reduced the cost of having a quality, precise and updated data or knowledge. The most important part of having an agent to perform data mining task particularly data preprocessing is the generation of agent’s knowledge. The data preprocessing agent’s knowledge are meant for agent to decide the appropriate data preprocessing technique to be used on a particular dataset. Therefore, in this paper we propose a methodology for creating the data preprocessing agent’s knowledge by using rough set theory. The experimental results showed that the agent’s knowledge generated is significant to be used for automated data preprocessing techniques selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aflori, C., Leon, F.: Efficient Distributed Data Mining Using Intelligent Agents, 1–6 (2004)

    Google Scholar 

  2. Ahmad, A.M., Nordin, N.A., Saaim, E.H., Samaon, F., Ibrahim, M.D.: An Architecture Design of The Intelligent Agent for Speech Recognition and Translation. In: 14th International Conference on Computer Theory and Applications (ICCTA 2004). IEEE, Egypt (2004)

    Google Scholar 

  3. Kehagias, D., Chatzidimitriou, K.C., Symeonidis, A.L., Mitkas, P.A.: Information Agents Cooperating with Heterogeneous Data Sources for Customer-Order Management. In: ACM Symposium on Applied Computing, pp. 52–57. ACM, Cyprus (2004)

    Google Scholar 

  4. Daiping, H., Weiquo, W., Huiming, D., Wei, Q.: An Agent Based Fault Diagnosis Support System and Its Application (2006)

    Google Scholar 

  5. Bo, Y., Wang, Y.D., Hong, S.X.: Research and Design of Distributed Training Algorithm For Neural Network. In: Proceedings of the International Conference on Machine Learning and Cybernetics, pp. 4044–4049. IEEE, China (2005)

    Google Scholar 

  6. Czarnowski, I., Jedrzejowiez, P.: An Agent-Based Approach to ANN training. Knowledge-Based System 19, 304–308 (2006)

    Article  Google Scholar 

  7. Yun-Lan, W., Zeng-Zhi, L., Hai-Ping, Z.: Mobile-Agent-Based Distributed and Incremental Techniques for Association Rules. In: Proceedings of the Second International Conference on Machine Learning and Cybernetics, pp. 266–271. IEEE, Poland (2003)

    Google Scholar 

  8. Yu-Fang, Z., Zhong-Yang, X., Xiu-Qiong, W.: Distributed Intrusion Detection Based on Clustering. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, pp. 2379–2382. IEEE, Guangzhaou (2005)

    Google Scholar 

  9. Josenildo, C., et al.: Distributed Data Mining and Agents. International Journal of Engineering Applications of Artificial Intelligent 18, 791–807 (2005)

    Article  Google Scholar 

  10. Seydim, A.Y.: Intelligent Agents: A Data Mining Perspective, Dallas (1999)

    Google Scholar 

  11. Nurmi, P., Przybilski, M., Lindén, G., Floréen, P.: An architecture for distributed agent-based data preprocessing. In: Gorodetsky, V., Liu, J., Skormin, V.A. (eds.) AIS-ADM 2005. LNCS, vol. 3505, pp. 123–133. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  12. Luo, P., He, Q., Huang, Q., Lin, F., Shi, Z.: Execution Engine of Meta-Learning System for KDD in Multi-Agent Environment. In: AIS-ADM, pp. 149–160. IEEE, Los Alamitos (2005)

    Google Scholar 

  13. Li, C., Gao, Y.: Agent-Based Pattern Mining of Discredited Activities in Public Services. In: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, USA, pp. 15–18 (2006)

    Google Scholar 

  14. Othman, Z.A., Shuib, N., Bakar, A.A., Omar, K.: Agent based Preprocessing. In: International Conferences on Intelligent & Advanced Systems, KLCC Malaysia, p. 54 (2007)

    Google Scholar 

  15. Bakar, A.A., Othman, Z.A., Hamdan, A.R., Yusof, R., Ismail, R.: An Agent Based Rough Classifier for Data Mining. In: The International Conference on Intelligent Systems Design and Applications (ISDA 2008), Kaohsiung, Taiwan (2008)

    Google Scholar 

  16. Dunham, M.H.: Data Mining: Introductory and Advanced Topics. Prentice Hall, Upper Saddle River (2003)

    Google Scholar 

  17. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in database. AI Magazine 17, 37–54 (1996)

    Google Scholar 

  18. Simon, H.A.: The Science of the Artificial, 2nd edn., Cambridge (1981)

    Google Scholar 

  19. Michal, R., Chmielewski, J.W., Grzymala, B.: Global Discretization of Continuous Attributes as Preprocessing for Machine Learning, 319–331 (1996)

    Google Scholar 

  20. Yang, Y.: Discretization for Data Mining, http://www.csse.monash.edu.au/~yyang/ Discretization for DM.pdf

  21. Divina, F., Keijzer, M., Marchiori, E.: A Method for Handling Numerical Attributes in GA-based Inductive Concept Learners. In: Proceedings of the Genetic and Evolutionary Computation Conference, p. 898. Springer, Chicago (2003)

    Google Scholar 

  22. Famili, A.: The Role of Data Pre-Processing in Intellligent Data Analysis. In: Proceeding of the International Sysmposiumon Intelligent Data Analysis (IDA 1995), pp. 54–58. NRC Publication, Germany (1995)

    Google Scholar 

  23. UCI Repositories of Machine Learning and Domain Theories, http://archive.ics.uci.edu/ml/dataset.html

  24. ROSETTA – A Rough Set Toolkit for Analysis of Data, http://www.galaxy.gmu.Edu/interface/I01/2001Proceedings/JBreault/JBreault-Paper.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Othman, Z.A., Bakar, A.A., Othman, Z., Rosli, S. (2009). Development of the Data Preprocessing Agent’s Knowledge for Data Mining Using Rough Set Theory. In: Wen, P., Li, Y., Polkowski, L., Yao, Y., Tsumoto, S., Wang, G. (eds) Rough Sets and Knowledge Technology. RSKT 2009. Lecture Notes in Computer Science(), vol 5589. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02962-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02962-2_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02961-5

  • Online ISBN: 978-3-642-02962-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics