Skip to main content

Decomposing Data Mining by a Process-Oriented Execution Plan

  • Conference paper
Artificial Intelligence and Computational Intelligence (AICI 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6319))

  • 1784 Accesses

Abstract

Data mining deals with the extraction of hidden knowledge from large amounts of data. Nowadays, coarse-grained data mining modules are used. This traditional black box approach focuses on specific algorithm improvements and is not flexible enough to be used for more general optimization and beneficial component reuse. The work presented in this paper elaborates on decomposing data mining tasks as data mining execution process plans which are composed of finer-grained data mining operators. The cost of an operator can be analyzed and provides means for more holistic optimizations. This process-based data mining concept is evaluated via an OGSA-DAI based implementations for association rule mining which show the feasibility of our approach as well as the re-usability of some of the data mining operators.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Open grid services architecture - database access and integration (ogsa-dai), http://www.ogsadai.org.uk/

  2. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, New York, NY, USA, pp. 207–216 (1993)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, ICDE 1995, pp. 3–14 (1995)

    Google Scholar 

  4. Garcia-Molina, H., Widom, J., Ullman, J.D.: Database Systems: The Complete Book, 2nd edn. Prentice Hall, New Jersey (2009)

    Google Scholar 

  5. Gopalan, R.P., Nuruddin, T., Sucahyo, Y.G.: Algebraic specification of association rule queries. In: Proceedings of the 4th Data Mining and Knowledge Discovery: Theory, Tools, and Technology (2003)

    Google Scholar 

  6. Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25(2), 73–169 (1993)

    Article  Google Scholar 

  7. Haas, L.M., Freytag, J.C., Lohman, G.M., Pirahesh, H.: Extensible query processing in starburst. SIGMOD Rec. 18(2), 377–388 (1989)

    Article  Google Scholar 

  8. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  9. Houtsma, M.A.W., Swami, A.N.: Set-oriented mining for association rules in relational databases. In: Proceedings of the Eleventh International Conference on Data Engineering, ICDE 1995, pp. 25–33 (1995)

    Google Scholar 

  10. Kusiak, A.: Decomposition in data mining: an industrial case study. IEEE Transactions on Electronics Packaging Manufacturing 23(4), 345–353 (2000)

    Article  Google Scholar 

  11. Maimon, O., Rokach, L.: Data Mining and Knowledge Discovery Handbook. Springer US, New York (2005)

    Book  MATH  Google Scholar 

  12. Meo, R., Psaila, G., Ceri, S.: An extension to sql for mining association rules. Data Min. Knowl. Discov. 2(2), 195–224 (1998)

    Article  Google Scholar 

  13. Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: Yale: rapid prototyping for complex data mining tasks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, New York, NY, USA, pp. 935–940 (2006)

    Google Scholar 

  14. Panda, B., Herbach, J.S., Basu, S., Bayardo, R.J.: Planet: massively parallel learning of tree ensembles with mapreduce. Proc. VLDB Endow. 2(2), 1426–1437 (2009)

    Article  Google Scholar 

  15. Sacco, G.M., Yao, S.B.: Query optimization in distributed data base systems. Advances in Computers 21, 225–273 (1982)

    Article  Google Scholar 

  16. Wöhrer, A., Zhang, Y., ul Haq Dar, E., Brezany, P.: Unboxing data mining via decomposition in operators - towards macro optimization and distribution. In: KDIR 2009, pp. 243–248. Funchal-Madeira, Portugal (2009)

    Google Scholar 

  17. Yuan, X.: Data mining query language design and implementation. Master’s thesis, The Chinese University of Hong Kong, Hong Kong (2003)

    Google Scholar 

  18. Zhang, Y., Wöhrer, A., Brezany, P.: Towards China’s Railway Freight Transportation Information Grid. In: Proceedings of the 32nd international Convention on Information and Communication Technology, Electronics and Microelectroincs, MIPRO 2009, Opatija, Croatia (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Y., Li, H., Wöhrer, A., Brezany, P., Dai, G. (2010). Decomposing Data Mining by a Process-Oriented Execution Plan. In: Wang, F.L., Deng, H., Gao, Y., Lei, J. (eds) Artificial Intelligence and Computational Intelligence. AICI 2010. Lecture Notes in Computer Science(), vol 6319. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16530-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16530-6_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16529-0

  • Online ISBN: 978-3-642-16530-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics