Skip to main content

KDD Support Services Based on Data Semantics

  • Conference paper
Journal on Data Semantics IV

Part of the book series: Lecture Notes in Computer Science ((JODS,volume 3730))

Abstract

The identification of valid, novel and interesting models from large volumes of data is the primary goal of Knowledge Discovery in Databases (KDD). In order to successfully achieve such a complex goal, many kinds of semantic information about the KDD and business domains is necessary. In this paper, we present an approach to the characterization of semantic domain information for a particular kind of KDD process: classification. In particular we show how, by estimating the properties of the true but unknown classification model, one can derive domain information on the classification problem at hand. We discuss how, by saving these properties with the data, users profit from this information and save time for experimenting with a lot of classifiers and parameters by accessing this knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)

    Google Scholar 

  2. Brazdil, P., Soares, C., Costa, J.: Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results. Machine Learning 50(3), 251–277 (2003)

    Article  MATH  Google Scholar 

  3. Cannataro, M., Comito, C.: A Data Mining Ontology for Grid Programming. In: Proc. 1st Work. on Semantics in Peer-to-Peer and Grid Computing, pp. 119–130 (2003)

    Google Scholar 

  4. Cannataro, M., Talia, D.: The Knowledge Grid. Comm. of the ACM 46(1), 89–93 (2003)

    Article  Google Scholar 

  5. Cespivova, H., Rauch, J., Svatek, V., Kejkula, M., Tomeckova, M.: Roles of Medical Ontologies in Association Mining CRISP-DM Cycle. In: ECML/PKDD Workshop on Knowledge Discovery and Ontologies, Pisa, Italy, pp. 1–12 (2004)

    Google Scholar 

  6. Chervenak, A., Foster, I., Kesselman, C., Tuecke, S.: Protocols and Services for Distributed Data-Intensive Science. In: Proc. Advanced Computing and Analysis Techniques in Physics (ACAT 2000), pp. 161–163 (2000)

    Google Scholar 

  7. Clarkson, K.: A program for convex hulls, http://cm.bell-labs.com/netlib/voronoi/hull.html

  8. Diamantini, C., Spalvieri, A.: Quantizing for Minimum Average Misclassification Risk. IEEE Trans. on Neural Networks 9(1), 174–182 (1998)

    Article  Google Scholar 

  9. Diamantini, C., Potena, D., Panti, M.: Developing an Open Knowledge Discovery Support System for a Network Environment. In: Proc. of the 2005 International Symposium on Collaborative Technologies and Systems, Saint Louis, Missouri, USA, May 15-19 (2005) (to appear)

    Google Scholar 

  10. Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press (1996)

    Google Scholar 

  11. Fermandez, C., Martinez, J.F., Wasilewska, A., Hadjimichael, M., Menasalvas, E.: Data Mining - a Semantic Model. In: IEEE International Conference on Fuzzy Systems, vol. 2, pp. 938–943 (May 2002)

    Google Scholar 

  12. Grossman, R. (ed.): Proc. of the Second Annual ACM KDD Workshop on Data Mining Standards, Services and Platforms, Seattle, WA (August 2004)

    Google Scholar 

  13. Grossman, R., Mazzucco, M.: DataSpace: a Data Web for the Exploratory Analysis and Mining of Data. IEEE Computing in Science and Engineering 4(4), 44–51 (2002)

    Google Scholar 

  14. Grossman, R., Hornik, M., Meyer, G.: Emerging Standards and Interfaces in Data Mining. In: Ye, N. (ed.) Handbook of Data Mining, Kluwer Ac. Pub., Dordrecht (April 2003)

    Google Scholar 

  15. Hart, P.E.: The Condensed Nearest Neighbor Rule. IEEE Trans. on Information Theory 14, 515–516 (1968)

    Article  Google Scholar 

  16. Hotho, A., Staab, S., Stumme, G.: Ontologies Improve Text Document Clustering. In: IEEE International Conference on Data Mining, pp. 541–544 (November 2003)

    Google Scholar 

  17. Kalousis, A., Hilario, M.: Model Selection via Meta-Learning. Int. Journal on Artificial Intelligence Tools 10(4) (2001)

    Google Scholar 

  18. Kargupta, H., Joshi, A., Sivakumar, K., Yesha, Y.: Data Mining, Next Generation Challenges and Future Directions. AAAI/MIT Press (2004)

    Google Scholar 

  19. Kohonen, T., Barna, G., Chrisley, R.: Statistical Pattern Recognition With Neural Networks: Benchmarking Studies. In: IEEE International Conference on Neural Networks, San Diego CA, 24-27 July 1998, pp. 61–68 (1998)

    Google Scholar 

  20. Kotasek, P., Zendulka, J.: An XML Framework Proposal for Knowledge Discovery in Databases. In: European Conference on Principles and Practice of Knowledge Discovery in Databases, Workshop on Knowledge Management: Theory and Applications, Lyon, France, pp. 143–156 (2000)

    Google Scholar 

  21. Krishnaswamy, S., Zaslasvky, A., Loke, S. W.: Internet Delivery of Distributed Data Mining Services: Architectures, Issues and Prospects. In: Murthy, V.K., Shi, N. (eds.) Architectural Issues of Web-enabled Electronic Business, ch. 7, pp. 113–127. Idea Group Publishing, USA (2003)

    Google Scholar 

  22. Kumar, A., Kantardzic, M., Ramaswamy, P., Sadeghian, P.: An Extensible Service Oriented Distributed Data Mining Framework. In: Proc. IEEE/ACM Intl. Conf. on Machine Learning and Applications, Louisville, KY, USA, December 16-18 (2004)

    Google Scholar 

  23. Lee, C., Landgrebe, D.A.: Feature Extraction Based on Decision Boundaries. IEEE Trans. on Pattern Analysis and Machine Intelligence 15(4), 288–400 (1993)

    Article  Google Scholar 

  24. Morgera, S.D., Datta, L.: Towards a Fundamental Theory of Optimal Feature Selection: Part I. IEEE Trans. on Pattern Analysis and Machine Intelligence 6(5), 601–616 (1984)

    Article  MATH  Google Scholar 

  25. Phillips, J., Buchanan, B.G.: Ontology-Guided Knowledge Discovery in Databases. In: 1st ACM Int. Conf. on Knowledge Capture, Victoria, Canada, October 2001, pp. 123–130 (2001)

    Google Scholar 

  26. Sarawagi, S., Nagaralu, S.H.: Data Mining Models as Services on the Internet. ACM SIGKDD Explorations 2(1), 24–28 (2000)

    Article  Google Scholar 

  27. Shearer, C.: The CRISP-DM Model: The new Blueprint for Data Mining. Jour. of Data Warehousing 5(4) (Fall 2000)

    Google Scholar 

  28. Talia, D.: The Open Grid Services Architecture: Where the Grid Meets the Web. IEEE Internet Computing 6(6), 67–71 (2002)

    Article  Google Scholar 

  29. Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)

    MATH  Google Scholar 

  30. Varde, A., Rundensteiner, E., Ruiz, C., Maniruzzaman, M., Sisson, R.: Data Mining Over Graphical Results of Experiments With Domain Semantics. In: ACM 2nd Internationa Conference on Intelligent Computing and Information Systems, Cairo, Egypt, March 5-7 (2005)

    Google Scholar 

  31. Verschelde, J., Casella Dos Santos, M., Deray, T., Smith, B., Ceusters, W.: Ontology-Assisted Database Integration to Support Natural Language Processing and Biomedical Data Mining. Journal of Integrative Bioinformatics (January 2004)

    Google Scholar 

  32. Wang, B., McKay, R., Abbass, H., Barlow, M.: A Comparative Study for Domain Ontology Guided Feature Extraction. In: 26th Australasian Computer Science Conference, Adelaide, Australia, pp. 69–78 (2003)

    Google Scholar 

  33. Li, Y., Lu, Z.: Ontology-Based Universal Knowledge Grid: Enabling Knowledge Discovery and Integration on the Grid. In: IEEE International Conference on Services Computing, pp. 557–560 (September 2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Diamantini, C., Potena, D., Panti, M. (2005). KDD Support Services Based on Data Semantics. In: Spaccapietra, S. (eds) Journal on Data Semantics IV. Lecture Notes in Computer Science, vol 3730. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11603412_9

Download citation

  • DOI: https://doi.org/10.1007/11603412_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31001-3

  • Online ISBN: 978-3-540-31447-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics