Abstract
The identification of valid, novel and interesting models from large volumes of data is the primary goal of Knowledge Discovery in Databases (KDD). In order to successfully achieve such a complex goal, many kinds of semantic information about the KDD and business domains is necessary. In this paper, we present an approach to the characterization of semantic domain information for a particular kind of KDD process: classification. In particular we show how, by estimating the properties of the true but unknown classification model, one can derive domain information on the classification problem at hand. We discuss how, by saving these properties with the data, users profit from this information and save time for experimenting with a lot of classifiers and parameters by accessing this knowledge.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)
Brazdil, P., Soares, C., Costa, J.: Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results. Machine Learning 50(3), 251–277 (2003)
Cannataro, M., Comito, C.: A Data Mining Ontology for Grid Programming. In: Proc. 1st Work. on Semantics in Peer-to-Peer and Grid Computing, pp. 119–130 (2003)
Cannataro, M., Talia, D.: The Knowledge Grid. Comm. of the ACM 46(1), 89–93 (2003)
Cespivova, H., Rauch, J., Svatek, V., Kejkula, M., Tomeckova, M.: Roles of Medical Ontologies in Association Mining CRISP-DM Cycle. In: ECML/PKDD Workshop on Knowledge Discovery and Ontologies, Pisa, Italy, pp. 1–12 (2004)
Chervenak, A., Foster, I., Kesselman, C., Tuecke, S.: Protocols and Services for Distributed Data-Intensive Science. In: Proc. Advanced Computing and Analysis Techniques in Physics (ACAT 2000), pp. 161–163 (2000)
Clarkson, K.: A program for convex hulls, http://cm.bell-labs.com/netlib/voronoi/hull.html
Diamantini, C., Spalvieri, A.: Quantizing for Minimum Average Misclassification Risk. IEEE Trans. on Neural Networks 9(1), 174–182 (1998)
Diamantini, C., Potena, D., Panti, M.: Developing an Open Knowledge Discovery Support System for a Network Environment. In: Proc. of the 2005 International Symposium on Collaborative Technologies and Systems, Saint Louis, Missouri, USA, May 15-19 (2005) (to appear)
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press (1996)
Fermandez, C., Martinez, J.F., Wasilewska, A., Hadjimichael, M., Menasalvas, E.: Data Mining - a Semantic Model. In: IEEE International Conference on Fuzzy Systems, vol. 2, pp. 938–943 (May 2002)
Grossman, R. (ed.): Proc. of the Second Annual ACM KDD Workshop on Data Mining Standards, Services and Platforms, Seattle, WA (August 2004)
Grossman, R., Mazzucco, M.: DataSpace: a Data Web for the Exploratory Analysis and Mining of Data. IEEE Computing in Science and Engineering 4(4), 44–51 (2002)
Grossman, R., Hornik, M., Meyer, G.: Emerging Standards and Interfaces in Data Mining. In: Ye, N. (ed.) Handbook of Data Mining, Kluwer Ac. Pub., Dordrecht (April 2003)
Hart, P.E.: The Condensed Nearest Neighbor Rule. IEEE Trans. on Information Theory 14, 515–516 (1968)
Hotho, A., Staab, S., Stumme, G.: Ontologies Improve Text Document Clustering. In: IEEE International Conference on Data Mining, pp. 541–544 (November 2003)
Kalousis, A., Hilario, M.: Model Selection via Meta-Learning. Int. Journal on Artificial Intelligence Tools 10(4) (2001)
Kargupta, H., Joshi, A., Sivakumar, K., Yesha, Y.: Data Mining, Next Generation Challenges and Future Directions. AAAI/MIT Press (2004)
Kohonen, T., Barna, G., Chrisley, R.: Statistical Pattern Recognition With Neural Networks: Benchmarking Studies. In: IEEE International Conference on Neural Networks, San Diego CA, 24-27 July 1998, pp. 61–68 (1998)
Kotasek, P., Zendulka, J.: An XML Framework Proposal for Knowledge Discovery in Databases. In: European Conference on Principles and Practice of Knowledge Discovery in Databases, Workshop on Knowledge Management: Theory and Applications, Lyon, France, pp. 143–156 (2000)
Krishnaswamy, S., Zaslasvky, A., Loke, S. W.: Internet Delivery of Distributed Data Mining Services: Architectures, Issues and Prospects. In: Murthy, V.K., Shi, N. (eds.) Architectural Issues of Web-enabled Electronic Business, ch. 7, pp. 113–127. Idea Group Publishing, USA (2003)
Kumar, A., Kantardzic, M., Ramaswamy, P., Sadeghian, P.: An Extensible Service Oriented Distributed Data Mining Framework. In: Proc. IEEE/ACM Intl. Conf. on Machine Learning and Applications, Louisville, KY, USA, December 16-18 (2004)
Lee, C., Landgrebe, D.A.: Feature Extraction Based on Decision Boundaries. IEEE Trans. on Pattern Analysis and Machine Intelligence 15(4), 288–400 (1993)
Morgera, S.D., Datta, L.: Towards a Fundamental Theory of Optimal Feature Selection: Part I. IEEE Trans. on Pattern Analysis and Machine Intelligence 6(5), 601–616 (1984)
Phillips, J., Buchanan, B.G.: Ontology-Guided Knowledge Discovery in Databases. In: 1st ACM Int. Conf. on Knowledge Capture, Victoria, Canada, October 2001, pp. 123–130 (2001)
Sarawagi, S., Nagaralu, S.H.: Data Mining Models as Services on the Internet. ACM SIGKDD Explorations 2(1), 24–28 (2000)
Shearer, C.: The CRISP-DM Model: The new Blueprint for Data Mining. Jour. of Data Warehousing 5(4) (Fall 2000)
Talia, D.: The Open Grid Services Architecture: Where the Grid Meets the Web. IEEE Internet Computing 6(6), 67–71 (2002)
Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)
Varde, A., Rundensteiner, E., Ruiz, C., Maniruzzaman, M., Sisson, R.: Data Mining Over Graphical Results of Experiments With Domain Semantics. In: ACM 2nd Internationa Conference on Intelligent Computing and Information Systems, Cairo, Egypt, March 5-7 (2005)
Verschelde, J., Casella Dos Santos, M., Deray, T., Smith, B., Ceusters, W.: Ontology-Assisted Database Integration to Support Natural Language Processing and Biomedical Data Mining. Journal of Integrative Bioinformatics (January 2004)
Wang, B., McKay, R., Abbass, H., Barlow, M.: A Comparative Study for Domain Ontology Guided Feature Extraction. In: 26th Australasian Computer Science Conference, Adelaide, Australia, pp. 69–78 (2003)
Li, Y., Lu, Z.: Ontology-Based Universal Knowledge Grid: Enabling Knowledge Discovery and Integration on the Grid. In: IEEE International Conference on Services Computing, pp. 557–560 (September 2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Diamantini, C., Potena, D., Panti, M. (2005). KDD Support Services Based on Data Semantics. In: Spaccapietra, S. (eds) Journal on Data Semantics IV. Lecture Notes in Computer Science, vol 3730. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11603412_9
Download citation
DOI: https://doi.org/10.1007/11603412_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31001-3
Online ISBN: 978-3-540-31447-9
eBook Packages: Computer ScienceComputer Science (R0)