Abstract
The paper presents a clustering method which can be applied to populated ontologies for discovering interesting groupings of resources therein. The method exploits a simple, yet effective and language-independent, semi-distance measure for individuals, that is based on their underlying semantics along with a number of dimensions corresponding to a set of concept descriptions (discriminating features committee). The clustering algorithm is a partitional method and it is based on the notion of medoids w.r.t. the adopted semi-distance measure. Eventually, it produces a hierarchical organization of groups of individuals. A final experiment demonstrates the validity of the approach using absolute quality indices. We propose two possible exploitations of these clusterings: concept formation and detecting concept drift or novelty.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook. Cambridge University Press, Cambridge (2003)
Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Transactions on Systems, Man, and Cybernetics 28(3), 301–315 (1998)
Borgida, A.: On the relative expressiveness of description logics and predicate logics. Artificial Intelligence 82(1-2)
Borgida, A., Walsh, T.J., Hirsh, H.: Towards measuring similarity in description logics. In: Horrocks, I., Sattler, U., Wolter, F. (eds.) Working Notes of the International Description Logics Workshop, Edinburgh, UK. CEUR Workshop Proceedings, vol. 147 (2005)
d’Amato, C., Fanizzi, N., Esposito, F.: Reasoning by analogy in description logics through instance-based learning. In: Tummarello, G., Bouquet, P., Signore, O. (eds.) Proceedings of Semantic Web Applications and Perspectives, 3rd Italian Semantic Web Workshop, SWAP 2006, Pisa, Italy. CEUR Workshop Proceedings, vol. 201 (2006)
d’Amato, C., Staab, S., Fanizzi, N., Esposito, F.: Efficient discovery of services specified in description logics languages. In: Di Noia, T., et al. (eds.) Proceedings of Service Matchmaking and Resource Retrieval in the Semantic Web Workshop at ISWC 2007, vol. 243, CEUR (2007)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases. In: Proceedings of the 2nd Conference of ACM SIGKDD, pp. 226–231 (1996)
Fanizzi, N., d’Amato, C., Esposito, F.: Induction of optimal semi-distances for individuals based on feature sets. In: Working Notes of the International Description Logics Workshop, DL 2007, Bressanone, Italy. CEUR Workshop Proceedings, vol. 250 (2007)
Fanizzi, N., d’Amato, C., Esposito, F.: Randomized metric induction and evolutionary conceptual clustering for semantic knowledge bases. In: Silva, M., Laender, A., Baeza-Yates, R., McGuinness, D., Olsen, O., Olstad, B. (eds.) Proceedings of the ACM International Conference on Knowledge Management, CIKM 2007, Lisbon, Portugal, ACM Press, New York (2007)
Fanizzi, N., Iannone, L., Palmisano, I., Semeraro, G.: Concept formation in expressive description logics. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 99–113. Springer, Heidelberg (2004)
Ghozeil, A., Fogel, D.B.: Discovering patterns in spatial data using evolutionary programming. In: Koza, J.R., Goldberg, D.E., Fogel, D.B., Riolo, R.L. (eds.) Genetic Programming 1996: Proceedings of the First Annual Conference, Stanford University, CA, USA, pp. 521–527. MIT Press, Cambridge (1996)
Hall, L.O., Özyurt, I.B., Bezdek, J.C.: Clustering with a genetically optimized approach. IEEE Trans. Evolutionary Computation 3(2), 103–112 (1999)
Hirano, S., Tsumoto, S.: An indiscernibility-based clustering method. In: Hu, X., Liu, Q., Skowron, A., Lin, T.Y., Yager, R., Zhang, B. (eds.) 2005 IEEE International Conference on Granular Computing, pp. 468–473. IEEE, Los Alamitos (2005)
Iannone, L., Palmisano, I., Fanizzi, N.: An algorithm based on counterfactuals for concept learning in the semantic web. Applied Intelligence 26(2), 139–159 (2007)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)
Kietz, J.-U., Morik, K.: A polynomial approach to the constructive induction of structural knowledge. Machine Learning 14(2), 193–218 (1994)
Kirsten, M., Wrobel, S.: Relational distance-based clustering. In: Page, D.L. (ed.) ILP 1998. LNCS, vol. 1446, pp. 261–270. Springer, Heidelberg (1998)
Lehmann, J.: Concept learning in description logics. Master’s thesis, Dresden University of Technology (2006)
Lehmann, J., Hitzler, P.: A refinement operator based learning algorithm for the alc description logic. In: The 17th International Conference on Inductive Logic Programming (ILP). LNCS, Springer, Heidelberg (2007)
Nasraoui, O., Krishnapuram, R.: One step evolutionary mining of context sensitive associations and web navigation patterns. In: Proceedings of the SIAM conference on Data Mining, Arlington, VA, pp. 531–547 (2002)
Ng, R., Han, J.: Efficient and effective clustering method for spatial data mining. In: Proceedings of the 20th Conference on Very Large Databases, VLDB 1994, pp. 144–155 (1994)
Nienhuys-Cheng, S.-H.: Distances and limits on herbrand interpretations. In: Page, D.L. (ed.) ILP 1998. LNCS, vol. 1446, pp. 250–260. Springer, Heidelberg (1998)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, Dordrecht (1991)
Sebag, M.: Distance induction in first order logic. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 264–272. Springer, Heidelberg (1997)
Spinosa, E.J., de Leon Ferreira de Carvalho, A.P., Gama, J.: OLINDDA: A cluster-based approach for detecting novelty and concept drift in data streams. In: Proceedings of the 22nd Annual ACM Symposium of Applied Computing, SAC 2007, Seoul, South Korea, vol. 1, pp. 448–452. ACM, New York (2007)
Stepp, R.E., Michalski, R.S.: Conceptual clustering of structured objects: A goal-oriented approach. Artificial Intelligence 28(1), 43–69 (1986)
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning 23(1), 69–101 (1996)
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search – The Metric Space Approach. In: Advances in Database Systems, Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fanizzi, N., d’Amato, C., Esposito, F. (2008). Conceptual Clustering and Its Application to Concept Drift and Novelty Detection. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds) The Semantic Web: Research and Applications. ESWC 2008. Lecture Notes in Computer Science, vol 5021. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68234-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-68234-9_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68233-2
Online ISBN: 978-3-540-68234-9
eBook Packages: Computer ScienceComputer Science (R0)