Abstract
The paper deals with the problem of cluster discovery in the context of Semantic Web knowledge bases. A partitional clustering algorithm is presented. It is applied for grouping resources contained in knowledge bases and expressed in the standard ontology languages. The method exploits a language-independent semi-distance measure for individuals that is based on the semantics of the resources w.r.t. a context represented by a set of concept descriptions (discriminating features). The clustering algorithm adapts Bisecting k-Means method to work with medoids. Besides, we propose simple mechanisms to assign each cluster an intensional definition that may suggest new concepts for the knowledge base (vivification). A final experiment demonstrates the validity of the approach through absolute quality indices for clustering results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook. Cambridge University Press, Cambridge (2003)
Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American 284(5), 34–43 (2001)
Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Transactions on Systems, Man, and Cybernetics 28(3), 301–315 (1998)
Bock, H.H., Diday, E.: Analysis of symbolic data: exploratory methods for extracting statistical information from complex data. Springer, Heidelberg (2000)
Borgida, A.: On the relative expressiveness of description logics and predicate logics. Artificial Intelligence 82(1-2), 353–367 (1996)
Borgida, A., Walsh, T.J., Hirsh, H.: Towards measuring similarity in description logics. In: Horrocks, I., Sattler, U., Wolter, F. (eds.) Working Notes of the International Description Logics Workshop, CEUR Workshop Proceedings, Edinburgh, UK, vol. 147 (2005)
d’Amato, C., Fanizzi, N., Esposito, F.: Reasoning by analogy in description logics through instance-based learning. In: Tummarello, G., Bouquet, P., Signore, O. (eds.) Proceedings of Semantic Web Applications and Perspectives, 3rd Italian Semantic Web Workshop, SWAP 2006. CEUR Workshop Proceedings, Pisa, Italy, vol. 201 (2006)
d’Amato, C., Staab, S., Fanizzi, N., Esposito, F.: Efficient discovery of services specified in description logics languages. In: Di Noia, T., et al. (eds.) Proceedings of the SMR2 2007 Workshop on Service Matchmaking and Resource Retrieval in the Semantic Web (SMRR 2007) co-located with ISWC 2007 + ASWC 2007, CEUR Workshop Proceedings. Busan, South Korea, vol. 243. CEUR (2007)
Dean, M., Schreiber, G.: Web Ontology Language Reference. W3C recommendation, W3C (2004), http://www.w3.org/TR/owl-ref
Emde, W., Wettschereck, D.: Relational instance-based learning. In: Saitta, L. (ed.) Proceedings of the 13th International Conference on Machine Learning, ICML 1996, pp. 122–130. Morgan Kaufmann, San Francisco (1996)
Fanizzi, N., d’Amato, C., Esposito, F.: Randomized metric induction and evolutionary conceptual clustering for semantic knowledge bases. In: Silva, M., Laender, A., Baeza-Yates, R., McGuinness, D., Olsen, O., Olstad, B. (eds.) Proceedings of the ACM International Conference on Knowledge Management, CIKM 2007. Lisbon Portugal. ACM Press, New York (2007)
Fanizzi, N., d’Amato, C., Esposito, F.: Conceptual clustering and its application to concept drift and novelty detection. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 318–332. Springer, Heidelberg (2008)
Fanizzi, N., Iannone, L., Palmisano, I., Semeraro, G.: Concept formation in expressive description logics. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 99–110. Springer, Heidelberg (2004)
Hirano, S., Tsumoto, S.: An indiscernibility-based clustering method. In: Hu, X., Liu, Q., Skowron, A., Lin, T.Y., Yager, R., Zhang, B. (eds.) 2005 IEEE International Conference on Granular Computing, pp. 468–473. IEEE Computer Society Press, Los Alamitos (2005)
Iannone, L., Palmisano, I., Fanizzi, N.: An algorithm based on counterfactuals for concept learning in the semantic web. Applied Intelligence 26(2), 139–159 (2007)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)
Janowicz, K.: Sim-dl: Towards a semantic similarity measurement theory for the description logic \(\mathcal{ALCNR}\) in geographic information retrieval. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2006 Workshops. LNCS, vol. 4278, pp. 1681–1692. Springer, Heidelberg (2006)
Kaufman, L., Rousseeuw, P.: Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)
Kietz, J.U., Morik, K.: A polynomial approach to the constructive induction of structural knowledge. Machine Learning 14(2), 193–218 (1994)
Kirsten, M., Wrobel, S.: Relational distance-based clustering. In: Page, D.L. (ed.) ILP 1998. LNCS (LNAI), vol. 1446, pp. 261–270. Springer, Heidelberg (1998)
Kramer, S., Lavrač, N., Džeroski, S.: Propositionalization approaches to relational data mining. In: Džeroski, S., Lavrač, N. (eds.) Relational Data Mining, Springer, Heidelberg (2001)
Lehmann, J., Hitzler, P.: Foundations of refinement operators for description logics. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 161–174. Springer, Heidelberg (2008)
Lehmann, J., Hitzler, P.: A refinement operator based learning algorithm for the \({\cal ALC}\) description logic. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS, vol. 4894, pp. 147–160. Springer, Heidelberg (2008)
Nasraoui, O., Krishnapuram, R.: One step evolutionary mining of context sensitive associations and web navigation patterns. In: Proceedings of the SIAM conference on Data Mining, Arlington, VA, pp. 531–547 (2002)
Nienhuys-Cheng, S.H.: Distances and limits on Herbrand interpretations. In: Page, D.L. (ed.) ILP 1998. LNCS (LNAI), vol. 1446, pp. 250–260. Springer, Heidelberg (1998)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, Dordrecht (1991)
Sebag, M.: Distance induction in first order logic. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS (LNAI), vol. 1297, pp. 264–272. Springer, Heidelberg (1997)
Stepp, R.E., Michalski, R.S.: Conceptual clustering of structured objects: A goal-oriented approach. Artificial Intelligence 28(1), 43–69 (1986)
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search – The Metric Space Approach. In: Advances in Database Systems, Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Esposito, F., Fanizzi, N., d’Amato, C. (2009). Partitional Conceptual Clustering of Web Resources Annotated with Ontology Languages. In: Berendt, B., et al. Knowledge Discovery Enhanced with Semantic and Social Information. Studies in Computational Intelligence, vol 220. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01891-6_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-01891-6_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01890-9
Online ISBN: 978-3-642-01891-6
eBook Packages: EngineeringEngineering (R0)