Abstract
We consider the problem of finding a compact labelling for large, rooted web taxonomies that can be used to encode all local path information for each taxonomy element. This research is motivated by the problem of developing standards for taxonomic data, and addresses the data intensive problem of evaluating semantic similarities between taxonomic elements. Evaluating such similarities often requires the processing of large common ancestor sets between elements. We propose a new class of compact labelling schemes, designed for directed acyclic graphs, and tailored for applications to large web taxonomies. Our labelling schemes significantly reduce the complexity of evaluating similarities among taxonomy elements by enabling the gleaning of inferences from the labels alone, without searching the data structure. We provide an analysis of the label lengths for the proposed schemes based on structural properties of the taxonomy. Finally, we provide supporting empirical evidence for the quality of these schemes by evaluating the performance on the WordNet taxonomy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., Kaplan, H., Milo, T.: Compact Labeling Schemes for Ancestor Queries. In: Proceedings of 12th ACM-SIAM Symposium on Discrete Algorithms, pp. 547–556 (2001)
Alstrup, S., Gavoille, C., Kaplan, H., Rauhe, T.: Nearest Common Ancestor: A Survey and a New Algorithm for a Distributed Environment. Theory of Computing Systems 37, 441–456 (2004)
Budanitsky, A., Hirst, G.: Semantic Distance in WordNet: An Experimental, Application-Oriented Evaluation of Five Measures. In: Workshop on WordNet and Other Lexical Resources, Second Meeting of the North American Chapter of the Association for Computational Linguistics (2001)
Caseau, Y.: Efficient Handling of Multiple Inheritance Hierarchies. In: OOPSLA 1993, pp. 271–287 (1993)
Caseau, Y., Habib, M., Nourine, L., Raynaud, O.: Encoding of Multiple Inheritance Hierarchies and Partial Orders. Computational Intelligence 15, 50–63 (1999)
Christophides, V., Plexousakis, D., Scholl, M., Tourtounis, S.: On Labeling Schemes for the Semantic Web. In: Proceedings of the 13th World Wide Web Conference, pp. 544–555 (2003)
Kaplan, H., Milo, T., Ronen, S.: A Comparison of Labeling Schemes for Ancestor Queries. In: Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 954–963 (2000)
Krall, A., Vitek, J., Horspoool, R.N.: Near Optimal Hierarchical Encoding of Types. In: Aksit, M., Matsuoka, S. (eds.) ECOOP 1997. LNCS, vol. 1241, pp. 128–145. Springer, Heidelberg (1997)
Maganaraki, A., Alexaki, S., Christophides, V., Plexousakis, D.: Benchmarking rdf Schemas for the Semantic Web. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 132–147. Springer, Heidelberg (2002)
Miller, G.: Wordnet: An On-Line Lexical Database. International Journal of Lexicography 3(4) (1990)
Peleg, D.: Informative Labeling Schemes for Graphs. In: Nielsen, M., Rovan, B. (eds.) MFCS 2000. LNCS, vol. 1893, pp. 579–588. Springer, Heidelberg (2000)
Resnik, P.: Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence Research 11, 95–130 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Strunjaš-Yoshikawa, S., Annexstein, F.S., Berman, K.A. (2006). Compact Encodings for All Local Path Information in Web Taxonomies with Application to WordNet. In: Wiedermann, J., Tel, G., Pokorný, J., Bieliková, M., Štuller, J. (eds) SOFSEM 2006: Theory and Practice of Computer Science. SOFSEM 2006. Lecture Notes in Computer Science, vol 3831. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11611257_49
Download citation
DOI: https://doi.org/10.1007/11611257_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31198-0
Online ISBN: 978-3-540-32217-7
eBook Packages: Computer ScienceComputer Science (R0)