Abstract
We propose an approach for searching large RDF graphs, using advanced vector space models, and in particular, Random Indexing (RI). We first generate documents from an RDF Graph, and then index them using RI in order to generate a semantic index, which is then used to find similarities between graph nodes. We have experimented with large RDF graphs in the domain of life sciences and engaged the domain experts in two stages: firstly, to generate a set of keywords of interest to them, and secondly to judge on the quality of the output of the Random Indexing method, which generated a set of similar terms (literals and URIs) for each keyword of interest.
Chapter PDF
Similar content being viewed by others
References
Assel, M., Cheptsov, A., Czink, B., Damljanovic, D., Quesada, J.: MPI Realization of High Performance Search for Querying Large RDF Graphs using Statistical Semantics. In: García-Castro, R., et al. (eds.) ESWC 2011 Workshops. LNCS, vol. 7117, pp. 156–171. Springer, Heidelberg (2011)
Cheng, G., Ge, W., Qu, Y.: Falcons: Searching and Browsing Entities on the Semantic Web. In: Proceedings of WWW 2008, pp. 1101–1102 (2008)
Cohen, T., Schvaneveldt, R., Widdows, D.: Reflective random indexing and indirect inference: A scalable method for discovery of implicit connections. Journal of Biomedical Informatics (2009)
Cohen, T.: Exploring medline space with random indexing and pathfinder networks. In: AMIA.. Annual Symposium proceedings / AMIA Symposium, pp. 126–130 (2008)
Croft, B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice, 1st edn. Addison Wesley (2009)
Damljanovic, D., Petrak, J., Cunningham, H.: Random Indexing for Searching Large RDF Graphs. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010. LNCS, vol. 6088, pp. 1–32. Springer, Heidelberg (2010)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)
Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R.S., Peng, Y., Reddivari, P., Doshi, V., Sachs, J.: Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the 13th ACM International Conference on Information and Knowledge Management, pp. 652–659. ACM, New York (2004)
Eugenio, B.D., Glass, M.: The kappa statistic: a second look. Computational Linguistics 1(30) (2004) (squib)
Franz, T., Schultz, A., Sizov, S., Staab, S.: TripleRank: Ranking semantic web data by tensor decomposition. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 213–228. Springer, Heidelberg (2009), http://dx.doi.org/10.1007/978-3-642-04930-9_14
Hogan, A., Harth, A., Decker, S.: Reconrank: A scalable ranking method for semantic web data with context. In: Second International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2006), Athens, GA, USA (2006)
Hripcsak, G., Heitjan, D.: Measuring agreement in medical informatics reliability studies. Journal of Biomedical Informatics 35, 99–110 (2002)
Johnson, W.B., Lindenstrauss, J.: Extensions to lipschiz mapping into hilbert space. Contemporary Mathematics 26 (1984)
Karlgren, J., Sahlgren, M.: From words to understanding. In: Uesaka, Y., Kanerva, P., Asoh, H. (eds.) Foundations of Real-World Intelligence, pp. 294–308. CSLI Publications, Stanford (2001)
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C Recommendation. W3C (January 15, 2008)
Qu, Y., Hu, W., Cheng, G.: Constructing virtual documents for ontology matching. In: Proceedings of WWW 2006, pp. 23–31 (2006)
Quesada, J., Brandao-Vidal, R., Schooler, L.: Random indexing spaces for bridging the human and data webs. In: d’Amato, C., Fanizzi, N., Grobelnik, M., Lawrynowicz, A., Svatek, V. (eds.) IRMLeS 2010: The 2nd ESWC Workshop on Inductive Reasoning and Machine Learning for the Semantic Web (2010)
Sahlgren, M.: An introduction to random indexing. In: Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE 2005. Citeseer (2005)
Tummarello, G., Delbru, R., Oren, E.: Sindice.com: Weaving the Open Linked Data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 552–565. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Damljanovic, D. et al. (2012). Random Indexing for Finding Similar Nodes within Large RDF Graphs. In: García-Castro, R., Fensel, D., Antoniou, G. (eds) The Semantic Web: ESWC 2011 Workshops. ESWC 2011. Lecture Notes in Computer Science, vol 7117. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25953-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-25953-1_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25952-4
Online ISBN: 978-3-642-25953-1
eBook Packages: Computer ScienceComputer Science (R0)