Efficiently Finding Paths Between Classes to Build a SPARQL Query for Life-Science Databases

  • Atsuko YamaguchiEmail author
  • Kouji Kozaki
  • Kai Lenz
  • Hongyan Wu
  • Yasunori Yamamoto
  • Norio Kobayashi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9544)


Many databases in life science are provided in Resource Description Framework (RDF) model with SPARQL Protocol and RDF Query Language (SPARQL) endpoints. However, it may be difficult for users who are not familiar with Semantic Web technologies to write a SPARQL query. Therefore, assisting users to build SPARQL queries is important task to expand the range of users of RDF databases. We developed a web application called SPARQL Builder ( that enables users to access life-science RDF datasets by assisting them in writing SPARQL queries. One of the key technologies used in SPARQL Builder is to extract possible relationships in an RDF dataset between two classes of input and output data. We express such relationships by paths on a labeled graph called class graph representing class–predicate–class relations in a dataset. In addition, we present an efficient algorithm to compute all the possible paths between two classes on a class graph. To show the performance of the proposed algorithm, we compared our algorithm with a naive method using RDF datasets of various class sizes and confirmed that our algorithm runs much faster when the numbers of classes and relations are relatively large.


Semantic web SPARQL Intelligent query generation Database integration Life-science databases 



This work was supported by JSPS KAKENHI Grant Number 25280081, 24120002 and the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency (JST).


  1. 1.
  2. 2.
    The UniProt Consortium: Reorganizing the protein space at the Universal ProteinResource (UniProt). Nucl. Acids Res. 40(D1), D71–D75 (2012)CrossRefGoogle Scholar
  3. 3.
    Belleau, F., Nolin, M.A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)CrossRefGoogle Scholar
  4. 4.
    Jupp, S., Malone, J., Bolleman, J., Brandizi, M., Davies, M., Garcia, L., Gaulton, A., Gehant, S., Laibe, C., Redaschi, N., Wimalaratne, S.M., Martin, M., Le Novére, N., Parkinson, H., Birney, E., Jenkinson, A.M.: The EBI RDF platform: linked open data for the life sciences. Bioinform. 30(9), 1338–1339 (2014)CrossRefGoogle Scholar
  5. 5.
    Yamaguchi, A., Kozaki, K., Lenz, K., Wu, H., Kobayashi, N.: An Intelligent SPARQL Query Builder for Exploration of Various Life-science Databases. In: CEUR Workshop Proceedings 1279 of the 3rd International Workshop on Intelligent Exploration of Semantic Data (IESD 2014), Riva del Garda, Italy.(1279)Google Scholar
  6. 6.
    Ferré, S., Hermann, A.: Reconciling faceted search and query languages for the semantic web. IJMSO 7(1), 37–54 (2012)CrossRefGoogle Scholar
  7. 7.
    Guyonvarch, J., Ferré, S.: Scalewelis: a scalable query-based faceted search elena work. Multilingual Question Answering over Linked Data (QALD-3), Valencia, Spain (2013)Google Scholar
  8. 8.
    Russell, A., Smart, P.R., Braines, D., Shadbolt, N.R.: NITELIGHT: a graphical tool for semantic query construction. In: Semantic Web User Interaction Workshop (SWUI 2008), Florence, Italy (2008)Google Scholar
  9. 9.
    Hogenboom, F., Milea, V., Frasincar, F., Kaymak, U.: RDF-GL: A SPARQL-based graphical query language for RDF. In: Chbeir, R., Badr, Y., Abraham, A., Hassanien, A.-E. (eds.) Emergent Web Intelligence: Advanced Information Retrieval, pp. 87–116. Springer, London (2010)CrossRefGoogle Scholar
  10. 10.
    Popov, I.O., Schraefel, M.C., Hall, W., Shadbolt, N.: Connecting the dots: a multi-pivot approach to data exploration. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 553–568. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  11. 11.
    Kozaki, K., Hirota, T., Mizoguchi, R.: Understanding an ontology through divergent exploration. In: Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., Leenheer, P., Pan, J., Antoniou, G. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 305–320. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  12. 12.
    Li, F., Le, W., Duan, S., Kementsietsidis, A.: Scalable keyword search on large RDF data. IEEE Trans. Knowl. Data Eng. 26, 2774–2788 (2014). doi: 10.1109/TKDE.2014.2302294 CrossRefGoogle Scholar
  13. 13.
    Tran, T., Ladwig, G., Rudolph, S.: Managing structured and semistructured RDF data using structure indexes. IEEE Trans. Knowl. Data Eng. 25(9), 2076–2089 (2013)CrossRefGoogle Scholar
  14. 14.
    Kobayashi, N., Toyoda, T.: BioSPARQL: ontology-based smart building of SPARQL queries for biological linked open data. In: SWAT4LS, pp. 47–49, London, UK (2011)Google Scholar
  15. 15.
    Grossi, R.: Enumeration of paths, cycles, and spanning trees. In: Kao, M.Y. (ed.) Encyclopedia of Algorithms, pp. 1–7. Springer, New York (2015)CrossRefGoogle Scholar
  16. 16.
    Zhang, H., Li, Y., Tan, H.B.K.: Measuring design complexity of semantic web ontologies. J. Syst. Softw. 83, 803–814 (2010)CrossRefGoogle Scholar
  17. 17.
    Yamamoto, Y., Yamaguchi, A., Bono, H., Takagi, T.: Allie: a database and a search service of abbreviations and long forms. Database (2011). doi: 10.1093/database/bar013 Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Atsuko Yamaguchi
    • 1
    Email author
  • Kouji Kozaki
    • 2
  • Kai Lenz
    • 3
  • Hongyan Wu
    • 1
  • Yasunori Yamamoto
    • 1
  • Norio Kobayashi
    • 3
  1. 1.Database Center for Life Science (DBCLS)Research Organization of Information and SystemsKashiwaJapan
  2. 2.The Institute of Scientific and Industrial Research (ISIR)Osaka UniversityOsakaJapan
  3. 3.Advanced Center for Computing and Communication (ACCC)RIKENWakoJapan

Personalised recommendations