Implementing LOD Surfer as a Search System for the Annotation of Multiple Protein Sequence Alignment
Many life science databases have been provided as Linked Open Data (LOD). To promote the utilization of these databases, we had developed a method that can be referred to as LOD Surfer, that employed federated query search along a path of class–class relationships. In this study, we developed a specified version of the LOD Surfer for the annotation of multiple protein sequence alignment. The system comprised a web application programming interface (API) and a client system for the API. The web API provides a list of classes, and a list of paths between the classes that are specified by a user. The client presents the list of classes and the list of paths obtained from the API and assists a user in selecting classes and paths to acquire the required annotation of proteins. Additionally, the client system generates SPARQL queries to execute a federated query search for a selected path. During the development of the system, we can observe that (1) the client system should display some instances with human readable information because class selection is not an easy task for biological researchers, and (2) it is preferable that the client system stores paths that are selected by a user for reuse by other users because path selection may be time consuming at times and because the selected paths may be valuable for other researchers.
KeywordsLinked Open Data Class-class relationships Multiple protein sequence alignment Database integration in life sciences
The authors would thank members of LOD Surfer project including Kouji Kozaki, Osaka University, Norio Kobayashi and Hiroshi Masuya, RIKEN, and Yasunori Yamamoto, DBCLS, for valuable discussion. This work was supported by JSPS KAKENHI grant numbers 17K00434 and by the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency (JST).
- 1.Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web: Theory and Technology, 1st edn., vol. 1, no. 1, pp. 1–136. Morgan & Claypool (2011)Google Scholar
- 2.Heim, P., Hellmann, S., Lehmann, J., Lohmann, S., Stegemann, T.: RelFinder: revealing relationships in RDF knowledge bases. In: Chua, T.-S., Kompatsiaris, Y., Mérialdo, B., Haas, W., Thallinger, G., Bailer, W. (eds.) SAMT 2009. LNCS, vol. 5887, pp. 182–187. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10543-2_21CrossRefGoogle Scholar
- 3.Yamaguchi, A., Kozaki, K., Yamamoto, Y., Masuya, H., Kobayashi, N.: Semantic graph analysis for federated LOD surfing in life sciences. In: Wang, Z., Turhan, A.-Y., Wang, K., Zhang, X. (eds.) JIST 2017. LNCS, vol. 10675, pp. 268–276. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70682-5_18CrossRefGoogle Scholar
- 6.Yamaguchi, A., Kozaki, K., Lenz, K., Yamamoto, Y., Masuya, H., Kobayashi, N.: Semantic data acquisition by traversing class–class relationships over linked open data. In: Li, Y.-F., et al. (eds.) JIST 2016. LNCS, vol. 10055, pp. 136–151. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50112-3_11CrossRefGoogle Scholar
- 7.Yamamoto, Y., Yamaguchi, A., Splendiani, A.: YummyData: providing high-quality open life science data. Database, 2018 (2018). https://doi.org/10.1093/database/bay022