Abstract
The Open PHACTS Discovery Platform aims to provide an integrated information space to advance pharmacological research in the area of drug discovery. Effective drug discovery requires comprehensive data coverage, i.e. integrating all available sources of pharmacology data. While many relevant data sources are available on the linked open data cloud, their content needs to be combined with that of commercial datasets and the licensing of these commercial datasets respected when providing access to the data. Additionally, pharmaceutical companies have built up their own extensive private data collections that they require to be included in their pharmacological dataspace. In this paper we discuss the challenges of incorporating private and commercial data into a linked dataspace: focusing on the modelling of these datasets and their interlinking. We also present the graph-based access control mechanism that ensures commercial and private datasets are only available to authorized users.
Chapter PDF
References
Alexander, K., Cyganiak, R., Hausenblas, M., Zhao, J.: Describing linked datasets with the void vocabulary. Note, W3C (March 2011), http://www.w3.org/TR/void/
Azzaoui, K., Jacoby, E., Senger, S., Rodríguez, E.C., Loza, M., Zdrazil, B., Pinto, M., Williams, A.J., de la Torre, V., Mestres, J., Pastor, M., Taboureau, O., Rarey, M., Chichester, C., Pettifer, S., Blomberg, N., Harland, L., Williams-Jones, B., Ecker, G.F.: Scientific competency questions as the basis for semantically enriched open pharmacological space development. Drug Discovery Today (to appear), http://dx.doi.org/10.1016/j.drudis.2013.05.008
Banff manifesto (May 2007), http://sourceforge.net/apps/mediawiki/bio2rdf/index.php?title=Banff_Manifesto
Berners-Lee, T.: Linked data. Technical report, W3C (2006), http://www.w3.org/DesignIssues/LinkedData.html
Callahan, A., Cruz-Toledo, J., Ansell, P., Dumontier, M.: Bio2rdf release 2: Improved coverage, interoperability and provenance of life science linked data. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 200–212. Springer, Heidelberg (2013)
Chen, B., Wild, D., Guha, R.: Pubchem as a source of polypharmacology. Journal of Chemical Information and Modeling 49(9), 2044–2055 (2009)
Cobden, M., Black, J., Gibbins, N., Carr, L., Shadbolt, N.: A research agenda for linked closed dataset. In: Proceedings of the Second International Workshop on Consuming Linked Data (COLD 2011). CEUR Workshop Proceedings, Bonn, Germany (2011)
Dalby, A., Nourse, J.G., Hounshell, W.D., Gushurst, A.K.I., Grier, D.L., Leland, B.A., Laufer, J.: Description of several chemical structure file formats used by computer programs developed at molecular design limited. Journal of Chemical Information and Modeling 32(3), 244 (1992)
Gaulton, A., Bellis, L., Chambers, J., Davies, M., Hersey, A., Light, Y., McGlinchey, S., Akhtar, R., Atkinson, F., Bento, A., Al-Lazikani, B., Michalovich, D., Overington, J.: ChEMBL: A large-scale bioactivity database for chemical biology and drug discovery. Nucleic Acids Research. Database Issue 40(D1), D1100–D1107 (2012)
Gray, A.J.G., Groth, P., Loizou, A., Askjaer, S., Brenninkmeijer, C., Burger, K., Chichester, C., Evelo, C.T., Goble, C., Harland, L., Pettifer, S., Thompson, M., Waagmeester, A., Williams, A.J.: Applying linked data approaches to pharmacology: Architectural decisions and implementation. Semantic Web Journal (to appear), http://semantic-web-journal.net/sites/default/files/swj258.pdf
Gray, A.: Dataset descriptions for the open pharmacological space. Working Draft, Open PHACTS (October 2012), http://www.openphacts.org/specs/datadesc/
Haupt, C., Waagmeester, A., Zimmerman, M., Willighagen, E.: Guidelines for exposing data as RDF in Open PHACTS. Working Draft, Open PHACTS (August 2012), http://www.openphacts.org/specs/rdfguide/
Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. In: Synthesis Lectures on the Semantic Web: Theory and Technology, 1st edn., vol. 1. Morgan & Claypool (2011)
Karapetyan, K., Tkachenko, V., Batchelor, C., Sharpe, D., Williams, A.J.: Rsc chemical validation and Standardization platform: A potential path to quality-conscious databases. In: 245th American Chemical Society National Meeting and Exposition, New Orleans, LA, USA (April 2013)
Kelder, T., van Iersel, M., Hanspers, K., Kutmon, M., Conklin, B., Evelo, C., Pico, A.: WikiPathways: building research communities on biological pathways. Nucleic Acids Research 40(D1), D1301–D1307 (2012)
Marshall, M.S., Boyce, R., Deus, H.F., Zhao, J., Willighagen, E.L., Samwald, M., Pichler, E., Hajagos, J., Prud’hommeaux, E., Stephens, S.: Emerging practices for mapping and linking life sciences data using RDF - a case series. Journal of Web Semantics 14, 2–13 (2012)
McNaught, A.: The IUPAC international chemical identifier: InChI. Chemistry International 28(6) (2006)
Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., Kanehisa, M.: Kegg: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research 27(1), 29–34 (1999)
Pence, H.E., Williams, A.: Chemspider: An online chemical information resource. Journal of Chemical Education 87(11), 1123–1124 (2010)
Schomburg, I., Chang, A., Ebeling, C., Gremse, M., Heldt, C., Huhn, G., Schomburg, D.: Brenda, the enzyme database: updates and major new developments. Nucleic Acids Research 32(Database issue), D431–D433 (2004)
Southan, C., Várkonyi, P., Muresan, S.: Quantitative assessment of the expanding complementarity between public and commercial databases of bioactive compounds. Journal of Cheminformatics 1(10) (2009)
The UniProt Consortium: Update on activities at the universal protein resource (UniProt) in 2013. Nucleic Acids Research 41(D1), D43–D47 (2013)
US Food and Drug Administration: Food and Drug Administration Substance Registration System Standard Operating Procedure, 5c edn. (June 2007), http://www.fda.gov/downloads/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/ucm127743.pdf
Vempati, U.D., Przydzial, M.J., Chung, C., Abeyruwan, S., Mir, A., Sakurai, K., Visser, U., Lemmon, V.P., Schürer, S.C.: Formalization, annotation and analysis of diverse drug and probe screening assay datasets using the BioAssay ontology (BAO). PLoS ONE 7(11), e49198+ (2012)
Wang, Y., Bolton, E., Dracheva, S., Karapetyan, K., Shoemaker, B., Suzek, T., Wang, J., Xiao, J., Zhang, J., Bryant, S.: An overview of the pubchem bioassay resource. Nucleic Acids Research 38(Database issue), D255–D266 (2010)
Williams, A.J., Harland, L., Groth, P., Pettifer, S., Chichester, C., Willighagen, E.L., Evelo, C.T., Blomberg, N., Ecker, G., Goble, C., Mons, B.: Open PHACTS: Semantic interoperability for drug discovery. Drug Discovery Today 17(21-22), 1188–1198 (2012)
Williams, A.J., Wilbanks, J., Ekins, S.: Why open drug discovery needs four simple rules for licensing data and models. PLoS Computational Biology 8(9) (September 2012)
Willighagen, E.: Encoding units and unit types in RDF using QUDT. Working Draft, Open PHACTS (June 2013)
Willighagen, E.L., Waagmeester, A., Spjuth, O., Ansell, P., Williams, A.J., Tkachenko, V., Hastings, J., Chen, B., Wild, D.J.: The ChEMBL database as linked open data. Journal of Cheminformatics 5(23) (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Goble, C. et al. (2013). Incorporating Commercial and Private Data into an Open Linked Data Platform for Drug Discovery. In: Alani, H., et al. The Semantic Web – ISWC 2013. ISWC 2013. Lecture Notes in Computer Science, vol 8219. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41338-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-41338-4_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41337-7
Online ISBN: 978-3-642-41338-4
eBook Packages: Computer ScienceComputer Science (R0)