Pathway annotation and analysis with Reactome: The solute carrier class of membrane transporters
- 1k Downloads
Reactome is an expert-authored, peer-reviewed knowledge base of human reactions and pathways that functions as a data-mining resource and electronic textbook. Its current release covers approximately 23 per cent of the complete human proteome from UniProt. The pathway browser, search and data-mining tools facilitate searching and visualising pathway data and the analysis of user-supplied high-throughput datasets. A catalogue of all the solute-carrier (SLC) class of transporters which have known ligands has been annotated in Reactome. Reactome provides a detailed and interactive view of this set of transport reactions. Using the example of the SLC class of transporters, we show how they can be overlaid with protein-protein interaction, protein-drug interaction and gene expression data and compared with equivalent pathways in other species, to facilitate over-representation, expression and other pathway analyses.
Keywordssolute transport protein-ligand interaction protein-protein interaction pathway analysis gene expression analysis
Reactome (http://www.reactome.org) is a freely available and open-source database of biological pathways [1, 2]. Expert scientists curate information into Reactome, which is then peer-reviewed to obtain a consensus representation of the process or pathway. The data are extensively cross-linked to major protein and nucleotide sequence databases, as well as to the Gene Ontology and PubMed databases. A new website for Reactome was recently released. This includes new functionality for interacting with curated pathways and analysing them with linked or user-supplied datasets. Tools in this new version of Reactome allow users to overlay interaction or expression data, further enriching the pathway information. The results of these analyses can then be downloaded in a range of formats.
Curated SLC-mediated transmembrane transport pathway in Reactome
We have recently completed a catalogue of the solute carrier (SLC) class of transmembrane transporters. These proteins are well conserved in all eukaryotes, as well as most prokaryotes, and play a gate-keeping role for cells and organelles, controlling the uptake and efflux of many types of substrates, such as sugars, inorganic cations and anions, organic anions and carboxylates, amino acids and oligopeptides, fatty acids and lipids, and neurotrans-mitters and vitamins. The SLC superfamily comprises 55 gene families, with 362 putatively functional protein-coding genes reported [3, 4]. Of these, 231 with the criterion that the transporter has a substrate which it transports across the membrane have been catalogued in Reactome. The remainder are orphan proteins, with no characterised substrates at this time. This pathway will be used as the basis for describing the use of Reactome's analysis tools.
Reactome's website can be viewed on PCs, Macs or Linux computers with later versions of the Internet Explorer (IE), Firefox or Safari browsers. Reactome's whole content can be downloaded as a mysql datadump (http://www.reactome.org/download/index.html). Other Reactome datasets and code that can be downloaded can be found here too.
Briefly, a query was constructed in the Universal Protein Resource (UniProt) (http://www.uniprot.org/). UniProt is a comprehensive resource for protein sequence and annotation data . Reactome uses UniProt identifiers as the primary reference for proteins used in their database. The query searched for manually annotated and reviewed human SLC transporters.
gene: SLC* AND organism:human AND reviewed:yes
This query returned 367 results, the 362 proteins reviewed by He et al. and five newly characterised ones. These combined results were used as the basis of the information entered into Reactome. Transporters for which substrates were experimentally identified were catalogued in Reactome using an in-house graphical user interface (GUI) called the Curator Tool, an interface which allows curators to structure data around Reactome's data model and commit to a central repository . The base unit of Reactome is the reaction. The basic set of attributes of a reaction that are captured are details of the input and output molecule(s), the modulating protein(s), compartments for these entities, supporting literature reference(s), a textual summary describing this reaction and the species (eg human).
Tools in Reactome
Reactome is a freely available database of pathways. The SLC family of transporters plays a vital role in mediating the movement of essential metals, ions, drugs and many endogenous compounds into and out of the cell and cellular organelles. Information about the SLC family of transporters has been systematically annotated in Reactome and this provides a basis for a number of analyses which can be performed on these data. These analyses include interactions, expression data, over-representation analysis and species comparison. The results of such analyses can be starting points for further investigations using systems biology. The value of the database to users should continue to grow as additional pathways are annotated and new software for data analysis and integration are developed. Work is now under way to improve the visual overview of expression data and provide closer integration with Cytoscape (http://www.cytoscape.org/). Cytoscape is an open-source platform for the visualisation and data integration of biological pathways and networks. Tools are being developed to support additional analysis of interactors, including functional interactors; to pull data from other omics sources, such as expression data or transcription factors; and to support integration with medical data.
Development of the Reactome data model and database is a collaborative project and this work benefited greatly from interactions with fellow Reactome team members (http://www.reactome.org/about.html), as well as colleagues at the EBI from GO, ChEBI, ChEMBL, UniProt, ArrayExpress and IntAct. This work was supported by grants from the US National Human Genome Research Institute, NIH (P41 HG003751) and the EU 6th Framework Programme 'ENFIN' (LSHG-CT-2005-518254).