Medicago truncatula transporter database: a comprehensive database resource for M. truncatula transporters
Medicago truncatula has been chosen as a model species for genomic studies. It is closely related to an important legume, alfalfa. Transporters are a large group of membrane-spanning proteins. They deliver essential nutrients, eject waste products, and assist the cell in sensing environmental conditions by forming a complex system of pumps and channels. Although studies have effectively characterized individual M. truncatula transporters in several databases, until now there has been no available systematic database that includes all transporters in M. truncatula.
The M. truncatula transporter database (MTDB) contains comprehensive information on the transporters in M. truncatula. Based on the TransportTP method, we have presented a novel prediction pipeline. A total of 3,665 putative transporters have been annotated based on International Medicago Genome Annotated Group (IMGAG) V3.5 V3 and the M. truncatula Gene Index (MTGI) V10.0 releases and assigned to 162 families according to the transporter classification system. These families were further classified into seven types according to their transport mode and energy coupling mechanism. Extensive annotations referring to each protein were generated, including basic protein function, expressed sequence tag (EST) mapping, genome locus, three-dimensional template prediction, transmembrane segment, and domain annotation. A chromosome distribution map and text-based Basic Local Alignment Search Tools were also created. In addition, we have provided a way to explore the expression of putative M. truncatula transporter genes under stress treatments.
In summary, the MTDB enables the exploration and comparative analysis of putative transporters in M. truncatula. A user-friendly web interface and regular updates make MTDB valuable to researchers in related fields. The MTDB is freely available now to all users at http://bioinformatics.cau.edu.cn/MtTransporter/.
KeywordsProtein Data Bank Transporter Gene Basic Local Alignment Search Tool Basic Local Alignment Search Tool Search Tentative Consensus
Medicago truncatula is closely related to an important forage legume, alfalfa. Because of its advantageous characteristics such as small size, short generation time, self-fertility, and diploid genome, M. truncatula has been used as a model species in genomic studies [1, 2]. Arabidopsis thaliana is a model plant whose genome was sequenced by an international consortium and is well annotated. Very high sequence identity exists between genes from M. truncatula and their counterparts from alfalfa (98.7% at the amino acid level for isoflavone reductase and 99.1% at the amino acid level for vestitone reductase), so it serves as a genetically tractable model for alfalfa, which is tetraploid. In addition to alfalfa, M. truncatula can act as a model organism for economically important legumes such as soybeans . Second only to the grass family, the legume family is important to humans as a source of food, feed for livestock, and raw materials for industry . In a symbiotic association with rhizobia, legumes supply their own nitrogen by reducing N2 to NH3. This mutually beneficial association supplies a free and renewable source of available nitrogen for legumes and other crops . By establishing symbiosis with mycorrhizal fungi, legumes also help the plant obtain phosphorous and other nutrients from the soil .
Transporters represent a large and diverse group of membrane-spanning proteins. They deliver essential nutrients, eject waste products, and assist the cell in sensing environmental conditions by forming a complex system of pumps and channels. Differences in membrane topology, energy-coupling mechanisms, and substrate specificities are present. Numerous studies have demonstrated that transporters play indispensable roles in the fundamental cellular processes of all organisms . In addition, transporters provide pathogenic bacteria with resistance to antibiotics and provide cancer cells with resistance to chemotherapies. Systematic studies have been performed to identify and characterize the transporters in a variety of plant species, such as Arabidopsis and rice. With the assistance of databases containing known and characterized transport proteins, transporters in new species are identifiable and classified via sequence similarity. Perhaps the most comprehensive of these databases is the Transporter Classification Database (TCDB), which contains a large group of functionally characterized transporters. It also achieves the purpose of categorizing new transporters into families and subfamilies based on molecular, evolutionary, and functional properties [8, 9].
However, although studies have characterized individual M. truncatula transporters in several databases, there has been no systematic database that includes all transporters in M. truncatula. Extensive cDNA and genomic DNA sequencing of several legume species (e.g., M. truncatula, soybeans, and Lotus japonicas) have been implemented over the past few years and have enabled an interesting model system to analyze whole-genome transporters [10, 11, 12, 13]. The genomic sequence of M. truncatula is being annotated by the International Medicago Genome Annotated Group (IMGAG), which described 47,529 genes in its version 3.5v3 of the genome sequence http://www.medicagohapmap.org/downloads_genome/Mt3.5/. Additional resources relevant to Medicago functional genomics include the Medicago genome portal at the Noble Foundation , which provides final annotation analysis results on Medicago genes. To help researchers interested in M. truncatula transport proteins, we report the development of the M. truncatula transporter database (MTDB), which contains information about M. truncatula transporters derived from a comparison to the protein sequences of TCDB and A. thaliana, the most well-studied genetic model plant. This archives 3,665 putative M. truncatula transport proteins belonging to 162 families. This represents 7.5% of all predicted proteins in Medicago and is in line with what has been found in other plant species. For example, transporter genes account for 4.6% of all Arabidopsis genes and 5% of all rice genes [16, 17]. The aim of the MTDB is to present the comprehensive transporter profiles of sequenced M. truncatula, as well as to provide comparative and phylogenetic trees to view, search, and compare the transporter data in an easy-to-navigate format.
Construction and content
Genome sequence data acquisition
Protein sequences of M. truncatula and their annotations were derived from the IMGAG. Transport protein sequences of A. thaliana and their annotations came from TransportDB . Our transporter data were downloaded from the TCDB web site in March 2011 and contained 6,068 transporters. Pfam annotations came from the Pfam database, version 24.0. Three-dimensional (3D) structure annotations were provided by the Protein Data Bank (PDB). Medicago transporter annotations based on the IMGAG V3.5 V3 were derived from the Medicago genome portal at the Noble Foundation.
Identification of putative transporters
We also mapped Medicago EST data (M. truncatula Gene Index [MTGI] version 10.0) onto the 3,598 putative transport proteins using mutual BLASTn. We set the e-value cut-off at 10-30 when we used Perl scripts to analyze the BLASTn search results. Of the 68,848 Tentative Consensus (TC) sequences and singletons in the EST database, 6,623 encoded proteins similar to our putative transport proteins.
Transport proteins in Medicago truncatula transporter database (MTDB) and classification according to Transporter Classification Database (TCDB) classes.
2. Electrochemical potential-driven transporters
3. Primary active transporters
4. Group translocators
5. Transmembrane electron carriers
8. Accessory factors involved in transport
9. Incompletely characterized transport systems
We constructed and configured MTDB upon a typical LAMP (linux + Apache + MySQL + PHP) platform. The data set was stored in MySQL 4.1 http://www.mysql.com and a web interface was achieved using PHP scripts (PHP version 4.4; http://www.php.net) on Red Hat Linux, powered by an Apache sever. Schema of this database consists of five tables of the current version of MTDB [Additional file 1]. Table pro stores whole genome transport protein predictions and expressed sequence tag (EST) mapping data; table domain stores data related to the protein domain annotation predictions by Pfam; table tmhmm stores data related to transmembrane segment prediction by TMHMM; and table structure stores the experimentally determined 3D structures of membrane transporters. An additional table, express, stores information on the expression of putative M. truncatula transporter genes under stress treatments.
Utility and discussion
Comparative tools and references
A map containing gene loci located on the chromosomes was generated and visualized using GenomePixelizer , which gives users a direct view of the distribution of M. truncatula's putative transporter genes on chromosomes and is especially useful in observing tandem duplications [Additional file 3]. The sections are included in the advanced tools function: structure, transmembrane segment, and expression. The "structure" section has been added to MTDB and describes putative transporters that have sequence homology with experimentally determined 3D structures. We used the transport protein sequences of M. truncatula to conduct a Position-Specific Iterative (PSI)-BLAST search with protein sequences provided by the PDB. We set the maximum number of iterations at three and the e-value cut off at 0.001 (PSI-BLAST-based method). In addition, we used the FFAS  tools to filter the (PSI)-BLAST results. Lower (more negative) FFAS scores indicate stronger similarity. FFAS scores lower than -9.5 are expected to contain less than 3% of false positives as indicated by comprehensive benchmarks of known structures. A total of 1,950 putative transporters were represented by structures in the PDB. Links to the PDB and MTDB are also provided.
In the "transmembrane segment" section, users can submit a single protein sequence to the service at http://www.cbs.dtu.dk/services/TMHMM/, and then TMHMM outputs statistics and a list of the locations of the predicted transmembrane helices and the predicted locations of the intervening loop regions. This information can be shown graphically.
Mapping of probe sets onto transporter genes
The consensus sequence of each probe set was provided by Affymetrix . A total of 3,665 predicted transporter coding sequences were matched to probe sets using mutual BLASTn. The annotation of the best match was assigned to the probe set (best BLAST hit method). We set the e-value cut-off as 10-4 and the length of the high-scoring segment pair to be longer than 100 bp then we used Perl and BioPerl scripts to analyze the BLAST search results (Mapping method in MtED). In total, of the 3,665 putative transporters, 2,039 were represented by probe sets on the Affymetrix Medicago GeneChip. Probe sets mapping information for all identified transporters were imported into the MTDB database to facilitate web searches and displays.
Microarray expression value
We further provided a way to explore the expression of putative M. truncatula transporter genes under stress treatments. To explore the expression of M. truncatula transporter genes, we retrieved and categorized microarray expression data from Gene Expression Omnibus (GEO). We picked up two independent GEO series, GSE13921 and GSE27991. GSE13921 was provided by MtED which includes functional category analysis, some querying and maps tools, and tools for the comparison and visualization of expression profiles. We mainly used MtED's data because of its high quality and experimental continuity. MtED collects roots at 0 h, 6 h, 24 h, and 48 h after salt stress for microarray experiments. The expression of probe sets at one time point changed more or less than two-fold versus 0 h and was described as up-regulated or down-regulated, respectively. Based on the result obtained from express information analysis, at 6-h stress, 47.7% of the transporter genes (972) were up-regulated and 52.2% of the transporter genes (1,064) were down-regulated. At 24-h stress, 49.6% of the transporter genes (1,011) were up-regulated and 50.4% of the transporter genes (1,028) were down-regulated. At 48-h stress, 50.3% of the transporter genes (1,026) were up-regulated and 49.5% of the transporter genes (1,009) were down-regulated. Besides, another GEO series, GSE27991, collects expression data of M. truncatula roots treated with auxin transport inhibitors. We made pairwise comparisons within each series grown under same condition respectively and users can directly inspect gene expression values by searching any one of the MtID. Each result contains links to the experiment page, which provides users with the expression curve graphs and other annotation links.
The M. truncatula Gene Expression Atlas (MtGEA) is a comprehensive platform that provides complete transcription profiles of all major organ systems of M. truncatula. We included suitable links to MtGEA on the expression page and transporter detail page so that users can readily examine transcriptome information for their probe set of interest in the MTDB.
In the future, we will continue to incorporate new expression information. Regular update and relative analysis will provide user up-to-data transporter expression information.
MTDB was developed as a relational database for the comprehensive representation of M. truncatula transporter systems. As the M. truncatula genome is currently being annotated by an international consortium, available information on this model legume (including sequences, 3D structures, expression and pathway information) will become more comprehensive and accurate. MTDB will be routinely updated monthly with new annotation information.
In summary, we built a local database called MTDB that was constructed in the PHP scripting language as a MySQL relational database system based on a Linux server. The MTDB is the first convenient web-based index database concerning transporters in the model legume M. truncatula. It will assist searchers in related fields by providing comprehensive information on transporter gene families and members of these families. The MTDB enables the exploration and comparative analysis of putative transporters in M. truncatula. A total of 3,665 putative transport proteins have been annotated and assigned to 162 families according to the TC classification system. These families are further classified into seven types according to their transport mode and energy-coupling mechanism. Both manual management and automated searches were achieved for the identification of putative protein sequences. Extensive annotations referring to each protein were generated, including basic protein function, genome locus, sequence annotations, EST mapping results, 3D template predictions, transmembrane segments, and domain annotation. A chromosome distribution map and text-based and BLAST search tools against known sequences of M. truncatula were also created. A user-friendly web interface and regular updates make MTDB valuable to researchers in related fields. We further provided a way to explore expression of M. truncatula transporter genes under stress treatments. The MTDB is freely available now to all users at http://bioinformatics.cau.edu.cn/MtTransporter/.
Availability and requirements
We would like to thank Wenying Xu and Zhou Du for their assistance in the construction of the distribution map and for their helpful feedback on other aspects of the MTDB web site. This work was supported by grants from the Ministry of Science and Technology of China (2012CB215300) and the Ministry of Education of China (NCET-09-0735).
- 1.David G, Barker SB, Blondon Franqois, Yvette , Dattée GD, Essad Sadi, Flament Pascal, Philippe , Gallusci GG, Guy Pierre, Muel Xavier, Jacques , Tourneur JDaTH: Medicago truncatula, a model plant for studying the molecular genetics of the Rhizobium-legume symbiosis. Plant Molecular Biology Reporter. 1990, 8: 40-49. 10.1007/BF02668879.CrossRefGoogle Scholar
- 6.Smith SE, Read DJ: Mycorrhizal Symbiosis. 2008, San Diego: Academic PressGoogle Scholar
- 9.Saier MH, Tran CV, Barabote RD: TCDB: the Transporter Classification Database for membrane transport protein analyses and information. Nucleic Acids Res. 2006, D181-186. 34 Database issueGoogle Scholar
- 14.Medicago Sequencing Resources. [http://www.medicago.org/]
- 15.The Medicago genome portal at the Noble Foundation. [http://bioinfo3.noble.org/medicago/MT3.5/]
- 18.Ren Q, Kang KH, Paulsen IT: TransportDB: a relational database of cellular membrane transport systems. Nucleic Acids Res. 2004, D284-288. 32 Database issueGoogle Scholar
- 19.Pfam database. [ftp://ftp.sanger.ac.uk/pub/databases/Pfam]
- 22.Chukkapalli G, Guda C, Subramaniam S: SledgeHMMER: a web server for batch searching the Pfam database. Nucleic Acids Res. 2004, W542-544. 32 Web Server issueGoogle Scholar
- 23.BioPerl. [http://www.bioperl.org/]
- 27.MTGI version 10.0. [http://compbio.dfci.harvard.edu/tgi/cgi-bin/tgi/gimain.pl?gudb=medicago/]
- 31.Jaroszewski L, Rychlewski L, Li Z, Li W, Godzik A: FFAS03: a server for profile--profile sequence alignments. Nucleic Acids Res. 2005, W284-288. 33 Web Server issueGoogle Scholar
- 32.Affymetrix GeneChip Medicago Genome Array. [http://www.affymetrix.com/products_services/arrays/specific/medicago.affx]
- 35.Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R: NCBI GEO: mining tens of millions of expression profiles--database and tools update. Nucleic Acids Res. 2007, D760-765. 35 Database issueGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.