Abstract
Membrane protein is the prime constituent of a cell, which performs a role of mediator between intra and extracellular processes. The prediction of transmembrane (TM) helix and its topology provides essential information regarding the function and structure of membrane proteins. However, prediction of TM helix and its topology is a challenging issue in bioinformatics and computational biology due to experimental complexities and lack of its established structures. Therefore, the location and orientation of TM helix segments are predicted from topogenic sequences. In this regard, we propose WRF-TMH model for effectively predicting TM helix segments. In this model, information is extracted from membrane protein sequences using compositional index and physicochemical properties. The redundant and irrelevant features are eliminated through singular value decomposition. The selected features provided by these feature extraction strategies are then fused to develop a hybrid model. Weighted random forest is adopted as a classification approach. We have used two benchmark datasets including low and high-resolution datasets. tenfold cross validation is employed to assess the performance of WRF-TMH model at different levels including per protein, per segment, and per residue. The success rates of WRF-TMH model are quite promising and are the best reported so far on the same datasets. It is observed that WRF-TMH model might play a substantial role, and will provide essential information for further structural and functional studies on membrane proteins. The accompanied web predictor is accessible at http://111.68.99.218/WRF-TMH/.
Similar content being viewed by others
References
Afridi TH, Khan A, Lee YS (2012) Mito-GSAAC: mitochondria prediction using genetic ensemble classifier and split amino acid composition. Amino Acids 42:1443–1453
Amico M, Finelli M, Rossi I (2006) PONGO: a web server for multiple predictions of all-alpha transmembrane proteins. Nucleic Acids Res 34:W169–W172
Arai M, Mitsuke H, Ikeda M, Xia JX, Kikuchi T, Satake M, Shimizu T (2004) Con Pred II: a consensus prediction method for obtaining transmembrane topology models with high reliability. Nucleic Acids Res 32:W390–W393
Argos P, Rao J, Hargrave P (1982) Structural prediction of membrane bound proteins. Eur J Biochem 128:565–575
Bagos P, Liakopoulos T, Hamodrakas S (2006) Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins. BMC Bioinform 7:189
Bairoch A, Apweiler R (1997) The SWISS-PROT protein sequence database: its relevance to human molecular medical research. J Mol Med 5:312–316
Berman H, Westbrook J, Feng Z, Gilliland G, Bhat T (2000) Nucleic Acids Res 28:235–242
Bordner A (2009) Predicting protein–protein binding sites in membrane proteins. BMC Bioinform 24(10):312
Bush WS, Edwards TS, Dudek SM, Mckinney BA, Ritchie MD (2008) Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction. BMC Bioinform 9:238–254
Chen CP, Kernytsky A, Rost B (2002) Transmembrane helix predictions revisited. Protein Sci 11:2774–2791
Claros MG, Von Heijne G (1994) TopPred II: an improved software for membrane protein structure predictions. Comput Appl Biosci 10:685–686
Cserzo M, Wallin E, Simon I, Von Heijne G, Elofsson A (1997) Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng Des Sel 10:673–676
Cserzo M, Eisenhaber F, Eisenhaber B, Simon I (2004) TM or not TM: transmembrane protein prediction with low false positive rate using DASTMfilter. Bioinformatics 20:136–137
Cuthbertson JM, Doyle DA, Sansom MS (2005) Transmembrane helix prediction: a comparative evaluation and analysis. Protein Eng Des Sel 18:295–308
Deber C, Wang C, Liu L, Prior A, Agrawal S, Muskat B, Cuticchia A (2001) TM finder: a prediction program for transmembrane protein segments using a combination of hydrophobicity and nonpolar phase helicity scales. Protein Sci 10:212–219
Eisenberg D, Weiss RM, Terwilliger TC (1982) The helical hydrophobic moment: a measure of the amphipathicity of a helix. Nature 299:371–374
Hayat M, Khan A (2012) Mem-PHybrid: hybrid features based prediction system for classifying membrane protein types. Anal Biochem 424:35–44
Hayat M, Khan A, Yeasin M (2012) Prediction of membrane proteins using split amino acid composition and ensemble classification. Amino Acids 42:2447–2460
Hirokawa T, Boon-Chieng S, Mitaku S (1998) SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics 14:378–379
Hosseini SR, Sadeghi M, Pezeshk H, Eslahchi C, Habibi M (2008) Prosign: a method for protein secondary structure assignment based on three-dimensional coordinates of consecutive c(alpha) atoms. Comput Biol Chem 32(6):406–411
Ikeda M, Arai M, Lao DM, Shimizu T (2002) Transmembrane topology prediction methods: a re-assessment and improvement by a consensus method using a dataset of experimentally-characterized transmembrane topologies. In Silico Biol 2:19–33
Jayasinghe S, Hristova K, White SH (2001a) MPtopo: a database of membrane protein topology. Protein Sci 10:455–458
Jayasinghe S, Hristova K, White SH (2001b) Energetics, stability, and prediction of transmembrane helices. J Mol Biol 312:927–934
Jones DT (2007) Improving the accuracy of transmembrane protein topology prediction using evolutionary information. Bioinformatics 23:538–544
Juretic D, Zoranic L, Zucic D (2002) Basic charge clusters and predictions of membrane protein topology. J Chem Inf Comput Sci 42:620–632
Kahsay R, Gao G, Liao L (2005) An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes. Bioinformatics 21:1853–1858
Kall L, Sonnhammer E (2002) Reliability of transmembrane predictions in whole-genome data. FEBS Lett 532:415–418
Kall L, Krogh A, Sonnhammer E (2007) Advantages of combined transmembrane topology and signal peptide prediction—the Phobius web server. Nucleic Acids Res 35:W429–W432
Khan A, Majid A, Choi TS, Acids A (2010) Predicting protein subcellular location: exploiting amino acid based sequence of feature spaces and fusion of diverse classifiers. Amino Acids 38:347–350
Klabunde T, Hessler G (2002) Chem Bio Chem 3:928–944
Krogh A, Larsson B, von Heijne G, Sonnhammer EL (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567–580
Kyte J, Doolittle R (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157:105–132
Lo A, Chiu HS, Sung TY, Lyu PC, Hsu WL (2008) Enhanced membrane protein topology prediction using a hierarchical classification method and a new scoring function. J Proteome Res 7:487–496
Martelli P, Fariselli P, Casadio R (2003) An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins. Bioinformatics 19:i205–i211
Melen K, Krogh A, von-Heijne G (2003) Reliability measures for membrane protein topology prediction algorithms. J Mol Biol 327:735–744
Moller S, Kriventseva EV, Apweiler R (2000) A collection of well characterized integral membrane proteins. Bioinformatics 16:1159–1160
Moller S, Croning MD, Apweiler R (2001) Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 646–653:17
Nakai K, Kanehisa M (1992) A knowledge base for predicting protein localization sites in eukaryotic cells. Genomics 14:897–911
Naveed M, Khan A (2012) GPCR-MPredictor: multi-level prediction of G protein-coupled receptors using genetic ensemble. Amino Acids 42:1809–1823
Nugent T, Jones D (2009a) Transmembrane protein topology prediction using support vector machines. BMC Bioinformatics 10:159
Nugent T, Jones D (2009b) Predicting transmembrane helix packing arrangements using residue contacts and a force-directed algorithm. PLoS Comput Biol 6:e1000714
Persson B, Argos P (1996) Topology prediction of membrane proteins. Protein Sci 5:363–371
Pylouster J, Bornot A, Etchebest C, Brevern AGD (2010) Influence of assignment on the prediction of transmembrane helices in protein structures. Amino Acids 39(5):1241–1254
Rost B, Casadio R, Fariselli P, Sander C (1995) Transmembrane helices predicted at 95% accuracy. Protein Sci 4:521–533
Rost B, Fariselli P, Casadio R (1996) Topology prediction for helical transmembrane proteins at 86% accuracy. Protein Sci 5:1704–1718
Shen H, Chou JJ (2008) MemBrain: improving the accuracy of predicting transmembrane helices. PLoS ONE 3:e2399
Sonnhammer EL, Von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6:175–182
Suyama M, Ohara O (2003) Domcut: prediction of inter-domain linker regions in amino acid sequences. Bioinformatics 19:673–674
Tusnady GE, Simon I (1998) Principles governing amino acid composition of integral membrane proteins: application to topology prediction. J Mol Biol 283:489–506
Tusnady GE, Simon I (2001) The HMMTOP transmembrane topology prediction server. Bioinformatics 17:849–850
Viklund H, Elofsson A (2004) Best alpha-helical transmembrane protein topology predictions are achieved using hidden Markov models and evolutionary information. Protein Sci 13:1908–1917
Von Heijne G (1992) Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. J Mol Biol 225:487–494
Wang XF, Chen Z, Wang C, Yan RX, Zhang Z, Song J (2011) Predicting residue–residue contacts and helix–helix interactions in transmembrane proteins using an integrative feature-based random forest approach. PLoS ONE 6:e26767
Wang C, Xi L, Li S, Liu H, Yao X (2012) A sequence-based computational model for the prediction of the solvent accessible surface area for <alpha> -helix and <beta> -barrel transmembrane residues. J Comput Chem 33:11–17
Zaki N, Bouktif S, Sanja LM (2011a) A combination of compositional index and genetic algorithm for predicting transmembrane helical segments. PLoSONE 6(7):e21821
Zaki N, Bouktif S, Sanja LM (2011b) A genetic algorithm to enhance transmembrane helices topology prediction using compositional index, ACM GECCO’11, Dublin
Acknowledgments
This work was supported by the Higher Education Commission of Pakistan under the indigenous PhD scholarship program 17-5-3 (Eg3-045)/HEC/Sch/2006).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hayat, M., Khan, A. WRF-TMH: predicting transmembrane helix by fusing composition index and physicochemical properties of amino acids. Amino Acids 44, 1317–1328 (2013). https://doi.org/10.1007/s00726-013-1466-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00726-013-1466-4