INTERACT-O-FINDER: A Tool for Prediction of DNA-Binding Proteins Using Sequence Features

Samant, Monika; Jethva, Minesh; Hasija, Yasha

doi:10.1007/s10989-014-9446-4

INTERACT-O-FINDER: A Tool for Prediction of DNA-Binding Proteins Using Sequence Features

Published: 09 December 2014

Volume 21, pages 189–193, (2015)
Cite this article

International Journal of Peptide Research and Therapeutics Aims and scope Submit manuscript

Monika Samant¹,
Minesh Jethva² &
Yasha Hasija¹

215 Accesses
1 Citation
Explore all metrics

Abstract

Protein–DNA interactions carry out many important regulatory functions in our body which are essential for growth and survival. The molecular level understanding of these interactions helps us to decipher the mechanism of regulation. Conventionally these interactions were identified using small scale experimental techniques and high throughput technologies. But these approaches being time consuming and expensive, arouse the need of a computational approach for the prediction of interactions. To this end, a machine learning approach for predicting and classifying DNA-binding proteins has been used. Weka and LibSVM platform have been used to develop an effective classifying model. The different classifiers were applied on the attributes generated from protein sequence and tried to develop an efficient model for classifying a protein as DNA-interacting or DNA-non-interacting. Among several classifying algorithms applied to generate models, best performance was achieved using LibSVM with 87.83 % accuracy. The tool named INTERACT-O-FINDER, based on the prediction model is available at http://interacto.eurekanow.org/index.html.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

iDNAProt-ES: Identification of DNA-binding Proteins Using Evolutionary and Structural Features

Article Open access 02 November 2017

An Empirical Investigation of Discretization Techniques on the Classification of Protein–Protein Interaction

Enhanced Prediction of DNA-Binding Proteins and Classes

References

Ahmad S, Sarai A (2004) Moment-based prediction of DNA-binding proteins. J Mol Biol 341:65–71
Article CAS PubMed Google Scholar
András S, Jeffrey S (2006) Efficient prediction of nucleic acid binding function from low-resolution protein structures. J Mol Biol 358:922–933
Article Google Scholar
Andrea S, Ondřej K, Filip Ž, Jakub T (2012) Prediction of DNA-binding proteins from relational features. Proteome Sci 10:66
Article Google Scholar
Bhardwaj N, Langlois RE, Zhao G, Lu H (2005) Kernel-based machine learning protocol for predicting DNA-binding proteins. Nucleic Acids Res 33:6486–6493
Article PubMed Central CAS PubMed Google Scholar
Blouin S, Craggs TD, Lafontaine DA, Penedo JC (2009) Functional studies of DNA–protein interactions using FRET techniques. Methods Mol Biol 543:475–502
Article CAS PubMed Google Scholar
Caragea C, Caragea D, Silvescu A, Honavar V (2010) Semi-supervised prediction of protein subcellular localization using abstraction augmented Markov models. BMC Bioinform 11:S6
Article Google Scholar
Chang CC and Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27
Chawla NV, Japkowicz N, Kotcz A (2004) Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor Newsl 6:1–6
Article Google Scholar
Chu WY, Huang YF, Huang CC, Cheng YS, Huang CK, Oyang YJ (2009) ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors. Nucleic Acids Res 37:W396–W401
Article PubMed Central CAS PubMed Google Scholar
Chuanxin Z, Jiayu G, Honglin L (2013) An improved sequence based prediction protocol for DNA-binding proteins using SVM and comprehensive feature analysis. BMC Bioinform 14:90
Article Google Scholar
Fang Y, Guo Y, Feng Y, Li M (2008) Predicting DNA-binding proteins: approached from Chou’s pseudo amino acid composition and other specific sequence features. Amino Acids 34:103–109
Article CAS PubMed Google Scholar
Gao M, Skolnick J (2009) A threading-based method for the prediction of DNA-binding proteins with application to the human genome. PLoS Comput Biol 5:e1000567
Article PubMed Central PubMed Google Scholar
Guy N, András S, Christina L, Nir BT (2009) Identification of DNA-binding proteins using structural, electrostatic and evolutionary features. J Mol Biol 387:1040–1053
Article Google Scholar
Huang Y, Niu B, Gao Y, Limin F, Li W (2010) CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 26:680–682
Article PubMed Central CAS PubMed Google Scholar
Hwang S, Gou Z, Kuznetsov IB (2007) DP-Bind: a web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins. Bioinformatics 23:634–636
Article CAS PubMed Google Scholar
Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292:195–202
Article CAS PubMed Google Scholar
Jones S, Shanahan HP, Berman HM, Thornton JM (2003) Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. Nucleic Acids Res 31:7189–7198
Article PubMed Central CAS PubMed Google Scholar
Kumar M, Gromiha M, Raghava G (2007) Identification of DNA-binding proteins using support vector machines and evolutionary profiles. BMC Bioinform 8:463
Article CAS Google Scholar
Langlois RE, Lu H (2010) Boosting the prediction and understanding of DNA-binding domains from sequence. Nucleic Acids Res 38:3149–3158
Article PubMed Central CAS PubMed Google Scholar
Larrañaga Pedro, Calvo Borja, Santana Roberto, Bielza Concha et al (2006) Machine learning in bioinformatics. Brief Bioinform 7:86–112
Article PubMed Google Scholar
Lin WZ, Fang JA, Xiao X, Chou KC (2011) IDNA-prot: identification of DNA binding proteins using random forest with grey model. PLoS ONE 6:e24756
Article PubMed Central CAS PubMed Google Scholar
Liu LA, Bradley P (2012) Atomistic modeling of protein–DNA interaction specificity: progress and applications. Curr Opin Struct Biol 22:397–405
Article PubMed Central PubMed Google Scholar
Luscombe NM, Austin SE, Berman HM, Thornton JM (2000) An overview of the structures of protein–DNA complexes. Genome Biol 1:1–37
Article Google Scholar
Mark Hall, Eib Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian Witten H (2009) The WEKA data mining software: an update. SIGKDD Explor Newsl 11:10–18
Article Google Scholar
Nagarajan R, Shander A, Gromiha M (2013) Novel approach for selecting best predictor for identifying binding sites in DNA binding proteins. Nucleic Acids Res 41:7606–7614
Article PubMed Central CAS PubMed Google Scholar
Ofran Y, Mysore V, Rost B (2007) Prediction of DNA-binding residues from sequence. Bioinformatics 23:i347–i353
Article CAS PubMed Google Scholar
Ren B, Robert F, Wyrick JJ, Aparicio O, Jennings EG, Simon I, Zeitlinger J, Schreiber J, Hannett N, Kanin E et al (2000) Genome-wide location and function of DNA binding proteins. Science 290:2306–2309
Article CAS PubMed Google Scholar
Sarai A, Kono H (2005) Protein–DNA recognition patterns and predictions. Annu Rev Biophys Biomol Struct 34:379–398
Article CAS PubMed Google Scholar
Shao X, Tian Y, Wu L, Wang Y, Jing L, Deng N (2009) Predicting DNA-and RNA-binding proteins from sequences with kernel methods. J Theor Biol 258:289–293
Article CAS PubMed Google Scholar
Si J, Zhang Z, Lin B, Schroeder M, Huang B (2011) MetaDBSite: a meta approach to improve protein DNA-binding sites prediction. BMC Syst Biol 5:S7
Article PubMed Central CAS PubMed Google Scholar
Stawiski EW, Gregoret LM, Mandel-Gutfreund Y (2003) Annotating nucleic acid-binding function based on protein structure. J Mol Biol 326:1065–1079
Article CAS PubMed Google Scholar
Tarca AL, Carey VJ, Chen XW, Romero R, Draghici S (2007) Machine learning and its applications to biology. PLoS Comput Biol 3:e116
Article PubMed Central PubMed Google Scholar
Tjong H, Zhou HX (2007) DISPLAR: an accurate method for predicting DNA-binding sites on protein surfaces. Nucleic Acids Res 35:1465–1477
Article PubMed Central CAS PubMed Google Scholar
Wang L, Brown SJ (2006) BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences. Nucleic Acids Res 34:W243–W248
Article PubMed Central CAS PubMed Google Scholar
Wang G, Dunbrack RL Jr (2003) PISCES: a protein sequence culling server. Bioinformatics 19:1589–1591
Article CAS PubMed Google Scholar
Xiong Y, Liu J, Wei DQ (2011) An accurate feature-based method for identifying DNA-binding residues on protein surfaces. Proteins 79:509–517
Article CAS PubMed Google Scholar
Yan C, Terribilini M, Wu F, Jernigan R, Dobbs D, Honavar V (2006) Predicting DNA-binding sites of proteins from amino acid sequence. BMC Bioinform 7:262
Article Google Scholar
Yang L, Xia JF, Gui J (2010) Prediction of protein–protein interactions from protein sequence using local descriptors. Protein Pept Lett 17:1085–1090
Article CAS PubMed Google Scholar
Zhao H, Yang Y, Zhou Y (2010) Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function. Bioinformatics 26:1857–1863
Article PubMed Central CAS PubMed Google Scholar
Zhou W, Yan H (2011) Prediction of DNA-binding protein based on statistical and geometric features and support vector machines. Proteome Sci 9(Suppl 1):S1. doi:10.1186/1477-5956-9-S1-S1
Article PubMed Central PubMed Google Scholar

Download references

Acknowledgement

Authors thank Nitin Thukral for scientific discussions.

Conflict of interest

The authors declare no conflict of interest.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.

Author information

Authors and Affiliations

Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042, India
Monika Samant & Yasha Hasija
Synthetic Biology and Biofuels, International Center for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, Delhi, 110067, India
Minesh Jethva

Authors

Monika Samant
View author publications
You can also search for this author in PubMed Google Scholar
Minesh Jethva
View author publications
You can also search for this author in PubMed Google Scholar
Yasha Hasija
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yasha Hasija.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Samant, M., Jethva, M. & Hasija, Y. INTERACT-O-FINDER: A Tool for Prediction of DNA-Binding Proteins Using Sequence Features. Int J Pept Res Ther 21, 189–193 (2015). https://doi.org/10.1007/s10989-014-9446-4

Download citation

Accepted: 02 December 2014
Published: 09 December 2014
Issue Date: June 2015
DOI: https://doi.org/10.1007/s10989-014-9446-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

INTERACT-O-FINDER: A Tool for Prediction of DNA-Binding Proteins Using Sequence Features

Abstract

Access this article

Similar content being viewed by others

iDNAProt-ES: Identification of DNA-binding Proteins Using Evolutionary and Structural Features

An Empirical Investigation of Discretization Techniques on the Classification of Protein–Protein Interaction

Enhanced Prediction of DNA-Binding Proteins and Classes

References

Acknowledgement

Conflict of interest

Human and Animal Rights and Informed Consent

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

INTERACT-O-FINDER: A Tool for Prediction of DNA-Binding Proteins Using Sequence Features

Abstract

Access this article

Similar content being viewed by others

iDNAProt-ES: Identification of DNA-binding Proteins Using Evolutionary and Structural Features

An Empirical Investigation of Discretization Techniques on the Classification of Protein–Protein Interaction

Enhanced Prediction of DNA-Binding Proteins and Classes

References

Acknowledgement

Conflict of interest

Human and Animal Rights and Informed Consent

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation