Skip to main content

Bioinformatics: An Application in Information Science

  • Conference paper
  • First Online:
First International Conference on Artificial Intelligence and Cognitive Computing

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 815))

  • 878 Accesses

Abstract

Bioinformatics is an interdisciplinary subject of bonded relationship in between computer science, mathematics, and molecular biology. Biological information keeps growing tremendously. Molecular biologists are specialized in solving bioinformatics issues such as to store, analyze, and retrieve biological data by applying algorithm and techniques of computer science. This review is from the computer science perspective. The fundamental terminology of bioinformatics and its definition are essential to understand bioinformatics in depth. There are main three components of bioinformatics and data types. Data types are input format for tools or software. Real-life databases of bioinformatics are also discussed which are important for analyzing the algorithms. We then provide bioinformatics applications in various areas. As bioinformatics is a fusion from many disciplines, there are lots of research issues and challenges, but computational and biological research issues and challenges are quite significant. Nowadays, the tremendous amount of biological data are being generated, Due to them, bioinformatics has emerging future research trends in big data, machine learning, and deep learning which are presented at last.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Luscombe NM, Greenbaum D, Gerstein M, What is bioinformatics? An introduction and overview, NCBI, 83–99 (2001)

    Google Scholar 

  2. SABU M. THAMPI Introduction to Bioinformatics, CoRR (2009)

    Google Scholar 

  3. Hogeweg, Paulien, The Roots of Bioinformatics in Theoretical Biology, PLoS Computational Biology (2011)

    Google Scholar 

  4. National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov

  5. Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K. and Walter, P. Molecular Biology of the Cell. 4th Edn, Annals of Botany, vol. 91.3 (2003)

    Google Scholar 

  6. Ribonucleic Acid, https://www.nature.com/scitable/definition/ribonucleic-acid-rna-45

  7. Pearson H., Genetics: what is a gene?, Nature, 441, 398–401 (2006)

    Article  Google Scholar 

  8. International Human Genome Sequencing Consortium, Finishing the euchromatic sequence of the human genome, Nature, 431, 931–45 (2004)

    Article  Google Scholar 

  9. The Chimpanzee Sequencing and Analysis Consortium 2005, Initial Sequence of the Chimpanzee Genome and Comparison with the Human Genome, Nature, 37, 69–7 (2005)

    Google Scholar 

  10. Allele, https://www.nature.com/scitable/definition/allele-48

  11. DNA Sequencing, https://www.genome.gov/10001177/dna-sequencing-fact-sheet

  12. Griffiths AJF, Miller JH, Suzuki DT, An Introduction to Genetic Analysis-7th edition. W. H. Freeman, New York (2000)

    Google Scholar 

  13. Genome, https://www.ncbi.nlm.nih.gov/genome

  14. Chromosomes, https://www.ncbi.nlm.nih.gov/pubmedhealth/PMHT0025047

  15. Proteins, https://www.nature.com/subjects/proteins

  16. J.Christopher Anderson, Thomas J Magliery, Peter G Schultz, Exploring the Limits of Codon and Anticodon Size, In Chemistry & Biology, Vol. 9, Issue 2, pp. 237–244, (2002)

    Article  Google Scholar 

  17. Annunziato, A. T. Split decision: What happens to nucleosomes during DNA replication? Journal of Biological Chemistry, 280, pp. 12065–12068 (2005)

    Article  Google Scholar 

  18. Ribosomes, Transcription, and Translation, https://www.nature.com/scitable/topicpage/ribosomes-transcription-and-translation-14120660

  19. Genetic Mutation, https://www.nature.com/scitable/topicpage/genetic-mutation-1127

  20. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL. GenBank. Nucleic Acids Research, 28, (2000)

    Google Scholar 

  21. Okayama T, Tamura T, Gojobori T, Tateno Y, Ikeo K, Miyazaki S, Formal design and implementation of an improved DDBJ DNA database with a new schema and object-oriented library, Bioinformatics 14, (1998)

    Google Scholar 

  22. Baker W, van den Broek A, Camon E, Hingamp P, Sterk P, Stoesser G, The EMBL nucleotide sequence database. Nucleic Acids Research, 28, pp. 19–23 (2000)

    Article  MathSciNet  Google Scholar 

  23. The National Center for Biotechnology Information Programs and Activities, https://www.nlm.nih.gov/pubs/factsheets/ncbi.html

  24. Bairoch A, Apweiler R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000, Nucleic Acids Research, 28 (2000)

    Article  Google Scholar 

  25. McGarvey PB, Huang H, Barker WC, Orcutt BC, Garavelli JS, Srinivasarao GY, et al. PIR: a new resource for bioinformatics. Bioinformatics, 16, pp. 290–291 (2000)

    Article  Google Scholar 

  26. Bleasby AJ, Akrigg D, Attwood TK. OWL—a non-redundant composite protein sequence database. Nucleic Acids Research, 22, pp. 3574–3577 (1994)

    Google Scholar 

  27. Bleasby AJ, Wootton JC. Construction of validated, non-redundant composite protein sequence databases. Protein Eng, 3, pp. 153–159 (1990)

    Article  Google Scholar 

  28. Hofmann K, Bucher P, Falquet L, Bairoch A. The PROSITE database, its status in 1999. Nucleic Acids Research, 27, pp. 215–219 (1999)

    Article  Google Scholar 

  29. Attwood TK, Croning MD, Flower DR, Lewis AP, Mabey JE, Scordis P, PRINTS-S: the database formerly known as PRINTS. Nucleic Acids Research 2000, 28, pp. 225–227 (2000)

    Article  Google Scholar 

  30. Bateman A, Birney E, Durbin R, Eddy SR, Howe KL, Sonnhammer EL. The Pfam protein families database. Nucleic Acids Research, 28, pp. 263–266 (2000)

    Article  Google Scholar 

  31. Bernstein FC, Koetzle TF, Williams GJ, Meyer EF, Jr., Brice MD, Rodgers JR, The Protein Data Bank. A computer-based archival file for macromolecular structures. Eur J Biochem, 80, (1977)

    Article  Google Scholar 

  32. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, The Protein Data Bank. Nucleic Acids Research, 28, (2000)

    Article  Google Scholar 

  33. Berman HM, Olson WK, Beveridge DL, Westbrook J, Gelbin A, Demeny T, The Nucleic Acid Database. A comprehensive relational database of threedimensional structures of nucleic acids. Biophys J, 63, pp. 751–759 (1992)

    Article  Google Scholar 

  34. Vondrasek J, Wlodawer A. Database of HIV proteinase structures. TIBS, 22, (1997)

    Article  Google Scholar 

  35. Hendlich M. Databases for protein-ligand complexes. Acta Cryst D 54, (1998)

    Article  Google Scholar 

  36. Laskowski RA, Hutchinson EG, Michie AD, Wallace AC, Jones ML, Thornton JM. PDBsum: a Web-based database of summaries and analyses of all PDB structures. TIBS, 22, pp. 488–490 (1997)

    Google Scholar 

  37. Pearl FM, Lee D, Bray JE, Sillitoe I, Todd AE, Harrison AP, Assigning genomic sequences to CATH. Nucleic Acids Research, 28, pp. 277–282 (2000)

    Article  MathSciNet  Google Scholar 

  38. Lo Conte L, Ailey B, Hubbard TJ, Brenner SE, Murzin AG, Chothia C. SCOP: a structural classification of proteins database. Nucleic Acids Res, 28, pp. 257–259 (2000)

    Google Scholar 

  39. Holm L, Sander C. Touring protein fold space with Dali/FSSP. Nucleic Acids Research, 26, pp. 316–319 (1998)

    Article  Google Scholar 

  40. Brenner SE, Koehl P, Levitt M. The ASTRAL compendium for protein structure and sequence analysis. Nucleic Acids Research, 28, pp. 254–256 (2000)

    Article  Google Scholar 

  41. Mizuguchi K, Deane CM, Blundell TL, Overington JP. HOMSTRAD: a database of protein structure alignments for homologous families. Protein Science : A Publication of the Protein Society, 7, pp. 2469–2471 (1998)

    Article  Google Scholar 

  42. Tatusova TA, Karsch-Mizrachi I, Ostell JA. Complete genomes in WWW Entrez: data representation and analysis. Bioinformatics, 15, (1999)

    Article  Google Scholar 

  43. Lin J, Gerstein M. Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels. Genome Research, 10 (2000)

    Article  Google Scholar 

  44. Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science, 278, (1997)

    Article  Google Scholar 

  45. Attwood TK, Flower DR, Lewis AP, Mabey JE, Morgan SR, Scordis P, PRINTS prepares for the new millennium. Nucleic Acids Research, 27 pp. 220–225 (1999)

    Article  Google Scholar 

  46. Etzold T, Ulyanov A, Argos P. SRS: information retrieval system for molecular biology data banks. Methods Enzymol, 266 (1996)

    Google Scholar 

  47. Schuler GD, Epstein JA, Ohkawa H, Kans JA. Entrez: molecular biology database and retrieval system. Methods Enzymol, 266, (1996)

    Google Scholar 

  48. Makarova, Kira S. Genome of the Extremely Radiation-Resistant Bacterium Deinococcus Radiodurans Viewed from the Perspective of Comparative Genomics. Microbiology and Molecular Biology Reviews, 65, pp. 44–79 (2001)

    Article  Google Scholar 

  49. Samuel Levy, Granger Sutton, Pauline C Ng, Lars Feuk, Aaron L Halpern, Brian P Walenz, Nelson Axelrod, Jiaqi Huang, Ewen F Kirkness, Gennady Denisov, Yuan Lin, Jeffrey R MacDonald, Andy Wing Chun Pang, Mary Shago, Timothy B Stockwell, Alexia Tsiamouri, Vineet Bafna, Vikas Bansal, Saul A Kravitz, Dana A Busam, Karen Y Beeson, Tina C McIntosh, Karin A Remington, Josep F Abril, John Gill, Jon Borman, Yu-Hui Rogers, Marvin E Frazier, Stephen W Scherer, Robert L Strausberg, J. Craig Venter, “The Diploid Genome Sequence of an Individual Human”, PLoS Biology, 5 (2007)

    Article  Google Scholar 

  50. Eva Bianconi and Allison Piovesan and Federica Facchin and Alina Beraudi and Raffaella Casadei and Flavia Frabetti and Lorenza Vitale and Maria Chiara Pelleri and Simone Tassani and Francesco Piva and Soledad Perez-Amodio and Pierluigi Strippoli and Silvia Canaider, An estimation of the number of cells in the human body, Annals of Human Biology, Vol. 40, pp. 463–471 (2013)

    Google Scholar 

  51. K. Shvachko, H. Kuang, S. Radia and R. Chansler, The Hadoop Distributed File System, In: 26th IEEE Symposium on Mass Storage Systems and Technologies (MSST), Incline Village, NV, pp. 1–10 (2010)

    Google Scholar 

  52. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica., Spark: cluster computing with working sets. In: 2nd USENIX conference on Hot topics in cloud computing (HotCloud’10), USENIX Association, Berkeley, CA, USA, 10 (2010)

    Google Scholar 

  53. Naresh Kumar Gundla, Zhengxin Chen, Creating NoSQL Biological Databases with Ontologies for Query Relaxation, In: Computer Science, Vol. 91, pp. 460–469, (2016)

    Article  Google Scholar 

  54. J. R. Quinlan. Induction of Decision Trees. Mach. Learn. 1, pp. 81–106 (1986)

    Google Scholar 

  55. Pedro Domingos and Michael Pazzani. 1997. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Mach. Learn. 29, pp. 103–130 (1997)

    Google Scholar 

  56. Christopher J. C. Burges. 1998. A Tutorial on Support Vector Machines for Pattern Recognition. Data Min. Knowl. Discov. 2, pp. 121–167 (1998)

    Google Scholar 

  57. Hyunsoo Yoon, Cheong-Sool Park, Jun Seok Kim, Jun-Geol Baek, Algorithm learning based neural network integrating feature selection and classification, Expert Systems with Applications, Vol. 40, pp. 231–241 (2013)

    Article  Google Scholar 

  58. T. Cover and P. Hart. 2006. Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13, pp. 21–27 (2006)

    Article  Google Scholar 

  59. John A. Hartigan. Clustering Algorithms, John Wiley & Sons, Inc., New York, NY, USA (1975)

    Google Scholar 

  60. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. ImageNet classification with deep convolutional neural networks. In: 25th International Conference on Neural Information Processing Systems (NIPS’12), F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.), Vol. 1. Curran Associates Inc., USA, pp. 1097–1105 (2012)

    Google Scholar 

  61. Zachary C. Lipton, John Berkowitz, Charles Elkan, A Critical Review of Recurrent Neural Networks for Sequence Learning, CoRR, (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Parth Goel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Goel, P., Padole, M. (2019). Bioinformatics: An Application in Information Science. In: Bapi, R., Rao, K., Prasad, M. (eds) First International Conference on Artificial Intelligence and Cognitive Computing . Advances in Intelligent Systems and Computing, vol 815. Springer, Singapore. https://doi.org/10.1007/978-981-13-1580-0_22

Download citation

Publish with us

Policies and ethics