Skip to main content

NO SQL Approach for Handling Bioinformatics Data Using MongoDB

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 813))

Abstract

Proliferation of genomic, diagnostic, medical, and other forms of biological data resulted in categorizing of biological data as bigdata. The low-cost sequencing machinery, even in small research labs, is generating large volumes of data which now needs to be mined for useful biological features and knowledge. In this paper, we have used a NoSQL approach to handle the repeat information of the entire human genome. A total of 12 million repeats have been extracted from the entire human genome and have been stored using MongoDB, a popular NoSQL database. A web application has been developed to query data from the database at ease. It is evident that bioinformaticians tend to shift their database development approach from traditional relational model to novel approaches like NoSQL in order to handle the massive amounts of biological data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Cook, C.E., Bergman, M.T., Finn, R.D., Cochrane, G., Birney, E., Apweiler, R.: The European bioinformatics institute in 2016: data growth and integration. Nucleic Acids Res. 44(D1), D20–D26 (2015)

    Google Scholar 

  2. Marx, V.: Biology: the big challenges of big data. Nature 498(7453), 255–260 (2013)

    Google Scholar 

  3. Singer, E.: Biology’s big problem: there’s too much data to handle. Quanta Mag. 2014. Accessed 26 Jan

    Google Scholar 

  4. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)

    Google Scholar 

  5. Manyam, G., Payton, M.A., Roth, J.A., Abruzzo, L.V., Coombes, K.R.: Relax with CouchDB—into the non-relational DBMS era of bioinformatics. Genomics 100(1), 1–7 (2012)

    Google Scholar 

  6. Chodorow, K.: MongoDB: The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly Media, Inc. (2013)

    Google Scholar 

  7. Mudunuri, S.B., Nagarajaram, H.A.: IMEx: imperfect microsatellite extractor. Bioinformatics 23(10), 1181–1187 (2007)

    Google Scholar 

  8. Ellegren, H.: Microsatellites: simple sequences with complex evolution. Nat. Rev. Genet. 5(6), 435–445 (2004)

    Google Scholar 

  9. Sutherland, G.R., Richards, R.I.: Simple tandem DNA repeats and human genetic disease. Proc Natl Acad Sci 92(9) 3636–3641 (1995)

    Google Scholar 

  10. Mudunuri, S.B., Patnana, S., Nagarajaram, H.A.: MICdb3.0: a comprehensive resource of microsatellite repeats from prokaryotic genomes. Database (2014)

    Google Scholar 

  11. Archak, S., Meduri, E., Kumar, P.S., Nagaraju, J.: InSatDb: a microsatellite database of fully sequenced insect genomes. Nucleic Acids Res 35(suppl_1), D36–D39 (2006)

    Google Scholar 

  12. Sablok, G., Padma Raju, G.V., Mudunuri, S.B., Prabha, R., Singh, D.P., Baev, V., Yahubyan, G., Ralph, P.J., Porta, N.L.: ChloroMitoSSRDB 2.00: more genomes, more repeats, unifying SSRs search patterns and on-the-fly repeat detection. Database (2015)

    Google Scholar 

  13. Aishwarya, V., Grover, A., Sharma, P.C.: EuMicroSat db: a database for microsatellites in the sequenced genomes of eukaryotes. BMC Genomics 8(1), 225 (2007)

    Google Scholar 

  14. Blenda, A., Scheffler, J., Scheffler, B., Palmer, M., Lacape, J.M., John, Z.Y., Jesudurai, C., Jung, S., Muthukumar, S., Yellambalase, P., Ficklin, S.: CMD: a cotton microsatellite database resource for Gossypium genomics. BMC Genomics 7(1), 132 (2006)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Ms. Kranthi Chennamsetti, Centre for Bioinformatics Research and SRKR for her help in the extraction of microsatellites from human genome. This work is supported by SERB, Department of Science and Technology (DST), India (Grant ID://ECR/2016/000346).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suresh B. Mudunuri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chigurupati, S., Vegesna, K., Siva Rama Krishna Boddu, L.V., Nookala, G.K.M., Mudunuri, S.B. (2019). NO SQL Approach for Handling Bioinformatics Data Using MongoDB. In: Abraham, A., Dutta, P., Mandal, J., Bhattacharya, A., Dutta, S. (eds) Emerging Technologies in Data Mining and Information Security. Advances in Intelligent Systems and Computing, vol 813. Springer, Singapore. https://doi.org/10.1007/978-981-13-1498-8_25

Download citation

Publish with us

Policies and ethics