NO SQL Approach for Handling Bioinformatics Data Using MongoDB

Chigurupati, Swaroop; Vegesna, Kiran; Siva Rama Krishna Boddu, L. V.; Nookala, Gopala Krishna Murthy; Mudunuri, Suresh B.

doi:10.1007/978-981-13-1498-8_25

NO SQL Approach for Handling Bioinformatics Data Using MongoDB

Swaroop Chigurupati¹⁹,
Kiran Vegesna¹⁹,
L. V. Siva Rama Krishna Boddu¹⁹,
Gopala Krishna Murthy Nookala¹⁹ &
…
Suresh B. Mudunuri¹⁹

Conference paper
First Online: 02 September 2018

1201 Accesses
2 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 813))

Abstract

Proliferation of genomic, diagnostic, medical, and other forms of biological data resulted in categorizing of biological data as bigdata. The low-cost sequencing machinery, even in small research labs, is generating large volumes of data which now needs to be mined for useful biological features and knowledge. In this paper, we have used a NoSQL approach to handle the repeat information of the entire human genome. A total of 12 million repeats have been extracted from the entire human genome and have been stored using MongoDB, a popular NoSQL database. A web application has been developed to query data from the database at ease. It is evident that bioinformaticians tend to shift their database development approach from traditional relational model to novel approaches like NoSQL in order to handle the massive amounts of biological data.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Cook, C.E., Bergman, M.T., Finn, R.D., Cochrane, G., Birney, E., Apweiler, R.: The European bioinformatics institute in 2016: data growth and integration. Nucleic Acids Res. 44(D1), D20–D26 (2015)
Google Scholar
Marx, V.: Biology: the big challenges of big data. Nature 498(7453), 255–260 (2013)
Google Scholar
Singer, E.: Biology’s big problem: there’s too much data to handle. Quanta Mag. 2014. Accessed 26 Jan
Google Scholar
Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)
Google Scholar
Manyam, G., Payton, M.A., Roth, J.A., Abruzzo, L.V., Coombes, K.R.: Relax with CouchDB—into the non-relational DBMS era of bioinformatics. Genomics 100(1), 1–7 (2012)
Google Scholar
Chodorow, K.: MongoDB: The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly Media, Inc. (2013)
Google Scholar
Mudunuri, S.B., Nagarajaram, H.A.: IMEx: imperfect microsatellite extractor. Bioinformatics 23(10), 1181–1187 (2007)
Google Scholar
Ellegren, H.: Microsatellites: simple sequences with complex evolution. Nat. Rev. Genet. 5(6), 435–445 (2004)
Google Scholar
Sutherland, G.R., Richards, R.I.: Simple tandem DNA repeats and human genetic disease. Proc Natl Acad Sci 92(9) 3636–3641 (1995)
Google Scholar
Mudunuri, S.B., Patnana, S., Nagarajaram, H.A.: MICdb3.0: a comprehensive resource of microsatellite repeats from prokaryotic genomes. Database (2014)
Google Scholar
Archak, S., Meduri, E., Kumar, P.S., Nagaraju, J.: InSatDb: a microsatellite database of fully sequenced insect genomes. Nucleic Acids Res 35(suppl_1), D36–D39 (2006)
Google Scholar
Sablok, G., Padma Raju, G.V., Mudunuri, S.B., Prabha, R., Singh, D.P., Baev, V., Yahubyan, G., Ralph, P.J., Porta, N.L.: ChloroMitoSSRDB 2.00: more genomes, more repeats, unifying SSRs search patterns and on-the-fly repeat detection. Database (2015)
Google Scholar
Aishwarya, V., Grover, A., Sharma, P.C.: EuMicroSat db: a database for microsatellites in the sequenced genomes of eukaryotes. BMC Genomics 8(1), 225 (2007)
Google Scholar
Blenda, A., Scheffler, J., Scheffler, B., Palmer, M., Lacape, J.M., John, Z.Y., Jesudurai, C., Jung, S., Muthukumar, S., Yellambalase, P., Ficklin, S.: CMD: a cotton microsatellite database resource for Gossypium genomics. BMC Genomics 7(1), 132 (2006)
Google Scholar

Download references

Acknowledgements

The authors would like to thank Ms. Kranthi Chennamsetti, Centre for Bioinformatics Research and SRKR for her help in the extraction of microsatellites from human genome. This work is supported by SERB, Department of Science and Technology (DST), India (Grant ID://ECR/2016/000346).

Author information

Authors and Affiliations

Sagi Ramakrishnam Raju Engineering College, Bhimavaram, 534204, India
Swaroop Chigurupati, Kiran Vegesna, L. V. Siva Rama Krishna Boddu, Gopala Krishna Murthy Nookala & Suresh B. Mudunuri

Authors

Swaroop Chigurupati
View author publications
You can also search for this author in PubMed Google Scholar
Kiran Vegesna
View author publications
You can also search for this author in PubMed Google Scholar
L. V. Siva Rama Krishna Boddu
View author publications
You can also search for this author in PubMed Google Scholar
Gopala Krishna Murthy Nookala
View author publications
You can also search for this author in PubMed Google Scholar
Suresh B. Mudunuri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Suresh B. Mudunuri .

Editor information

Editors and Affiliations

Machine Intelligence Research Labs, Auburn, WA, USA
Ajith Abraham
Department of Computer and System Sciences, Visva-Bharati University, Santiniketan, West Bengal, India
Paramartha Dutta
Department of Computer Science and Engineering, University of Kalyani, Kalyani, India
Jyotsna Kumar Mandal
Institute of Engineering and Management, Kolkata, West Bengal, India
Abhishek Bhattacharya
Institute of Engineering and Management, Kolkata, West Bengal, India
Soumi Dutta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chigurupati, S., Vegesna, K., Siva Rama Krishna Boddu, L.V., Nookala, G.K.M., Mudunuri, S.B. (2019). NO SQL Approach for Handling Bioinformatics Data Using MongoDB. In: Abraham, A., Dutta, P., Mandal, J., Bhattacharya, A., Dutta, S. (eds) Emerging Technologies in Data Mining and Information Security. Advances in Intelligent Systems and Computing, vol 813. Springer, Singapore. https://doi.org/10.1007/978-981-13-1498-8_25

Download citation

DOI: https://doi.org/10.1007/978-981-13-1498-8_25
Published: 02 September 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1497-1
Online ISBN: 978-981-13-1498-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics