Gene Presence and Absence in Genomic Big Data for Precision Medicine

Adhil, Mohamood; Agarwal, Mahima; Ghosh, Krittika; Sule, Manas; Talukder, Asoke K.

doi:10.1007/978-981-10-7245-1_22

Mohamood Adhil¹⁹,
Mahima Agarwal¹⁹,
Krittika Ghosh¹⁹,
Manas Sule¹⁹ &
…
Asoke K. Talukder¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 673))

1562 Accesses

Abstract

The twenty–first-century precision medicine aims at using a systems-oriented approach to find the root cause of disease specific to an individual by including molecular pathology tests. The challenges of genomic data analysis for precision medicine are multifold, they are a combination of big data, high dimensionality, and with often multimodal distributions. Advanced investigations use techniques such as Next Generation Sequencing (NGS) which rely on complex statistical methods for gaining useful insights. Analysis of the exome and transcriptome data allow for in-depth study of the 22 thousand genes in the human body, many of which relate to phenotype and disease state. Not all genes are expressed in all tissues. In disease state, some genes are even deleted in the genome. Therefore, as part of knowledge discovery, exome and transcriptome big data needs to be analyzed to determine whether a gene is actually absent (deleted/not expressed) or present. In this paper, we present a statistical technique to identify the genes that are present or absent in exome or transcriptome data (big data) to improve the accuracy for precision medicine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Eisenstein, Michael. “Big data: the power of petabytes.” Nature 527.7576 (2015): S2–S4.
Google Scholar
Bock, Hans-Hermann, and Edwin Diday, eds. Analysis of symbolic data: exploratory methods for extracting statistical information from complex data. Springer Science & Business Media, 2012.
Google Scholar
Morley, Michael, et al. “Genetic analysis of genome-wide variation in human gene expression.” Nature 430.7001 (2004): 743–747.
Google Scholar
Ried, Thomas, et al. “Genomic changes defining the genesis, progression, and malignancy potential in solid human tumors: a phenotype/genotype correlation.” Genes, Chromosomes and Cancer 25.3 (1999): 195–204.
Google Scholar
Kitano, Hiroaki. “Computational systems biology.” Nature 420.6912 (2002): 206–210.
Google Scholar
Maniatis, Tom, Stephen Goodbourn, and Janice A. Fischer. “Regulation of inducible and tissue-specific gene expression.” Science 236 (1987): 1237–1246.
Google Scholar
Komura, Daisuke, et al. “Noise reduction from genotyping microarrays using probe level information.” In silico biology 6.1, 2 (2006): 79–92.
Google Scholar
Schwartz, Schraga, Ram Oren, and Gil Ast. “Detection and removal of biases in the analysis of next-generation sequencing reads.” PloS one 6.1 (2011): e16685.
Google Scholar
Trapnell, Cole, et al. “Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.” Nature protocols 7.3 (2012): 562–578.
Google Scholar
iOMICS-Research Version 4.0.
Google Scholar
Reynolds, Douglas. “Gaussian mixture models.” Encyclopedia of biometrics (2015): 827–832.
Google Scholar
Moon, Todd K. “The expectation-maximization algorithm.” IEEE Signal processing magazine 13.6 (1996): 47–60.
Google Scholar
Lappalainen, Tuuli, et al. “Transcriptome and genome sequencing uncovers functional variation in humans.” Nature 501.7468 (2013): 506–511.
Google Scholar
Petryszak, Robert, et al. “Expression Atlas update—an integrated database of gene and protein expression in humans, animals and plants.” Nucleic acids research (2015): gkv1045.
Google Scholar
Pleasance, Erin D., et al. “A comprehensive catalogue of somatic mutations from a human cancer genome.” Nature 463.7278 (2010): 191–196.
Google Scholar
Talukder, Asoke K., et al. “Tracking Cancer Genetic Evolution using OncoTrack.” Scientific Reports 6 (2016).
Google Scholar
Gracia-Aznarez, Francisco Javier, et al. “Whole exome sequencing suggests much of non-BRCA1/BRCA2 familial breast cancer is due to moderate and low penetrance susceptibility alleles.” PloS one 8.2 (2013): e55681.
Google Scholar

Download references

Author information

Authors and Affiliations

Interpretomics, 5th Floor, Shezan Lavelle, 15 Walton Road, Bangalore, 560001, India
Mohamood Adhil, Mahima Agarwal, Krittika Ghosh, Manas Sule & Asoke K. Talukder

Authors

Mohamood Adhil
View author publications
You can also search for this author in PubMed Google Scholar
Mahima Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
Krittika Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
Manas Sule
View author publications
You can also search for this author in PubMed Google Scholar
Asoke K. Talukder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Asoke K. Talukder .

Editor information

Editors and Affiliations

Department of Computer Software, University of Aizu, Aizuwakamatsu, Fukushima, Japan
Subhash Bhalla
Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges, Lucknow, Uttar Pradesh, India
Vikrant Bhateja
Department of Information Technology, MIT College of Engineering, Pune, Maharashtra, India
Anjali A. Chandavale
Department of Information Technology, MIT College of Engineering, Pune, Maharashtra, India
Anil S. Hiwale
Department of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences, Visakhapatnam, Andhra Pradesh, India
Suresh Chandra Satapathy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Adhil, M., Agarwal, M., Ghosh, K., Sule, M., Talukder, A.K. (2018). Gene Presence and Absence in Genomic Big Data for Precision Medicine. In: Bhalla, S., Bhateja, V., Chandavale, A., Hiwale, A., Satapathy, S. (eds) Intelligent Computing and Information and Communication. Advances in Intelligent Systems and Computing, vol 673. Springer, Singapore. https://doi.org/10.1007/978-981-10-7245-1_22

Download citation

DOI: https://doi.org/10.1007/978-981-10-7245-1_22
Published: 20 January 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7244-4
Online ISBN: 978-981-10-7245-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics