Allele-Specific Expression Analysis in Cancer Using Next-Generation Sequencing Data

  • Alessandro RomanelEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1878)


Allele-specific expression arises when transcriptional activity at the different alleles of a gene differs considerably. Although extensive research has been carried out to detect and characterize this phenomenon, the landscape of allele-specific expression in cancer is still poorly understood. In this chapter, we describe a fast and reliable analysis pipeline to study allele-specific expression in cancer using next-generation sequencing data. The pipeline provides a gene-level analysis approach that exploits paired germline DNA and tumor RNA sequencing data and benefits from parallel computation resources when available.

Key words

Allele-specific features Genome analysis Parallel computation Transcriptome analysis Next-generation sequencing SNPs 


  1. 1.
    Lo HS, Wang Z, Hu Y, Yang HH, Gere S, Buetow KH et al (2003) Allelic variation in gene expression is common in the human genome. Genome Res 13(8):1855–1862PubMedPubMedCentralGoogle Scholar
  2. 2.
    Gimelbrant A, Hutchinson JN, Thompson BR, Chess A (2007) Widespread monoallelic expression on human autosomes. Science 318(5853):1136–1140CrossRefGoogle Scholar
  3. 3.
    Walker EJ, Zhang C, Castelo-Branco P, Hawkins C, Wilson W, Zhukova N et al (2012) Monoallelic expression determines oncogenic progression and outcome in benign and malignant brain tumors. Cancer Res 72(3):636–644CrossRefGoogle Scholar
  4. 4.
    Lalonde E, Ha KCH, Wang Z, Bemmo A, Kleinman CL, Kwan T et al (2011) RNA sequencing reveals the role of splicing polymorphisms in regulating human gene expression. Genome Res 21(4):545–554CrossRefGoogle Scholar
  5. 5.
    Meyer KB, Maia A-T, O’Reilly M, Teschendorff AE, Chin S-F, Caldas C et al (2008) Allele-specific up-regulation of FGFR2 increases susceptibility to breast cancer. PLoS Biol 6(5):e108CrossRefGoogle Scholar
  6. 6.
    Wei Q-X, Claus R, Hielscher T, Mertens D, Raval A, Oakes CC et al (2013) Germline allele-specific expression of DAPK1 in chronic lymphocytic leukemia. PLoS One 8(1):e55261CrossRefGoogle Scholar
  7. 7.
    Ferguson-Smith AC, Surani MA (2001) Imprinting and the epigenetic asymmetry between parental genomes. Science 293(5532):1086–1089CrossRefGoogle Scholar
  8. 8.
    Knight JC (2004) Allele-specific gene expression uncovered. Trends Genet 20(3):113–116CrossRefGoogle Scholar
  9. 9.
    Pastinen T (2010) Genome-wide allele-specific analysis: insights into regulatory variation. Nat Rev Genet 11(8):533–538CrossRefGoogle Scholar
  10. 10.
    Prandi D, Baca SC, Romanel A, Barbieri CE, Mosquera J-M, Fontugne J et al (2014) Unraveling the clonal hierarchy of somatic genomic aberrations. Genome Biol 15(8):439CrossRefGoogle Scholar
  11. 11.
    Nik-Zainal S, Van Loo P, Wedge DC, Alexandrov LB, Greenman CD, Lau KW et al (2012) Breast Cancer Working Group of the International Cancer Genome Consortium. The life history of 21 breast cancers. Cell 149(5):994–1007CrossRefGoogle Scholar
  12. 12.
    Gajecka M (2016) Unrevealed mosaicism in the next-generation sequencing era. Mol Gen Genomics 291:513–530CrossRefGoogle Scholar
  13. 13.
    Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y et al (2009) Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25(24):3207–3212CrossRefGoogle Scholar
  14. 14.
    Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E et al (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464(7289):768–772CrossRefGoogle Scholar
  15. 15.
    Lee MP (2012) Allele-specific gene expression and epigenetic modifications and their application to understanding inheritance and cancer. Biochim Biophys Acta 1819(7):739–742CrossRefGoogle Scholar
  16. 16.
    Tuch BB, Laborde RR, Xu X, Gu J, Chung CB, Monighetti CK et al (2010) Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations. PLoS One 5(2):e9317CrossRefGoogle Scholar
  17. 17.
    Ha G, Roth A, Lai D, Bashashati A, Ding J, Goya R et al (2012) Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer. Genome Res 22(10):1995–2007CrossRefGoogle Scholar
  18. 18.
    Rozowsky J, Abyzov A, Wang J, Alves P, Raha D, Harmanci A et al (2011) AlleleSeq: analysis of allele-specific expression and binding in a network framework. Mol Syst Biol 7:522CrossRefGoogle Scholar
  19. 19.
    Mayba O, Gilbert HN, Liu J, Haverty PM, Jhunjhunwala S, Jiang Z et al (2014) MBASED: allele-specific expression detection in cancer tissues and cell lines. Genome Biol 15(8):405 Scholar
  20. 20.
    Skelly DA, Johansson M, Madeoy J, Wakefield J, Akey JM (2011) A powerful and flexible statistical framework for testing hypotheses of allele-specific gene expression from RNA-seq data. Genome Res 21(10):1728–1737CrossRefGoogle Scholar
  21. 21.
    Wei Y, Li X, Wang Q, Ji H (2012) iASeq: integrative analysis of allele-specificity of protein-DNA interactions in multiple ChIP-seq datasets. BMC Genomics 13:681CrossRefGoogle Scholar
  22. 22.
    Pandey RV, Franssen SU, Futschik A, Schlötterer C (2013) Allelic imbalance metre (Allim), a new tool for measuring allele-specific gene expression with RNA-seq data. Mol Ecol Resour 13(4):740–745CrossRefGoogle Scholar
  23. 23.
    Romanel A, Lago S, Prandi D, Sboner A, Demichelis F (2015) ASEQ: fast allele-specific studies from next-generation sequencing data. BMC Med Genet 8:9Google Scholar
  24. 24.
    Beltran H, Eng K, Mosquera JM, Sigaras A, Romanel A, Rennert H et al (2015) Whole-exome sequencing of metastatic cancer and biomarkers of treatment response. JAMA Oncol 1(4):466CrossRefGoogle Scholar
  25. 25.
    Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120CrossRefGoogle Scholar
  26. 26.
    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760CrossRefGoogle Scholar
  27. 27.
    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N et al (2009) 1000 genome project data processing subgroup. The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079CrossRefGoogle Scholar
  28. 28.
    McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A et al (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20(9):1297–1303CrossRefGoogle Scholar
  29. 29.
    Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14(4):R36CrossRefGoogle Scholar
  30. 30.
    Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, Van Baren MJ et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5):511–515CrossRefGoogle Scholar
  31. 31.
    Boeva V, Popova T, Bleakley K, Chiche P, Cappo J, Schleiermacher G et al (2012) Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics 28(3):423–425CrossRefGoogle Scholar
  32. 32.
    Amarasinghe KC, Li J, Halgamuge SK (2013) CoNVEX: copy number variation estimation in exome sequencing data using HMM. BMC Bioinformatics 14(Suppl 2):S2CrossRefGoogle Scholar
  33. 33.
    Magi A, Tattini L, Cifola I, D’Aurizio R, Benelli M, Mangano E et al (2013) EXCAVATOR: detecting copy number variants from whole-exome sequencing data. Genome Biol 14(10):R120CrossRefGoogle Scholar
  34. 34.
    Chiang DY, Getz G, Jaffe DB, O’Kelly MJT, Zhao X, Carter SL et al (2009) High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods 6(1):99–103CrossRefGoogle Scholar
  35. 35.
    Xi R, Hadjipanayis AG, Luquette LJ, Kim T-M, Lee E, Zhang J et al (2011) Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion. Proc Natl Acad Sci U S A 108(46):E1128–E1136CrossRefGoogle Scholar
  36. 36.
    Olshen AB, Venkatraman ES, Lucito R, Wigler M (2004) Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5(4):557–572CrossRefGoogle Scholar
  37. 37.
    Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841–842CrossRefGoogle Scholar
  38. 38.
    Su X, Zhang L, Zhang J, Meric-Bernstam F, Weinstein JN (2012) PurityEst: estimating purity of human tumor samples using next-generation sequencing data. Bioinformatics 28(17):2265–2266CrossRefGoogle Scholar
  39. 39.
    Yoshihara K, Shahmoradgoli M, Martínez E, Vegesna R, Kim H, Torres-Garcia W et al (2013) Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun 4:2612 Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Centre for Integrative Biology (CIBIO)University of TrentoTrentoItaly

Personalised recommendations