Using RNA Sequencing to Characterize the Tumor Microenvironment

  • C. C. Smith
  • L. M. Bixby
  • K. L. Miller
  • S. R. Selitsky
  • D. S. Bortone
  • K. A. Hoadley
  • B. G. Vincent
  • J. S. SerodyEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 2055)


RNA sequencing (RNA-seq) is an integral tool in immunogenomics, allowing for interrogation of the transcriptome of a tumor and its microenvironment. Analytical methods to deconstruct the genomics data can then be applied to infer gene expression patterns associated with the presence of various immunocyte populations. High quality RNA-seq is possible from formalin-fixed, paraffin-embedded (FFPE), fresh-frozen, and fresh tissue, with a wide variety of sequencing library preparation methods, sequencing platforms, and downstream bioinformatics analyses currently available. Selection of an appropriate library preparation method is largely determined by tissue type, quality of RNA, and quantity of RNA. Downstream of sequencing, many analyses can be applied to the data, including differential gene expression analysis, immune gene signature analysis, gene pathway analysis, T/B-cell receptor inference, HLA inference, and viral transcript quantification. In this chapter, we will describe our workflow for RNA-seq from bulk tissue to evaluable data, including extraction of RNA, library preparation methods, sequencing of libraries, alignment and quality assurance of data, and initial downstream analyses of RNA-seq data to extract relevant immunogenomics features. Systems biology methods that draw additional insights by integrating these features are covered further in Chapters  28 30.

Key words

RNA-seq Immunogenomics RNA extraction Library preparation Next-generation sequencing Alignment Quantification 


  1. 1.
    Newman AM, Liu CL, Green MR et al (2015) Robust enumeration of cell subsets from tissue expression profiles. Nat Methods 12:453–457. Scholar
  2. 2.
    Li B, Severson E, Pignon JC et al (2016) Comprehensive analyses of tumor immunity: implications for cancer immunotherapy. Genome Biol 17.
  3. 3.
    Cancer Genome Atlas Research Network (2015) Comprehensive molecular portraits of invasive lobular breast cancer. Cell 163:506–519. Scholar
  4. 4.
    Saito R, Smith CC, Utsumi T et al (2018) Molecular subtype-specific immunocompetent models of high-grade urothelial carcinoma reveal differential neoantigen expression and response to immunotherapy. Cancer Res 78:3954–3968. Scholar
  5. 5.
    Hellmann MD, Callahan MK, Awad MM et al (2018) Tumor mutational burden and efficacy of nivolumab monotherapy and in combination with ipilimumab in small-cell lung cancer. Cancer Cell 33:853–861.e4. Scholar
  6. 6.
    Smith CC et al (2018) Endogenous retroviral signatures predict immunotherapy response in clear cell renal cell carcinoma. J Clin InvestGoogle Scholar
  7. 7.
    Castle JC, Kreiter S, Diekmann J et al (2012) Exploiting the mutanome for tumor vaccination. Cancer Res 72:1081–1091. Scholar
  8. 8.
    Matsushita H, Vesely MD, Koboldt DC et al (2012) Cancer exome analysis reveals a T-cell-dependent mechanism of cancer immunoediting. Nature 482:400–404. Scholar
  9. 9.
    Simpson AJG, Caballero OL, Jungbluth A et al (2005) Cancer/testis antigens, gametogenesis and cancer. Nat Rev Cancer 5:615–625CrossRefGoogle Scholar
  10. 10.
    Coulie PG, Van Den Eynde BJ, Van Der Bruggen P, Boon T (2014) Tumour antigens recognized by T lymphocytes: at the core of cancer immunotherapy. Nat Rev Cancer 14:135–146CrossRefGoogle Scholar
  11. 11.
    Illumina (2017) bcl2fastq2 Software v2.19.1 Release NotesGoogle Scholar
  12. 12.
    Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Available at: In: FastQC a qual. Control tool high throughput Seq. data. Available
  13. 13.
    Bushnell, Brian (2014) BBMap: a fast, accurate, splice-aware aligner. Conf. 9th Annu. Genomics Energy Environ. MeetGoogle Scholar
  14. 14.
    Dobin A, Davis CA, Schlesinger F et al (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21. Scholar
  15. 15.
    Qiagen (2018) CLC assembly cell user manualGoogle Scholar
  16. 16.
    Hercus C, Albertyn Z (2012) Novoalign. Novocr TechnolGoogle Scholar
  17. 17.
    Wu TD, Reeder J, Lawrence M et al (2016) GMAP and GSNAP for genomic sequence alignment: enhancements to speed, accuracy, and functionality. In: Methods in molecular biology, pp 283–334Google Scholar
  18. 18.
    Patro R, Duggal G, Love MI et al (2017) Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14:417–419. Scholar
  19. 19.
    Patro R, Mount SM, Kingsford C (2014) Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol 32:462–464. Scholar
  20. 20.
    Bray NL, Pimentel H, Melsted P, Pachter L (2016) Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34:525–527. Scholar
  21. 21.
    Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. Scholar
  22. 22.
    Broad Institute (2016) Picard tools.
  23. 23.
    Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15(550).
  24. 24.
    Smyth GK (2005) Limma: linear models fro microarray data. In: Bioinformatics and computational biology solutions using R and bioconductor p 397–420Google Scholar
  25. 25.
    Law CW, Chen Y, Shi W, Smyth GK (2014) Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol 15. Scholar
  26. 26.
    Tarazona S, Furió-Tarí P, Turrà D et al (2015) Data quality aware analysis of differential expression in RNA-seq with NOISeq R/bioc package. Nucleic Acids Res 43.
  27. 27.
    Robinson MD, McCarthy DJ, Smyth GK (2009) edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139–140. Scholar
  28. 28.
    Mootha VK, Lindgren CM, Eriksson KF et al (2003) PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 34:267–273. Scholar
  29. 29.
    Subramanian P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JPAT (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545–15550. Scholar
  30. 30.
    Barbie DA, Tamayo P, Boehm JS et al (2009) Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462:108–112. Scholar
  31. 31.
    Hänzelmann S, Castelo R, Guinney J (2013) GSVA: gene set variation analysis for microarray and RNA-Seq data. BMC Bioinformatics 14. Scholar
  32. 32.
    Hou JP, Ma J (2014) DawnRank: discovering personalized driver genes in cancer. Genome Med 6.
  33. 33.
    Bolotin DA, Poslavsky S, Mitrophanov I et al (2015) MiXCR: software for comprehensive adaptive immunity profiling. Nat Methods 12:380–381CrossRefGoogle Scholar
  34. 34.
    Mose LE, Selitsky SR, Bixby LM et al (2016) Assembly-based inference of B-cell receptor repertoires from short read RNA sequencing data with V’DJer. Bioinformatics 32:3729–3734. Scholar
  35. 35.
    Bai Y, Wang D, Fury W (2018) PHLAT: inference of high-resolution HLA types from RNA and whole exome sequencing. In: Methods in molecular biology, pp 193–201Google Scholar
  36. 36.
    Buchkovich ML, Brown CC, Robasky K et al (2017) HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data. Genome Med 9.
  37. 37.
    Jurtz V, Paul S, Andreatta M et al (2017) NetMHCpan-4.0: improved peptide–MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J Immunol. ji1700893. Scholar
  38. 38.
    Andreatta M, Karosiene E, Rasmussen M et al (2015) Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification. Immunogenetics 67:641–650. Scholar
  39. 39.
    Kim S, Kim HS, Kim E et al (2018) Neopepsee: accurate genome-level prediction of neoantigens by harnessing sequence and amino acid immunogenicity information. Ann Oncol 29:1030–1036. Scholar
  40. 40.
    Hundal J, Carreno BM, Petti AA et al (2016) Abstract 3995: pVAC-Seq: a genome-guided in silico approach to identify tumor neoantigens for personalized immunotherapy. Cancer Res 76:3995–3995. Scholar
  41. 41.
    Zhang J, Mardis ER, Maher CA (2017) INTEGRATE-neo: a pipeline for personalized gene fusion neoantigen discovery. Bioinformatics 33:555–557. Scholar
  42. 42.
    Selitsky SR, David M, Lisle M, Parker Joel S, Dittmer DP (2018) Epstein-Barr virus-positive cancers show altered B-cell Clonality. mSystems 3(5)Google Scholar
  43. 43.
    Ali N, Rampazzo RDCP, Costa ADT, Krieger MA (2017, 2017) Current nucleic acid extraction methods and their implications to point-of-care diagnostics. Biomed Res IntGoogle Scholar
  44. 44.
    Escobar MD, Hunt JL (2017) A cost-effective RNA extraction technique from animal cells and tissue using silica columns. J Biol Methods 4:72. Scholar
  45. 45.
    Chirgwin JM, Przybyla AE, MacDonald RJ, Rutter WJ (1979) Isolation of biologically active ribonucleic acid from sources enriched in ribonuclease. Biochemistry 18:5294–5299. Scholar
  46. 46.
    Farrell RE (2010) RNA methodologies: laboratory guide for isolation and characterizationCrossRefGoogle Scholar
  47. 47.
    Amini P, Ettlin J, Opitz L et al (2017) An optimised protocol for isolation of RNA from small sections of laser-capture microdissected FFPE tissue amenable for next-generation sequencing. BMC Mol Biol 18.
  48. 48.
    Kresse SH, Namløs HM, Lorenz S et al (2018) Evaluation of commercial DNA and RNA extraction methods for high-throughput sequencing of FFPE samples. PLoS One 13:e0197456. Scholar
  49. 49.
    Bonin S, Hlubek F, Benhattar J et al (2010) Multicentre validation study of nucleic acids extraction from FFPE tissues. Virchows Arch 457:309–317. Scholar
  50. 50.
    Patel PG, Selvarajah S, Guérard KP et al (2017) Reliability and performance of commercial RNA and DNA extraction kits for FFPE tissue cores. PLoS One 12. Scholar
  51. 51.
    Patel PG, Selvarajah S, Boursalie S et al (2016) Preparation of formalin-fixed paraffin-embedded tissue cores for both RNA and DNA extraction. J Vis Exp:1–10.
  52. 52.
    Nielsen H (2011) RNA methods and protocolsGoogle Scholar
  53. 53.
    Schroeder A, Mueller O, Stocker S et al (2006) The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol 7. Scholar
  54. 54.
    Mueller O, Schroeder A (2004) RNA integrity number (RIN) – standardization of RNA quality control application. Nano 1(8).
  55. 55.
    Illumina (2016) Evaluating RNA quality from FFPE samplesGoogle Scholar
  56. 56.
    Baruzzo G, Hayer KE, Kim EJ et al (2017) Simulation-based comprehensive benchmarking of RNA-seq aligners. Nat Methods 14:135–139. Scholar
  57. 57.
    Engström PG, Steijger T, Sipos B et al (2013) Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods 10:1185–1191. Scholar
  58. 58.
    Wang K, Singh D, Zeng Z et al (2010) MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res 38. Scholar
  59. 59.
    McCall MN, Murakami PN, Lukk M et al (2011) Assessing affymetrix GeneChip microarray quality. BMC Bioinformatics 12. Scholar
  60. 60.
    Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12.
  61. 61.
    Trapnell C, Roberts A, Goff L et al (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and cufflinks. Nat Protoc Protoc 7:562–578. Scholar
  62. 62.
    Zhang C, Zhang B, Lin LL, Zhao S (2017) Evaluation and comparison of computational tools for RNA-seq isoform quantification. BMC Genomics 18.
  63. 63.
    Li X, Brock GN, Rouchka EC et al (2017) A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data. PLoS One 12:e0176185. Scholar
  64. 64.
    Costa-Silva Juliana AND, Domingues DANDLFM (2017) RNA-Seq differential expression analysis: an extended review and a software tool. PLoS One 12:1–18. Scholar
  65. 65.
    Schurch NJ, Schofield P, Gierliński M et al (2016) How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use. RNA 22:839–851. Scholar
  66. 66.
    Fan C, Oh DS, Wessels L et al (2006) Concordance among gene-expression– based predictors for breast cancer. N Engl J Med 355:560–569CrossRefGoogle Scholar
  67. 67.
    Palmer C, Diehn M, Alizadeh AA, Brown PO (2006) Cell-type specific gene expression profiles of leukocytes in human peripheral blood. BMC Genomics 7. Scholar
  68. 68.
    Schmidt M, Böhm D, Von Törne C et al (2008) The humoral immune system has a key prognostic impact in node-negative breast cancer. Cancer Res 68:5405–5413. Scholar
  69. 69.
    Beck AH, Espinosa I, Edris B et al (2009) The macrophage colony-stimulating factor 1 response signature in breast carcinoma. Clin Cancer Res 15:778–787. Scholar
  70. 70.
    Rody A, Holtrich U, Pusztai L et al (2009) T-cell metagene predicts a favorable prognosis in estrogen receptor-negative and HER2-positive breast cancers. Breast Cancer Res 11.
  71. 71.
    Chan KS, Espinosa I, Chao M et al (2009) Identification, molecular characterization, clinical prognosis, and therapeutic targeting of human bladder tumor-initiating cells. Proc Natl Acad Sci U S A 106:14016–14021. Scholar
  72. 72.
    Prat A, Parker JS, Karginova O et al (2010) Phenotypic and molecular characterization of the claudin-low intrinsic subtype of breast cancer. Breast Cancer Res 12:R68. Scholar
  73. 73.
    Fan C, Prat A, Parker JS et al (2011) Building prognostic models for breast cancer patients using clinical variables and hundreds of gene expression signatures. BMC Med Genet 4.
  74. 74.
    Rody A, Karn T, Liedtke C et al (2011) A clinically relevant gene signature in triple negative and basal-like breast cancer. Breast Cancer Res 13:R97. Scholar
  75. 75.
    Bindea G, Mlecnik B, Tosolini M et al (2013) Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Immunity 39:782–795. Scholar
  76. 76.
    Iglesia MD, Vincent BG, Parker JS et al (2014) Prognostic B-cell signatures using mRNA-seq in patients with subtype-specific breast and ovarian cancer. Clin Cancer Res 20:3818–3829. Scholar
  77. 77.
    Kardos J, Chai S, Mose LE et al (2016) Claudin-low bladder tumors are immune infiltrated and actively immune suppressed. JCI Insight 1.
  78. 78.
    Charoentong P, Finotello F, Angelova M et al (2017) Pan-cancer immunogenomic analyses reveal genotype-immunophenotype relationships and predictors of response to checkpoint blockade. Cell Rep 18:248–262. Scholar
  79. 79.
    Thorsson V, Gibbs DL, Brown SD et al (2018) The immune landscape of cancer. Immunity 48:812–830.e14. Scholar
  80. 80.
    Liberzon A, Subramanian A, Pinchback R et al (2011) Molecular signatures database (MSigDB) 3.0. Bioinformatics 27:1739–1740. Scholar
  81. 81.
    Tomfohr J, Lu J, Kepler TB (2005) Pathway level analysis of gene expression using singular value decomposition. BMC Bioinformatics 6. Scholar
  82. 82.
    Hulsegge I, Kommadath A, Smits MA (2009) Globaltest and GOEAST: two different approaches for gene ontology analysis. BMC Proc 3:S10. Scholar
  83. 83.
    Tarca AL, Draghici S, Bhatti G, Romero R (2012) Down-weighting overlapping genes improves gene set analysis. BMC Bioinformatics 13(136).
  84. 84.
    Tarca AL, Bhatti G, Romero R (2013) A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity. PLoS One 8. Scholar
  85. 85.
    Bolotin DA, Shugay M, Mamedov IZ et al (2013) MiTCR: software for T-cell receptor sequencing data analysis. Nat Methods 10:813–814CrossRefGoogle Scholar
  86. 86.
    Li B, Li T, Wang B et al (2017) Ultrasensitive detection of TCR hypervariable-region sequences in solid-tissue RNA-seq data. Nat Genet 49:483–484CrossRefGoogle Scholar
  87. 87.
    Bolotin DA, Poslavsky S, Davydov AN et al (2017) Antigen receptor repertoire profiling from RNA-seq data. Nat Biotechnol 35:908–911. Scholar
  88. 88.
    Weimer ET, Montgomery M, Petraroia R et al (2016) Performance characteristics and validation of next-generation sequencing for human leucocyte antigen typing. J Mol Diagnostics 18. Scholar
  89. 89.
    Nariai N, Kojima K, Saito S et al (2015) HLA-VBSeq: accurate HLA typing at full resolution from whole-genome sequencing data. BMC Genomics 16.
  90. 90.
    Major E, Rigó K, Hague T et al (2013) HLA typing from 1000 genomes whole genome and whole exome illumina data. PLoS One 8. Scholar
  91. 91.
    Greytak SR, Engel KB, Zmuda E, Casas-Silva E, Guan P, Hoadley KA, Mungall AJ, Wheeler DA, Doddapaneni HV, Moore H (2018) National cancer institute biospecimen evidence-based practices: harmonizing procedures for nucleic acid extraction from formalin-fixed, paraffin-embedded tissue. Biopreserv Biobank 16:247–250. Scholar
  92. 92.
    Zhao W, He X, Hoadley KA et al (2014) Comparison of RNA-Seq by poly (a) capture, ribosomal RNA depletion, and DNA microarray for expression profiling. BMC Genomics 15. Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  • C. C. Smith
    • 1
    • 2
  • L. M. Bixby
    • 2
  • K. L. Miller
    • 2
  • S. R. Selitsky
    • 3
  • D. S. Bortone
    • 3
  • K. A. Hoadley
    • 2
    • 4
  • B. G. Vincent
    • 1
    • 3
    • 5
    • 6
  • J. S. Serody
    • 1
    • 2
    • 5
    Email author
  1. 1.Department of Microbiology and ImmunologyUNC School of MedicineChapel HillUSA
  2. 2.Lineberger Comprehensive Cancer CenterUniversity of North Carolina at Chapel HillChapel HillUSA
  3. 3.Lineberger Comprehensive Cancer CenterUniversity of North Carolina at Chapel HillChapel HillUSA
  4. 4.Department of GeneticsUniversity of North Carolina at Chapel HillChapel HillUSA
  5. 5.Division of Hematology/Oncology, Department of Medicine, Lineberger Comprehensive Cancer CenterUniversity of North Carolina at Chapel HillChapel HillUSA
  6. 6.Curriculum in Bioinformatics and Computational BiologyUniversity of North Carolina at Chapel HillChapel HillUSA

Personalised recommendations