Skip to main content

Differential Expression and Functional Analysis of High-Throughput -Omics Data Using Open Source Tools

  • Protocol
  • First Online:
Oral Biology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1537))

Abstract

Today, –omics analyses, including the systematic cataloging of messenger RNA and microRNA sequences or DNA methylation patterns in a cell population, organ, or tissue sample, allow for an unbiased, comprehensive genome-level analysis of complex diseases, offering a large advantage over earlier “candidate” gene or pathway analyses. A primary goal in the analysis of these high-throughput assays is the detection of those features among several thousand that differ between different groups of samples. In the context of oral biology, our group has successfully utilized –omics technology to identify key molecules and pathways in different diagnostic entities of periodontal disease.

A major issue when inferring biological information from high-throughput –omics studies is the fact that the sheer volume of high-dimensional data generated by contemporary technology is not appropriately analyzed using common statistical methods employed in the biomedical sciences.

In this chapter, we outline a robust and well-accepted bioinformatics workflow for the initial analysis of –omics data generated using microarrays or next-generation sequencing technology using open-source tools. Starting with quality control measures and necessary preprocessing steps for data originating from different –omics technologies, we next outline a differential expression analysis pipeline that can be used for data from both microarray and sequencing experiments, and offers the possibility to account for random or fixed effects. Finally, we present an overview of the possibilities for a functional analysis of the obtained data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kebschull M, Demmer RT, Grun B, Guarnieri P, Pavlidis P, Papapanou PN (2014) Gingival tissue transcriptomes identify distinct periodontitis phenotypes. J Dent Res 93:459–468

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Nowak M, Kramer B, Haupt M, Papapanou PN, Kebschull J, Hoffmann P, Schmidt-Wolf IG, Jepsen S, Brossart P, Perner S, Kebschull M (2013) Activation of invariant NK T cells in periodontitis lesions. J Immunol 190:2282–2291

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Kramer B, Kebschull M, Nowak M, Demmer RT, Haupt M, Korner C, Perner S, Jepsen S, Nattermann J, Papapanou PN (2013) Role of the NK cell-activating receptor CRACC in periodontitis. Infect Immun 81:690–696

    Article  PubMed  PubMed Central  Google Scholar 

  4. Kebschull M, Guarnieri P, Demmer RT, Boulesteix AL, Pavlidis P, Papapanou PN (2013) Molecular differences between chronic and aggressive periodontitis. J Dent Res 92:1081–1088

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Stoecklin-Wasmer C, Guarnieri P, Celenti R, Demmer RT, Kebschull M, Papapanou PN (2012) MicroRNAs and their target genes in gingival tissues. J Dent Res 91:934–940

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Kebschull M, Papapanou PN (2010) The use of gene arrays in deciphering the pathobiology of periodontal diseases. Methods Mol Biol 666:385–393

    Article  CAS  PubMed  Google Scholar 

  7. Papapanou PN, Behle JH, Kebschull M, Celenti R, Wolf DL, Handfield M, Pavlidis P, Demmer RT (2009) Subgingival bacterial colonization profiles correlate with gingival tissue gene expression. BMC Microbiol 9:221

    Article  PubMed  PubMed Central  Google Scholar 

  8. Demmer RT, Behle JH, Wolf DL, Handfield M, Kebschull M, Celenti R, Pavlidis P, Papapanou PN (2008) Transcriptomes in healthy and diseased gingival tissues. J Periodontol 79:2112–2124

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Joensson D, Ramberg P, Demmer RT, Kebschull M, Dahlen G, Papapanou PN (2011) Gingival tissue transcriptomes in experimental gingivitis. J Clin Periodontol 38:599–611

    Article  Google Scholar 

  10. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80

    Article  PubMed  PubMed Central  Google Scholar 

  11. Ritchie ME, Diyagama D, Neilson J, van Laar R, Dobrovic A, Holloway A, Smyth GK (2006) Empirical array quality weights in the analysis of microarray data. BMC Bioinformatics 7:261

    Article  PubMed  PubMed Central  Google Scholar 

  12. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015) Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43:e47

    Article  PubMed  PubMed Central  Google Scholar 

  13. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21

    Article  CAS  PubMed  Google Scholar 

  14. Dobin A, Gingeras TR (2015) Mapping RNA-seq reads with STAR. Curr Protoc Bioinformatics 51:11.14.11–19. doi:10.1002/0471250953.bi1114s51

    Google Scholar 

  15. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for illumina sequence data. Bioinformatics 30:2114–2120

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545–15550

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Merico D, Isserlin R, Bader GD (2011) Visualizing gene-set enrichment results using the cytoscape plug-in enrichment map. Methods Mol Biol 781:257–277

    Article  CAS  PubMed  Google Scholar 

  19. Gillis J, Mistry M, Pavlidis P (2010) Gene function analysis in complex data sets using ErmineJ. Nat Protoc 5:1148–1159

    Article  CAS  PubMed  Google Scholar 

  20. Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen KD, Irizarry RA (2014) Minfi: a flexible and comprehensive bioconductor package for the analysis of infinium DNA methylation microarrays. Bioinformatics 30:1363–1369

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Liao Y, Smyth GK, Shi W (2014) featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30:923–930

    Article  CAS  PubMed  Google Scholar 

  22. Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11:R25

    Article  PubMed  PubMed Central  Google Scholar 

  23. Law CW, Chen Y, Shi W, Smyth GK (2014) Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol 15:R29

    Article  PubMed  PubMed Central  Google Scholar 

  24. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc 57:289–300

    Google Scholar 

  25. Shi W, Banerjee A, Ritchie ME, Gerondakis S, Smyth GK (2009) Illumina WG-6 BeadChip strips should be normalized separately. BMC Bioinformatics 10:372

    Article  PubMed  PubMed Central  Google Scholar 

  26. Hansen KD, Brenner SE, Dudoit S (2010) Biases in illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res 38, e131

    Article  PubMed  PubMed Central  Google Scholar 

  27. Kopylova E, Noe L, Touzet H (2012) SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics 28:3211–3217

    Article  CAS  PubMed  Google Scholar 

  28. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, Macmanes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, Leduc RD, Friedman N, Regev A (2013) De novo transcript sequence reconstruction from RNA-seq using the trinity platform for reference generation and analysis. Nat Protoc 8:1494–1512

    Article  CAS  PubMed  Google Scholar 

  30. Bray NL, Pimentel H, Melsted P, Pachter L (2016) Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34:525–527

    Article  CAS  PubMed  Google Scholar 

  31. Kozomara A, Griffiths-Jones S (2014) miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res 42(Database issue):D68–D73

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

This work was supported by grants from the German Society for Periodontology (DG PARO) and the German Society for Oral and Maxillo-Facial Sciences (DGZMK) to M.K. and by grants from NIH/NIDCR (DE015649, DE021820 and DE024735) and by an unrestricted gift from Colgate-Palmolive Inc. to author P.N.P.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moritz Kebschull .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this protocol

Cite this protocol

Kebschull, M., Fittler, M.J., Demmer, R.T., Papapanou, P.N. (2017). Differential Expression and Functional Analysis of High-Throughput -Omics Data Using Open Source Tools. In: Seymour, G., Cullinan, M., Heng, N. (eds) Oral Biology. Methods in Molecular Biology, vol 1537. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6685-1_19

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-6685-1_19

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-6683-7

  • Online ISBN: 978-1-4939-6685-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics