Gene Set Approaches and Prognostic Subgroup Prediction

Kim, Ju Han

doi:10.1007/978-981-13-1942-6_8

Ju Han Kim²

Part of the book series: Learning Materials in Biosciences ((LMB))

2904 Accesses
1 Citations

Abstract

In this chapter, we implement Gene Set Enrichment Analysis (GSEA) to analyze microarray data. We perform Kaplan-Meier survival analysis for the clustered genes obtained by microarray data clustering analysis and test the statistical significance of different prognoses between clusters. It provides an understanding of the correlation between biological interpretation and GO and pathway analysis of the clustered genes and an interpretation with GSEA of the clustered genes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Theoretically, once the number of genes is decided, the possible number of gene sets is defined by combinatorics, i.e., “Stirling numbers.” However, this number is a very big number, resulting in a “computationally unfeasible” state. In addition, not all combinations are reasonable enough to have a biological meaning. Therefore, practically, a small portion of all Stirling numbers that are meaningful are testable.
2.
ASCII (American Standard Code for Information Interchange) is a 7-bit character code to present an English character in computer. It has 128 codes. Codes 0 to 31 are used to control peripherals, such as printers, codes 32–47 are for all the characters, 48–57 are for numbers, and 65–90 are for alphabets.
3.
Auto-formatting function can lead to “auto-error” when a gene name is entered in the Excel file (Zeeberg et al. 2004).
4.
CR/LF, line feed format is OS-dependent.
5.
Do not use hyphen “-” in the file name. It cannot be recognized in the GSEA input window due to some JAVA libraries.
6.
Excel sends a warning that it has features unable to support tab-delimited files. Nevertheless, please select “Yes” to save.

Bibliography

Cancer Genome Atlas Research Network (2011) Integrated genomic analyses of ovarian carcinoma. Nature 474(7353):609–615. https://doi.org/10.1038/nature10166
Article CAS Google Scholar
Chiaretti S et al (2004) Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival. Blood 103(7):2771–2778. Epub 2003 Dec 18
Article CAS PubMed Google Scholar
Jacobsen A (2017) cgdsr: R-Based API for Accessing the MSKCC Cancer Genomics Data Server (CGDS). R package version 1.2.6. https://CRAN.R-project.org/package=cgdsr
Liberzon A et al (2011) Molecular signatures database (MSigDB) 3.0. Bioinformatics 27(12):1739–1740. https://doi.org/10.1093/bioinformatics/btr260. Epub 2011 May 5
Article CAS PubMed PubMed Central Google Scholar
Mootha VK et al (2003) PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 34(3):267–273
Article CAS PubMed Google Scholar
Subramanian A et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102(43):15545–15550
Article CAS PubMed PubMed Central Google Scholar
Therneau T (2015) _A package for survival analysis in S_. version 2.38, <URL: https://CRAN.R-project.org/package=survival>
Zeeberg BR et al (2004) Mistaken identifiers: gene name errors can be introduced inadvertently when using excel in bioinformatics. BMC Bioinformatics 5:80
Article PubMed PubMed Central Google Scholar
Zeeberg BR et al (2004) Mistaken identifiers: gene name errors can be introduced inadvertently when using excel in bioinformatics. BMC Bioinformatics 5:80
Article PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Division of Biomedical Informatics, Seoul National University College of Medicine, Seoul, South Korea
Ju Han Kim

Authors

Ju Han Kim
View author publications
You can also search for this author in PubMed Google Scholar

1 Electronic Supplementary Material

(ZIP 3318 kb)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kim, J.H. (2019). Gene Set Approaches and Prognostic Subgroup Prediction. In: Genome Data Analysis. Learning Materials in Biosciences. Springer, Singapore. https://doi.org/10.1007/978-981-13-1942-6_8

Download citation

DOI: https://doi.org/10.1007/978-981-13-1942-6_8
Published: 01 May 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1941-9
Online ISBN: 978-981-13-1942-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics