Abstract
The nodes of a correlation network correspond to the columns of a numeric data matrix datX. Based on the singular value decomposition (SVD) of datX, we are able to characterize approximately factorizable correlation networks, i.e., adjacency matrices that factor into node-specific contributions. The SVD yields singular vectors that have important practical applications. For example, the first left singular vector (referred to as the eigenvector) explains the maximum amount of variation of the columns of datX. The eigenvector is also known as module eigengene in the context of a gene co-expression network module. Right singular vectors can be used for signal balancing, e.g., to remove batch effects and other technical artifacts. Based on the eigenvector (the first left singular vector), we define a new type of network concept, referred to as eigenvector-based network concept. Eigenvector-based concepts are analogous to approximate conformity-based network concepts but have a major advantage: they often allow for a geometric interpretation based on the angular interpretation of correlations. The underlying structure of correlation networks affects network analysis results. For example, there are geometric reasons why intramodular hub nodes in important modules tend to be important, and why hub nodes in one module cannot be hubs in another distinct module. The hub node significance of a module can be interpreted as angle between a sample trait and the module eigengene. Since the intramodular connectivity kIM i is highly related to the module membership measure kME i , it can be interpreted as angle between x i and the module eigenvector ME. A short dictionary for translating between data mining- and network theory language may facilitate the communication between the two fields. Mouse and brain gene co-expression network applications are used to illustrate the results. This work reviews and extends work with Jun Dong (Horvath and Dong PLoS Comput Biol 4(8):e1000117, 2008).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Adrian D, Chris H, Beatrix J, Joseph R, Guang Y, West M (2004) Sparse graphical models for exploring gene expression data. J Multivar Anal 90(1):196–212
Alter O, Brown PO, Botstein D (2000) Singular value decomposition for genome-wide expression data processing and modelling. Proc Natl Acad Sci USA 97:10101–10106
Carlson M, Zhang B, Fang Z, Mischel P, Horvath S, Nelson SF (2006) Gene connectivity, function, and sequence conservation: Predictions from modular yeast co-expression networks. BMC Genomics 7(7):40
Dong J, Horvath S (2007) Understanding network concepts in modules. BMC Syst Biol 1(1):24
Fuller TF, Ghazalpour A, Aten JE, Drake T, Lusis AJ, Horvath S (2007) Weighted gene coexpression network analysis strategies applied to mouse weight. Mamm Genome 18(6–7):463–472
Ghazalpour A, Doss S, Zhang B, Plaisier C, Wang S, Schadt EE, Thomas A, Drake TA, Lusis AJ, Horvath S (2006) Integrating genetics and network analysis to characterize genes related to mouse weight. PloS Genet 2(2):8
Hibbs MA, Hess DC, Myers CL, Huttenhower C, Li K, Troyanskaya OG (2007) Exploring the functional landscape of gene expression: Directed search of large microarray compendia. Bioinformatics 23(20):2692–2699
Holter NS, Mitra M, Maritan A, Cieplak M, Banavar JR, Fedoroff NV (2000) Fundamental patterns underlying gene expression profiles: Simplicity from complexity. Proc Natl Acad Sci USA 97(15):8409–8414
Horvath S, Dong J (2008) Geometric interpretation of gene co-expression network analysis. PLoS Comput Biol 4(8):e1000117
Horvath S, Zhang B, Carlson M, Lu KV, Zhu S, Felciano RM, Laurance MF, Zhao W, Shu Q, Lee Y, Scheck AC, Liau LM, Wu H, Geschwind DH, Febbo PG, Kornblum HI, Cloughesy TF, Nelson SF, Mischel PS (2006) Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a novel molecular target. Proc Natl Acad Sci USA 103(46):17402–17407
Langfelder P, Horvath S (2007) Eigengene networks for studying the relationships between co-expression modules. BMC Syst Biol 1(1):54
Liao JC, Boscolo R, Yang YL, Tran LM, Sabatti C, Roychowdhury VP (2003) Network component analysis: Reconstruction of regulatory signals in biological systems. Proc Natl Acad Sci USA 100(26):15522–15527
Oldham MC, Konopka G, Iwamoto K, Langfelder P, Kato T, Horvath S, Geschwind DH (2008) Functional organization of the transcriptome in human brain. Nat Neurosci 11(11):1271–1282
Shen R, Ghosh D, Chinnaiyan A, Meng Z (2006) Eigengene-based linear discriminant model for tumor classification using gene expression microarray data. Bioinformatics 22(21):2635–2642
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9(12):3273–3297
Tamayo P, Scanfeld D, Ebert BL, Gillette MA, Roberts CW, Mesirov JP (2007) Metagene projection for cross-platform, cross-species characterization of global transcriptional states. Proc Natl Acad Sci USA 104(14):5959–5964
West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R, Zuzan H, Olson JA, Marks JR, Nevins JR (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci USA 98(20):11462–11467
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Horvath, S. (2011). Geometric Interpretation of Correlation Networks Using the Singular Value Decomposition. In: Weighted Network Analysis. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-8819-5_6
Download citation
DOI: https://doi.org/10.1007/978-1-4419-8819-5_6
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-8818-8
Online ISBN: 978-1-4419-8819-5
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)