Abstract
We explore the use of a network meta-modeling approach to compare the effects of similarity metrics used to construct biological networks on the topology of the resulting networks. This work reviews various similarity metrics for the construction of networks and various topology measures for the characterization of resulting network topology, demonstrating the use of these metrics in the construction and comparison of phylogenomic and transcriptomic networks.
This work was supported by the Plant-Microbe Interfaces Scientific Focus Area (http://pmi.ornl.gov) in the Genomic Science Program, the Office of Biological and Environmental Research (BER) in the U.S. Department of Energy Office of Science, and the BERs BioEnergy Science Center (BESC) at the Oak Ridge National Laboratory (contract DE-PS02-06ER64304). Oak Ridge National Laboratory is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05-00OR22725. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the chapter for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy provides public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). The authors would also like to acknowledge the Centre for High Performance Computing and the Stellenbosch High Performance Computing Cluster for computing resources, and the South African National Research Foundation (www.nrf.ac.za) Technology and Human Resources Programme and Winetech. The financial assistance of the National Research Foundation (NRF) toward this research is hereby acknowledged. Opinions expressed and conclusions reached are those of the author and are not necessarily to be attributed to the NRF. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barabasi AL, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5(2):101–113
Pearson K (1895) Note on regression and inheritance in the case of two parents. Proc R Soc Lond 58(347–352):240–242
Rodgers JL, Nicewander WA (1988) Thirteen ways to look at the correlation coefficient. Am Stat 42(1):59–66
Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15(1):72–101
Pinto da Costa J, Soares C (2005) A weighted rank measure of correlation. Aust N Z J Stat 47(4):515–529
Jaccard P (1912) The distribution of the flora in the alpine zone. 1. New Phytol 11(2):37–50
Lipkus AH (1999) A proof of the triangle inequality for the Tanimoto distance. J Math Chem 26(1–3):263–265
Hamers L, Hemeryck Y, Herweyers G, Janssen M, Keters H, Rousseau R, Vanhoutte A (1989) Similarity measures in scientometric research: the Jaccard index versus Salton’s cosine formula. Inform Process Manage 25(3):315–318
Sørensen T (1948) A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Biologiske Skrifter 5:1–34
Dice LR (1945) Measures of the amount of ecologic association between species. Ecology 26(3):297–302
Yoshioka PM (2008) Misidentification of the Bray-Curtis similarity index. Mar Ecol Prog Ser 368:309–310
Bray JR, Curtis JT (1957) An ordination of the upland forest communities of southern Wisconsin. Ecol Monogr 27(4):325–349
Lance G, Williams W (1966) Computer programs for hierarchical polythetic classification (“similarity analyses”). Comput J 9(1):60–64
Schubert A (2013) Measuring the similarity between the reference and citation distributions of journals. Scientometrics 96(1):305–313
Schubert A, Telcs A (2014) A note on the Jaccardized Czekanowski similarity index. Scientometrics 98(2):1397–1399
Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC. Detecting novel associations in large data sets - supplementary material. http://www.sciencemag.org/content/334/6062/1518/suppl/DC1. Accessed Feb 2013
Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC (2011) Detecting novel associations in large data sets. Science 334(6062):1518–1524
Horvath S, Dong J (2008) Geometric interpretation of gene coexpression network analysis. PLoS Comput Biol 4(8):e1000117
Zhang B, Horvath S et al (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4(1):5144–6115
Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabási AL (2002) Hierarchical organization of modularity in metabolic networks. Science 297(5586):1551–1555
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440–442
Reijneveld JC, Ponten SC, Berendse HW, Stam CJ (2007) The application of graph theoretical analysis to complex networks in the brain. Clin Neurophysiol 118(11):2317–2331
Latora V, Marchiori M (2001) Efficient behavior of small-world networks. Phys Rev Lett 87(19):198701
Snijders TA (1981) The degree variance: an index of graph heterogeneity. Soc Networks 3(3):163–174
Dong J, Horvath S (2007) Understanding network concepts in modules. BMC Syst Biol 1:24
Freeman LC (1979) Centrality in social networks conceptual clarification. Soc Netw 1(3):215–239
Meilă M (2005) Comparing clusterings: an axiomatic view. In: Proceedings of the 22nd international conference on machine learning, ACM, pp 577–584
Van Dongen S (2000) Graph clustering by flow simulation. Ph.D. thesis, University of Utrecht
Van Dongen S (2008) Graph clustering via a discrete uncoupling process. SIAM J Matrix Anal Appl 30(1):121–141
Wagner S, Wagner D (2007) Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik
Meilă M (2007) Comparing clusterings - an information based distance. J Multivar Anal 98(5):873–895
Berlingerio M, Koutra D, Eliassi-Rad T, Faloutsos C. A scalable approach to size-independent network similarity. Available: http://arxiv.org/pdf/1209.2684.pdf
Bloom SA (1981) Similarity indices in community studies: potential pitfalls. Mar Ecol Prog Ser 5(2):125–128
Qlucore (2008) http://www.qlucore.com/. Accessed 14 Feb 2013
Shannon P, Markiel A, Ozier O, Baliga N, Wang J, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504
Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2):249–264
Li L, Stoeckert C, Roos D (2003) Orthomcl: identification of ortholog groups for eukaryotic genomes. Genome Res 13(9):2178–2189
Enright A, Van Dongen S, Ouzounis C (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30(7):1575–1578
Setati ME, Jacobson D, Andong UC, Bauer F (2012) The vineyard yeast microbiome, a mixed model microbial map. PLoS One 7(12):e52609
Federhen S (2012) The NCBI taxonomy database. Nucleic Acids Res 40(D1):D136–D143
Weighill DA (2014) Exploring the topology of complex phylogenomic and transcriptomic networks. Master’s thesis, Stellenbosch University
Author’s Contributions and Acknowledgments
D. Weighill and D. Jacobson conceived of and designed the methods, D. Weighill wrote the code and created the networks, D. Weighill and D. Jacobson discussed and interpreted the networks, D. Weighill drafted the manuscript, and D. Jacobson critically revised and edited the manuscript.
The research reported in this chapter was performed at Stellenbosch University, South Africa as part of a Master's thesis [41], and subsequent editing for publication in this book was performed at Oak Ridge National Laboratory and University of Tennessee, Knoxville.
Competing Interests The authors declare that they have no competing financial interests.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Weighill, D.A., Jacobson, D. (2016). Network Metamodeling: Effect of Correlation Metric Choice on Phylogenomic and Transcriptomic Network Topology. In: Nookaew, I. (eds) Network Biology. Advances in Biochemical Engineering/Biotechnology, vol 160. Springer, Cham. https://doi.org/10.1007/10_2016_46
Download citation
DOI: https://doi.org/10.1007/10_2016_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56459-3
Online ISBN: 978-3-319-56460-9
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)