Abstract
How can we “scale down” an n-node network G to a smaller network \(G'\), with \(k \ll n\) nodes, so that \(G'\) (approximately) maintains the important structural properties of G? There is a voluminous literature on many versions of this problem if k is given in advance, but one’s tolerance for approximation (and the resulting value of k) will vary. Here, then, we formulate a “rescalable” version of this approximation task for complex networks. Specifically, we propose a node ordering version of graph summarization: permute the nodes of G so that the subgraph induced by the first k nodes is a good size-k approximation of G, averaged over the full range of possible sizes k. We consider as a case study the phonological network of English words, and discover two natural word orders (word frequency and age of acquisition) that do a surprisingly good job of rescalably summarizing the lexicon.
This work grew out of portions of a research project that was carried out by the authors of this work in collaboration with Aman Panda and Duo Tao. We gratefully acknowledge their contributions. This work was supported in part by Carleton College. Comments are welcome.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmed, N., Neville, J., Kompella, R.: Network sampling: from static to streaming graphs. ACM Trans. Knowl. Discov. Data (TKDD) 8(2), 7 (2014)
Altieri, N., Gruenenfelder, T., Pisoni, D.: Clustering coefficients of lexical neighborhoods: Does neighborhood structure matter in spoken word recognition. Mental Lex. 5(1), 1–21 (2010)
Arbesman, S.: The fractal dimension of ZIP codes. WIRED (2012)
Arbesman, S., Strogatz, S., Vitevitch, M.: The structure of phonological networks across multiple languages. Int. J. Bifurc. Chaos 20(03), 679–685 (2010)
Avidan, S., Shamir, A.: Seam carving for content-aware image resizing. ACM Trans. Graph. 26(3), 10 (2007)
Balota, D., et al.: The English Lexicon project. Behav. Res. Methods 39(3), 445–459 (2007)
Boldi, P., Vigna, S.: The webgraph framework I: compression techniques. In: WWW 2004
Brot, H., Muchnik, L., Goldenberg, J., Louzoun, Y.: Evolution through bursts: network structure develops through localized bursts in time and space. Netw. Sci. 4(3), 293–313 (2016)
Brysbaert, M., New, B.: Moving beyond Kucera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behav. Res. Methods 41(4), 977–990 (2009)
Brysbaert, M., Van Wijnendaele, I., De Deyne, S.: Age-of-acquisition effects in semantic processing tasks. Acta Psychol. 104(2), 215–226 (2000)
Chan, K., Vitevitch, M.: The influence of the phonological neighborhood clustering coefficient on spoken word recognition. J. Exp. Psychol. Hum. Percept. Perform 35(6), 1934–1949 (2009)
Clauset, A., Shalizi, C., Newman, M.: Power-law distributions in empirical data. SIAM Rev. 51(4), 661–703 (2009)
Cortese, M., Khanna, M.: Age of acquisition predicts naming and lexical-decision performance above and beyond 22 other predictor variables: an analysis of 2,342 words. Q. J. Exp. Psychol. 60(8), 1072–1082 (2007)
Devanur, N., Khot, S., Saket, R., Vishnoi, N.: Integrality gaps for sparsest cut and minimum linear arrangement problems. In: STOC 2006
Dhamdhere, K.: Approximating additive distortion of embeddings into line metrics. In: APPROX/RANDOM 2004
Feige, U., Lee, J.: An improved approximation ratio for the minimum linear arrangement problem. Inf. Process. Lett. 101(1), 26–29 (2007)
Girvan, M., Newman, M.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)
Gruenenfelder, T., Pisoni, D.: The lexical restructuring hypothesis and graph theoretic analyses of networks based on random lexicons. J. Speech Lang. Hear. Res. 52(3), 596–609 (2009)
Hennessey, D., Brooks, D., Fridman, A., Breen, D.: A simplification algorithm for visualizing the structure of complex graphs. In: INFOVIS 2008
Hübler, C., Kriegel, H.P., Borgwardt, K., Ghahramani, Z.: Metropolis algorithms for representative subgraph sampling. In: ICDM 2008
Järvelin, K., Kekäläinen, J.: IR evaluation methods for retrieving highly relevant documents. In: SIGIR 2000
Kempe, D., Kleinberg, J., Tardos, É.: Maximizing the spread of influence through a social network. In: KDD 2003
Kosara, R.: US ZIPScribble map. https://eagereyes.org/zipscribble-maps/united-states (2006)
Kossinets, G., Kleinberg, J., Watts, D.: The structure of information pathways in a social communication network. In: KDD 2008
Kumar, R., Vassilvitskii, S.: Generalized distances between rankings. In: WWW 2010
Kuperman, V., Stadthagen-Gonzalez, H., Brysbaert, M.: Age-of-acquisition ratings for 30,000 English words. Behav. Res. Methods 44(4), 978–990 (2012)
Landauer, T., Streeter, L.: Structural differences between common and rare words: failure of equivalence assumptions for theories of word recognition. J. Mem. Lang. 12(2), 119 (1973)
Lee, M.J., Lee, J., Park, J.Y., Choi, R.H., Chung, C.W.: Qube: a quick algorithm for updating betweenness centrality. In: WWW 2012
Leskovec, J., Backstrom, L., Kumar, R., Tomkins, A.: Microscopic evolution of social networks. In: KDD 2008
Leskovec, J., Faloutsos, C.: Sampling from large graphs. In: KDD 2006
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1), 2 (2007)
Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)
Lin, S.D., Yeh, M.Y., Li, C.T.: Sampling and summarization for social networks (Tutorial). In: Pacific Asia Knowledge Discovery and Data Mining (2013)
Liu, Y., Safavi, T., Dighe, A., Koutra, D.: Graph summarization methods and applications: a survey. ACM Comput. Surv. 51(3), 62 (2018)
Maiya, A., Berger-Wolf, T.: Benefits of bias: towards better characterization of network sampling. In: KDD 2011
Maiya, A., Berger-Wolf, T.: Sampling community structure. In: WWW 2010
Nagel, T., Duval, E.: A visual survey of arc diagrams. In: IEEE Visualization (2013)
Newman, M.: Assortative mixing in networks. Phys. Rev. Lett. 89(20), 208,701 (2002)
Rafiei, D., Curial, S.: Effectively visualizing large networks through sampling. In: VIS 2005
Ruan, N., Jin, R., Huang, Y.: Distance preserving graph simplification. In: ICDM 2011
Sariyuce, A., Kaya, K., Saule, E., Catalyurek, U.: Incremental algorithms for closeness centrality. In: IEEE International Conference on Big Data (2013)
Shoemark, P., Goldwater, S., Kirby, J., Sarkar, R.: Towards robust cross-linguistic comparisons of phonological networks. In: Computational Research in Phonetics, Phonology, and Morphology (2016)
Siew, C.: The orthographic similarity structure of English words: insights from network science. Appl. Netw. Sci. 3(1), 13 (2018)
Stella, M., Brede, M.: Patterns in the English language: phonological networks, percolation and assembly models. J. Stat. Mech. Theory Exp. 2015(5), P05,006 (2015)
Turnbull, R., Peperkamp, S.: What governs a language’s lexicon? Determining the organizing principles of phonological neighbourhood networks. In: International Workshop on Complex Networks and Their Applications (2016)
Vattani, A., Chakrabarti, D., Gurevich, M.: Preserving personalized PageRank in subgraphs. In: ICML 2011
Vitevitch, M.: What can graph theory tell us about word learning and lexical retrieval. J. Speech Lang. Hear. Res. 51(2), 408–422 (2008)
Wattenberg, M.: Arc diagrams: visualizing structure in strings. In: INFOVIS 2002
Watts, D., Strogatz, S.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440 (1998)
Yates, M.: How the clustering of phonological neighbors affects visual word recognition. J. Exp. Psychol. Learn. Mem. Cogn. 39(5), 1649–1656 (2013)
Zachary, W.: An information flow model for conflict and fission in small groups. J. Anthropol. Res. 33(4), 452–473 (1977)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Brown, V. et al. (2019). Node Ordering for Rescalable Network Summarization (or, the Apparent Magic of Word Frequency and Age of Acquisition in the Lexicon). In: Aiello, L., Cherifi, C., Cherifi, H., Lambiotte, R., Lió, P., Rocha, L. (eds) Complex Networks and Their Applications VII. COMPLEX NETWORKS 2018. Studies in Computational Intelligence, vol 812. Springer, Cham. https://doi.org/10.1007/978-3-030-05411-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-05411-3_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05410-6
Online ISBN: 978-3-030-05411-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)