Node Ordering for Rescalable Network Summarization (or, the Apparent Magic of Word Frequency and Age of Acquisition in the Lexicon)

  • Violet Brown
  • Xi Chen
  • Maryam Hedayati
  • Camden Sikes
  • Julia Strand
  • Tegan Wilson
  • David Liben-NowellEmail author
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 812)


How can we “scale down” an n-node network G to a smaller network \(G'\), with \(k \ll n\) nodes, so that \(G'\) (approximately) maintains the important structural properties of G? There is a voluminous literature on many versions of this problem if k is given in advance, but one’s tolerance for approximation (and the resulting value of k) will vary. Here, then, we formulate a “rescalable” version of this approximation task for complex networks. Specifically, we propose a node ordering version of graph summarization: permute the nodes of G so that the subgraph induced by the first k nodes is a good size-k approximation of G, averaged over the full range of possible sizes k. We consider as a case study the phonological network of English words, and discover two natural word orders (word frequency and age of acquisition) that do a surprisingly good job of rescalably summarizing the lexicon.


Network summarization Node ordering Phonological networks 


  1. 1.
    Ahmed, N., Neville, J., Kompella, R.: Network sampling: from static to streaming graphs. ACM Trans. Knowl. Discov. Data (TKDD) 8(2), 7 (2014)Google Scholar
  2. 2.
    Altieri, N., Gruenenfelder, T., Pisoni, D.: Clustering coefficients of lexical neighborhoods: Does neighborhood structure matter in spoken word recognition. Mental Lex. 5(1), 1–21 (2010)Google Scholar
  3. 3.
    Arbesman, S.: The fractal dimension of ZIP codes. WIRED (2012)Google Scholar
  4. 4.
    Arbesman, S., Strogatz, S., Vitevitch, M.: The structure of phonological networks across multiple languages. Int. J. Bifurc. Chaos 20(03), 679–685 (2010)Google Scholar
  5. 5.
    Avidan, S., Shamir, A.: Seam carving for content-aware image resizing. ACM Trans. Graph. 26(3), 10 (2007)Google Scholar
  6. 6.
    Balota, D., et al.: The English Lexicon project. Behav. Res. Methods 39(3), 445–459 (2007)Google Scholar
  7. 7.
    Boldi, P., Vigna, S.: The webgraph framework I: compression techniques. In: WWW 2004Google Scholar
  8. 8.
    Brot, H., Muchnik, L., Goldenberg, J., Louzoun, Y.: Evolution through bursts: network structure develops through localized bursts in time and space. Netw. Sci. 4(3), 293–313 (2016)Google Scholar
  9. 9.
    Brysbaert, M., New, B.: Moving beyond Kucera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behav. Res. Methods 41(4), 977–990 (2009)Google Scholar
  10. 10.
    Brysbaert, M., Van Wijnendaele, I., De Deyne, S.: Age-of-acquisition effects in semantic processing tasks. Acta Psychol. 104(2), 215–226 (2000)Google Scholar
  11. 11.
    Chan, K., Vitevitch, M.: The influence of the phonological neighborhood clustering coefficient on spoken word recognition. J. Exp. Psychol. Hum. Percept. Perform 35(6), 1934–1949 (2009)Google Scholar
  12. 12.
    Clauset, A., Shalizi, C., Newman, M.: Power-law distributions in empirical data. SIAM Rev. 51(4), 661–703 (2009)Google Scholar
  13. 13.
    Cortese, M., Khanna, M.: Age of acquisition predicts naming and lexical-decision performance above and beyond 22 other predictor variables: an analysis of 2,342 words. Q. J. Exp. Psychol. 60(8), 1072–1082 (2007)Google Scholar
  14. 14.
    Devanur, N., Khot, S., Saket, R., Vishnoi, N.: Integrality gaps for sparsest cut and minimum linear arrangement problems. In: STOC 2006Google Scholar
  15. 15.
    Dhamdhere, K.: Approximating additive distortion of embeddings into line metrics. In: APPROX/RANDOM 2004Google Scholar
  16. 16.
    Feige, U., Lee, J.: An improved approximation ratio for the minimum linear arrangement problem. Inf. Process. Lett. 101(1), 26–29 (2007)Google Scholar
  17. 17.
    Girvan, M., Newman, M.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)Google Scholar
  18. 18.
    Gruenenfelder, T., Pisoni, D.: The lexical restructuring hypothesis and graph theoretic analyses of networks based on random lexicons. J. Speech Lang. Hear. Res. 52(3), 596–609 (2009)Google Scholar
  19. 19.
    Hennessey, D., Brooks, D., Fridman, A., Breen, D.: A simplification algorithm for visualizing the structure of complex graphs. In: INFOVIS 2008Google Scholar
  20. 20.
    Hübler, C., Kriegel, H.P., Borgwardt, K., Ghahramani, Z.: Metropolis algorithms for representative subgraph sampling. In: ICDM 2008Google Scholar
  21. 21.
    Järvelin, K., Kekäläinen, J.: IR evaluation methods for retrieving highly relevant documents. In: SIGIR 2000Google Scholar
  22. 22.
    Kempe, D., Kleinberg, J., Tardos, É.: Maximizing the spread of influence through a social network. In: KDD 2003Google Scholar
  23. 23.
    Kosara, R.: US ZIPScribble map. (2006)
  24. 24.
    Kossinets, G., Kleinberg, J., Watts, D.: The structure of information pathways in a social communication network. In: KDD 2008Google Scholar
  25. 25.
    Kumar, R., Vassilvitskii, S.: Generalized distances between rankings. In: WWW 2010Google Scholar
  26. 26.
    Kuperman, V., Stadthagen-Gonzalez, H., Brysbaert, M.: Age-of-acquisition ratings for 30,000 English words. Behav. Res. Methods 44(4), 978–990 (2012)Google Scholar
  27. 27.
    Landauer, T., Streeter, L.: Structural differences between common and rare words: failure of equivalence assumptions for theories of word recognition. J. Mem. Lang. 12(2), 119 (1973)Google Scholar
  28. 28.
    Lee, M.J., Lee, J., Park, J.Y., Choi, R.H., Chung, C.W.: Qube: a quick algorithm for updating betweenness centrality. In: WWW 2012Google Scholar
  29. 29.
    Leskovec, J., Backstrom, L., Kumar, R., Tomkins, A.: Microscopic evolution of social networks. In: KDD 2008Google Scholar
  30. 30.
    Leskovec, J., Faloutsos, C.: Sampling from large graphs. In: KDD 2006Google Scholar
  31. 31.
    Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1), 2 (2007)Google Scholar
  32. 32.
    Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)Google Scholar
  33. 33.
    Lin, S.D., Yeh, M.Y., Li, C.T.: Sampling and summarization for social networks (Tutorial). In: Pacific Asia Knowledge Discovery and Data Mining (2013)Google Scholar
  34. 34.
    Liu, Y., Safavi, T., Dighe, A., Koutra, D.: Graph summarization methods and applications: a survey. ACM Comput. Surv. 51(3), 62 (2018)Google Scholar
  35. 35.
    Maiya, A., Berger-Wolf, T.: Benefits of bias: towards better characterization of network sampling. In: KDD 2011Google Scholar
  36. 36.
    Maiya, A., Berger-Wolf, T.: Sampling community structure. In: WWW 2010Google Scholar
  37. 37.
    Nagel, T., Duval, E.: A visual survey of arc diagrams. In: IEEE Visualization (2013)Google Scholar
  38. 38.
    Newman, M.: Assortative mixing in networks. Phys. Rev. Lett. 89(20), 208,701 (2002)Google Scholar
  39. 39.
    Rafiei, D., Curial, S.: Effectively visualizing large networks through sampling. In: VIS 2005Google Scholar
  40. 40.
    Ruan, N., Jin, R., Huang, Y.: Distance preserving graph simplification. In: ICDM 2011Google Scholar
  41. 41.
    Sariyuce, A., Kaya, K., Saule, E., Catalyurek, U.: Incremental algorithms for closeness centrality. In: IEEE International Conference on Big Data (2013)Google Scholar
  42. 42.
    Shoemark, P., Goldwater, S., Kirby, J., Sarkar, R.: Towards robust cross-linguistic comparisons of phonological networks. In: Computational Research in Phonetics, Phonology, and Morphology (2016)Google Scholar
  43. 43.
    Siew, C.: The orthographic similarity structure of English words: insights from network science. Appl. Netw. Sci. 3(1), 13 (2018)Google Scholar
  44. 44.
    Stella, M., Brede, M.: Patterns in the English language: phonological networks, percolation and assembly models. J. Stat. Mech. Theory Exp. 2015(5), P05,006 (2015)Google Scholar
  45. 45.
    Turnbull, R., Peperkamp, S.: What governs a language’s lexicon? Determining the organizing principles of phonological neighbourhood networks. In: International Workshop on Complex Networks and Their Applications (2016)Google Scholar
  46. 46.
    Vattani, A., Chakrabarti, D., Gurevich, M.: Preserving personalized PageRank in subgraphs. In: ICML 2011Google Scholar
  47. 47.
    Vitevitch, M.: What can graph theory tell us about word learning and lexical retrieval. J. Speech Lang. Hear. Res. 51(2), 408–422 (2008)Google Scholar
  48. 48.
    Wattenberg, M.: Arc diagrams: visualizing structure in strings. In: INFOVIS 2002Google Scholar
  49. 49.
    Watts, D., Strogatz, S.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440 (1998)Google Scholar
  50. 50.
    Yates, M.: How the clustering of phonological neighbors affects visual word recognition. J. Exp. Psychol. Learn. Mem. Cogn. 39(5), 1649–1656 (2013)Google Scholar
  51. 51.
    Zachary, W.: An information flow model for conflict and fission in small groups. J. Anthropol. Res. 33(4), 452–473 (1977)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Violet Brown
    • 1
  • Xi Chen
    • 2
  • Maryam Hedayati
    • 1
    • 2
  • Camden Sikes
    • 2
  • Julia Strand
    • 1
  • Tegan Wilson
    • 2
  • David Liben-Nowell
    • 2
    Email author
  1. 1.Department of PsychologyCarleton CollegeNorthfieldUSA
  2. 2.Department of Computer ScienceCarleton CollegeNorthfieldUSA

Personalised recommendations