Skip to main content

The Structure and Dynamics of Linguistic Networks

  • Chapter
  • First Online:

Human beings as a species are quite unique to this biological world, for they are the only organisms known to be capable of thinking, communicating and preserving potentially an infinite number of ideas that form the pillars of modern civilization. This unique ability is a consequence of the complex and powerful human languages characterized by their recursive syntax and compositional semantics [40]. It has been argued that language is a dynamic complex adaptive system that has evolved through the process of self-organization to serve the purpose of human communication needs [80]. The complexity of human languages has always attracted the attention of physicists, who have tried to explain several linguistic phenomena through models of physical systems (see e.g., [32, 42]).

Like any physical system, a linguistic system (i.e., a language) can be viewed from three different perspectives [52]. On one extreme, a language is a collection of utterances that are produced by the speakers of a linguistic community during the course of their interactions with other speakers of the same community. This is analogous to the microscopic view of a thermodynamic system, where every utterance and its corresponding context contributes to the identity of the language, i.e., the grammar. On the other extreme, a language can be characterized by a set of grammar rules and a vocabulary. This is analogous to a macroscopic view. Sandwiched between these two extremes, one can also conceive of a mesoscopic view of language, where linguistic entities, such as the letters, words or phrases are the basic units and the grammar is an emergent property of the interactions among them.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    While it is true that syntactic dependencies have a tendency to avoid crossing, there are systematic exceptions to that generalization in languages with relatively free constituent order. In German, for example, about one-third of all relative clauses are extraposed, thus creating cross dependencies.

References

  1. M. E. Adilson, A. P. S. de Moura, Y. C. Lai, and P. Dasgupta. Topology of the conceptual network of language. Physical Review E, 65(065102):1–4, 2002.

    Google Scholar 

  2. A. Agarwal, S. Chakrabarti, and S. Aggarwal. Learning to rank networked entities. In Proceedings of KDD, 2006.

    Google Scholar 

  3. A. Akmajian. Linguistics. An introduction to Language and Communication. MIT Press, Cambridge, MA, 1995.

    Google Scholar 

  4. A. Albright and B. Hayes. Rules vs. analogy in english past tenses: A computational/experimental study. Cognition, 90:119–161, 2003.

    Article  Google Scholar 

  5. A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286:509–512, 1999.

    Article  MathSciNet  Google Scholar 

  6. C. Biemann. Chinese whispers - an efficient graph clustering algorithm and its application to natural language processing problems. In Proceedings of TextGraphs: the Second Workshop on Graph Based Methods for Natural Language Processing, pages 73–80, New York, NY, June 2006. Association for Computational Linguistics.

    Google Scholar 

  7. C. Biemann. Unsupervised part-of-speech tagging employing efficient graph clustering. In Proceedings of the COLING/ACL 2006 Student Research Workshop, pages 7–12, Sydney, Australia, July 2006. Association for Computational Linguistics.

    Google Scholar 

  8. C. Biemann, I. Matveeva, R. Mihalcea, and D. Radev, editors. Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing. Association for Computational Linguistics, Rochester, NY, 2007.

    Google Scholar 

  9. S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. CNIS, 30(1–7):107–117, 1998.

    Google Scholar 

  10. N. Chomsky. The Minimalist Program. MIT Press, Cambridge, MA, 1995.

    Google Scholar 

  11. M. Choudhury, M. Thomas, A. Mukherjee, A. Basu, and N. Ganguly. How difficult is it to develop a perfect spell-checker? A cross-linguistic analysis through complex network approach. In Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing, pages 81–88, Rochester, NY, 2007. Association for Computational Linguistics.

    Google Scholar 

  12. A. Clark. Inducing syntactic categories by context distribution clustering. In C. Cardie, W. Daelemans, C. Nédellec, and E. T. K. Sang, editors, Proceedings of the Fourth Conference on Computational Natural Language Learning and of the Second Learning Language in Logic Workshop, Lisbon, 2000, pages 91–94. Association for Computational Linguistics, Somerset, NJ, 2000.

    Google Scholar 

  13. A. M. Collins and M. R. Quillian. Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Memory, 8:240–247, 1969.

    Article  Google Scholar 

  14. W. Croft. Typology and Universals. Cambridge University Press, Cambridge, MA, 1990.

    Google Scholar 

  15. B. de Boer. Self-organisation in vowel systems. Journal of Phonetics, 28(4): 441–465, 2000.

    Article  Google Scholar 

  16. I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In Proceedings of The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD-2003), pages 89–98, 2003.

    Google Scholar 

  17. W. B. Dolan, L. Vanderwende, and S. Richardson. Automatically deriving structured knowledge base from on-line dictionaries. In Proceedings of the Pacific Association for Computational Linguistics, 1993.

    Google Scholar 

  18. S. N. Dorogovtsev and J. F. F. Mendes. Language as an evolving word Web. Proceedings of the Royal Society of London B, 268(1485):2603–2606, December 22, 2001.

    Article  Google Scholar 

  19. G. Erkan and D. Radev. LexRank: Graph-based lexical centrality as salience in text summarization. JAIR, 22:457–479, December 4, 2004.

    Google Scholar 

  20. C. Felbaum. WordNet, an Electronic Lexical Database for English. MIT Press, Cambridge, MA, 1998.

    Google Scholar 

  21. R. Ferrer-i-Cancho. The structure of syntactic dependency networks: insights from recent advances in network theory. In: “Problems of Quantitative Linguistics”, G. Altmann, V. Levickij, and V. Perebyinis (eds.). Chernivtsi: Ruta. 60–75, 2005

    Google Scholar 

  22. R. Ferrer-i-Cancho. Why do syntactic links not cross? Europhysics Letters, 76:1228–1235, 2006.

    Article  Google Scholar 

  23. R. Ferrer-i-Cancho, A. Capocci, and G. Caldarelli. Spectral methods cluster words of the same class in a syntactic dependency network. International Journal of Bifurcation and Chaos, 17(7), 2007. AQ: Please provide page number for Ref. 23.

    Google Scholar 

  24. R. Ferrer-i-Cancho and R. V. Solé. The small world of human language. Proceedings of The Royal Society of London. Series B, Biological Sciences, 268(1482):2261–2265, November 2001.

    Article  Google Scholar 

  25. R. Ferrer-i-Cancho and R. V. Solé. Two regimes in the frequency of words and the origin of complex lexicons: Zipf's law revisited. Journal of Quantitative Linguistics, 8:165–173, 2001.

    Article  Google Scholar 

  26. R. Ferrer-i-Cancho and R. V. Solé. Patterns in syntactic dependency networks. Physical Review E, 69(051915), 2004.

    Google Scholar 

  27. S. Finch and N. Chater. Bootstrapping syntactic categories using statistical methods. In Background and Experiments in Machine Learning of Natural Language: Proceedings of the 1st SHOE Workshop, pages 229–235. Katholieke Universiteit, Brabant, Holland, 1992.

    Google Scholar 

  28. D. Freitag. Toward unsupervised whole-corpus tagging. In COLING '04: Proceedings of the 20th International Conference on Computational Linguistics, page 357, Morristown, NJ, 2004. Association for Computational Linguistics.

    Google Scholar 

  29. M. Galley and K. McKeown. Improving word sense disambiguation in lexical chaining. In Proceedings of IJCAI, 2003.

    Google Scholar 

  30. M. Gamon. Graph-based text representation for novelty detection. In Proceedings of the Workshop on TextGraphs at HLT-NAACL, pages 17–24, 2006.

    Google Scholar 

  31. S. Gauch and R. Futrelle. Experiments in Automatic Word Class and Word Sense Identification for Information Retrieval. In Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval, pages 425–434, Las Vegas, NV, April 1994.

    Google Scholar 

  32. M. Gell-Mann. Language and complexity. In J. W. Minett and W. S.-Y. Wang, editors, Language Acquisition, Change and Emergence: Essays in Evolutionary Linguistics. City University of Hong Kong Press, July 2005.

    Google Scholar 

  33. D. Gibson, J. M. Kleinberg, and P. Raghavan. Inferring Web communities from link topology. In Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia, pages 225–234, 1998.

    Google Scholar 

  34. A. B. Goldberg and J. Zhu. Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization. In HLT-NAACL 2006 Workshop on Textgraphs: Graph-based Algorithms for Natural Language Processing, 2006.

    Google Scholar 

  35. J. H. Greenberg and J. J. Jenkins. Studies in the psychological correlates of the sound system of American English. Word, 20:157–177, 1964.

    Google Scholar 

  36. T. M. Gruenenfelder and D. B. Pisoni. Modeling the mental lexicon as a complex system: Some preliminary results using graph theoretic measures. In Research on Spoken Language Processing Progress Report No. 27, Bloomington, Indiana University, 27–47, 2005.

    Google Scholar 

  37. Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating Web spam with TrustRank. In Proceedings of VLDB, pages 576–587, 2004.

    Google Scholar 

  38. A. D. Haghighi, A. Y. Ng, and C. D. Manning. Robust textual inference via graph matching. In HLT '05: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 387–394, Morristown, NJ, 2005. Association for Computational Linguistics.

    Google Scholar 

  39. Z. S. Harris. Mathematical Structures of Language. Wiley, New York, 1968.

    Google Scholar 

  40. M. D. Hauser, N. Chomsky, and W. T. Fitch. The faculty of language: What is it, who has it, and how did it evolve? Science, 298:1569–1579, 2002.

    Article  Google Scholar 

  41. R. F. i-Cancho, A. Mehler, O. Pustylnikov, and A. Díaz-Guilera. Correlations in the organization of large-scale syntactic dependency networks. In TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pages 65–72. Association for Computational Linguistics, 2007.

    Google Scholar 

  42. Y. Itoh and S. Ueda. The Ising model for changes in word ordering rules in natural languages. Physica D: Nonlinear Phenomena, 198(3–4):333–339, 2004.

    Article  Google Scholar 

  43. J. Jannink and G. Wiederhold. Thesaurus entry extraction from an on-line dictionary. In Proceedings of Fusion, 1999.

    Google Scholar 

  44. B. Jedynak and D. Karakos. Unigram language models using diffusion smoothing over graphs. In Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing, pages 33–36, Rochester, NY, 2007. Association for Computational Linguistics.

    Google Scholar 

  45. V. Kapatsinski. Sound similarity relations in the mental lexicon: Modeling the lexicon as a complex network. Speech Research Lab Progress Report, Indiana University, Bloomington, IN, 2006.

    Google Scholar 

  46. V. Kapustin and A. Jamsen. Vertex degree distribution for the graph of word co-occurrences in Russian. In Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing, pages 89–92, Rochester, NY, 2007. Association for Computational Linguistics.

    Google Scholar 

  47. J. Ke, M. Ogura, and W. S.-Y. Wang. Optimization models of sound systems using genetic algorithms. Computational Linguistics, 29(1):1–18, 2003.

    Article  Google Scholar 

  48. J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of ACM, 46, 1999.

    Google Scholar 

  49. R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. Structure and evolution of blogspace. Communications of the ACM, 47(12):35–39, 2004.

    Article  Google Scholar 

  50. M. Lesk. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of SIGDOC, 1986.

    Google Scholar 

  51. J. Liljencrants and B. Lindblom. Numerical simulation of vowel quality systems: the role of perceptual contrast. Language, 48:839–862, 1972.

    Article  Google Scholar 

  52. H. Liljenstrom. Micro Meso Macro: Addressing Complex Systems Couplings. World Scientific Publishing, Singapore, 2005.

    Google Scholar 

  53. P. A. Luce and D. B. Pisoni. Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19:1–36, 1998.

    Article  Google Scholar 

  54. I. Maddieson. Patterns of Sounds. Cambridge University Press, Cambridge, 1984.

    Google Scholar 

  55. W. Marslen-Wilson. Activation, competition, and frequency in lexical access. In: G. T. M. Altmann (ed.), Cognitive Models of Speech Processing: Psycholinguistic and Computational Perspectives, MIT Press, Cambridge, MA, pages 148–173, 1990.

    Google Scholar 

  56. R. McDonald, F. Pereira, K. Ribarov, and J. Hajič. Non-projective dependency parsing using spanning tree algorithms. In HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 523–530, Morristown, NJ, 2005. Association for Computational Linguistics.

    Google Scholar 

  57. R. Mihalcea. Graph-based ranking algorithms for large vocabulary word sense disambiguation. In Proceedings of HTL-EMNLP, 2005.

    Google Scholar 

  58. R. Mihalcea and D. Radev. Graph-based algorithms for information retrieval and natural language processing. Tutorial at HLT/NAACL 2006, 2006.

    Google Scholar 

  59. R. Mihalcea and D. Radev, editors. Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing. Association for Computational Linguistics, 2006.

    Google Scholar 

  60. R. Mihalcea and P. Tarau. TextRank: Bringing order into texts. In Proceedings of EMNLP, 2004.

    Google Scholar 

  61. R. Mihalcea, P. Tarau, and E. Figa. PageRank on semantic networks with applications to word sense disambiguation. In Proceedings of COLING, 2004.

    Google Scholar 

  62. G. A. Miller and W. G. Charles. Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1):1–28, 1991.

    Article  Google Scholar 

  63. G. A. Miller and P. M. Gildea. How children learn words. Scientific American, 257(3):86–91, 1987.

    Article  Google Scholar 

  64. A. Mukherjee, M. Choudhury, A. Basu, and N. Ganguly. Modeling the co-occurrence principles of the consonant inventories: A complex network approach. International Journal of Modern Physics C, 18(2):281–295, 2007.

    Article  MATH  Google Scholar 

  65. A. Mukherjee, M. Choudhury, A. Basu, and N. Ganguly. Self-organization of sound inventories: Analysis and synthesis of the occurrence and co-occurrence networks of consonants. Journal of Quantitative Linguistics, http://ariXiv.org/physics/0610120.

  66. J. Nath, M. Choudhury, A. Mukherjee, C. Biemann, and N. Ganguly. Unsupervised parts-of-speech induction for Bengali. In Proceedings of the Sixth International Language Resources and Evaluation Conference (LREC), 2008.

    Google Scholar 

  67. D. Nettle. Using social impact theory to simulate language change. Lingua, 108: 95–117, 1999.

    Article  Google Scholar 

  68. H. G. Nusbaum, D. B. Pisoni, and C. K. Davis. Sizing up the Hoosier mental lexicon: Measuring the familiarity of 20,000 words, Indiana University. Research on Speech Perception Progress Report No. 10, pages 357–376, 1984.

    Google Scholar 

  69. B. Pang and L. Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL '04), Main Volume, pages 271–278, Barcelona, Spain, July 2004.

    Google Scholar 

  70. S. Pinker. The Language Instinct: How the Mind Creates Language. HarperCollins, New York, 1994.

    Google Scholar 

  71. S. Pinker and A. Price. On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28:195–247, 1988.

    Article  Google Scholar 

  72. R. Rapp. A practical solution to the problem of automatic part-of-speech induction from text. In Conference Companion Volume of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL-05), Ann Arbor, MI, 2005.

    Google Scholar 

  73. M. Richardson, A. Prakash, and E. Brill. Beyond PageRank: Machine learning for static ranking. In Proceedings of WWW, pages 707–715, 2006.

    Google Scholar 

  74. H. Schütze. Part-of-speech induction from scratch. In Proceedings of the 31st Annual Meeting on Association for Computational Linguistics, pages 251–258, Morristown, NJ, 1993. Association for Computational Linguistics.

    Google Scholar 

  75. H. Schütze. Distributional part-of-speech tagging. In Proceedings of the 7th Conference on European Chapter of the Association for Computational Linguistics, pages 141–148, San Francisco, CA, 1995. Morgan Kaufmann Publishers Inc.

    Google Scholar 

  76. J.-L. Schwartz, L.-J. Boë, N. Vallée, and C. Abry. The dispersion-focalization theory of vowel systems. Journal of Phonetics, 25:255–286, 1997.

    Article  Google Scholar 

  77. M. Sigman and G. A. Cecchi. Global organization of the wordnet lexicon. Proceedings of the National Academy of Science, 99(3):1742–1747, 2002.

    Article  Google Scholar 

  78. M. M. Soares, G. Corso, and L. S. Lucena. The network of syllables in Portuguese. Physica A: Statistical Mechanics and its Applications, 355(2–4): 678–684, 2005.

    Google Scholar 

  79. Z. Solan, D. Horn, E. Ruppin, and S. Edelman. Unsupervised learning of natural languages. Proceedings of National Academy of Sciences, 102(33):11629–11634, 2005.

    Article  Google Scholar 

  80. L. Steels. Language as a complex adaptive system. In Proceedings of PPSN VI, pages 17–26, 2000.

    Google Scholar 

  81. D. Steriade. Knowledge of similarity and narrow lexical override. BLS, 29: 583–598, 2004.

    Google Scholar 

  82. M. Steyvers and J. B. Tenenbaum. The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Cognitive Science, 29(1): 41–78, 2005.

    Article  Google Scholar 

  83. M. Tamariz. Exploring the Adaptive Structure of the Mental Lexicon. Ph.D. thesis, Department of Theoretical and Applied Linguistics, Univerisity of Edinburgh, Scotland, 2005.

    Google Scholar 

  84. K. Toutanova, C. D. Manning, and A. Y. Ng. Learning random walk models for inducing word dependency distributions. In ICML '04: Proceedings of the Twenty-First International Conference on Machine Learning, page 103, New York, NY, 2004.

    Google Scholar 

  85. J. Véronis. HyperLex: Lexical cartography for information retrieval. Computer Speech and Language, 18(3):223–252, 2004.

    Article  Google Scholar 

  86. M. S. Vitevitch. Phonological neighbors in a small world (network): What can graph theory tell us about the mental lexicon? Departmental Colloquy co-sponsored by the Linguistics and Psychology Departments, Rice University, January 27, 2006.

    Google Scholar 

  87. D. Widdows and B. Dorow. A graph model for unsupervised lexical acquisition. In Proceedings of COLING, 2002.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Monojit Choudhury or Animesh Mukherjee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Birkhäuser Boston, a part of Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Choudhury, M., Mukherjee, A. (2009). The Structure and Dynamics of Linguistic Networks. In: Ganguly, N., Deutsch, A., Mukherjee, A. (eds) Dynamics On and Of Complex Networks. Modeling and Simulation in Science, Engineering and Technology. Birkhäuser Boston. https://doi.org/10.1007/978-0-8176-4751-3_9

Download citation

  • DOI: https://doi.org/10.1007/978-0-8176-4751-3_9

  • Published:

  • Publisher Name: Birkhäuser Boston

  • Print ISBN: 978-0-8176-4750-6

  • Online ISBN: 978-0-8176-4751-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics