The Structure and Dynamics of Linguistic Networks

Choudhury, Monojit; Mukherjee, Animesh

doi:10.1007/978-0-8176-4751-3_9

The Structure and Dynamics of Linguistic Networks

Monojit Choudhury⁴ &
Animesh Mukherjee⁵

Chapter
First Online: 01 January 2009

1715 Accesses
14 Citations
3 Altmetric

Part of the book series: Modeling and Simulation in Science, Engineering and Technology ((MSSET))

Human beings as a species are quite unique to this biological world, for they are the only organisms known to be capable of thinking, communicating and preserving potentially an infinite number of ideas that form the pillars of modern civilization. This unique ability is a consequence of the complex and powerful human languages characterized by their recursive syntax and compositional semantics [40]. It has been argued that language is a dynamic complex adaptive system that has evolved through the process of self-organization to serve the purpose of human communication needs [80]. The complexity of human languages has always attracted the attention of physicists, who have tried to explain several linguistic phenomena through models of physical systems (see e.g., [32, 42]).

Like any physical system, a linguistic system (i.e., a language) can be viewed from three different perspectives [52]. On one extreme, a language is a collection of utterances that are produced by the speakers of a linguistic community during the course of their interactions with other speakers of the same community. This is analogous to the microscopic view of a thermodynamic system, where every utterance and its corresponding context contributes to the identity of the language, i.e., the grammar. On the other extreme, a language can be characterized by a set of grammar rules and a vocabulary. This is analogous to a macroscopic view. Sandwiched between these two extremes, one can also conceive of a mesoscopic view of language, where linguistic entities, such as the letters, words or phrases are the basic units and the grammar is an emergent property of the interactions among them.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
While it is true that syntactic dependencies have a tendency to avoid crossing, there are systematic exceptions to that generalization in languages with relatively free constituent order. In German, for example, about one-third of all relative clauses are extraposed, thus creating cross dependencies.

References

M. E. Adilson, A. P. S. de Moura, Y. C. Lai, and P. Dasgupta. Topology of the conceptual network of language. Physical Review E, 65(065102):1–4, 2002.
Google Scholar
A. Agarwal, S. Chakrabarti, and S. Aggarwal. Learning to rank networked entities. In Proceedings of KDD, 2006.
Google Scholar
A. Akmajian. Linguistics. An introduction to Language and Communication. MIT Press, Cambridge, MA, 1995.
Google Scholar
A. Albright and B. Hayes. Rules vs. analogy in english past tenses: A computational/experimental study. Cognition, 90:119–161, 2003.
Article Google Scholar
A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286:509–512, 1999.
Article MathSciNet Google Scholar
C. Biemann. Chinese whispers - an efficient graph clustering algorithm and its application to natural language processing problems. In Proceedings of TextGraphs: the Second Workshop on Graph Based Methods for Natural Language Processing, pages 73–80, New York, NY, June 2006. Association for Computational Linguistics.
Google Scholar
C. Biemann. Unsupervised part-of-speech tagging employing efficient graph clustering. In Proceedings of the COLING/ACL 2006 Student Research Workshop, pages 7–12, Sydney, Australia, July 2006. Association for Computational Linguistics.
Google Scholar
C. Biemann, I. Matveeva, R. Mihalcea, and D. Radev, editors. Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing. Association for Computational Linguistics, Rochester, NY, 2007.
Google Scholar
S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. CNIS, 30(1–7):107–117, 1998.
Google Scholar
N. Chomsky. The Minimalist Program. MIT Press, Cambridge, MA, 1995.
Google Scholar
M. Choudhury, M. Thomas, A. Mukherjee, A. Basu, and N. Ganguly. How difficult is it to develop a perfect spell-checker? A cross-linguistic analysis through complex network approach. In Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing, pages 81–88, Rochester, NY, 2007. Association for Computational Linguistics.
Google Scholar
A. Clark. Inducing syntactic categories by context distribution clustering. In C. Cardie, W. Daelemans, C. Nédellec, and E. T. K. Sang, editors, Proceedings of the Fourth Conference on Computational Natural Language Learning and of the Second Learning Language in Logic Workshop, Lisbon, 2000, pages 91–94. Association for Computational Linguistics, Somerset, NJ, 2000.
Google Scholar
A. M. Collins and M. R. Quillian. Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Memory, 8:240–247, 1969.
Article Google Scholar
W. Croft. Typology and Universals. Cambridge University Press, Cambridge, MA, 1990.
Google Scholar
B. de Boer. Self-organisation in vowel systems. Journal of Phonetics, 28(4): 441–465, 2000.
Article Google Scholar
I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In Proceedings of The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD-2003), pages 89–98, 2003.
Google Scholar
W. B. Dolan, L. Vanderwende, and S. Richardson. Automatically deriving structured knowledge base from on-line dictionaries. In Proceedings of the Pacific Association for Computational Linguistics, 1993.
Google Scholar
S. N. Dorogovtsev and J. F. F. Mendes. Language as an evolving word Web. Proceedings of the Royal Society of London B, 268(1485):2603–2606, December 22, 2001.
Article Google Scholar
G. Erkan and D. Radev. LexRank: Graph-based lexical centrality as salience in text summarization. JAIR, 22:457–479, December 4, 2004.
Google Scholar
C. Felbaum. WordNet, an Electronic Lexical Database for English. MIT Press, Cambridge, MA, 1998.
Google Scholar
R. Ferrer-i-Cancho. The structure of syntactic dependency networks: insights from recent advances in network theory. In: “Problems of Quantitative Linguistics”, G. Altmann, V. Levickij, and V. Perebyinis (eds.). Chernivtsi: Ruta. 60–75, 2005
Google Scholar
R. Ferrer-i-Cancho. Why do syntactic links not cross? Europhysics Letters, 76:1228–1235, 2006.
Article Google Scholar
R. Ferrer-i-Cancho, A. Capocci, and G. Caldarelli. Spectral methods cluster words of the same class in a syntactic dependency network. International Journal of Bifurcation and Chaos, 17(7), 2007. AQ: Please provide page number for Ref. 23.
Google Scholar
R. Ferrer-i-Cancho and R. V. Solé. The small world of human language. Proceedings of The Royal Society of London. Series B, Biological Sciences, 268(1482):2261–2265, November 2001.
Article Google Scholar
R. Ferrer-i-Cancho and R. V. Solé. Two regimes in the frequency of words and the origin of complex lexicons: Zipf's law revisited. Journal of Quantitative Linguistics, 8:165–173, 2001.
Article Google Scholar
R. Ferrer-i-Cancho and R. V. Solé. Patterns in syntactic dependency networks. Physical Review E, 69(051915), 2004.
Google Scholar
S. Finch and N. Chater. Bootstrapping syntactic categories using statistical methods. In Background and Experiments in Machine Learning of Natural Language: Proceedings of the 1st SHOE Workshop, pages 229–235. Katholieke Universiteit, Brabant, Holland, 1992.
Google Scholar
D. Freitag. Toward unsupervised whole-corpus tagging. In COLING '04: Proceedings of the 20th International Conference on Computational Linguistics, page 357, Morristown, NJ, 2004. Association for Computational Linguistics.
Google Scholar
M. Galley and K. McKeown. Improving word sense disambiguation in lexical chaining. In Proceedings of IJCAI, 2003.
Google Scholar
M. Gamon. Graph-based text representation for novelty detection. In Proceedings of the Workshop on TextGraphs at HLT-NAACL, pages 17–24, 2006.
Google Scholar
S. Gauch and R. Futrelle. Experiments in Automatic Word Class and Word Sense Identification for Information Retrieval. In Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval, pages 425–434, Las Vegas, NV, April 1994.
Google Scholar
M. Gell-Mann. Language and complexity. In J. W. Minett and W. S.-Y. Wang, editors, Language Acquisition, Change and Emergence: Essays in Evolutionary Linguistics. City University of Hong Kong Press, July 2005.
Google Scholar
D. Gibson, J. M. Kleinberg, and P. Raghavan. Inferring Web communities from link topology. In Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia, pages 225–234, 1998.
Google Scholar
A. B. Goldberg and J. Zhu. Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization. In HLT-NAACL 2006 Workshop on Textgraphs: Graph-based Algorithms for Natural Language Processing, 2006.
Google Scholar
J. H. Greenberg and J. J. Jenkins. Studies in the psychological correlates of the sound system of American English. Word, 20:157–177, 1964.
Google Scholar
T. M. Gruenenfelder and D. B. Pisoni. Modeling the mental lexicon as a complex system: Some preliminary results using graph theoretic measures. In Research on Spoken Language Processing Progress Report No. 27, Bloomington, Indiana University, 27–47, 2005.
Google Scholar
Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating Web spam with TrustRank. In Proceedings of VLDB, pages 576–587, 2004.
Google Scholar
A. D. Haghighi, A. Y. Ng, and C. D. Manning. Robust textual inference via graph matching. In HLT '05: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 387–394, Morristown, NJ, 2005. Association for Computational Linguistics.
Google Scholar
Z. S. Harris. Mathematical Structures of Language. Wiley, New York, 1968.
Google Scholar
M. D. Hauser, N. Chomsky, and W. T. Fitch. The faculty of language: What is it, who has it, and how did it evolve? Science, 298:1569–1579, 2002.
Article Google Scholar
R. F. i-Cancho, A. Mehler, O. Pustylnikov, and A. Díaz-Guilera. Correlations in the organization of large-scale syntactic dependency networks. In TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pages 65–72. Association for Computational Linguistics, 2007.
Google Scholar
Y. Itoh and S. Ueda. The Ising model for changes in word ordering rules in natural languages. Physica D: Nonlinear Phenomena, 198(3–4):333–339, 2004.
Article Google Scholar
J. Jannink and G. Wiederhold. Thesaurus entry extraction from an on-line dictionary. In Proceedings of Fusion, 1999.
Google Scholar
B. Jedynak and D. Karakos. Unigram language models using diffusion smoothing over graphs. In Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing, pages 33–36, Rochester, NY, 2007. Association for Computational Linguistics.
Google Scholar
V. Kapatsinski. Sound similarity relations in the mental lexicon: Modeling the lexicon as a complex network. Speech Research Lab Progress Report, Indiana University, Bloomington, IN, 2006.
Google Scholar
V. Kapustin and A. Jamsen. Vertex degree distribution for the graph of word co-occurrences in Russian. In Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing, pages 89–92, Rochester, NY, 2007. Association for Computational Linguistics.
Google Scholar
J. Ke, M. Ogura, and W. S.-Y. Wang. Optimization models of sound systems using genetic algorithms. Computational Linguistics, 29(1):1–18, 2003.
Article Google Scholar
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of ACM, 46, 1999.
Google Scholar
R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. Structure and evolution of blogspace. Communications of the ACM, 47(12):35–39, 2004.
Article Google Scholar
M. Lesk. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of SIGDOC, 1986.
Google Scholar
J. Liljencrants and B. Lindblom. Numerical simulation of vowel quality systems: the role of perceptual contrast. Language, 48:839–862, 1972.
Article Google Scholar
H. Liljenstrom. Micro Meso Macro: Addressing Complex Systems Couplings. World Scientific Publishing, Singapore, 2005.
Google Scholar
P. A. Luce and D. B. Pisoni. Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19:1–36, 1998.
Article Google Scholar
I. Maddieson. Patterns of Sounds. Cambridge University Press, Cambridge, 1984.
Google Scholar
W. Marslen-Wilson. Activation, competition, and frequency in lexical access. In: G. T. M. Altmann (ed.), Cognitive Models of Speech Processing: Psycholinguistic and Computational Perspectives, MIT Press, Cambridge, MA, pages 148–173, 1990.
Google Scholar
R. McDonald, F. Pereira, K. Ribarov, and J. Hajič. Non-projective dependency parsing using spanning tree algorithms. In HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 523–530, Morristown, NJ, 2005. Association for Computational Linguistics.
Google Scholar
R. Mihalcea. Graph-based ranking algorithms for large vocabulary word sense disambiguation. In Proceedings of HTL-EMNLP, 2005.
Google Scholar
R. Mihalcea and D. Radev. Graph-based algorithms for information retrieval and natural language processing. Tutorial at HLT/NAACL 2006, 2006.
Google Scholar
R. Mihalcea and D. Radev, editors. Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing. Association for Computational Linguistics, 2006.
Google Scholar
R. Mihalcea and P. Tarau. TextRank: Bringing order into texts. In Proceedings of EMNLP, 2004.
Google Scholar
R. Mihalcea, P. Tarau, and E. Figa. PageRank on semantic networks with applications to word sense disambiguation. In Proceedings of COLING, 2004.
Google Scholar
G. A. Miller and W. G. Charles. Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1):1–28, 1991.
Article Google Scholar
G. A. Miller and P. M. Gildea. How children learn words. Scientific American, 257(3):86–91, 1987.
Article Google Scholar
A. Mukherjee, M. Choudhury, A. Basu, and N. Ganguly. Modeling the co-occurrence principles of the consonant inventories: A complex network approach. International Journal of Modern Physics C, 18(2):281–295, 2007.
Article MATH Google Scholar
A. Mukherjee, M. Choudhury, A. Basu, and N. Ganguly. Self-organization of sound inventories: Analysis and synthesis of the occurrence and co-occurrence networks of consonants. Journal of Quantitative Linguistics, http://ariXiv.org/physics/0610120.
J. Nath, M. Choudhury, A. Mukherjee, C. Biemann, and N. Ganguly. Unsupervised parts-of-speech induction for Bengali. In Proceedings of the Sixth International Language Resources and Evaluation Conference (LREC), 2008.
Google Scholar
D. Nettle. Using social impact theory to simulate language change. Lingua, 108: 95–117, 1999.
Article Google Scholar
H. G. Nusbaum, D. B. Pisoni, and C. K. Davis. Sizing up the Hoosier mental lexicon: Measuring the familiarity of 20,000 words, Indiana University. Research on Speech Perception Progress Report No. 10, pages 357–376, 1984.
Google Scholar
B. Pang and L. Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL '04), Main Volume, pages 271–278, Barcelona, Spain, July 2004.
Google Scholar
S. Pinker. The Language Instinct: How the Mind Creates Language. HarperCollins, New York, 1994.
Google Scholar
S. Pinker and A. Price. On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28:195–247, 1988.
Article Google Scholar
R. Rapp. A practical solution to the problem of automatic part-of-speech induction from text. In Conference Companion Volume of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL-05), Ann Arbor, MI, 2005.
Google Scholar
M. Richardson, A. Prakash, and E. Brill. Beyond PageRank: Machine learning for static ranking. In Proceedings of WWW, pages 707–715, 2006.
Google Scholar
H. Schütze. Part-of-speech induction from scratch. In Proceedings of the 31st Annual Meeting on Association for Computational Linguistics, pages 251–258, Morristown, NJ, 1993. Association for Computational Linguistics.
Google Scholar
H. Schütze. Distributional part-of-speech tagging. In Proceedings of the 7th Conference on European Chapter of the Association for Computational Linguistics, pages 141–148, San Francisco, CA, 1995. Morgan Kaufmann Publishers Inc.
Google Scholar
J.-L. Schwartz, L.-J. Boë, N. Vallée, and C. Abry. The dispersion-focalization theory of vowel systems. Journal of Phonetics, 25:255–286, 1997.
Article Google Scholar
M. Sigman and G. A. Cecchi. Global organization of the wordnet lexicon. Proceedings of the National Academy of Science, 99(3):1742–1747, 2002.
Article Google Scholar
M. M. Soares, G. Corso, and L. S. Lucena. The network of syllables in Portuguese. Physica A: Statistical Mechanics and its Applications, 355(2–4): 678–684, 2005.
Google Scholar
Z. Solan, D. Horn, E. Ruppin, and S. Edelman. Unsupervised learning of natural languages. Proceedings of National Academy of Sciences, 102(33):11629–11634, 2005.
Article Google Scholar
L. Steels. Language as a complex adaptive system. In Proceedings of PPSN VI, pages 17–26, 2000.
Google Scholar
D. Steriade. Knowledge of similarity and narrow lexical override. BLS, 29: 583–598, 2004.
Google Scholar
M. Steyvers and J. B. Tenenbaum. The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Cognitive Science, 29(1): 41–78, 2005.
Article Google Scholar
M. Tamariz. Exploring the Adaptive Structure of the Mental Lexicon. Ph.D. thesis, Department of Theoretical and Applied Linguistics, Univerisity of Edinburgh, Scotland, 2005.
Google Scholar
K. Toutanova, C. D. Manning, and A. Y. Ng. Learning random walk models for inducing word dependency distributions. In ICML '04: Proceedings of the Twenty-First International Conference on Machine Learning, page 103, New York, NY, 2004.
Google Scholar
J. Véronis. HyperLex: Lexical cartography for information retrieval. Computer Speech and Language, 18(3):223–252, 2004.
Article Google Scholar
M. S. Vitevitch. Phonological neighbors in a small world (network): What can graph theory tell us about the mental lexicon? Departmental Colloquy co-sponsored by the Linguistics and Psychology Departments, Rice University, January 27, 2006.
Google Scholar
D. Widdows and B. Dorow. A graph model for unsupervised lexical acquisition. In Proceedings of COLING, 2002.
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft Research India, 560080, Sadashivnagar, Bangalore, India
Monojit Choudhury
Department of Computer Science and Engineering, Indian Institute of Technology, 72130, Kharagpur, India
Animesh Mukherjee

Authors

Monojit Choudhury
View author publications
You can also search for this author in PubMed Google Scholar
Animesh Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Monojit Choudhury or Animesh Mukherjee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Choudhury, M., Mukherjee, A. (2009). The Structure and Dynamics of Linguistic Networks. In: Ganguly, N., Deutsch, A., Mukherjee, A. (eds) Dynamics On and Of Complex Networks. Modeling and Simulation in Science, Engineering and Technology. Birkhäuser Boston. https://doi.org/10.1007/978-0-8176-4751-3_9

Download citation

DOI: https://doi.org/10.1007/978-0-8176-4751-3_9
Published: 26 February 2009
Publisher Name: Birkhäuser Boston
Print ISBN: 978-0-8176-4750-6
Online ISBN: 978-0-8176-4751-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Buying options