Skip to main content

Learning Lexical Properties from Word Usage Patterns: Which Context Words Should be Used?

  • Conference paper

Part of the book series: Perspectives in Neural Computing ((PERSPECT.NEURAL))

Abstract

Several recent papers have described how lexical properties of words can be captured by simple measurements of which other words tend to occur close to them. At a practical level, word co–occurrence statistics are used to generate high dimensional vector space representations and appropriate distance metrics are defined on those spaces. The resulting co–occurrence vectors have been used to account for phenomena ranging from semantic priming to vocabulary acquisition. We have developed a simple and highly efficient system for computing useful word co–occurrence statistics, along with a number of criteria for optimizing and validating the resulting representations. Other workers have advocated various methods for reducing the number of dimensions in the co–occurrence vectors. Lund&Burgess [10] have suggested using only the most variant components; Landauer&Dumais [5] stress that to be of explanatory value the dimensionality of the co–occurrence vectors must be reduced to around 300 using singular value decomposition, a procedure related to principal components analysis; and Lowe&McDonald [8] have used a statistical reliability criterion. We have used a simpler framework that orders and truncates the dimensions according to their word frequency. Here we compare how the different methods perform for two evaluation criteria and briefly discuss the consequences of the different methodologies for work within cognitive or neural computation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aston, G. & Burnard, L. (1998). The BNC Handbook: Exploring the British National Corpus with SARA. Edinburgh University Press.

    Google Scholar 

  2. Battig, W.F. & Montague, W.E. (1969). Category norms for verbal items in 56 categories: A replication and extension of the Connecticut category norms. Journal of Experimental Psychology Monograph, 80,1–45.

    Article  Google Scholar 

  3. Dagan, I., Marcus, S. & Markovitch, S. (1993). Contextual word similarity and estimation from sparse data. in Proceedings of the 31st Annual Meeting of the ACL, 164–171.

    Google Scholar 

  4. Finch, S. & Chater, N. (1992). Bootstrapping syntactic categories. In Proceedings of the Ilh Annual Meeting of the Cognitive Science Society, 820–825.

    Google Scholar 

  5. Landauer, T. & Dumais, S. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2),211–240.

    Article  Google Scholar 

  6. Levy, J.P., Bullinaria, JA. & Patel, M. (1998). Explorations in the derivation of word co-occurrence statistics. South Pacific Journal of Psychology, 10 (1), 99–111.

    Google Scholar 

  7. Levy, J.P. & Bullinaria, J.A. (1999). The emergence of semantic representations from language usage. Paper given at the EPSRC Workshop on Self-Organising Systems-Future Prospectsfor Computing, UMIST, October 1999.

    Google Scholar 

  8. Lowe, W. & McDonald, S. (2000). The direct route: Mediated priming in semantic space. In Proceedings of the 22nd Annual Meeting of the Cognitive Science Society.

    Google Scholar 

  9. Lund, K., Burgess, C. & Atchley, R.A. (1995). Semantic and associative priming in high-dimensional semantic space. In Proceedings of the 1 i h Annual Meeting of the Cognitive Science Society, 660–665.

    Google Scholar 

  10. Lund, K. & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence, Behavior Research Methods, Instruments,&Computers, 28(2), 203–208.

    Article  Google Scholar 

  11. Manning, C.D. & Schiitze, H. (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.

    MATH  Google Scholar 

  12. Patel, M., Bullinaria, J.A. & Levy, J.P. (1998). Extracting Semantic Representations from Large Text Corpora. In Bullinaria, J.A., Glasspool, D.W. & Houghton, G. (eds), 4th Neural Computation and Psychology Workshop, London, 9-Jl April 1997: Connectionist Representations, 199–212. London: Springer-Verlag.

    Google Scholar 

  13. Redington, M. & Chater, N. (1997). Probabilistic and distributional approaches to language acquisition. Trends in Cognitive Sciences, 1(7), 273–281.

    Article  Google Scholar 

  14. Schiitze, H. (1993). Word Space. In SJ. Hanson, J.D. Cowan & C.L. Giles (Eds.) Advances in Neural Information Processing Systems 5, 895–902. San Mateo, CA: Morgan Kauffmann.

    Google Scholar 

  15. Zhu, H. (1997). Bayesian Geometric Theory of Learning Algorithms. In: Proceedings of the International Conference on Neural Networks (ICNN’97), Vol. 2, 1041–1044.

    Article  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag London

About this paper

Cite this paper

Levy, J.P., Bullinaria, J.A. (2001). Learning Lexical Properties from Word Usage Patterns: Which Context Words Should be Used?. In: French, R.M., Sougné, J.P. (eds) Connectionist Models of Learning, Development and Evolution. Perspectives in Neural Computing. Springer, London. https://doi.org/10.1007/978-1-4471-0281-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-0281-6_27

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-85233-354-6

  • Online ISBN: 978-1-4471-0281-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics