Context Analysis for Computer-Assisted Near-Synonym Learning

  • Liang-Chih YuEmail author
  • Wei-Nan Chien
  • Kai-Hsiang Hsu
Part of the Chinese Language Learning Sciences book series (CLLS)


Despite their similar meanings, near-synonyms may have different usages in different contexts. For second-language learners, such differences are not easily grasped in practical use. This chapter introduces several context analysis techniques such as pointwise mutual information (PMI), n-gram language model, latent semantic analysis (LSA), and independent component analysis (ICA) to verify whether near-synonyms do match the given contexts. Applications can benefit from such techniques to provide useful contextual information for learners, making it easier for them to understand different usages of various near-synonyms. Based on these context analysis techniques, we build a prototype computer-assisted near-synonym learning system. In experiments, we evaluate the context analysis methods on both Chinese and English sentences, and compared its performance to several previously proposed supervised and unsupervised methods. Experimental results show that training on the independent components that contain useful contextual features with minimized term dependence can improve the classifiers’ ability to discriminate among near-synonyms, thus yielding better performance.


  1. Bhogal, J., Macfarlane, A., & Smith, P. (2007). A review of ontology based query expansion. Information Processing and Management, 43(4), 866–886.CrossRefGoogle Scholar
  2. Cheng, C.-C. (2004). Word-focused extensive reading with guidance. In Proceedings of the 13th International Symposium on English Teaching (pp. 24–32).Google Scholar
  3. Church, K., & Hanks, P. (1990). Word association norms, mutual information and lexicography. Computational Linguistics, 16(1), 22–29.Google Scholar
  4. Cribbin, T. (2011). Discovering latent topical structure by second-order similarity analysis. Journal of the American Society for Information Science and Technology, 62(6), 1188–1207.CrossRefGoogle Scholar
  5. Edmonds, P. (1997). Choosing the word most typical in context using a lexical co-occurrence network. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (pp. 507–509).Google Scholar
  6. Fellbaum, C. (1998). WordNet: An electronic lexical database. Cambridge, MA: MIT Press.CrossRefGoogle Scholar
  7. Gardiner, M., & Dras, M. (2007). Exploring approaches to discriminating among near-synonyms. In Proceedings of the Australasian Language Technology Workshop (pp. 31–39).Google Scholar
  8. Golub, G. H., & Van Loan, C. F. (1996). Matrix computations (3rd ed.). Baltimore, MD: Johns Hopkins University Press.Google Scholar
  9. Harris, Z. (1954). Distributional structure. Word, 10(2–3), 146–162.CrossRefGoogle Scholar
  10. Howell, D. C. (2007). Statistical methods for psychology (6th ed.). Belmont, CA: Thomson.Google Scholar
  11. Huang, C.-R., Hsieh, S.-K., Hong, J.-F., Chen, Y.-Z., Su, I.-L., Chen, Y.-X., & Huang, S.-W. (2008). Chinese Wordnet: Design, implementation, and application of an infrastructure for cross-lingual knowledge processing. In Proceedings of the 9th Chinese Lexical Semantics Workshop.Google Scholar
  12. Hyvärinen, A. (1999). Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks, 10(3), 626–634.CrossRefGoogle Scholar
  13. Hyvärinen, A., Karhunen, J., & Oja, E. (2001). Independent component analysis. New York: Wiley.CrossRefGoogle Scholar
  14. Inkpen, D. (2007). A statistical model of near-synonym choice. ACM Transactions on Speech and Language Processing, 4(1), 1–17.CrossRefGoogle Scholar
  15. Inkpen, D., & Hirst, G. (2006). Building and using a lexical knowledge-base of near-synonym differences. Computational Linguistics, 32(2), 1–39.CrossRefGoogle Scholar
  16. Islam, A., & Inkpen, D. (2010). Near-synonym choice using a 5-gram language model. Research in Computing Science, 46, 41–52.Google Scholar
  17. Kolenda, T., & Hansen, L. K. (2000). Independent components in text. Advances in Neural Information Processing Systems, 13, 235–256.Google Scholar
  18. Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25(2–3), 259–284.CrossRefGoogle Scholar
  19. Lee, T. W. (1998). Independent component analysis—Theory and applications. Norwell, MA: Kluwer.CrossRefGoogle Scholar
  20. Lin, D. (1998). Automatic retrieval and clustering of similar words. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (pp. 768–774).Google Scholar
  21. Moldovan, D., & Mihalcea, R. (2000). Using Wordnet and lexical operators to improve internet searches. IEEE Internet Computing, 4(1), 34–43.CrossRefGoogle Scholar
  22. Navigli, R., & Velardi, P. (2003). An analysis of ontology-based query expansion strategies. In Proceedings of the Workshop on Adaptive Text Extraction and Mining.Google Scholar
  23. Ouyang, S., Gao, H.-H., & Koh, S.-N. (2009). Developing a computer-facilitated tool for acquiring near-synonyms in Chinese and English. In Proceedings of the 8th International Conference on Computational Semantics (pp. 316–319).Google Scholar
  24. Pearce, D. (2001). Synonymy in collocation extraction. In Proceedings of the Workshop on WordNet and Other Lexical Resources.Google Scholar
  25. Rapp, R. (2004). Mining text for word senses using independent component analysis. In Proceedings of the 4th SIAM International Conference on Data Mining (pp. 422–426).Google Scholar
  26. Rodríguez, H., Climent, S., Vossen, P., Bloksma, L., Peters, W., Alonge, A., et al. (1998). The top-down strategy for building EuroWordNet: vocabulary coverage, base concepts and top ontology. Computers and the Humanities, 32, 117–159.CrossRefGoogle Scholar
  27. Roussinov, D., & Zhao, J. L. (2003). Automatic discovery of similarity relationships through Web mining. Decision Support Systems, 35(1), 149–166.CrossRefGoogle Scholar
  28. Sevillano, X., Alías, F., & Socoró, J. C. (2004). Reliability in ICA-based text classification. In Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (pp. 1213–1220).CrossRefGoogle Scholar
  29. Shlrl, A., & Revle, C. (2006). Query expansion behavior within a thesaurus-enhanced search environment: A user-centered evaluation. Journal of the American Society for Information Science and Technology, 57(4), 462–478.CrossRefGoogle Scholar
  30. Vanderwende, L., Suzuki, H., Brockett, C., & Nenkova, A. (2007). Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing and Management, 43(6), 1606–1618.CrossRefGoogle Scholar
  31. Wang, T., & Hirst, G. (2010). Near-synonym lexical choice in latent semantic space. In Proceedings of the 23rd International Conference on Computational Linguistics (pp. 1182–1190).Google Scholar
  32. Weeds, J., Weir, D., & McCarthy, D. (2004). Characterising measures of lexical distributional similarity. In Proceedings of the 20th International Conference on Computational Linguistics (pp. 1015–1021).Google Scholar
  33. Wei, C. P., Yang, C. C., & Lin, C. M. (2008). A Latent semantic indexing-based approach to multilingual document clustering. Decision Support Systems, 45(3), 606–620.CrossRefGoogle Scholar
  34. Wible, D., Kuo, C.-H., Tsao, N.-L., Liu, A., & Lin, H.-L. (2003). Bootstrapping in a language learning environment. Journal of Computer Assisted learning, 19(1), 90–102.CrossRefGoogle Scholar
  35. Wu, C.-H. Liu, C.-H., Matthew, H., & Yu, L.-C. (2010). Sentence correction incorporating relative position and parse template language models. IEEE Transactions on Audio, Speech and Language Processing, 18(6), 1170–1181.Google Scholar
  36. Yu, L.-C., & Chien, W.-N. (2013). Independent component analysis for near-synonym choice. Decision Support Systems, 55(1), 146–155.CrossRefGoogle Scholar
  37. Yu, L.-C., Lee, L.-H., Yeh, J.-F., Shih, H.-M., & Lai, Y.-L. (2016). Near-synonym substitution using a discriminative vector space model. Knowledge-Based Systems, 106, 74–84.CrossRefGoogle Scholar
  38. Yu, L.-C., Wu, C.-H., Chang, R.-Y., Liu, C.-H., & Hovy, E. H. (2010). Annotation and verification of sense pools in OntoNotes. Information Processing and Management, 46(4), 436–447.CrossRefGoogle Scholar
  39. Yu, L.-C., Wu, C.-H., & Jang, F.-L. (2009). Psychiatric document retrieval using a discourse-aware model. Artificial Intelligence, 173(7–8), 817–829.CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Yuan Ze UniversityTaoyuanTaiwan
  2. 2.Yuanze UniversityTaoyuanTaiwan

Personalised recommendations