Web-Assisted Detection and Correction of Joint and Disjoint Malapropos Word Combinations
An experiment on Web-assisted detection and correction of malapropism is reported. Malapropos words semantically destroy collocations they are in, usually with retention of syntactical links with other words. A hundred English malapropisms were gathered, each supplied with its correction candidates, i.e. word combinations with one word equal to an editing variant of the corresponding word in the malapropism. Google statistics of occurrences and co-occurrences were gathered for each malapropism and correcting candidate. The collocation components may be adjacent or separated by other words in a sentence, so statistics were accumulated for the most probable distance between them. The raw Google occurrence statistics are then recalculated to numeric values of a specially defined Semantic Compatibility Index (SCI). Heuristic rules are proposed to signal malapropisms when SCI values are lower than a predetermined threshold and to retain a few highly SCI-ranked correction candidates. Within certain limitations, the experiment gave promising results.
KeywordsContent Word Editing Operation Semantic Error Word Combination Probable Distance
Unable to display preview. Download preview PDF.
- 2.Bolshakov, I.A., Gelbukh, A.: On Detection of Malapropisms by Multistage Collocation Testing. In: Düsterhöft, A., Talheim, B. (eds.) Proc. 8th Int. Conference on Applications of Natural Language to Information Systems NLDB 2003, Burg, Germany, June 2003, vol. V. P-29, Bonn, pp. 28–41 (2003)Google Scholar
- 3.Bolshakov, I.A., Gelbukh, A.: Paronyms for Accelerated Correction of Semantic Errors. International Journal on Information Theories & Applications 10, 198–204 (2003)Google Scholar
- 7.Hirst, G., St-Onge, D.: Lexical Chains as Representation of Context for Detection and Corrections of Malapropisms. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, pp. 305–332. MIT Press, Cambridge (1998)Google Scholar
- 9.Mel’čuk, I.: Dependency Syntax: Theory and Practice. SONY Press, NY (1988)Google Scholar
- 10.Oxford Collocations Dictionary for Students of English. Oxford University Press (2003) Google Scholar
- 11.Sekine, S., Carrol, J.J., Ananiadou, S., Tsujii, J.: Automatic Learning for Semantic Collocation. In: Proc. 3rd Conf. ANLP, Trento, Italy, pp. 104–110 (1992)Google Scholar
- 12.Wermter, J., Hahn, U.: Collocation Extraction Based on Modifiability Statistics. In: Proc. 20th Int. Conf. on Computational Linguistics Coling 2004, Geneva, Switzerland, August 2004, pp. 980–986 (2004)Google Scholar