Method to Evaluate Difficulty of Technical Terms
We have developed an auto annotating system. To apply to the system, we conducted experiments about the method to evaluate difficulty of technical terms in documents by using data of Wikipedia. Based on a hypothesis that basic and easy terms appear frequently in Wikipedia, we surveyed relationship between subjective difficulty and appearance frequency in Wikipedia. As a result, we could classify technical terms into the easy term and the difficult term at the accuracy of 0.70.
KeywordsWord clustering Automatic annotation Information assistance
This work was partially supported by JSPS KAKENHI grants (No. 25240043) and TISE Research Grant of Chuo University.
- 1.Amano, S., Kondo, T.: Estimation of mental lexicon size with word familiarity database. In: Proceedings of International Conference on Spoken Language Processing, vol. 5, pp. 2119–2122 (1998)Google Scholar
- 2.Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706. ACM, May 2007Google Scholar
- 3.Jiang, Z., Sun, G., Gu, Q., Bai, T., Chen, D.: A graph-based readability assessment method using word couplingGoogle Scholar
- 4.Sato, S., Matsuyoshi, S., Kondoh, Y.: Automatic assessment of Japanese text readability based on a textbook corpus. In: LREC, May 2008Google Scholar