Abstract
This study presents a series of experiments using contextual next word prediction to aid controlled language authoring. The goal is to assess the capabilities of n-gram language modelling to improve the generation of controlled language in restricted domains with minimal supervision. We evaluate how different dimensions of language model design can impact prediction and textual coherence. In particular, evaluations of suggestion ranking, perplexity gain and language model combination are presented. We show that word prediction can provide adequate suggestions which could offer an alternative to costly manual configuration of rules in controlled language applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Koester, H.H., Levine, S.: Effect of a word prediction feature on user performance. Augmentative Altern. Commun. 12(3), 155–168 (1996)
Trnka, K., McCaw, J., Yarrington, D., McCoy, K.F., Pennington, C.: User interaction with word prediction: the effects of prediction quality. ACM Trans. Accessible Comput. (TACCESS) 1(3), 17 (2009)
Kuhn, T.: A survey and classification of controlled natural languages. Comput. Linguist. 40(1), 121–170 (2014)
Spaggiari, L., Beaujard, F., Cannesson, E.: A controlled language at Airbus. In: Proceedings of EAMT-CLAW 2003, pp. 151–159 (2003)
Rychtyckyj, N.: Standard Language at Ford Motor Company: A Case Study in Controlled Language Development and Deployment. Cambridge University Press, Cambridge (2006)
Ramírez-Polo, L.: Use and evaluation of controlled languages in industrial environments and feasibility study for the implementation of machine translation. Ph.D. thesis, Universidad de Valencia (2012)
Wu, D., Sui, Z., Zhao, J.: An information-based method for selecting feature types for word prediction. In: EUROSPEECH (1999)
Garay-Vitoria, N., Abascal, J.: Text prediction systems: a survey. Univ. Access Inf. Soc. 4(3), 188–203 (2006)
Trost, H., Matiasek, J., Baroni, M.: The language component of the FASTY text prediction system. Appl. Artif. Intell. 19(8), 743–781 (2005)
Wandmacher, T., Antoine, J.Y.: Methods to integrate a language model with semantic information for a word prediction component. arXiv preprint (2008). arXiv:0801.4716
Van Den Bosch, A.: Scalable classification-based word prediction and confusible correction. Traitement Automatique des Langues 46(2), 39–63 (2006)
Even-Zohar, Y., Roth, D.: A classification approach to word prediction. In: Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, NAACL 2000, Stroudsburg, PA, USA, pp. 124–131. Association for Computational Linguistics (2000)
Lesher, G.W., Moulton, B.J., Higginbotham, D.J., et al.: Effects of ngram order and training text size on word prediction. In: Proceedings of RESNA 1999, pp. 52–54 (1999)
Schwitter, R., Ljungberg, A., Hood, D.: Ecole-a look-ahead editor for a controlled language. In: EAMT-CLAW 2003, pp. 141–150 (2003)
Angelov, K., Ranta, A.: Implementing controlled languages in GF. In: Fuchs, N.E. (ed.) CNL 2009. LNCS, vol. 5972, pp. 82–101. Springer, Heidelberg (2010)
Kuhn, T.: A principled approach to grammars for controlled natural languages and predictive editors. J. Log. Lang. Inf. 22(1), 33–70 (2013)
Tiedemann, J.: News from OPUS - a collection of multilingual parallel corpora with tools and interfaces. In: Nicolov, N., Bontcheva, K., Angelova, G., Mitkov, R. (eds.) Recent Advances in Natural Language Processing, vol. V, pp. 237–248. John Benjamins, Amsterdam/Philadelphia (2009)
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pp. 177–180. Association for Computational Linguistics (2007)
Heafield, K.: KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, Edinburgh, Scotland, United Kingdom, pp. 187–197 (2011)
Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y.: Exploring the limits of language modeling. arXiv preprint (2016). arXiv:1602.02410
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Palmaz, S., Cuadros, M., Etchegoyhen, T. (2016). Statistically-Guided Controlled Language Authoring. In: Davis, B., Pace, G., Wyner, A. (eds) Controlled Natural Language. CNL 2016. Lecture Notes in Computer Science(), vol 9767. Springer, Cham. https://doi.org/10.1007/978-3-319-41498-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-41498-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41497-3
Online ISBN: 978-3-319-41498-0
eBook Packages: Computer ScienceComputer Science (R0)