Skip to main content

Statistically-Guided Controlled Language Authoring

  • Conference paper
  • First Online:
Controlled Natural Language (CNL 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9767))

Included in the following conference series:

  • 654 Accesses

Abstract

This study presents a series of experiments using contextual next word prediction to aid controlled language authoring. The goal is to assess the capabilities of n-gram language modelling to improve the generation of controlled language in restricted domains with minimal supervision. We evaluate how different dimensions of language model design can impact prediction and textual coherence. In particular, evaluations of suggestion ranking, perplexity gain and language model combination are presented. We show that word prediction can provide adequate suggestions which could offer an alternative to costly manual configuration of rules in controlled language applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.sumitsoft.com/.

  2. 2.

    http://www.goqsoftware.com/wordQ.php.

  3. 3.

    http://opus.lingfil.uu.se/.

References

  1. Koester, H.H., Levine, S.: Effect of a word prediction feature on user performance. Augmentative Altern. Commun. 12(3), 155–168 (1996)

    Article  Google Scholar 

  2. Trnka, K., McCaw, J., Yarrington, D., McCoy, K.F., Pennington, C.: User interaction with word prediction: the effects of prediction quality. ACM Trans. Accessible Comput. (TACCESS) 1(3), 17 (2009)

    Google Scholar 

  3. Kuhn, T.: A survey and classification of controlled natural languages. Comput. Linguist. 40(1), 121–170 (2014)

    Article  Google Scholar 

  4. Spaggiari, L., Beaujard, F., Cannesson, E.: A controlled language at Airbus. In: Proceedings of EAMT-CLAW 2003, pp. 151–159 (2003)

    Google Scholar 

  5. Rychtyckyj, N.: Standard Language at Ford Motor Company: A Case Study in Controlled Language Development and Deployment. Cambridge University Press, Cambridge (2006)

    Google Scholar 

  6. Ramírez-Polo, L.: Use and evaluation of controlled languages in industrial environments and feasibility study for the implementation of machine translation. Ph.D. thesis, Universidad de Valencia (2012)

    Google Scholar 

  7. Wu, D., Sui, Z., Zhao, J.: An information-based method for selecting feature types for word prediction. In: EUROSPEECH (1999)

    Google Scholar 

  8. Garay-Vitoria, N., Abascal, J.: Text prediction systems: a survey. Univ. Access Inf. Soc. 4(3), 188–203 (2006)

    Article  Google Scholar 

  9. Trost, H., Matiasek, J., Baroni, M.: The language component of the FASTY text prediction system. Appl. Artif. Intell. 19(8), 743–781 (2005)

    Article  Google Scholar 

  10. Wandmacher, T., Antoine, J.Y.: Methods to integrate a language model with semantic information for a word prediction component. arXiv preprint (2008). arXiv:0801.4716

  11. Van Den Bosch, A.: Scalable classification-based word prediction and confusible correction. Traitement Automatique des Langues 46(2), 39–63 (2006)

    Google Scholar 

  12. Even-Zohar, Y., Roth, D.: A classification approach to word prediction. In: Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, NAACL 2000, Stroudsburg, PA, USA, pp. 124–131. Association for Computational Linguistics (2000)

    Google Scholar 

  13. Lesher, G.W., Moulton, B.J., Higginbotham, D.J., et al.: Effects of ngram order and training text size on word prediction. In: Proceedings of RESNA 1999, pp. 52–54 (1999)

    Google Scholar 

  14. Schwitter, R., Ljungberg, A., Hood, D.: Ecole-a look-ahead editor for a controlled language. In: EAMT-CLAW 2003, pp. 141–150 (2003)

    Google Scholar 

  15. Angelov, K., Ranta, A.: Implementing controlled languages in GF. In: Fuchs, N.E. (ed.) CNL 2009. LNCS, vol. 5972, pp. 82–101. Springer, Heidelberg (2010)

    Google Scholar 

  16. Kuhn, T.: A principled approach to grammars for controlled natural languages and predictive editors. J. Log. Lang. Inf. 22(1), 33–70 (2013)

    Article  MATH  Google Scholar 

  17. Tiedemann, J.: News from OPUS - a collection of multilingual parallel corpora with tools and interfaces. In: Nicolov, N., Bontcheva, K., Angelova, G., Mitkov, R. (eds.) Recent Advances in Natural Language Processing, vol. V, pp. 237–248. John Benjamins, Amsterdam/Philadelphia (2009)

    Google Scholar 

  18. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pp. 177–180. Association for Computational Linguistics (2007)

    Google Scholar 

  19. Heafield, K.: KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, Edinburgh, Scotland, United Kingdom, pp. 187–197 (2011)

    Google Scholar 

  20. Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y.: Exploring the limits of language modeling. arXiv preprint (2016). arXiv:1602.02410

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Susana Palmaz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Palmaz, S., Cuadros, M., Etchegoyhen, T. (2016). Statistically-Guided Controlled Language Authoring. In: Davis, B., Pace, G., Wyner, A. (eds) Controlled Natural Language. CNL 2016. Lecture Notes in Computer Science(), vol 9767. Springer, Cham. https://doi.org/10.1007/978-3-319-41498-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41498-0_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41497-3

  • Online ISBN: 978-3-319-41498-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics