Advertisement

Combined Methodology Based on Kernel Regression and Kernel Density Estimation for Sign Language Machine Translation

  • Mehrez Boulares
  • Mohamed Jemni
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8866)

Abstract

The majority of current researches in Machine Translation field are focalized essentially on spoken languages. The aim is to find a most likely translation for a given source sentence based on statistical learning techniques which are applied to very big parallel corpora. In this work, we focused on gesture languages especially on Sign Language in order to present a new methodological foundation for Sign Language Machine Translation. Our approach is based on Kernel Regression combined to Kernel Density Estimation method applied to Sign Language n-grams. The translation process is modelled as an n-gram to n-gram mapping with the consideration of the n-gram positions in the source and the target phrases. For doing so, we propose a new feature mapping process (Weighted Sub n-gram Feature Mapping) which is a modified version of the String Subsequence Kernel SSK feature mapping. The Weighted Sub n-gram aims to generate feature vectors mapping of both source and target n-gram. Afterwards, to learn the function that map source n-grams to target n-grams, we used and compared four learning techniques (Gaussian Process Regressor, K-Nearest Neighbors Regressor, Support Vector Regressor with Gaussian Kernel and Kernel Ridge Regression) for the purpose to choose the efficient one which minimizes the SSE (Sum of Squared Error). Even so, to find solution to the pre-image problem, we rely on the De-Bruijn Multi Graph search applied on n-grams target. For the purpose to obtain the best translation, we relied on the search of the most frequently observed bilingual n-gram alignment in term of the maximization of the translation probability. For unknown n-grams, we used kernel ridge regression for the purpose to predict the probability through learning the Density Estimation function of the bilingual n-grams alignments. We obtained encouraging experimental results on a small-scale reduced-domain corpus.

Keywords

Kernel ridge regression String kernel De-Bruijn Kernel density estimation Sign language Gaussian process for regression KNN-Regressor SVR Gaussian kernel ASL signing space 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Biçici, E., Yuret, D.: L1 regularization for learning word alignments in sparse feature matrices. In: Proceedings of the Computer Science Student Workshop (2010)Google Scholar
  2. 2.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006). ISBN: 978-0-387-31073-2Google Scholar
  3. 3.
    Charles, A., Rebecca, S.: Reading optimally builds on spoken language implication for deaf readers. Learning research and development center University of Pittsburgh (2000)Google Scholar
  4. 4.
    Cortes, C., Mehryar, M., Jason, W.: A general regression framework for learning string-to-string mappings. In: Bakir, G.H., Hofmann, T., Sch, B. (eds.) Predicting Structured Data, pp. 143–168. The MIT Press (September 2007)Google Scholar
  5. 5.
    Finch, A., Hwang, Y.-S., Sumita, E.: Using machine translation evaluation techniques to determine sentence-level semantic equivalence. In: IWP 2005 (2005)Google Scholar
  6. 6.
    Huenerfauth, M., Lu, P.: Effect of spatial reference and verb inflection on usability of sign language animations. Springer-Verlag Univ. Access Inf. Soc. (2011). doi  10.1007/s10209-011-0247-7
  7. 7.
    Hung-Yu, S., Chung-Hsien, W.: Improving structural statistical machine translation for sign language with small corpus using thematic role templates as translation memory. IEEE Transactions on Audio Speech, and Language Processing 17(7), 1305–1315 (2009)CrossRefGoogle Scholar
  8. 8.
    Koehn, P., Hoang, H.: Factored translation models. In: Proc. of EMNLP-CoNLL 2007 (2007)Google Scholar
  9. 9.
    Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proc. of HAACL-HLT 2003, pp. 48–54 (2003)Google Scholar
  10. 10.
    Leslie, C., Eskin, E., Stafford, W.: The spectrum kernel: a string kernel forsvm protein classification. In: Pacific Symposium on Biocomputing, pp. 566–575 (2002)Google Scholar
  11. 11.
    Lodhi, H., Saunders, C., Shawe-Taylor, J., Nello, C., Watkins, C.: Text Classification using String Kernels. Journal of Machine Learning Research 2, 419–444 (2002)zbMATHGoogle Scholar
  12. 12.
    Serrano, N., Andres-Ferrer, J., Casacuberta, F.: On a kernel regression approach to machine translation. In: Iberian Conference on Pattern Recognition and Image Analysis, pp. 394–401 (2009)Google Scholar
  13. 13.
    Scott, D.W.: Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons, New York (1992)CrossRefzbMATHGoogle Scholar
  14. 14.
    Stein, D., Schmidt, C., Hermann, N.: Analysis, preparation, and optimization of statistical sign language machine translation. Machine Translation 26(4), 325–357 (2012)CrossRefGoogle Scholar
  15. 15.
    Trevor, H., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd edn. Springer (2009)Google Scholar
  16. 16.
    Watkins, C.: Dynamic alignment kernels. Advances in Large Margin Classifiers, pp. 39–50 (2000)Google Scholar
  17. 17.
    Zhuoran, W., Shawe-Taylor, J., Sandor, S.: Kernel regression based machine translation. In Human Language Technologies. In: The Conference of the North American Chapter of the Association for Computational Linguistics, pp. 185–188 (2007)Google Scholar
  18. 18.
    Zhuoran, W., Shawe-Taylor, J.: Kernel regression framework for machine translation: UCL system description for WMT 2008 shared translation task. In: Proceedings of the Third Workshop on Statistical Machine Translation, pp. 155–158 (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Research Laboratory of Technologies of Information and Communication and Electrical Ingineering (LaTICE)Ecole Supérieure des Sciences et Techniques de TunisTunisTunisia

Personalised recommendations