Modeling Aspects of Multimodal Lithuanian Human - Machine Interface

  • Rytis Maskeliunas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5398)


The paper deals with modeling of multimodal human - machine interface. Multimodal access to the information retrieval system (computer) is possible by combining three different approaches: 1) Data input / retrieval by voice; 2) Traditional data input / retrieval systems; 3) Confirmation / rejection by recognizing and displaying human face expressions and emotions. A prototype of multimodal access for web application is presented by combining three modalities. Lithuanian language speech recognition experiment results on transcriptions and outcomes of discriminant analysis are presented.


Multimodal interface Lithuanian speech recognition 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cetingul, H.E., Erzin, E., Yemez, Y., Tekalp, A.M.: Multimodal speaker/speech recognition using lip motion, lip texture and audio. In: Source Signal Processing, vol. 86(12), pp. 3549–3558. Elsevier North-Holland Inc., Amsterdam (2006)Google Scholar
  2. 2.
    Chapanis, A.: Interactive Communication: A few research answers for a technological explosion. In: Neel, D., Lienard, J.S. (eds.) Nouvelles Tendances de la Communication Homme-Machine, pp. 33–67. Inria, Le Chesnay (1980)Google Scholar
  3. 3.
    Cohen, P.R., Oviatt, S.L.: The Role of Voice Input for Human - Machine Communication. Proceedings of the National Academy of Sciences 92(22), 9921–9927 (1995)CrossRefGoogle Scholar
  4. 4.
    Dauhman, J.: Face and gesture recognition: Overview. IEEE Transaction on Pattern Analysis and Machine Intelligence 19(7), 675–676 (1997)CrossRefGoogle Scholar
  5. 5.
    Grasso, M.A., Ebert, D.S., Finin, T.W.: The integrality of speech in multimodal interfaces. ACM Transactions on Computer-Human Interaction (TOCHI) 5(4), 303–325 (1998)CrossRefGoogle Scholar
  6. 6.
    Jones, D.M., Hapeshi, K., Frankish, C.: Design guidelines for speech recognition interfaces. Applied Ergonomics 20(1), 40–52 (1990)Google Scholar
  7. 7.
    Kasparatis, P.: Diphone databases for Lithuanian text – to – speech synthesis. Informatica 16(2), 193–202 (2005)Google Scholar
  8. 8.
    Minker, W., Bannacef, S.: Speech and human machine dialog. Kluwer academic publishers, Boston (2004)zbMATHGoogle Scholar
  9. 9.
    Parke, F.I.: Parameterized models for facial animation. IEEE Computer Graphics and Applications 2(9), 61–68 (1982)CrossRefGoogle Scholar
  10. 10.
    Pike, G., Kemp, R., Brace, N.: The Psychology of Human Face Recognition. In: IEE Colloquium on Visual Biometrics, Ref. no. 2000/018, pp. 11/1—11/6. IEE Savoy Place, London (2000)Google Scholar
  11. 11.
    Prokoski, F.J., Riedel, R.B., Coffin, J.S.: Identification of Individuals by Means of Facial Thermography. In: Proceedings of the IEEE 1992 International Carnahan Conference on Security Technology, Crime Countermeasures, October 14–16, pp. 120–125. IEEE Press, New York (1992)CrossRefGoogle Scholar
  12. 12.
    Raudys, S.: Statistical and neural classifiers: An integrated approach to design. Springer, London (2001)CrossRefzbMATHGoogle Scholar
  13. 13.
    Reveret, L., Bailly, G., Badin, P.: MOTHER: A new generation of talking heads providing a flexible articulatory control for video-realistic speech animation. In: Proceedings of the 6th International Conference of Spoken Language Processing, pp. 16–20. ISCA, Beijing (2000)Google Scholar
  14. 14.
    Reveret, L., Essa, I.: Visual Coding and Tracking of Speech Related Facial Motion. Georgia institute of technology technical report GIT-GVU-TR-01-16. Georgia institute of technology (2001)Google Scholar
  15. 15.
    Rudzionis, A., Rudzionis, V.: Phoneme recognition in fixed context using regularized discriminant analysis. In: Proceedings of EUROSPEECH 1999, vol. 6, pp. 2745–2748. ESCA, Budapest (1999)Google Scholar
  16. 16.
    Shneiderman, B.: Direct manipulation: A step beyond programming languages. In: Human-computer interaction: a multidisciplinary approach, pp. 461–467. Morgan Kaufmann Publishers Inc., San Francisco (1987)Google Scholar
  17. 17.
    Smith, W.A.P., Robles-Kelly, A., Hancock, E.R.: Skin Reflectance Modeling for Face Recognition. In: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR 2004), vol. 3, pp. 210–213. IEEE computer society, Washington (2004)Google Scholar
  18. 18.
    Young, S.: Speech understanding and spoken dialogue systems. In: IEE Colloquium on Speech and Language Engineering – State of the Art (Ref. No. 1998/499), pp. 6/1—6/5. Savoy Place, London (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Rytis Maskeliunas
    • 1
  1. 1.Speech Research Lab.Kaunas University of TechnologyKaunasLithuania

Personalised recommendations