Engineering Variance: Software Techniques for Scalable, Customizable, and Reusable Multimodal Processing

  • Marc Erich Latoschik
  • Martin Fischbach
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8510)


This article describes four software techniques to enhance the overall quality of multimodal processing software and to include concurrency and variance due to individual characteristics and cultural context. First, the processing steps are decentralized and distributed using the actor model. Second, functor objects decouple domain- and application-specific operations from universal processing methods. Third, domain specific languages are provided inside of specialized feature processing units to define necessary algorithms in a human-readable and comprehensible format. Fourth, constituents of the DSLs (including the functors) are semantically grounded into a common ontology supporting syntactic and semantic correctness checks as well as code-generation capabilities. These techniques provide scalable, customizable, and reusable technical solutions for reoccurring multimodal processing tasks.


Multimodal processing interactive systems software architecture actor system DSL reactive manifesto software patterns 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Böhm, K., Broll, W., Sokolewicz, M.: Dynamic gesture recognition using neural networks; a fundament for advanced interaction construction. In: Fisher, S., Merrit, J., Bolan, M. (eds.) Stereoscopic Displays and Virtual Reality Systems, SPIE Conference Electronic Imaging Science & Technology, San Jose, USA, vol. 2177 (1994)Google Scholar
  2. 2.
    Bouchet, J., Nigay, L., Ganille, T.: ICARE software components for rapidly developing multimodal interfaces. In: ICMI 2004: Proceedings of the 6th International Conference on Multimodal Interfaces, pp. 251–258. ACM, New York (2004)Google Scholar
  3. 3.
    Fischbach, M., Wiebusch, D., Giebler-Schubert, A., Latoschik, M.E., Rehfeld, S., Tramberend, H.: SiXton’s curse - Simulator X demonstration. In: 2011 IEEE Virtual Reality Conference, VR, pp. 255–256 (2011)Google Scholar
  4. 4.
    Fitzgerald, W., Firby, R.J., Hannemann, M.: Multimodal event parsing for intelligent user interfaces. In: Proceedings of the 2003 International Conference on Intelligent User Interfaces, pp. 53–60. ACM Press (2003)Google Scholar
  5. 5.
    Hewitt, C., Bishop, P., Steiger, R.: A universal modular ACTOR formalism for artificial intelligence. In: IJCAI 1973: Proceedings of the 3rd International Joint Conference on Artificial Intelligence, pp. 235–245. Morgan Kaufmann Publishers Inc., San Francisco (1973)Google Scholar
  6. 6.
    Hoste, L., Dumas, B., Signer, B.: Mudra: A unified multimodal interaction framework. In: Proceedings of the 13th International Conference on Multimodal Interfaces, ICMI 2011, pp. 97–104. ACM, New York (2011)Google Scholar
  7. 7.
    Johnston, M.: Unification-based multimodal parsing. In: Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics, COLING-ACL, pp. 624–630 (1998)Google Scholar
  8. 8.
    Johnston, M., Bangalore, S.: Finite-state methods for multimodal parsing and integration. In: Finite-state Methods Workshop, ESSLLI Summer School on Logic Language and Information, Helsinki, Finland, pp. 74–80 (August 2001)Google Scholar
  9. 9.
    Johnston, M., Cohen, P.R., McGee, D., Oviatt, S.L., Pittman, J.A., Smith, I.: Unification-based multimodal integration. In: 35th Annual Meeting of the Association for Computational Linguistics, Madrid, pp. 281–288 (1997)Google Scholar
  10. 10.
    Kendon, A.: Gesticulation and speech: Two aspects of the process of utterance. In: Key, M.R. (ed.) The Relation between Verbal and Non-verbal Communication (1980)Google Scholar
  11. 11.
    Koons, D.B., Sparrel, C.J., Thorisson, K.R.: Intergrating simultaneous input from speech, gaze and hand gestures. In: Intelligent Multimedia Interfaces. American Association for Artificial Intelligence (1993)Google Scholar
  12. 12.
    Lalanne, D., Nigay, L., Palanque, P., Robinson, P., Vanderdonckt, J., Ladry, J.F.: Fusion engines for multimodal input: A survey. In: ICMI-MLMI 2009: Proceedings of the 2009 International Conference on Multimodal Interfaces, pp. 153–160. ACM, New York (2009)CrossRefGoogle Scholar
  13. 13.
    Latoschik, M.E.: Designing Transition Networks for Multimodal VR-Interactions Using a Markup Language. In: Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces, ICMI 2002, Pittsburgh, Pennsylvania, pp. 411–416. IEEE (2002)Google Scholar
  14. 14.
    Latoschik, M.E.: A user interface framework for multimodal VR interactions. In: Proceedings of the IEEE Seventh International Conference on Multimodal Interfaces, ICMI 2005, Trento, Italy, pp. 76–83 (October 2005)Google Scholar
  15. 15.
    Latoschik, M., Tramberend, H.: Simulator X: A scalable and concurrent architecture for intelligent realtime interactive systems. In: 2011 IEEE Virtual Reality Conference (VR), pp. 171–174 (March 2011)Google Scholar
  16. 16.
    Nigay, L., Bouchet, J., Juras, D., Mansoux, B., Ortega, M., Serrano, M., Lawson, J.-Y.L.: Software engineering for multimodal interactive systems. In: Tzovaras, D. (ed.) Multimodal User Interfaces. Signals and Commmunication Technologies, pp. 201–218. Springer (2008)Google Scholar
  17. 17.
    Väänänen, K., Böhm, K.: Gesture-driven interaction as a human factor in virtual environments – an approach with neural networks. In: Gigante, M.A., Jones, H. (eds.) Virtual Reality Systems. Academic Press (1993)Google Scholar
  18. 18.
    Wagner, J., Lingenfelser, F., Baur, T., Damian, I., Kistler, F., André, E.: The social signal interpretation (SSI) framework: Multimodal signal processing and recognition in real-time. In: Proceedings of the 21st ACM International Conference on Multimedia, MM 2013, pp. 831–834. ACM, New York (2013)Google Scholar
  19. 19.
    Wiebusch, D., Latoschik, M.E.: Enhanced decoupling of components in intelligent realtime interactive systems using ontologies. In: Proceedings of the IEEE Virtual Reality 2012 Workshop on Software Engineering and Architectures for Realtime Interactive Systems, SEARIS (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Marc Erich Latoschik
    • 1
  • Martin Fischbach
    • 1
  1. 1.HCI groupUniversity of WürzburgGermany

Personalised recommendations