Multimodal Interface for Effective Man Machine Interaction

  • N. S. SreekanthEmail author
  • Nobby Varghese
  • C. H. Pradeepkumar
  • Pal Vaishali
  • R. Ganga Prasad
  • N. Pal Supriya
  • N. K. Narayanan
Part of the Media Business and Innovation book series (MEDIA)


Providing human–human way of interaction for man machine interaction is still a research challenge. It is widely believed that as the computing, communication, and display technologies progress even further, the existing HCI techniques may become a constraint in the effective utilization of the available information flow. Multimodal interaction provides the user with multiple modes of interfacing with a system beyond the traditional keyboard and mouse interactions. This article mainly discusses about the effectiveness of Multimodal Interaction for Man–Machine interaction and also discusses about the implementation issues in various platforms and media. The convergence of various input and output technologies will subsidize the difficulties of humans in communicating with machines thus make maximum use of the converged media platforms. This chapter addresses the implementation of a multimodal interface system through a case study. In addition to that we also discuss about certain challenging application areas where we require a solution of this kind.


Speech Recognition Gesture Recognition Automatic Speech Recognition Hand Gesture Parse Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Bolt, R. A. (1980). Put-that-there: Voice and gesture at the graphics interface. Computer Graphics, 14(3), 262–270.CrossRefGoogle Scholar
  2. Brashear, H., Henderson, V., Park, K., Hamilton, H., Lee, S., & Starner, T. (2006). American Sign Language recognition in game development for deaf children. In Proceedings of ACM SIGACCESS Conference on Assistive Technologies (Portland, OR, Oct. 23–25). ACM Press, New York, 2006, pp. 79–86Google Scholar
  3. Cetingul, H. E., Erzin, E., Yemez, Y., & Tekalp, A. M. (2006). Multimodal speaker/speech recognition using lip motion, lip texture and audio. Signal Processing, 86, 3549–3558. Science direct, Elsevier.CrossRefGoogle Scholar
  4. Chen, Q. (2008). PhD thesis, Canada.Google Scholar
  5. Dumas, B., Lalanne, D., & Oviat, S. (2009). Human machine interaction (pp. 3–26). Berlin: Springer.CrossRefGoogle Scholar
  6. Dumas, B., Lalanne, D., & Oviatt, S. (2009). Multimodal interfaces: A survey of principles, models and frameworks. In D. Lalanne & J. Kohlas (Eds.), Human machine interaction: Research results of the MMI Program (LNCS, Vol. 5440, pp. 3–26). Berlin: Springer.CrossRefGoogle Scholar
  7. Flippo, F., Kerbs, A., & Marsic, I. (2003). A framework for rapid development of multimodal interface. In Proceedings of ICMI’03, Vancouver, British Columbia, Canada. November 5–7, 2003.
  8. Ito, E. (2001). Multi-modal interface with voice and head tracking for multiple home appliances. In Proceedings of INTERACT2001 8th IFIP TC.13 Conference on Human-Computer Interaction, pp. 727–728.Google Scholar
  9. Keller, P. E., Kangas, L. J., Liden, L. H., Hashem, S., & Kouzes, R. T. (1995). Electronic noses and their applications. In IEEE Northcon/Technical Applications Conference (TAC’95) in Portland, OR, USA on 12 October 1995.Google Scholar
  10. Krahnstoever, N., Kettebekov, S., Yeasin, M., & Sharma, R. (2002). A real-time framework for natural multimodal interaction with large screen displays. In International Conference on Multimodal Interfaces Proceedings of the 4th IEEE International Conference on Multimodal Interfaces-2002, pp. 349–354.Google Scholar
  11. Kuno, Y., Murashima, T., Shimada, N., & Shirai, Y. (2000). Intelligent wheelchair remotely controlled by interactive gestures. In Proceedings of 15th International Conference on Pattern Recognition (Barcelona, Sept. 3–7, 2000), pp. 672–675.Google Scholar
  12. Maat, L., & Pantic, M. (2007). Gaze-X: Adaptive affective multimodal interface for single-user office scenarios. In T. S. Huang, A. Nijholt, M. Pantic, & A. Pentland (Eds.), Artificial intelligence for human computing (Lecture Notes in Computer Science, Vol. 4451, pp. 251–271). Berlin: Springer.CrossRefGoogle Scholar
  13. Nishikawa, A., Hosoi, T., Koara, K., Negoro, D., Hikita, A., Asano, S., et al. (2003). FAce MOUSe: A novel human-machine interface for controlling the position of a laparoscope. IEEE Transactions on Robotics and Automation, 19(5), 825–841.CrossRefGoogle Scholar
  14. Pieraccini, R., Dayanidhi, K., Bloom, J., Dahan, J.-G., & Phillips, M. (2003). A Multimodal Conversational Interface for a Concept Vehicle. In Eurospeech 2003. Google Scholar
  15. Quek, F. (2003). The catchment feature model for multimodal language analysis. In Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV 2003) 2-Volume Set.Google Scholar
  16. Quek, F., McNeill, D., Bryll, R., Duncan, S., Ma, X. F., Kirbas, C., et al. (2002). Multimodal human discourse: Gesture and speech. ACM Transactions on ComputerHuman Interaction (TOCHI), 9(3), 171–193.CrossRefGoogle Scholar
  17. Rauschert, I., Agrawal, P., Sharma, R., Fuhrmann, S., Brewer, I., & MacEachren, A. M. (2002). Designing a human-centered, multimodal GIS interface to support emergency management. In Proceedings of the 10th ACM International Symposium on Advances in Geographic Information Systems (McLean, VA, Nov. 8–9). ACM Press, New York, 2002, pp. 119–124.Google Scholar
  18. Reitter, D., Panttaja, E. M., & Cummins, F. (2004). UI on the Fly: Generating a multimodal user interface. In Human Language Technology Conference Proceedings of HLT-NAACL 2004. pp. 45–48.Google Scholar
  19. Rogalla, O., Ehrenmann, M., Zöllner, R., Becher, R., & Dillmann, R. (2002). Using gesture and speech control for commanding a robot assistant. In Proceedings of the IEEE International Workshop on Robot and Human Interactive Communication (Berlin, Sept. 25–27, 2002), pp. 454–459.Google Scholar
  20. Sreekanth, N. S., Gopinath, P., Supriya, N. P., & Narayanan, N. K. (2011). Gesture based desktop interaction. International Journal of Machine Intelligence, 3(4), 268–271. ISSN: 0975–2927, E-ISSN: 0975–9166.Google Scholar
  21. Sreekanth, N. S., Supriya, N. P., Girish, K. G, Arunjith, A, & Narayanan, N. K. (2008). Performing operations on graph through multimodal interface: An agent based architecture. In Proceedings of ICADIWT 2008. First International Conference on the Applications of Digital Information and Web Technologies, pp. 74–77.
  22. Sreekanth, N. S., Supriya, N. P., Thomas, M., Haassan, A., & Narayanan, N. K. (2009). Multimodal interface: Fusion of various modalities, multimodal interface fusion of various modalities. International Journal of Information Studies, 1(2), 131–137.Google Scholar
  23. Starner, T., Auxier, J., Ashbrook, D., & Gandy, M. (2000). The gesture pendant: A self-illuminating, wearable, infrared computer-vision system for home-automation control and medical monitoring. In Proceedings of the Fourth International Symposium on Wearable Computers (Atlanta, Oct. 2000), pp. 87–94.Google Scholar
  24. Thomas, M., Hassan, A., Sreekanth N. S., Supriya, N. P. (2008). Multimodal interface to desktop. In Proceedings of International Conference on Opensource Computing-2008, NMAMIT-Nitte Mangalor, Indiae, 2008, pp. 26–30.Google Scholar
  25. Turunen, M., Hakulinen, J., Kainulainen, A., Melto, A., & Hurtig, T. (2007) Design of a rich multimodal interface for mobile spoken route guidance. In Proceedings of Interspeech 2007–Eurospeech, pp. 2193–2196.Google Scholar
  26. W3C. (2003). Multimodal interaction frame work.
  27. W3C. (2009). EMMA: Extensible multimodal annotation markup language
  28. W3C. (2014). Multimodal interaction working group charter.
  29. Wachs, J., Stern, H., Edan, Y., Gillam, M., Feied, C., Smith, M., et al. (2008). A hand-gesture sterile tool for browsing MRI images in the OR. Journal of the American Medical Informatics Association, 15(3), 321–323.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • N. S. Sreekanth
    • 1
    Email author
  • Nobby Varghese
    • 1
  • C. H. Pradeepkumar
    • 1
  • Pal Vaishali
    • 1
  • R. Ganga Prasad
    • 1
  • N. Pal Supriya
    • 1
  • N. K. Narayanan
    • 2
  1. 1.Center for Development of Advanced Computing (C-DAC)BangaloreIndia
  2. 2.Department of Information TechnologyKannur UniversityKannurIndia

Personalised recommendations