Skip to main content

Fusion and Fission: Improved MMIA for Multi-modal HCI Based on WPS and Voice-XML

  • Conference paper
  • 787 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4541))

Abstract

This paper implements the Multi-Modal Instruction Agent (hereinafter, MMIA) including a synchronization between audio-gesture modalities, and suggests improved fusion and fission rules depending on SNNR (Signal Plus Noise to Noise Ratio) and fuzzy value, based on the embedded KSSL (Korean Standard Sign Language) recognizer using the WPS (Wearable Personal Station) and Voice-XML. Our approach fuses and recognizes the sentence and word-based instruction models that are represented by speech and KSSL, and then translates recognition result that is fissioned according to a weight decision rule into synthetic speech and visual illustration (graphical display by HMD-Head Mounted Display) in real-time. In order to insure the validity of our approach, we evaluate performance with the average recognition rates and the recognition time of MMIA. In the experimental results, the average recognition rates of the MMIA for the prescribed 65 sentential and 156 word instruction models were 94.33% and 96.85% in clean environments, and 92.29% and 92.91% were shown in noisy environments. In addition, the average recognition time is approximately 0.36 ms in given both environments.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Grasso, M.A., Ebert, D.S., Finin, T.W.: The integrality of speech in multimodal interfaces. ACM Trans. Comput.-Hum. Interact. 5(4), 303–325 (1998)

    Article  Google Scholar 

  2. Perlman, G., et al.: HCI Bibliography.: Human-Computer Interaction Resources, http://www.hcibib.org/

  3. Kim, J.-H., et al.: Hand Gesture Recognition System using Fuzzy Algorithm and RDBMS for Post PC. In: Proceedings of FSKD2005. LNCS (LNAI), vol. 3614, pp. 170–175. Springer-Verlag, Berlin Heidelberg New York (2005)

    Google Scholar 

  4. Kim, J.-H., et al.: An Implementation of KSSL Recognizer for HCI Based on Post Wearable PC and Wireless Networks KES 2006. LNCS (LNAI), vol. 4251 Part I, pp. 788–797. Springer-Verlag, Berlin Heidelberg New York (2006)

    Google Scholar 

  5. i.MX21 Processor Data-sheet http://www.freescale.com/

  6. Kim, S.-G.: Standardization of Signed Korean. Journal of KSSE, vol. 9. KSSE (1992)

    Google Scholar 

  7. Kim, S.-G.: Korean Standard Sign Language Tutor, 1st, Osung Publishing Company, Seoul (2000)

    Google Scholar 

  8. Oracle 10g DW Guide http://www.oracle.com

  9. Kim, J.-H., Hong, K.-S.: An Implementation of the Real-Time KSSL Recognition System based on the Post wearable PC. In: Proc. ICCS 2006. Lecture Notes in Computer Science, Part IV, vol. 3994, Springer-Verlag, Berlin, Heidelberg, New York (2006)

    Google Scholar 

  10. Chen, C.H.: Fuzzy Logic and Neural Network Handbook, 1st edn. McGraw-Hill, New York (1992)

    Google Scholar 

  11. Vasantha kandaswamy, W.B.: Smaranda Fuzzy Algebra. American Research Press, Seattle (2003)

    Google Scholar 

  12. McGlashan, S., et al.: Voice Extensible Markup Language (VoiceXML) Version 2.0. W3C Recommendation http://www.w3.org (1992)

  13. Martin, W. H.: DeciBel-The New Name for the Transmission Unit, Bell System Technical Journal (January 1929)

    Google Scholar 

  14. NIOSH working group.: STRESS...AT WORK NIOSH, Publication No. 99-101,U.S. National Institutes of Occupational Health (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takeshi Okadome Tatsuya Yamazaki Mounir Makhtari

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Kim, JH., Hong, KS. (2007). Fusion and Fission: Improved MMIA for Multi-modal HCI Based on WPS and Voice-XML. In: Okadome, T., Yamazaki, T., Makhtari, M. (eds) Pervasive Computing for Quality of Life Enhancement. ICOST 2007. Lecture Notes in Computer Science, vol 4541. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73035-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73035-4_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73034-7

  • Online ISBN: 978-3-540-73035-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics