Fusion and Fission: Improved MMIA for Multi-modal HCI Based on WPS and Voice-XML

Kim, Jung-Hyun; Hong, Kwang-Seok

doi:10.1007/978-3-540-73035-4_15

Fusion and Fission: Improved MMIA for Multi-modal HCI Based on WPS and Voice-XML

Jung-Hyun Kim¹ &
Kwang-Seok Hong¹

Conference paper

787 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4541))

Abstract

This paper implements the Multi-Modal Instruction Agent (hereinafter, MMIA) including a synchronization between audio-gesture modalities, and suggests improved fusion and fission rules depending on SNNR (Signal Plus Noise to Noise Ratio) and fuzzy value, based on the embedded KSSL (Korean Standard Sign Language) recognizer using the WPS (Wearable Personal Station) and Voice-XML. Our approach fuses and recognizes the sentence and word-based instruction models that are represented by speech and KSSL, and then translates recognition result that is fissioned according to a weight decision rule into synthetic speech and visual illustration (graphical display by HMD-Head Mounted Display) in real-time. In order to insure the validity of our approach, we evaluate performance with the average recognition rates and the recognition time of MMIA. In the experimental results, the average recognition rates of the MMIA for the prescribed 65 sentential and 156 word instruction models were 94.33% and 96.85% in clean environments, and 92.29% and 92.91% were shown in noisy environments. In addition, the average recognition time is approximately 0.36 ms in given both environments.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Grasso, M.A., Ebert, D.S., Finin, T.W.: The integrality of speech in multimodal interfaces. ACM Trans. Comput.-Hum. Interact. 5(4), 303–325 (1998)
Article Google Scholar
Perlman, G., et al.: HCI Bibliography.: Human-Computer Interaction Resources, http://www.hcibib.org/
Kim, J.-H., et al.: Hand Gesture Recognition System using Fuzzy Algorithm and RDBMS for Post PC. In: Proceedings of FSKD2005. LNCS (LNAI), vol. 3614, pp. 170–175. Springer-Verlag, Berlin Heidelberg New York (2005)
Google Scholar
Kim, J.-H., et al.: An Implementation of KSSL Recognizer for HCI Based on Post Wearable PC and Wireless Networks KES 2006. LNCS (LNAI), vol. 4251 Part I, pp. 788–797. Springer-Verlag, Berlin Heidelberg New York (2006)
Google Scholar
i.MX21 Processor Data-sheet http://www.freescale.com/
Kim, S.-G.: Standardization of Signed Korean. Journal of KSSE, vol. 9. KSSE (1992)
Google Scholar
Kim, S.-G.: Korean Standard Sign Language Tutor, 1st, Osung Publishing Company, Seoul (2000)
Google Scholar
Oracle 10g DW Guide http://www.oracle.com
Kim, J.-H., Hong, K.-S.: An Implementation of the Real-Time KSSL Recognition System based on the Post wearable PC. In: Proc. ICCS 2006. Lecture Notes in Computer Science, Part IV, vol. 3994, Springer-Verlag, Berlin, Heidelberg, New York (2006)
Google Scholar
Chen, C.H.: Fuzzy Logic and Neural Network Handbook, 1st edn. McGraw-Hill, New York (1992)
Google Scholar
Vasantha kandaswamy, W.B.: Smaranda Fuzzy Algebra. American Research Press, Seattle (2003)
Google Scholar
McGlashan, S., et al.: Voice Extensible Markup Language (VoiceXML) Version 2.0. W3C Recommendation http://www.w3.org (1992)
Martin, W. H.: DeciBel-The New Name for the Transmission Unit, Bell System Technical Journal (January 1929)
Google Scholar
NIOSH working group.: STRESS...AT WORK NIOSH, Publication No. 99-101,U.S. National Institutes of Occupational Health (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information and Communication Engineering, Sungkyunkwan University, 300, Chunchun-dong, Jangan-gu, Suwon, KyungKi-do, 440-746, Korea
Jung-Hyun Kim & Kwang-Seok Hong

Authors

Jung-Hyun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Kwang-Seok Hong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Takeshi Okadome Tatsuya Yamazaki Mounir Makhtari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, JH., Hong, KS. (2007). Fusion and Fission: Improved MMIA for Multi-modal HCI Based on WPS and Voice-XML. In: Okadome, T., Yamazaki, T., Makhtari, M. (eds) Pervasive Computing for Quality of Life Enhancement. ICOST 2007. Lecture Notes in Computer Science, vol 4541. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73035-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-540-73035-4_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73034-7
Online ISBN: 978-3-540-73035-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics