Advertisement

Personalized Mobile Multimodal Services: CHAT Project Experiences

  • Giovanni Frattini
  • Federico Ceccarini
  • Fabio Corvino
  • Ivano De Furio
  • Francesco Gaudino
  • Pierpaolo Petriccione
  • Roberto Russo
  • Vladimiro Scotto di Carlo
  • Gianluca Supino
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5232)

Abstract

Despite optimistic expectations, the spread of multimodal mobile applications is proceeding slowly. Nevertheless the power of new high-end devices gives the opportunity to create a new class of application with advanced synergic multimodal features. In this paper we present the results the CHAT group achieved in defining and building a platform for developing synergic mobile multimodal services. CHAT is a project co-funded by Italian Ministry of Research, aimed at providing multimodal context-sensitive services to mobile users. Our architecture is based on the following key concepts: thin client approach, modular client interface, asynchronous content push, distributed recognition, natural language processing, speech driven semantic fusion. The core of the system is based on a mix of web and telecommunication technologies. This choice proved to be very useful to create high personalized context sensitive services. One of the main features is the possibility to push appropriate contents on the user terminal reducing unfriendly user interactions.

Keywords

Speech Recognition Automatic Speech Recognition Business Logic Multimodal Interface Internet Engineer Task Force 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Frattini, G., Romano, L., di Carlo, V.S., Petriccione, P., Supino, G., Leone, G., Autiero, C.: Multimodal Architectures: Issues and Experiences. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2006 Workshops. LNCS, vol. 4277, pp. 974–983. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    Frattini, G., Gaudino, F., di Carlo, V.S.: Mobile multimodal applications on mass-market devices: experiences - 18th International Workshop on Database and Expert Systems Applications (DEXA 2007) Dexa workshop 2007, pp. 89–93 (2007)Google Scholar
  3. 3.
    X+V for the Next Generation Web, http://www.voicexml.org/specs/multimodal/
  4. 4.
    Filippo, F., Krebs, A., Marsic, I.I.: A framework for rapid development of multimodal interfaces. In: Proceedings of the 5th international conference on Multimodal interfaces, Vancouver, British Columbia, Canada, November 05-07 (2003)Google Scholar
  5. 5.
    Synchronized Multimedia Integration Language (SMIL), http://www.w3.org/TR/REC-smil/
  6. 6.
    Niklfeld, G., Anegg, H., Pucher, M., Schatz, R., Simon, R., Wegscheider, F., Gassner, A., Jank, M., Pospischil, G.: Device independent mobile multimodal user interfaces with the MONA Multimodal Presentation Server. In: Proceedings of Eurescom Summit 2005, Heidelberg, Germany, April 27-29 (2005)Google Scholar
  7. 7.
    Thinlet classic (last accessed March 20, 2007), http://www.thinlet.com/
  8. 8.
    Reitter, D., Panttaja, E., Cummins, F.: UI on the fly: Generating a multimodal user interface. In: Proceedings of Human Language Technology conf 2004 / North American chapter of the Association for Computational Linguistics (HLT/NAACL 2004) (2004)Google Scholar
  9. 9.
    Piccolo (last accessed February 20, 2008), http://www.cs.umd.edu/hcil/jazz/index.shtml
  10. 10.
    Loquendo vocal technology and services, http://www.loquendo.com
  11. 11.
    Jarnal (last accessed March 5, 2008), http://www.dklevine.co/general/software/tc1000/jarnal.htm
  12. 12.
    Avola, D., Ferri, F., Grifoni, P.: Formalizing Recognition of Sketching Styles in Human Centered Systems. In: Apolloni, B., Howlett, R.J., Jain, L. (eds.) KES 2007, Part II. LNCS (LNAI), vol. 4693, pp. 369–376. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  13. 13.
    Avola, D., Caschera, M.C., Ferri, F., Grifoni, P.: Ambiguities in Sketch-Based Interfaces. In: Proceedings of the Hawaii International Conference on System Sciences (HICSS 2007), p. 290. IEEE Computer Society, Los Alamitos (2007)CrossRefGoogle Scholar
  14. 14.
    Gorin, A.L., Alonso, T., Abella, A., Riccardi, G., Wright, J.H.: Semantic Information Processing of Spoken Language - How May I Help You? In: Pattern recognition in Speech and Language Processing. CRC Press, Boca Raton (2003)Google Scholar
  15. 15.
    Wright, J.H., Gorin, A.L., Riccardi, G.: Automatic Acquisition of Salient Grammar Fragments for Call-Type Classification. In: Proc. 5th Europ. Conf. Speech Communication and Technology, pp. 1419–1422. Intern. Speech Communication Ass., Bonn, Germany (1997)Google Scholar
  16. 16.
    Arai, K., Wright, J.H., Riccardi, G., Gorin, A.: Grammar fragment acquisition using syntactic and semantic clustering. Speech Communication 27 (January 1999)Google Scholar
  17. 17.
    Gildea, D., Jurasfky, D.: Automatic labeling of semantic roles. Computational Linguistic 28(3), 496–530 (2002)CrossRefGoogle Scholar
  18. 18.
    Ng, H.T., Zelle, J.M.: Corpus-Based Approaches to Semantic Interpretation in NLP. AI Magazine 18(4), 45–64 (1997)Google Scholar
  19. 19.
    Johnson, C.R., Fillmore, C.J.: The FrameNet tagset for frame-semantic and syntactic coding of predicate-argument structure. In: Proceedings of the 1st Meeting of the North American Chapter of the Association for Computational Linguistics (ANLP-NAACL 2000), April 29-May 4, 2000, pp. 56–62 (2000)Google Scholar
  20. 20.
    W3C multimodal interaction activity, http://www.w3.org/2002/mmi
  21. 21.
    XUL project (last accessed November 12, 2007), http://www.mozilla.org/projects/xul/
  22. 22.
    Minker, W., Bennacef, S.: Speech and Human-Machine Dialog. Kluwer Academic Publishers, Boston (2004)zbMATHGoogle Scholar
  23. 23.
    Koons, D.B., Sparrell, C.J., Thorisson, K.R.: Integrating simultaneous input from speech, gaze and hand gestures. In: Maybury, M. (ed.) Intelligent Multimedia Interfaces, Menlo Park, CA, pp. 257–276. MIT, Cambridge (1993)Google Scholar
  24. 24.
    Russ, G., Sallans, B., Hareter, H.: Semantic Based Information Fusion in a Multimodal Interface. In: CSREA HCI (2005)Google Scholar
  25. 25.
    Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech RecognitionGoogle Scholar
  26. 26.
    Hakkani-Tür, D., Tur, G., Riccardi, G., Kim, H.K.: Error Prediction in Spoken Dialog: From Signal-to-Noise Ratio to Semantic Confidence Scores. In: The Proceedings of ICASSP-2005, IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, USA (March 2005)Google Scholar
  27. 27.
    Levin, B., Rappaport Hovav, M.: Lexical Semantics and Syntactic Structure. In: Lappin, S. (ed.) The Handbook of Contemporary Semantic Theory, pp. 487–507. Blackwell, OxfordGoogle Scholar
  28. 28.
    Foster, M.E.: State of the art review: Multimodal fission (2002), www.hcrc.ed.ac.uk/comic/documents/deliverables/Del6-1.pdfM

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Giovanni Frattini
    • 1
  • Federico Ceccarini
    • 1
  • Fabio Corvino
    • 1
  • Ivano De Furio
    • 1
  • Francesco Gaudino
    • 1
  • Pierpaolo Petriccione
    • 1
  • Roberto Russo
    • 1
  • Vladimiro Scotto di Carlo
    • 1
  • Gianluca Supino
    • 1
  1. 1.ENGINEERING.IT S.p.A.Pozzuoli (NA)Italy

Personalised recommendations