Skip to main content

User Expectations from Dictation on Mobile Devices

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 4551))

Abstract

Mobile phones, with their increasing processing power and memory, are enabling a diversity of tasks. The traditional text entry method using keypad is falling short in numerous ways. Some solutions to this problem include: QWERTY keypads on phone, external keypads, virtual keypads on table tops (Seimens at CeBIT ’05) and last but not the least, automatic speech recognition (ASR) technology. Speech recognition allows for dictation which facilitates text input via voice. Despite the progress, ASR systems still do not perform satisfactorily in mobile environments. This is mainly due to the complexity of capturing large vocabulary spoken by diverse speakers in various acoustic conditions. Therefore, dictation has its advantages but also comes with its own set of usability problems. The objective of this research is to uncover the various uses and benefits of using dictation on a mobile phone. This study focused on the users’ needs, expectations, and their concerns regarding the new input medium. Focus groups were conducted to investigate and discuss current data entry methods, potential use and usefulness of dictation feature, users’ reaction to errors from ASR during dictation, and possible error correction methods. Our findings indicate a strong requirement for dictation. All participants perceived dictation to be very useful, as long as it is easily accessible and usable. Potential applications for dictation were found in two distinct areas namely communication and personal use.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Card, S.K., Moran, T.P., Newell, A.: The Psychology of Human-Computer Interaction, pp. 259–297. Lawrence Erlbaum Associates, New Jersey (1983)

    Google Scholar 

  2. CTIA Website: http://ctia.org/research_statistics/statistics/index.cfm/AID/10202

  3. Cox, A., Walton, A.: Evaluating the viability of speech recognition for mobile text entry. In: Proceedings of HCI 2004: Design for Life, pp. 25–28 (2004)

    Google Scholar 

  4. Dragon Naturally Speaking Software website: http://www.nuance.com/naturallyspeaking/

  5. Dunlop, M.D., Crossan, A.: Predictive text entry methods for mobile phones. Personal Technologies, vol. 4, pp. 134–143. Springer, London (2000)

    Google Scholar 

  6. Feng, J., Karat, C.-M., Sears, A.: How productivity improves in hands-free continuous dictation tasks: lessons learned from a longitudinal study. Interacting with Computers 17, 265–289 (2005)

    Article  Google Scholar 

  7. Grinter, R., Eldridge, M.: y do tngrs luv 2 txt msg. In: Prinz, W., et al.: (eds.)Proceedings of the Seventh European Conference on Computer-Supported Cooperative Work ECSCW 2001, Dordecht, Netherlands: Kluwer, pp. 219–238 (2001)

    Google Scholar 

  8. Kamm, C.: User interfaces for voice applications. Paper presented in Colloquim: Human-Machine Communication by Voice at National Academy of Sciences at the Arnold and Mabel Beckman Center, Irvine, CA (February 8-9, 1993)

    Google Scholar 

  9. Karat, C.-M., Halverson, C., Karat, J., Horn, D.: Patterns of entry and correction in large vocabulary continuous speech recognition systems. In: Proceedings of CHI 1999, pp. 568–575 (1999)

    Google Scholar 

  10. Leiser, R.G.: Improving natural language and speech interfaces by the use of metalinguistic phenomena. Applied Ergonomics 20, 168–173 (1989)

    Article  Google Scholar 

  11. MacKenzie, I.S., Soukoreff, R.W.: Text entry for mobile computing: Models and methods, theory and practice. Human-Computer Interaction 17, 147–198 (2002)

    Article  Google Scholar 

  12. Marturano, L., Wheatley, D.: User centered research and design at Motorola. In: Proceedings of CHI 2000, pp. 221–222 (2000)

    Google Scholar 

  13. Microsoft Vista: http://www.microsoft.com/enable/products/windowsvista/

  14. Mobile Data Association Website: http://www.mda-mobiledata.org

  15. Munteanu, C., Baecker, R., Penn, G., Toms, E., James, D.: The Effect of Speech Recognition Accuracy Rates on the Usefulness and Usability of Webcast Archives. In: Proceedings of Computer Human Interaction Conference, Montreal, Canada, pp. 493–502. ACM Press, New York (2006)

    Google Scholar 

  16. Oniszczak, A., MacKenzie, S.I.: A Comparison of Two Input Methods for Keypads on Mobile Devices. In: Proceedings of the third Nordic conference on Human-computer interaction, Tampere Finland, pp. 101–104. ACM Press, New York (2004)

    Chapter  Google Scholar 

  17. Oviatt, S.L., Cohen, P.R., Wu, L., Vergo, J., Duncan, L., Suhm, B., Bers, J., Holzman, T., Winogard, T., Landay, J., Larson, J., Ferro, D.: Designing the user interface for multimodal speech and gesture applications: state-of-the-art systems and research directions. Human-Computer Interaction 15(4), 263–322 (2000)

    Article  Google Scholar 

  18. Palm Handheld Products website: http://www.palm.com/us/products/input/

  19. Rudnicky, A.I., Lee, K-F., Hauptmann, A.G.: Survey of current speech technology. Communications of the ACM 37(3), 52–57 (1994)

    Article  Google Scholar 

  20. Sears, A., Karat, C.-M., Oseitutu, K., Karimullah, A., Feng, J.: Productivity, satisfaction, and interaction strategies of individual with spinal cord injuries and traditional users interacting with speech recognition software. Universal Access in the information Society 1, 4–15 (2001)

    Google Scholar 

  21. Silfverberg, M., MacKenzie, I.S., Korhonen, P.: Perdicting text entry speed on mobile phones. In: Proceedings of CHI 2000, pp. 9–16. ACM Press, Amsterdam (2000)

    Google Scholar 

  22. Suhm, B., Myers, B., Waibel, A.: Multimodal error correction for speech user interfaces. ACM Transactions on Computer-Human Interaction 8(1), 60–98 (2001)

    Article  Google Scholar 

  23. Tarasewich, P.: Evaluation of thumbwheel text entry methods. Extended Abstracts of the CHI 2003 Conference, pp. 756–757 (2003)

    Google Scholar 

  24. Waibel, A., Suhm, B., Vo, M.T., Yang, J.: Multimodal Interfaces For Multimedia Information Agents. In: International Conference on Acoustics, Speech, and Signal Processing 1997, ICASSP 1997, Munich, Germany, 04 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Julie A. Jacko

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Basapur, S., Xu, S., Ahlenius, M., Lee, Y.S. (2007). User Expectations from Dictation on Mobile Devices. In: Jacko, J.A. (eds) Human-Computer Interaction. Interaction Platforms and Techniques. HCI 2007. Lecture Notes in Computer Science, vol 4551. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73107-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73107-8_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73106-1

  • Online ISBN: 978-3-540-73107-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics