Abstract
Mobile phones, with their increasing processing power and memory, are enabling a diversity of tasks. The traditional text entry method using keypad is falling short in numerous ways. Some solutions to this problem include: QWERTY keypads on phone, external keypads, virtual keypads on table tops (Seimens at CeBIT ’05) and last but not the least, automatic speech recognition (ASR) technology. Speech recognition allows for dictation which facilitates text input via voice. Despite the progress, ASR systems still do not perform satisfactorily in mobile environments. This is mainly due to the complexity of capturing large vocabulary spoken by diverse speakers in various acoustic conditions. Therefore, dictation has its advantages but also comes with its own set of usability problems. The objective of this research is to uncover the various uses and benefits of using dictation on a mobile phone. This study focused on the users’ needs, expectations, and their concerns regarding the new input medium. Focus groups were conducted to investigate and discuss current data entry methods, potential use and usefulness of dictation feature, users’ reaction to errors from ASR during dictation, and possible error correction methods. Our findings indicate a strong requirement for dictation. All participants perceived dictation to be very useful, as long as it is easily accessible and usable. Potential applications for dictation were found in two distinct areas namely communication and personal use.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Card, S.K., Moran, T.P., Newell, A.: The Psychology of Human-Computer Interaction, pp. 259–297. Lawrence Erlbaum Associates, New Jersey (1983)
CTIA Website: http://ctia.org/research_statistics/statistics/index.cfm/AID/10202
Cox, A., Walton, A.: Evaluating the viability of speech recognition for mobile text entry. In: Proceedings of HCI 2004: Design for Life, pp. 25–28 (2004)
Dragon Naturally Speaking Software website: http://www.nuance.com/naturallyspeaking/
Dunlop, M.D., Crossan, A.: Predictive text entry methods for mobile phones. Personal Technologies, vol. 4, pp. 134–143. Springer, London (2000)
Feng, J., Karat, C.-M., Sears, A.: How productivity improves in hands-free continuous dictation tasks: lessons learned from a longitudinal study. Interacting with Computers 17, 265–289 (2005)
Grinter, R., Eldridge, M.: y do tngrs luv 2 txt msg. In: Prinz, W., et al.: (eds.)Proceedings of the Seventh European Conference on Computer-Supported Cooperative Work ECSCW 2001, Dordecht, Netherlands: Kluwer, pp. 219–238 (2001)
Kamm, C.: User interfaces for voice applications. Paper presented in Colloquim: Human-Machine Communication by Voice at National Academy of Sciences at the Arnold and Mabel Beckman Center, Irvine, CA (February 8-9, 1993)
Karat, C.-M., Halverson, C., Karat, J., Horn, D.: Patterns of entry and correction in large vocabulary continuous speech recognition systems. In: Proceedings of CHI 1999, pp. 568–575 (1999)
Leiser, R.G.: Improving natural language and speech interfaces by the use of metalinguistic phenomena. Applied Ergonomics 20, 168–173 (1989)
MacKenzie, I.S., Soukoreff, R.W.: Text entry for mobile computing: Models and methods, theory and practice. Human-Computer Interaction 17, 147–198 (2002)
Marturano, L., Wheatley, D.: User centered research and design at Motorola. In: Proceedings of CHI 2000, pp. 221–222 (2000)
Microsoft Vista: http://www.microsoft.com/enable/products/windowsvista/
Mobile Data Association Website: http://www.mda-mobiledata.org
Munteanu, C., Baecker, R., Penn, G., Toms, E., James, D.: The Effect of Speech Recognition Accuracy Rates on the Usefulness and Usability of Webcast Archives. In: Proceedings of Computer Human Interaction Conference, Montreal, Canada, pp. 493–502. ACM Press, New York (2006)
Oniszczak, A., MacKenzie, S.I.: A Comparison of Two Input Methods for Keypads on Mobile Devices. In: Proceedings of the third Nordic conference on Human-computer interaction, Tampere Finland, pp. 101–104. ACM Press, New York (2004)
Oviatt, S.L., Cohen, P.R., Wu, L., Vergo, J., Duncan, L., Suhm, B., Bers, J., Holzman, T., Winogard, T., Landay, J., Larson, J., Ferro, D.: Designing the user interface for multimodal speech and gesture applications: state-of-the-art systems and research directions. Human-Computer Interaction 15(4), 263–322 (2000)
Palm Handheld Products website: http://www.palm.com/us/products/input/
Rudnicky, A.I., Lee, K-F., Hauptmann, A.G.: Survey of current speech technology. Communications of the ACM 37(3), 52–57 (1994)
Sears, A., Karat, C.-M., Oseitutu, K., Karimullah, A., Feng, J.: Productivity, satisfaction, and interaction strategies of individual with spinal cord injuries and traditional users interacting with speech recognition software. Universal Access in the information Society 1, 4–15 (2001)
Silfverberg, M., MacKenzie, I.S., Korhonen, P.: Perdicting text entry speed on mobile phones. In: Proceedings of CHI 2000, pp. 9–16. ACM Press, Amsterdam (2000)
Suhm, B., Myers, B., Waibel, A.: Multimodal error correction for speech user interfaces. ACM Transactions on Computer-Human Interaction 8(1), 60–98 (2001)
Tarasewich, P.: Evaluation of thumbwheel text entry methods. Extended Abstracts of the CHI 2003 Conference, pp. 756–757 (2003)
Waibel, A., Suhm, B., Vo, M.T., Yang, J.: Multimodal Interfaces For Multimedia Information Agents. In: International Conference on Acoustics, Speech, and Signal Processing 1997, ICASSP 1997, Munich, Germany, 04 (1997)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Basapur, S., Xu, S., Ahlenius, M., Lee, Y.S. (2007). User Expectations from Dictation on Mobile Devices. In: Jacko, J.A. (eds) Human-Computer Interaction. Interaction Platforms and Techniques. HCI 2007. Lecture Notes in Computer Science, vol 4551. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73107-8_24
Download citation
DOI: https://doi.org/10.1007/978-3-540-73107-8_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73106-1
Online ISBN: 978-3-540-73107-8
eBook Packages: Computer ScienceComputer Science (R0)