Skip to main content
Log in

Using speech and dialogue for interactive TV navigation

  • Long paper
  • Published:
Universal Access in the Information Society Aims and scope Submit manuscript

Abstract

Interaction techniques for interactive television (iTV) are currently complex and difficult to use for a wide-range of viewers. Few previous studies have dealt with the potential benefits of multimodal dialogue interaction in the context of iTV for the purpose of flexibility, usability, efficiency, and accessibility. This paper investigates the benefits of introducing speech and connected dialogue for iTV interaction, and presents a case study in which a prototype system was built allowing users to navigate the information space and control the operation of the TV by a speech-based natural language interface. The system was evaluated by analysing the user experience in five categories capturing essential aspects of iTV interaction: interaction style, information load, data access, effectiveness and initiative. Design considerations relevant for speech and dialogue information systems for TV interfaces also emerged from the analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. This study has been sponsored by and conducted at Nokia Home Communications, a developer of home consumer products such as set-top boxes.

  2. The Nokia Mediaterminal is not in the production line anymore.

References

  1. Adams M, Anand P, Fox S (2001) Interactive television: coming soon to a screen near you. In: Mohan S, Ranjay G (eds) Kellogg Tech Ventures 2001

    Google Scholar 

  2. Ali A, Lamont S (2000) Interactive television programs: current challenges and solutions. In: Proceedings of the 8th annual usability professionals’ association conference, UPA’00, North Carolina, pp 14–18

  3. Allen JF, Ferguson G, Stent A (2001) An architecture for more realistic conversational systems. In: Intelligent user interfaces. pp 1–8

  4. Ardissono L, Portis F, Torasso P (2001) Architecture of a system for the generation of personalized electronic program guides. In: Workshop on personalization in future TV, User Modeling 2001, Sonthofen, Germany. http://www.di.unito.it/~liliana/UM01/schedule.html.Cited May 2004

  5. Berglund A, Qvarfordt P (2003) Error resolution strategies for interactive television speech interfaces. In: Proceedings of the 9th IFIP TC13 international conference on human–computer interaction (INTERACT’03), Zurich, pp 105–112

  6. Black A, Bayley O, Burns C, Kuuluvainen I, Stoddard J (1994) Keeping viewers in the picture: real-world usability procedures in the development of a television control interface. In: Proceedings of the conference companion on human factors in computing systems, CHI’94, Boston, pp 243–244

  7. Bonnici S (2003) Which channel is that on? A design model for electronic programme guides. In: Proceedings of the 1st European conference on interactive television: from viewers to actors? (EuroITV), Brighton, UK, pp 49–57

  8. Brennan SE, Hulteen EA (1993) Interaction and feedback in a spoken language system. In: Proceedings of AAAI-93 fall symposium on human–computer collaboration: reconciling theory, synthesizing practice. AAAI Technical Report FS-93-05, pp 1–5

  9. Brennan SE, Hulteen EA (1995) Interaction and feedback in a spoken-language system: a theoretical framework. In: Knowledge-based systems 8:143–151

  10. Bretan I, Kroon P (1996) Concurrent engineering for an interactive TV interface. In: Proceedings of the conference companion on human factors in computing systems, CHI’96, Vancouver, pp 117–118

  11. Carbonell N (2003) Towards the design of usable multimodal interaction languages. In: Universal access in the information society: special issue on multimodality: a step towards universal access, vol 2, no. 2. Springer, Berlin Heidelberg New York, pp 143–159

  12. Choi H, Choi M, Yu H, Kim J (2003) An empirical study on the adoption of information appliances with a focus on interactive TV. Telematics Inform Issue 20:161–183

    Article  Google Scholar 

  13. Clancey M (1994) The television audience examined. Advertising Res 34(4):77–87

    Google Scholar 

  14. Cohen PR (1992) The role of natural language in a multimodal interface. In: Proceedings of the 5th annual ACM symposium on user interface software and technology (UIST’92), Monterey, pp 143–149

  15. Cohen P, McGee D, Clow J (2000) The efficiency of multimodal interaction for a map-based task. In: Proceedings of the applied natural language processing conference (ANLP’00), Seattle, pp 26–27

  16. Cohen P, Oviatt S (1994) The role of voice in human–machine communication. In: Roe D, Wilpon J (eds) Voice communication between humans and machines. National Academy of Sciences Press, Washington, pp 34–75

    Google Scholar 

  17. Cotter P, Smyth B (2000) PTV: intelligent personalized TV guides. In: Proceedings of the 17th national conference on artificial intelligence, AAAI 2000, Austin, pp 957–964

  18. Enns N, MacKenzie S (1998) Touchpad-based remote control devices. In: Proceedings of the conference on human factors in computing systems, CHI’98, Los Angeles, pp 229–230

  19. Eronen L, Vuorimaa P (2000) User interfaces for digital television: a navigator case study. In: Proceedings of the working conference on advanced visual interfaces, AVI2000, Palermo. ACM Press, pp 276–279

  20. Flycht-Eriksson A, Jönsson A (2000) Dialogue and domain knowledge management in dialogue systems. In: Proceedings of the first SIGdial-workshop on discourse and dialogue, Hong Kong, pp 121–130

  21. Freeman J, Lessiter J (2003) Using attitude based segmentation to better understand viewer’s usability issues with digital and interactive TV. In: Proceedings of the 1st European conference on interactive television: from viewers to actors? (EuroITV), Brighton, UK, pp 19–27

  22. French T, Springett M (2003) Developing novel iTV applications: a user centric analysis. In: Proceedings of the 1st European conference on interactive television: from viewers to actors? (EuroITV), Brighton, UK, pp 29–39

  23. Gill JM, Perera SA (2003) Accessibility of universal design of interactive digital television. In: Proceedings of the 1st European conference on interactive television: from viewers to actors? (EuroITV), Brighton, UK, pp 83–89

  24. Goto J, Komine K, Kim Y-B, Uratani N (2003) A television control system based on spoken natural language dialogue. In: Proceedings of the 9th IFIP TC13 international conference on human–computer interaction, INTERACT 2003, Zürich, pp 765–768

  25. Grasso MA, Ebert DS, Finin TW (1998) The integrality of speech in multimodal interfaces. ACM Trans Comput Hum Interact (TOCHI) 5:303–325

    Article  Google Scholar 

  26. Hackos JT and Redish JC (1998) User and task analysis for interface design. Wiley, New York

    Google Scholar 

  27. Han SH, Yun MH, Kwahk J, Hong SW (2001) Usability of consumer electronic products. Int J Ind Ergon 28:143–151

    Article  Google Scholar 

  28. Hauptmann AG, Witbrock MJ, Rudnicky AI, Reed S (1995) Speech for multimedia information retrieval. In: Proceedings of user interface software and technology, UIST-95, Pittsburgh, pp 79–80

  29. Hearst MA (1999) Trends and controversies: mixed-initiative interaction. IEEE Intell Syst 14(5):14–23

    Article  Google Scholar 

  30. Holtzblatt K, Beyer H (1993) Making customer-centered design work for teams. Commun ACM 36(10):93–103

    Article  Google Scholar 

  31. Ibrahim A, Lundberg J, Johansson J (2001) Speech enhanced remote control for media terminal. In: Proceedings of Eurospeech’01, vol 4. Aalborg, pp 2685–2688

  32. Johansson P (2001) Iterative development of an information-providing dialogue system. Master’s Thesis, Linköping University, Sweden

  33. Johansson P (2003) Natural language interaction in personalized EPGs. In: Proceedings of the 3rd workshop on personalization in TV (9th international conference on user modeling), Johnstown, pp 27–31

  34. Jönsson A (1997) A model for habitable and efficient dialogue management for natural language interaction. Nat Lang Eng 3:103–122

    Article  Google Scholar 

  35. Kang M-H (2002) Interactivity in television: use and impact of an interactive program guide. J Broadcasting Electron Media 46(3):330–345

    Google Scholar 

  36. Kaye B, Sapolsky B (1997) Electronic monitoring of in-home television RCD usage. J Broadcasting Electron Media 41(2):214–228

    Google Scholar 

  37. Klein JA, Karger SA, Sinclair KA (2003) Digital television for all—a report on usability and accessible design. Report available online at: http://www.digitaltelevision.gov. uk/pdf_documents/publications/Digital_TV_for_all.pdf. Cited May 2004

  38. Kurapati K, Gutta S, Schaffer D, Martino J, Zimmerman J (2001) A multi-agent TV recommender. In: Workshop on personalization in future TV, user modeling 2001, Sonthofen. Available at: http://www.di.unito.it/~liliana/UM01/schedule.html. Cited May 2004

  39. Kvale S (1994) Ten standard reactions to qualitative research interviews. J Phenomenol Psychol 25:147–173

    Google Scholar 

  40. Lee B, Lee RS (1995) How and why people watch TV: implications for the future of interactive television. Advertising Res 35(6):9–18

    Google Scholar 

  41. Logan RJ, Lenzi L (1995) Innovations in RCA user interface design. Design Manag J 6(4):16–20

    MATH  Google Scholar 

  42. Logan RJ, Augaitis S, Renk T (1994) Design of simplified television remote controls: a case for behavioral and emotional usability. In: Proceedings of the human factors and ergonomics society 38th annual meeting, Santa Monica, 24–28 October, pp 365–369

  43. Mace RL, Hardie GJ, Place JP (1991) Accessible environments: toward universal design. In: Preiser W, Vischer J, White E (eds) Design interventions: toward a more humane architecture. Van Nostrand Reinhold, New York

    Google Scholar 

  44. Mane A, Boyce S, Karis D, Yankelovich N (1996) Designing the user interface for speech recognition applications. ACM SIGCHI Bull 28:29–34

    Article  Google Scholar 

  45. Marshall JP (1992) A manufacturing application of voice recognition for assembly of aircraft wire harnesses. In: Proceedings of speech tech/voice systems worldwide. Media Dimensions, New York

  46. Martin GL (1989) The utility of speech input in user-computer interfaces. Int J Man Machine Stud 30:355–375

    Google Scholar 

  47. Multimodal interaction for information services (2002) Available online at: http://www.ida.liu.se/~nlplab/mifis/. Cited May 2004

  48. Neilson IE, Lee J (1994) Conversations with graphics: implications for the design of natural language/graphics interfaces. Int J Hum Comput Stud 40:509–541

    Article  Google Scholar 

  49. Oviatt S (1999) Mutual disambiguation of recognition errors in a multimodal architecture. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM Press, pp 576–583

  50. Oviatt S (2000) Multimodal system processing in mobile environments. In: Proceedings of the 13th annual ACM symposium on user interface software and technology (UIST’2000), San Diego, 5–8 November. ACM Press, New York, pp 21–30

  51. Oviatt S (2002) Multimodal interfaces. In: Jacko J, Sears A (eds) The human–computer interaction handbook: fundamentals, evolving technologies and emerging applications, Chap 14. Lawrence Erlbaum, Mahwah, pp 286–304

  52. Portolan N, Nael M, Renoullin JL, Naudin S (1999) Will we speak to our TV remote control in the future? In: Proceedings of the 17th international symposium on human factors in telecommunication, HFT’99, Copenhagen

  53. Renaud K, Cooper R (2000) Feedback in human–computer interaction: characteristics and recommendations. In: South African institute of computer scientists and information technologists. Annual research conference, Cape Town, pp 105–114

  54. Rice M (2003) A study of television and visual impairment: prospects for the accessibility of interactive television. In: Proceedings of European conference on interactive television: from viewers to actors? EuroITV03, Brighton, UK, pp 115–116

  55. Robertson S, Wharton C, Ashworth C, Franzke M (1996) Dual device user interface design: PDAs and interactive television. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM Press, pp 79–86

  56. Shneiderman B (1983) Direct manipulation: a step beyond programming languages. IEEE Comput 16:57–69

    Google Scholar 

  57. Shneiderman B (1992) Designing the user interface: strategies for effective computer interaction, 2nd edn. Addison-Wesley, Reading

    Google Scholar 

  58. Smyth B, Cotter P, Ryan J (2002) Evolving the personalized EPG—an alternative architecture for the delivery of DTV services. In: Proceedings of the 2nd workshop on personalization in future TV, 2nd international conference on adaptive hypermedia and adaptive web systems, Malaga

  59. Stephanidis C (ed), Salvendy G, Akoumianakis D, Bevan N, Brewer J, Emiliani PL, Galetsas A, Haataja S, Iakovidis I, Jacko J, Jenkins P, Karshmer A, Korn P, Marcus A, Murphy H, Stary C, Vanderheiden G, Weber G, Ziegler J (1998) Toward an information society for all: an international R&D agenda. Int J Hum Comput Interact 10(2):107–134

    Google Scholar 

  60. The Killer App is TV: designing the digital tv interface. In: ERGO/GERO human factors science. Available online at: http://www.ergogero.com/pages/digitaltv.html. Cited May 2004

  61. Quesenbery W, Reichart T (1997) Designing for interactive television. Available online at: http://www.wqusability.com/articles/itv-design.html. Cited May 2004

  62. Zaslow J (2002) If TiVo thinks you are gay, here’s how to set it straight. Wall Street J Online

  63. Van Dijk J, De Vos L (2001) Searching for the Holy grail: images of interactive television. New Media Soc 3(4):443–465

    Article  Google Scholar 

  64. Visick D, Johnson P, Long J (1984) The use of simple speech recognizers in industrial applications. In: Proceedings of INTERACT’84, London

  65. Walker MA (1989) Natural language in a desktop environment. In: Salvendy G, Smith MJ (eds) Designing and using human–computer interfaces and knowledge based systems. Elsevier, Amsterdam, pp 502–509

    Google Scholar 

  66. Walker MA, Whittaker S (1990) Mixed initiative in dialogue: an investigation into discourse segmentation. In: Proceedings of 28th annual meeting of the ACL, Pittsburgh, pp 70–79

  67. Westerink JHDM, van der Korst M, Roberts G (1998) Evaluating the use of pictographical representations for TV menus. In: Proceedings of CHI 98 conference summary on human factors in computing systems, Los Angeles. ACM Press, pp 217–218

Download references

Acknowledgements

This work is a result from a project on multimodal interaction for information services supported by Nokia Home Communications, Santa Anna IT Research Institute, SITI, and VINNOVA [47].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aseel Berglund.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Berglund, A., Johansson, P. Using speech and dialogue for interactive TV navigation. Univ Access Inf Soc 3, 224–238 (2004). https://doi.org/10.1007/s10209-004-0106-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10209-004-0106-x

Keywords

Navigation