International Journal of Speech Technology

, Volume 8, Issue 2, pp 203–219 | Cite as

A Context-Aware Language Model for Spoken Query Retrieval



This paper concentrates on the problem of designing and developing a spoken query retrieval (SQR) system to access large document databases via voice. The main challenge is to identify and address issues related to the adaptation and scalability of integrating automatic speech recognition (ASR) and information retrieval (IR). In this paper, a Context Aware Language Model (CALM) framework allowing information retrieval to large document databases via voice is presented and findings from a research study using the framework will be discussed as well.


spoken query retrieval language models automatic speech recognition speech user interfaces 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Allan, J. (2002). Perspectives on information retrieval and speech. In Proc. of the SIGIR'01 Workshop on Information Retrieval Techniques for Speech Applications, Springer LNCS 2273.Google Scholar
  2. American Foundation for the Blind. (2002). [Online] Available:
  3. Baeza-Yates, R. and Ribeiro-Neto, B. (1999). Modern Information Retrieval. Wokingham, UK: Addison-Wesley.Google Scholar
  4. Barnett, J., Anderson, S., Broglio, J., Singh, M., Hudson, R., and Kuo, S. (1997). Experiments in spoken queries for document retrieval. In Proceedings of Eurospeech97, pp. 1323–1326.Google Scholar
  5. BeVocal, Inc. (2002). [Online] Available:
  6. Cenek, P. (2001). Dialogue Interfaces for Library Systems. [Online] Available:
  7. Connolly, T. and Begg, C. (2002). Database Systems, 3rd ed., Addison Wesley.Google Scholar
  8. Crestani, F. (2002). Spoken query processing for interactive information retrieval. Data and Knowledge Engineering, 41(1):105–124.CrossRefMATHGoogle Scholar
  9. Drori, O. (2000). Improving display of search results in information retrieval systems—user's study. Technical Report of the Leibnitz Center for Research in Computer Science, No. 200034.Google Scholar
  10. (2002). [Online] Available:
  11. Fujii, A., Itou, K., and Ishikawa, T. (2002). Speech-driven text retrieval: Using target IR collections for statistical language model adaptation in speech recognition. In Proc. of the SIGIR'01 Workshop on Information Retrieval Techniques for Speech Applications. Springer LNCS 2273, pp. 94–104.Google Scholar
  12. Franz, Alexander, and Milch, Brian. (2002). Searching the web by voice. In Proceedings of the 19th International Conference on Computational Linguistics (COLING), pp. 1213–1217.Google Scholar
  13. Garofolo, J., Auzanne, C., and Voorhees, E. (2000). The TREC spoken document retrieval track: A success story. In Proceedings of TREC-8 (1999). NIST special publication.Google Scholar
  14. Hersh, W., Buckley, C., Leone, T., and Hickam, D. (1994). OHSUMED: An Interactive Retrieval Evaluation and new large test collection for research. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 192–201.Google Scholar
  15. Jansen, B.J., Spink, A., Bateman, J., and Saracevic, T. (1998). Real life information retrieval: A study of user queries on the web. SIGIR Forum, vol. 32. no. 1, pp. 5–17.Google Scholar
  16. Litman, D., Pan, S., and Walker, M. (1998). Evaluating response strategies in a web-based spoken dialogue agent. In Proc. 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conf. on Computational Linguistics, pp. 780–786.Google Scholar
  17. Miller, G.A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Science, 63:81–97.Google Scholar
  18. Robertson, S. and Hull D. (2000). The TREC-9 Filtering Track Final Report. [online] Available:
  19. Salton, G. and McGill, M. (1983). Introduction to Modern Information Retrieval. New York: McGraw-Hill Book Co.Google Scholar
  20. Seattle Post-Intelligencer. (2002). World's cellular phones will outnumber fixed lines within months, U.N. predicts. [Online] Available:
  21. (2002). [Online] Available:
  22. Shneiderman, B. (2000). The future of the web: Visual, social, universal. [Online] Available:
  23. Shneiderman B., Byrd, D., and Croft, W.B. (1997). Clarifying search: A user-interface framework for text searches. D-Lib Magazine.Google Scholar
  24. Steinbach, M., Karypis, G. and Kumar, V. (2000). A comparison of document clustering techniques. In KDD Workshop on Text Mining.Google Scholar
  25. UniBatt Ltd. (2002) The Market & Marketing Strategy. [Online] Available:
  26. Voice Extensible Markup Language (VoiceXML) Version 2.0. (2002). [Online] Available:
  27. Walker, M., Litman, D., Kamm, C. and Abella, A. (1997). PARADISE: A Framework for Evaluating Spoken Dialogue Agents. In 35th Annual Meeting of the Association of Computational Linguistics, ACL 97.Google Scholar
  28. (2000). [Online] Available:
  29. W3C Extensible Markup Language. (2003). [Online] Available:
  30. Zhong, Y., Gilbert, J., and Hu, W. (2003). Speech user interface for document retrieval. In Proceedings of the 41st Annual ACM Southeast Conference. Savannah, Georgia, pp. 130–131.Google Scholar

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  1. 1.Department of Computer Science & Software EngineeringAuburn UniversityALUSA

Personalised recommendations