Skip to main content

Query Modulation For Web-Based Question Answering

  • Chapter
Advances in Open Domain Question Answering

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 32))

  • 741 Accesses

The web is now becoming one of the largest information and knowledge repositories. Many large scale search engines (Google, Fast, Northern Light, etc.) have emerged to help users find information. In this paper, we study how we can effectively use these existing search engines to mine the Web and discover the “correct” answers to factual natural language questions. We propose a probabilistic algorithm called QASM (Question Answering using Statistical Models) that learns the best query paraphrase of a natural language question. We validate our approach for both local and web search engines using questions from the TREC evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

7. References

  • Banko, M., V. Mittal, and M. Witbrock. Headline Generation Based on Statistical Translation, ACL 2000.

    Google Scholar 

  • Berger, A. and J. Lafferty. Information retrieval as statistical translation. In Proceedings, 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, California, August 1999.

    Google Scholar 

  • Berger, A., P. Brown, S. Pietra, V. Pietra, J. Lafferty, H. Printz, and L. Ures. The Candide system for machine translation. In Proceedings of the ARPA Conference on Human Language Technology, 1994.

    Google Scholar 

  • Brown, P.F., J. Cocke, S. A. D. Pietra, V. J. D. Pietra, F. Jelinek, J. D. Lafferty, R. L. Mercer, and P. S. Roossin. A statistical approach to machine translation. Computational Linguistics, 16(2):79-85, 1990.

    Google Scholar 

  • Church, K. A stochastic parts program and a noun phrase parser for unrestricted text. In Proceedings of the Second Conference on Applied Natural Language Processing, Austin, Texas, 1988.

    Google Scholar 

  • Cohn, D. and Z. Ghahramani and M. Jordan. Active learning with statistical models. Journal of Artificial Intelligence Research 4, 1996, pages 129-145.

    Google Scholar 

  • Dempster, A.P., N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society series B, 39:1-38, 1977.

    Google Scholar 

  • Excite query corpus. ftp://ftp.excite.com/pub/jack/Excite_Log_12201999.gz, 1999.

  • Glover, E., G. Flake, S. Lawrence, W. Birmingham, and A. Kruger. Improving category specific web search by learning query modifications. In Symposium on Applications and the Internet, Jan 8– 12 2001.

    Google Scholar 

  • Glover, E.J., S. Lawrence, M. D. Gordon, W. P. Birmingham, and C. L. Giles. Web search - your way. Communications of the ACM, 2001.

    Google Scholar 

  • Harabagiu, S., D. Moldovan, M. Pasca, R. Mihalcea, M. Surdeanu, R. Bunescu, R. Gîrju, V. Rus, and P. Morarescu. The TREC-9 question answering track evaluation. In Text Retrieval Conference TREC-9, Gaithersburg, MD, 2001.

    Google Scholar 

  • Jelinek, F. Statistical Methods for Speech Recognition. MIT Press, Cambridge, Massachusetts, 1997.

    Google Scholar 

  • Knight, K. and D. Marcu. Statistics-based summarization - step one: sentence compression. In Proceedings of Seventeenth Annual Conference of the American Association for Artificial Intelligence, Austin, Texas, August 2000.

    Google Scholar 

  • Knight, K. and J. Graehl. Machine transliteration. Computational Linguistics, 24(4), 1998.

    Google Scholar 

  • Manning, C. and H. Schütze. Foundations of Statistical Natural Language Processing. The MIT Press, 1999.

    Google Scholar 

  • McCallum, A. and K. Nigam. Employing EM and pool-based active learning for text classification. Proceedings on ICML. Pages 359-367, 1998.

    Google Scholar 

  • Mikheev, A. Tagging sentence boundaries. In Proceedings of SIGIR 2000.

    Google Scholar 

  • Miller, G.A., R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography (special issue), 3(4):235-312, 1990.

    Article  Google Scholar 

  • Mitra, M. and A. Singhal and C. Buckley. Improving Automatic Query Expansion. SIGIR 1998.

    Google Scholar 

  • Moldovan, D., S. Harabagiu, M. Pasca, R. Mihalcea, R. Girju, R. Goodrum, and V. Rus. The structure and performance of an open-domain question answering system. Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000), Hong Kong, October 2000.

    Google Scholar 

  • Neal, R. and G. Hinton. A new view of the EM algorithm that justifies incremental and other variant. Technical Report. University of Toronto, 1993.

    Google Scholar 

  • Nigam, K., A. McCallum, S. Thrun, and T. Mitchell. Text Classification from Labeled and Unlabeled Documents using EM. Machine Learning 39 (2-3), pages 103-134, 2000.

    Article  Google Scholar 

  • Pereira, F., N. Tishby, and L. Lee. Distributional clustering of English words. In 30th Annual Meeting of the ACL, 183-190, 1993.

    Google Scholar 

  • Ponte, J. and B. Croft. A language modeling approach to information retrieval. In Proceedings, 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 275-281, Melbourne, Australia, August 1998.

    Google Scholar 

  • Prager, J., E. Brown, A. Coden, and Dragomir R. Radev. Question-answering by predictive annotation. In Proceedings of 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, July 2000.

    Google Scholar 

  • Radev, Dragomir R., J. Prager, and V. Samn. Ranking potential answers to natural language questions. In Proceedings of the 6th Conference on Applied Natural Language Processing, Seattle, WA, May 2000.

    Google Scholar 

  • Radev, Dragomir R., Kelsey Libner, and Weiguo Fan. Getting Answers to Natural Language Queries on the Web. Journal of the American Society for Information Science and Technology, 2002.

    Google Scholar 

  • Voorhees, E. and D. Tice. The TREC-8 question answering track evaluation. In Text Retrieval Conference TREC-8, Gaithersburg, MD, 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer

About this chapter

Cite this chapter

Radev, D.R. et al. (2008). Query Modulation For Web-Based Question Answering. In: Strzalkowski, T., Harabagiu, S.M. (eds) Advances in Open Domain Question Answering. Text, Speech and Language Technology, vol 32. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-4746-6_9

Download citation

Publish with us

Policies and ethics