Skip to main content

Learning Question Focus and Semantically Related Features from Web Search Results for Chinese Question Classification

  • Conference paper
  • 954 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4182))

Abstract

Recently, some machine learning techniques like support vector machines are employed for question classification. However, these techniques heavily depend on the availability of large amounts of training data, and may suffer many difficulties while facing various new questions from the real users on the Web. To mitigate the problem of lacking sufficient training data, in this paper, we present a simple learning method that explores Web search results to collect more training data automatically by a few seed terms (question answers). In addition, we propose a novel semantically related feature model (SRFM), which takes advantage of question focuses and their semantically related features learned from the larger number of collected training data to support the determination of question type. Our experimental results show that the proposed new learning method can obtain better classification performance than the bigram language modeling (LM) approach for the questions with untrained question focuses.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Moldovan, D., Pasca, M., Harabagiu, S., Surdeanu, M.: Performance Issues and Error Analysis in an Open-Domain Question Answering System. ACM Transactions on Information systems (2003)

    Google Scholar 

  2. Li, W.: Question Classification Using Language Modeling, CIIR Technical Report (2002)

    Google Scholar 

  3. Li, X. Roth, D.: Learning Question Classifiers. In: COLING 2002 (2002)

    Google Scholar 

  4. Day, M.-Y., Lee, C.-W., Wu, S.-H., Ong, C.-S., Hsu, W.-L.: An Integrated Knowledge-based and Machine Learning Approach for Chinese Question Classification. In: IEEE NLPKE 2005 (2005)

    Google Scholar 

  5. Solorio, T., Perez-Coutino, M., Montes-y-Gomez, M., Villasenor-Pineda, L. Lopez-Lopez, A.: A Language Independent Method for Question Classification. In: CLING 2004 (2004)

    Google Scholar 

  6. Suzuki, J., Taira, H., Sasaki, Y., Maeda, E.: Question Classification using HDAG Kernel. In: ACL 2003 Workshop on Multilingual Summarization and Question Answering (2003)

    Google Scholar 

  7. Zhang, D., Lee, W.S.: Question Classification using Support Vector Machines. In: ACM SIGIR 2003 (2003)

    Google Scholar 

  8. Brill, E., Dumais, S., Banko, M.: An analysis of the Ask MSR question-answering system. In: Proceedings of 2002 Conference on Empirical Methods in Natural Language Processing (2002)

    Google Scholar 

  9. Ravichandran, D., Hovy, E.: Learning surface text patterns for a question answering system. In: Association for Computational Linguistics Conference, ACL (2002)

    Google Scholar 

  10. Moldovan, D., Harabagiu, S., Pasca, M., Mihalcea, R., Goodrum, R., Gîrju, R., Rus, V.: Lasso: A Tool for Surfing the Answer Net. In: Proceedings of the 8th TExt Retrieval Conference (TREC-8), pp. 175–183 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lin, SJ., Lu, WH. (2006). Learning Question Focus and Semantically Related Features from Web Search Results for Chinese Question Classification. In: Ng, H.T., Leong, MK., Kan, MY., Ji, D. (eds) Information Retrieval Technology. AIRS 2006. Lecture Notes in Computer Science, vol 4182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880592_22

Download citation

  • DOI: https://doi.org/10.1007/11880592_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45780-0

  • Online ISBN: 978-3-540-46237-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics