Skip to main content

Phrase Pair Classification for Identifying Subtopics

  • Conference paper
Advances in Information Retrieval (ECIR 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7224))

Included in the following conference series:

Abstract

Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Begelman, G., Keller, P., Smadja, F.: Automated tag clustering: Improving search and exploration in the tag space. In: WWW (2006)

    Google Scholar 

  2. Deng, H., King, I., Lyu, M.R.: Formal models for expert finding on dblp bibliography data. In: ICDM (2008)

    Google Scholar 

  3. Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Web-scale information extraction in knowitall: (preliminary results). In: WWW (2004)

    Google Scholar 

  4. Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: COLING (1992)

    Google Scholar 

  5. Lavrenko, V., Croft, W.B.: Relevance based language models. In: SIGIR (2001)

    Google Scholar 

  6. Lawrie, D., Croft, W.B.: Discovering and comparing topic hierarchies. In: RIAO (2000)

    Google Scholar 

  7. Lin, H., Davis, J., Zhou, Y.: An Integrated Approach to Extracting Ontological Structures from Folksonomies. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 654–668. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  8. Macdonald, C., Ounis, I.: Expertise drift and query expansion in expert search. In: CIKM (2007)

    Google Scholar 

  9. Macdonald, C., Ounis, I.: Using Relevance Feedback in Expert Search. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 431–443. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Sanderson, M., Croft, B.: Deriving concept hierarchies from text. In: SIGIR (1999)

    Google Scholar 

  11. Snow, R., Jurafsky, D., Ng, A.Y.: Learning syntactic patterns for automatic hypernym discovery. In: NIPS (2005)

    Google Scholar 

  12. Zavitsanos, E., Paliouras, G., Vouros, G.A., Petridis, S.: Discovering subsumption hierarchies of ontology concepts from text corpora. Web Intelligence (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Das, S., Mitra, P., Giles, C.L. (2012). Phrase Pair Classification for Identifying Subtopics. In: Baeza-Yates, R., et al. Advances in Information Retrieval. ECIR 2012. Lecture Notes in Computer Science, vol 7224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28997-2_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28997-2_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28996-5

  • Online ISBN: 978-3-642-28997-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics