Skip to main content
Log in

A novel retrieval approach reflecting variability of syntactic phrase representation

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

In this paper, we introduce variability of syntactic phrases and propose a new retrieval approach reflecting the variability of syntactic phrase representation. With variability measure of a phrase, we can estimate how likely a phrase in a given query would appear in relevant documents and control the impact of syntactic phrases in a retrieval model. Various experimental results over different types of queries and document collections show that our retrieval model based on variability of syntactic phrases is very effective in terms of retrieval performance, especially for long natural language queries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Arampatzis, A., van der Weide, T., Koster, C., & van Bommel, P. (2000). Linguistically-motivated information retrieval. In Encyclopedia of Library and Information Science. New York: Marcel Dekker.

    Google Scholar 

  • Brants, T.: (2004). Natural language processing in information retrieval. In Proceedings of CLIN 2004 (pp. 1–13). Antwerp, Belgium.

  • Chelba, C., Engle, D., Jelinek, F., Jimenez, V. M., Khudanpur, S., Mangu, L., et al. (1997). Structure and performance of a dependency language model. In Proceedings of Eurospeech ’97 (pp. 2775–2778). Rhodes, Greece.

  • Chelba, C., & Jelinek, F. (1999). Recognition performance of a structured language model. In Proceedings of Eurospeech ’99 (pp. 1567–1570).

  • Fagan, J. (1987). Automatic phrase indexing for document retrieval. In Proceedings of SIGIR ’87 (pp. 91–101).

  • Gao, J., Nie, J.-Y., Wu, G., & Cao, G. (2004). Dependence language model for information retrieval. In Proceedings of SIGIR ’04 (pp. 170–177).

  • Kraaij, W., & Pohlmann, R. (1998). Comparing the effect of syntactic vs. statistical phrase indexing strategies for dutch. In ECDL ’98: Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries (pp. 605–617). London, UK.

  • Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In Proceedings of SIGIR ’05 (pp. 472–479).

  • Miller, D. R. H., Leek, T., & Schwartz, R. M. (1999). A hidden Markov model information retrieval system. In Proceedings of SIGIR ’99 (pp. 222–229).

  • Mitra, M., Buckley, C., Singhal, A., & Cardie, C. (1997). An analysis of statistical and syntactic phrases. In Proceedings of RIAO (pp. 200–214).

  • Pohlmann, R., & Kraaij, W. (1997). The effect of syntactic phrase indexing on retrieval performance for Dutch texts. In Proceedings of RIAO’97 (pp. 176–187).

  • Porter, M. F. (1997). An algorithm for suffix stripping (pp. 313–316). San Francisco, CA, USA: Morgan Kaufmann.

    Google Scholar 

  • Song, F., & Croft, W. B. (1999). A general language model for information retrieval. In Proceedings of CIKM ’99 (pp. 316–321).

  • Srikanth, M., & Srihari, R. (2003). Exploiting syntactic structure of queries in a language modeling approach to IR. In Proceedings of CIKM ’03 (pp. 476–483).

  • Strzalkowski, T., Carballo, J.P., & Marinescu, M. (1994). Natural language information retrieval: TREC-3 report. In The Third Text REtrieval Conference (TREC 3).

  • Strzalkowski, T., Guthrie, L., Karlgren, J., Leistensnider, J., Lin, F., Perez-Carballo, J., et al. (1997). Natural Language information retrieval: TREC-5 report. In The Fifth Text REtrieval Conference (TREC 5) (pp. 291–313).

  • Tapanainen, P., & Jarvinen, T. (1997). A non-projective dependency parser. In Fifth Conference on Applied Natural Language Processing (pp. 64–71).

  • Zhai, C. (1997). Fast statistical parsing of noun phrases for document indexing. In Proceedings of the fifth conference on Applied natural language processing (pp. 312–319).

  • Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to Ad Hoc information retrieval. In Proceedings of SIGIR-01 (pp. 334–342).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hae-Chang Rim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, YI., Han, KS., Kim, SB. et al. A novel retrieval approach reflecting variability of syntactic phrase representation. J Intell Inf Syst 31, 265–286 (2008). https://doi.org/10.1007/s10844-007-0045-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-007-0045-0

Keywords

Navigation