Advertisement

Binary Lexical Relations for Text Representation in Information Retrieval

  • Marco Gonzalez
  • Vera Lúcia Strube de Lima
  • José Valdeni de Lima
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3513)

Abstract

Text representation is crucial for many natural language processing applications. This paper presents an approach to extraction of binary lexical relations (BLR) from Portuguese texts for representing phrasal cohesion mechanisms. We demonstrate how this automatic strategy may be incorporated to information retrieval systems. Our approach is compared to those using bigrams and noun phrases for text retrieval. BLR strategy is shown to improve on the best performance in an experimental information retrieval system.

Keywords

Information Retrieval Noun Phrase Information Retrieval System Relative Pronoun Component Term 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bruza, P.D., van der Weide, T.P.: The Modeling and Retrieval of Documents using Index Expressions. ACM SIGIR Forum 25(2), 91–103 (1991)CrossRefGoogle Scholar
  2. 2.
    Fagan, J.L.: Automatic Phrase Indexing for Document Retrieval: An Examination of Syntactic and Non-Syntactic Methods. In: Proceedings of 10th Annual International ACM SIGIR conference, pp. 91–101 (1987)Google Scholar
  3. 3.
    Gamallo, P., Gonzalez, M., Agustini, A., Lopes, G., Lima, V.L.S.: Mapping Syntactic Dependencies onto Semantic Relations. In: ECAI 2002, Workshop on Natural Language Processing and Machine Learning for Ontology Engineering, Lyon, France, pp. 15–22 (2002)Google Scholar
  4. 4.
    Gao, J., Nie, J., Wu, G., Cao, G.: Dependence language model for information retrieval. In: Proceedings of 27th Annual International ACM SIGIR conference, pp. 170–177 (2004)Google Scholar
  5. 5.
  6. 6.
    Kahane, S., Polguere, A.: Formal Foundation of Lexical Functions. In: ACL 2000 – Workshop on Collocation, Toulouse (2001)Google Scholar
  7. 7.
    Katz, B., Lin, J.: REXTOR: A System for Generating Relations from Natural Language. In: ACL 2000 – Workshop on Recent Advances in NLP and IR, Hong-Kong, University of Science and Technology (2000)Google Scholar
  8. 8.
    Lee, C., Lee, G.G.: Probabilistic information retrieval model for a dependency structured indexing system. Information. Processing and Management 41, 161–175 (2005), Available online 19 December 2003Google Scholar
  9. 9.
    Lin, J.: Indexing and Retrieving Natural Language using Ternary Expressions. Master thesis, Massachusetts Institute of Technology, Cambridge (2001) Google Scholar
  10. 10.
    Liu, S., Liu, F., Yu, C., Meng, W.: An effective approach to document retrieval via utilizing WordNet and recognizing phrases. In: Proceedings of 27th Annual International ACM SIGIR conference, pp. 266–272 (2004)Google Scholar
  11. 11.
    Losee, R.M.: Term Dependence: a basis for Luhn and Zipf Models. Journal of the American Society for Information Science 52(12), 1019–1025 (2001)CrossRefGoogle Scholar
  12. 12.
    Matsumura, A., Takasu, A., Adachi, J.: The Effect of Information Retrieval Method Using Dependency Relationship Between Words. RIAO – Multimedia Information Representation and Retrieval (2000) Google Scholar
  13. 13.
    Miller, D.H., Leek, T., Schwartz, R.: A Hidden Markov Model information retrieval system. In: Proceedings of 22th Annual International ACM SIGIR conference, pp. 214–221 (1999)Google Scholar
  14. 14.
    Mira Mateus, M.H., Brito, A.M., Duarte, I., Faria, I.H.: Gramática da Língua Portuguesa. Lisboa: Ed. Caminho (2003)Google Scholar
  15. 15.
    Nallapati, R., Allan, J.: Capturing term dependencies using a language model based on sentence trees. In: Proceedings of the 11th International Conference on Information and Knowledge Management, CIKM, pp. 383–390 (2002)Google Scholar
  16. 16.
    Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24, 513–523 (1988)CrossRefGoogle Scholar
  17. 17.
    Song, F., Croft, B.: A general language model for information retrieval. In: CIKM, pp. 316–321 (1999)Google Scholar
  18. 18.
    Sparck-Jones, K.: Search Term relevance weighting given little relevance information. Journal of Documentation 35, 30–48 (1979)CrossRefGoogle Scholar
  19. 19.
    Spark-Jones, K., Walker, S., Robertson, S.E.: A Probabilistic Model of Information Retrieval: Development and Comparative Experiments – Part 1 and 2. Information Processing and Management 36(6), 779–840 (2000)CrossRefGoogle Scholar
  20. 20.
    Srikanth, M., Srihari, R.: Biterm language models for document retrieval. In: Proceedings of 25th Annual International ACM SIGIR conference, pp. 425–426 (2002)Google Scholar
  21. 21.
    Vilares, J., Barcala, F.M., Alonso, M.A.: Using Syntactic dependency-pairs conflation to improve retrieval performance in Spanish. In: Computational Linguistics and Intelligent Text Processing. Lectures Notes in Computer Science, Springer, Heidelberg (2002)Google Scholar
  22. 22.
    Voorhees, E.M.: Overview of TREC 2003. NIST Special Publication – SP500-255. In: The 12th Text Retrieval Conference, Gaithersburg (2003)Google Scholar
  23. 23.
    Wondergem, B., van Bommel, P., Weide, T.P.: Nesting and Defoliation of Index Expressions for Information Retrieval. Knowledge and Information Systems 2(1) (2000)Google Scholar
  24. 24.
    Zhai, C.: Fast statistical parsing of noun phrases of document indexing. In: Proceedings of the fifth conference on Applied natural language processing, pp. 312–319 (1997)Google Scholar
  25. 25.
    Ziviani, N.: Text Operations. In: Baeza-Yates, R., Ribeiro-Neto, B. (eds.) Modern Information Retrieval, ACM Press, New York (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Marco Gonzalez
    • 1
    • 2
  • Vera Lúcia Strube de Lima
    • 1
  • José Valdeni de Lima
    • 2
  1. 1.PUCRS – Faculdade de InformáticaPorto AlegreBrazil
  2. 2.UFRGS – Instituto de InformáticaPorto AlegreBrazil

Personalised recommendations