UFRGS@CLEF2008: Using Association Rules for Cross-Language Information Retrieval
For UFRGS’s participation on the TEL task at CLEF2008, our aim was to assess the validity of using algorithms for mining association rules to find mappings between concepts on a Cross-Language Information Retrieval scenario. Our approach requires a sample of parallel documents to serve as the basis for the generation of the association rules. The results of the experiments show that the performance of our approach is not statistically different from the monolingual baseline in terms of mean average precision. This is an indication that association rules can be effectively used to map concepts between languages. We have also tested a modification to BM25 that aims at increasing the weight of rare terms. The results show that this modified version achieved better performance. The improvements were considered to be statistically significant in terms of MAP on our monolingual runs.
Keywordsassociation rules experimentation performance measurement
Unable to display preview. Download preview PDF.
- 1.Aguirre, E., et al.: CLEF 2008: Ad Hoc Track Overview. In: Peters, C., et al. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 15–37. Springer, Heidelberg (2009)Google Scholar
- 2.Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the ACM SIGMOD Conference on Management of Data, Washington, D.C (1993)Google Scholar
- 3.Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th VLDB Conference, Santiago, Chile, pp. 487–499 (1994)Google Scholar
- 5.Google Translator, http://www.google.com/translate_t (accessed on: February 8, 2009)
- 8.Robertson, S., Walker, S.: Okapi at TREC-3. In: Proceedings of the Third Text REtrieval Conference (TREC). Gaithesburg, Maryland (1994)Google Scholar
- 9.Snowball. Spanish Stemmer, http://snowball.tartarus.org/algorithms/spanish/stemmer.html (retrieved August 08, 2008)
- 11.Zettair, www.seg.rmit.edu.au/zettair/ (retrieved 11/06/07, 2007)