Abstract
This paper presents the approach we developed for automatic multi-document summarization applied to short message contextualization, in particular to tweet contextualization. The proposed method is based on named entity recognition, part-of-speech weighting and sentence quality measuring. In contrast to previous research, we introduced an algorithm from smoothing from the local context. Our approach exploits topic-comment structure of a text. Moreover, we developed a graph-based algorithm for sentence reordering. The method has been evaluated at INEX/CLEF tweet contextualization track. We provide the evaluation results over the 4 years of the track. The method was also adapted to snippet retrieval and query expansion. The evaluation results indicate good performance of the approach.
L. Ermakova—Ambassade de France en Russie, bourse de thèse en cotutelle.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barzilay, R., Elhadad, N., McKeown, K.R.: Inferring strategies for sentence ordering in multidocument news summarization. Journal of Artificial Intelligence Research 17, 35–55 (2002)
Bellot, P., Doucet, A., Geva, S., Gurajada, S., Kamps, J., Kazai, G., Koolen, M., Mishra, A., Moriceau, V., Mothe, J., Preminger, M., SanJuan, E., Schenkel, R., Tannier, X., Theobald, M., Trappett, M., Wang, Q.: Overview of INEX 2013. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) CLEF 2013. LNCS, vol. 8138, pp. 269–281. Springer, Heidelberg (2013)
Ermakova, L., Mothe, J.: IRIT at INEX: question answering task. In: Geva, S., Kamps, J., Schenkel, R. (eds.) INEX 2011. LNCS, vol. 7424, pp. 219–226. Springer, Heidelberg (2012)
Hernádvölgyi, I.T.: Solving the sequential ordering problem with automatically generated lower bounds. In: Proceedings of Operations Research 2003, pp. 355–362 (2003)
Lioma, C., Blanco, R.: Part of speech based term weighting for information retrieval. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 412–423. Springer, Heidelberg (2009)
Meij, E., Weerkamp, W., de Rijke, M.: Adding semantics to microblog posts. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, WSDM 2012, pp. 563–572. ACM, New York (2012)
Murdock, V.G.: Aspects of sentence retrieval. Dissertation (2006)
de Oliveira, D.M., Laender, A.H., Veloso, A., da Silva, A.S.: FS-NER: a lightweight filter-stream approach to named entity recognition on twitter data. In: Proceedings of the 22Nd International Conference on Arabic named entity recognition World Wide Web Companion, WWW 2013 Companion, pp. 597–604. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2013)
SanJuan, E., Moriceau, V., Tannier, X., Bellot, P., Mothe, J.: Overview of the INEX 2011 question answering track (QA@INEX). In: Geva, S., Kamps, J., Schenkel, R. (eds.) INEX 2011. LNCS, vol. 7424, pp. 188–206. Springer, Heidelberg (2012)
Shen, C., Li, T.: Learning to rank for query-focused multi-document summarization, pp. 626–634. IEEE (2012)
Torres-Moreno, J.-M., Velázquez-Morales, P., Gagnon, M.: Statistical summarization at QA@INEX 2011 track using cortex and enertex systems. In: Geva, S., Kamps, J., Schenkel, R. (eds.) INEX 2011. LNCS, vol. 7424, pp. 247–256. Springer, Heidelberg (2012)
Vivaldi, J., da Cunha, I.: QA@INEX track 2011: question expansion and reformulation using the REG summarization system. In: Geva, S., Kamps, J., Schenkel, R. (eds.) INEX 2011. LNCS, vol. 7424, pp. 257–268. Springer, Heidelberg (2012)
Yang, Z., Cai, K., Tang, J., Zhang, L., Su, Z., Li, J.: Social context summarization. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 255–264. ACM, Beijing (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ermakova, L. (2015). A Method for Short Message Contextualization: Experiments at CLEF/INEX. In: Mothe, J., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2015. Lecture Notes in Computer Science(), vol 9283. Springer, Cham. https://doi.org/10.1007/978-3-319-24027-5_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-24027-5_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24026-8
Online ISBN: 978-3-319-24027-5
eBook Packages: Computer ScienceComputer Science (R0)