Skip to main content

Microblog Retrieval During Disasters: Comparative Evaluation of IR Methodologies

  • Conference paper
  • First Online:
Book cover Text Processing (FIRE 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10478))

Included in the following conference series:

Abstract

Microblogging sites are important sources of situational information during any natural or man-made disasters. Hence, it is important to design and test Information Retrieval (IR) systems that retrieve information from microblogs during disasters. With this perspective, a track was organized at the 8th meeting of Forum for Information Retrieval Evaluation (FIRE) 2016, focused on microblog retrieval during disaster events. A collection of about 50,000 microblogs posted during the Nepal Earthquake in April 2015 was released, along with a set of seven pragmatic information needs during a disaster situation. The task was to retrieve microblogs relevant to these information needs. Ten teams participated in the task, and fifteen runs were submitted. Evaluation of the performances of various microblog retrieval methodologies, as submitted by the participants, revealed several challenges associated with microblog retrieval. In this chapter, we describe our experience in organizing the FIRE track on microblog retrieval during disaster events. Additionally, we propose two novel methodologies for the said task, which perform better than all the methodologies submitted to the FIRE track.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://clef2016.clef-initiative.eu/.

  2. 2.

    http://research.nii.ac.jp/ntcir/index-en.html.

  3. 3.

    http://doctorsforyou.org/.

  4. 4.

    http://www.spadeindia.org/.

  5. 5.

    Since the different annotators potentially judged different sets of tweets, reporting inter-annotator agreement would not be meaningful under these circumstances.

  6. 6.

    Note that the Twitter terms and conditions prohibit direct public sharing of tweets. Hence, only the tweet-ids of the tweets were distributed among the participants, along with a Python script using which the tweets can be downloaded via the Twitter API.

  7. 7.

    https://wordnet.princeton.edu/.

  8. 8.

    nlp.stanford.edu/software/Stanford-ner-2015-04-20.zip.

  9. 9.

    https://deeplearning4j.org/word2vec.

  10. 10.

    https://lucene.apache.org/ (2016, August 20).

  11. 11.

    http://terrier.org.

  12. 12.

    The POS tagger included in the Python Natural Language Toolkit was used.

  13. 13.

    We also tried retrieval with other parts of speech, and observed that forming the query out of nouns, verbs, and adjectives, gives the best retrieval performance.

  14. 14.

    The Gensim implementation for word2vec was used – https://radimrehurek.com/gensim/models/word2vec.html.

  15. 15.

    We had many ties among the rankings, e.g., the top-ranked tweet for FMT1 and the top-ranked tweet for FMT2 both had same rank.

  16. 16.

    nlp.stanford.edu/software/Stanford-ner-2015-04-20.zip.

References

  1. AIDR - Artificial Intelligence for Disaster Response. https://irevolutions.org/2013/10/01/aidr-artificial-intelligence-for-disaster-response/

  2. Bandyopadhyay, S.: Correlation distance based information extraction system at FIRE 2016 Microblog Track. In: Working notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016

    Google Scholar 

  3. Basu, M., Bandyopadhyay, S., Ghosh, S.: Post disaster situation awareness and decision support through interactive crowdsourcing. In: Proceedings of International Conference on Humanitarian Technology: Science, Systems and Global Impact (HumTech), Procedia Engineering, pp. 167–173. Elsevier (2016)

    Google Scholar 

  4. Bhardwaj, P., Pakray, P.: Information extraction from Microblogs. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016

    Google Scholar 

  5. Chakraborty, R., Bhavsar, M.: Information Retrieval from Microblogs during natural disasters. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016

    Google Scholar 

  6. Cleverdon, C.: The cranfield tests on index language devices. In: Sparck Jones, K., Willett, P. (eds.) Readings in Information Retrieval, pp. 47–59. Morgan Kaufmann Publishers Inc., San Francisco (1997)

    Google Scholar 

  7. CrisisLex: Crisis-related Social Media Data and Tools. http://crisislex.org/

  8. Dasgupta, S., Kumar, A., Das, D., Naskar, S.K., Bandyopadhyay, S.: Word embeddings for information extraction from tweets. In: Working notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016

    Google Scholar 

  9. Ghorai, T.: An information retrieval system for FIRE 2016 Microblog Track. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016

    Google Scholar 

  10. Ghosh, S., Ghosh, K.: Overview of the FIRE 2016 Microblog Track: information extraction from microblogs posted during disasters. In: Working Notes of FIRE 2016 - Forum for Information Retrieval Evaluation, Kolkata, India, pp. 56–61. 7–10 December 2016. http://ceur-ws.org/Vol-1737/T2-1.pdf

  11. Hürriyetoǧlu, A., van den Bosch, A., Oostdijk, N.: Relevant tweet detection in Nepal earthquake with relevancer. In: Working notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016

    Google Scholar 

  12. Imran, M., Castillo, C., Diaz, F., Vieweg, S.: Processing social media messages in mass emergency: a survey. ACM Comput. Surv. 47(4), 67:1–67:38 (2015)

    Article  Google Scholar 

  13. Li, W., Ganguly, D., Jones, G.J.F.: Using WordNet for query expansion: ADAPT@ FIRE 2016 Microblog Track. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016

    Google Scholar 

  14. Lin, J., Efron, M., Wang, Y., Sherman, G., Voorhees, E.: Overview of the TREC-2015 Microblog Track. In: Proceedings of Text Retrieval Conference (TREC) (2015). http://trec.nist.gov/pubs/trec24/papers/Overview-MB.pdf

  15. Lkhagvasuren, G., Gonçalves, T., Saias, J.: Semi-automatic keyword based approach for FIRE 2016 Microblog Track. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016

    Google Scholar 

  16. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)

    Book  MATH  Google Scholar 

  17. Mikolov, T., Yih, W., Zweig, G.: Linguistic regularities in continuous space word representations. In: NAACL HLT 2013 (2013)

    Google Scholar 

  18. Modha, S., Mandalia, C., Agrawal, K., Verma, D., Majumder, P.: Real time information extraction from Microblog. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016

    Google Scholar 

  19. Ounis, I., Macdonald, C., Lin, J., Soboroff, I.: Overview of the TREC-2011 Microblog Track. In: Proceedings of Text Retrieval Conference (TREC) (2011). http://trec.nist.gov/pubs/trec20/papers/MICROBLOG.OVERVIEW.pdf

  20. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014). http://www.aclweb.org/anthology/D14-1162

  21. Robertson, S.E., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr. 3(4), 333–389 (2009)

    Article  Google Scholar 

  22. Soni, R., Pal, S.: IIT BHU at FIRE 2016 Microblog Track: a semi-automatic Microblog retrieval system. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016

    Google Scholar 

  23. Sparck Jones, K., van Rijsbergen, C.: Report on the need for and provision of an ideal information retrieval test collection. Technical report 5266, Computer Laboratory, University of Cambridge, UK (1975)

    Google Scholar 

  24. Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: a language model-based search engine for complex queries. In: Proceedings of ICIA (2004). http://www.lemurproject.org/indri/

  25. Tao, K., Abel, F., Hauff, C., Houben, G.J., Gadiraju, U.: Groundhog day: near-duplicate detection on Twitter. In: Proceedings of World Wide Web (WWW) (2013)

    Google Scholar 

  26. Twitter Search API. https://dev.twitter.com/rest/public/search

  27. Varga, I., et al.: Aid is out there: looking for help from tweets during a large scale disaster. In: Proceedings of ACL (2013)

    Google Scholar 

  28. Vieweg, S., Hughes, A.L., Starbird, K., Palen, L.: Microblogging during two natural hazards events: what Twitter may contribute to situational awareness. In: Proceedings of ACM SIGCHI (2010)

    Google Scholar 

  29. World Disasters Report 2013 - Focus on technology and the future of humanitarian action (2013). http://www.ifrc.org/PageFiles/134658/WDR2013complete.pdf

Download references

Acknowledgement

We thank the FIRE organizing committee for allowing us to run the track, and all participating teams for their participation. This research was partially supported by a grant from the Information Technology Research Academy (ITRA), MeITY, Government of India (Ref. No.: ITRA/15 (58)/Mobile/DISARM/05).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moumita Basu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Basu, M., Ghosh, K., Das, S., Bandyopadhyay, S., Ghosh, S. (2018). Microblog Retrieval During Disasters: Comparative Evaluation of IR Methodologies. In: Majumder, P., Mitra, M., Mehta, P., Sankhavara, J. (eds) Text Processing. FIRE 2016. Lecture Notes in Computer Science(), vol 10478. Springer, Cham. https://doi.org/10.1007/978-3-319-73606-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73606-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73605-1

  • Online ISBN: 978-3-319-73606-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics