Document Selection for Extracting Entity and Relationship Instances of Terrorist Events

Sun, Zhen; Lim, Ee-Peng; Chang, Kuiyu; Suryanto, Maggy Anastasia; Gunaratna, Rohan Kumar

doi:10.1007/978-0-387-71613-8_15

Zhen Sun⁸,
Ee-Peng Lim⁸,
Kuiyu Chang⁸,
Maggy Anastasia Suryanto⁸ &
…
Rohan Kumar Gunaratna⁹

Part of the book series: Integrated Series In Information Systems ((ISIS,volume 18))

1952 Accesses
1 Citations

In this chapter, we study the problem of selecting documents so as to extract terrorist event information from a collection of documents. We represent an event by its entity and relation instances. Very often, these entity and relation instances have to be extracted from multiple documents. We therefore define an information extraction (IE) task as selecting documents and extracting from which entity and relation instances relevant to a user-specified event (aka domain specific event entity and relation extraction). We adopt domain specific IE patterns to extract potentially relevant entity and relation instances from documents, and develop a number of document ranking strategies using the extracted instances to address this extraction task. Each ranking strategy (aka pattern-based document ranking strategy) assigns a score to each document, which estimates the latter's contribution to the gain in event related instances. We conducted experiments on two document collection datasets constructed using two historical terrorism events. Experiments showed that our proposed patternbased document ranking strategies performed well on the domain specific event entity and relation extraction task for document collections of various sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

bombsecurity.com: Extremists online, http://www.bombsecurity.com/extremists.html (2006)
Apache: Lucene search engine. http://jakarta.apache.org/lucene (2006)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley (1999)
Google Scholar
Techner, K.: A literature survey on information extraction and text summarization. Computational Linguistics Program (1997)
Google Scholar
Palmer, D.D., Day, D.S.: A statistical profile of the named entity task. In: Proceedings of the Fifth ACL Conference for Applied Natural Language Processing. (1997)
Google Scholar
Riloff, E.: Automatically constructing a dictionary form information extraction tasks. In: Proceedings of the Eleventh National Conference on Artificial Intelligence. (1993)
Google Scholar
Riloff, E., Shoen, J.: Automatically acquiring conceptual patterns without an annotated corpus. In: Proceedings of the Third Workshop on Very Large Corpora. (1995)
Google Scholar
Riloff, E.: Automatically generating extraction patterns from untagged text. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence. (1996)
Google Scholar
Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence. (1999)
Google Scholar
Thelen, M., Riloff, E.: A bootstrapping method for learning semantic lexicons using extraction pattern contexts. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing. (2002)
Google Scholar
Guo, Z., Jian, S.: A high-performance coreference resolution system using a multi-agent strategy. In: Proceedings of 20th International Conference on Computational Linguistics. (2004)
Google Scholar
Sun, Z.: Domain specific event information extraction on large text collections. Master’s thesis, Nanyang Technological University, School of Computer Engineering (2006)
Google Scholar
Soderland, S., Fisher, D., Aseltine, J., Lehnert, W.: Crystal: Inducing a conceptual dictionary. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. (1995)
Google Scholar
Soderland, S.: Learning to extract text-based information from the world wide web. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining. (1997)
Google Scholar
Krupka, G.: Description of the SRA system as used for MUC. In: Proceedings of the Sixth Message Understanding Conference. (1995)
Google Scholar
Huffman, S.: Learning information extraction patterns from examples. In: Proceedings of IJCAI-95Workshop on new approaches to learning for natural language processing. (1995)
Google Scholar
Kim, J., Moldavan, D.: Acquisition of linquistic patterns for knowledge-based information extraction. In: IEEE Transactions on Knowledge and Data Engineering. (1995)
Google Scholar
Muslea, I.: Extraction patterns for information extraction tasks: A survey. In: Proceedings of AAAI99 Workshop on Machine Learning for Information Extraction. (1999)
Google Scholar
Agichtein, E., Gravano, L.: Snowball: Extracting relations from large plain-text collections. In: Proceedings of the Fifth ACM International Conference on Digital Libraries. (2000)
Google Scholar
Goldstein, J., Mittal, V.O., Carbonell, J.G., Callan, J.P.: Creating and evaluating multi-document sentence extract summaries. In: CIKM. (2000) 165-172 21.
Google Scholar
Masterson, D., Kushmerick, N.: Information extraction from multi-document threads. In: Proceedings of ATEM. (2003)
Google Scholar
Reidsma, D., Kuper, J., Declerck, T., Saggion, H., Cunningham, H.: Cross document ontology based information extraction for multimedia retrieval. In: Supplementary proceedings of the ICCS03. (2003)
Google Scholar
Agichtein, E., Gravano, L.: Querying text database for efficient information extraction. In: Proceedings of the 2002 Conference on the 19th IEEE International Conference on Data Engineering. (2003)
Google Scholar
Finn, A., Kushmerick, N.: Active learning selection strategies for information extraction. In: Proceedings of ATEM. (2003)
Google Scholar
Madhyastha, H.V., Balakrishnan, N., Ramakrishnan, K.R.: Event information extraction using link grammar. In: Proceedings of the 13th International WorkShop on Research Issues in Data Engineering: Multi-lingual Information Management. (2003)
Google Scholar
Allan, J., Papka, R., Lavrenko, V.: On-line new event detection and tracking. In: SIGIR ’98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, ACM Press (1998) 37-45
Google Scholar
Brants, T., Chen, F.: A system for new event detection. In: SIGIR ’03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, ACM Press (2003) 330-337
Google Scholar
Kumaran, G., Allan, J.: Text classification and named entities for new event detection. In: SIGIR ’04: Proceedings of the 27th annual international conference on Research and development in information retrieval, ACM Press (2004) 297-304
Google Scholar
Chen, F.R., Farahat, A.O., Brants, T.: Story link detection and new event detection are asymmetric. In: Proceedings of Human Language Technology Conference(HLT-NAACL 2003). (2003)
Google Scholar
Yang, Y., Carbonell, J., Brown, R., Pierce, T., Archibald, B.T., Liu, X.: Learning approaches for detecting and tracking news events. In: IEEE Intelligent Systems, 14 (4):32-43. (1999)
Article Google Scholar
Fellbaum, C.: Wordnet: An electronic lexical database. MIT Press (1998)
Google Scholar
Bikel, D.M., Schwartz, R.L., Weischedel, R.M.: An algorithm that learns whats in a name. Machine Learning 34(1-3) (1999) 211-231
Article Google Scholar
Various:Badger information extraction(ie)software.http://www.nlp.cs.umass.edu/software/badger.html (2006)
Sun, Z., Lim, E.P., Chang, K., Ong, T.K., Gunaratna, R.K.: Event-driven document selection for terrorism information extraction. In Kantor, P., Muresan, G., Roberts, F., D., D., eds.: IEEE International Conference on Intelligence and Security. Lecture Notes in Computer Science, Berlin Heidelberg, Springer Verlag (2005) 37-48
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Advanced Information Systems, Nanyang Technological University, Singapore
Zhen Sun, Ee-Peng Lim, Kuiyu Chang & Maggy Anastasia Suryanto
International Center for Political Violence and Terrorism Research Institut, Nanyang Technological University, Singapore
Rohan Kumar Gunaratna

Authors

Zhen Sun
View author publications
You can also search for this author in PubMed Google Scholar
Ee-Peng Lim
View author publications
You can also search for this author in PubMed Google Scholar
Kuiyu Chang
View author publications
You can also search for this author in PubMed Google Scholar
Maggy Anastasia Suryanto
View author publications
You can also search for this author in PubMed Google Scholar
Rohan Kumar Gunaratna
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Arizona, Tucson, AZ, USA
Hsinchun Chen
Clarion University, Clarion, PA, USA
Edna Reid
The Analysis Corporation, McLean, VA, USA
Joshua Sinai
University of East London, London, UK
Andrew Silke
Lauder School of Government & Diplomacy, Herzliya, Israel
Boaz Ganor

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sun, Z., Lim, EP., Chang, K., Suryanto, M.A., Gunaratna, R.K. (2008). Document Selection for Extracting Entity and Relationship Instances of Terrorist Events. In: Chen, H., Reid, E., Sinai, J., Silke, A., Ganor, B. (eds) Terrorism Informatics. Integrated Series In Information Systems, vol 18. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-71613-8_15

Download citation

DOI: https://doi.org/10.1007/978-0-387-71613-8_15
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-71612-1
Online ISBN: 978-0-387-71613-8
eBook Packages: Business and EconomicsBusiness and Management (R0)

Publish with us

Policies and ethics