Text Segmentation and Event Detection

Aggarwal, Charu C.

doi:10.1007/978-3-319-73531-3_14

Charu C. Aggarwal²

9993 Accesses

Abstract

“To improve is to change; to be perfect is to change often.”—Winston Churchill

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
DARPA stands for Defense Advanced Research Projects Agency, which is an agency of the United States Department of Defense. It is responsible for the development of emerging technologies for use by the military, and often funds academic research efforts.

Bibliography

C. Aggarwal and K. Subbian. Event detection in social streams. SDM Conference, 2012.
Chapter Google Scholar
C. Aggarwal and P. Yu. On clustering massive text and categorical data streams. Knowledge and Information Systems, 24(2), pp. 171–196, 2010.
Article Google Scholar
J. Allan, J. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic detection and tracking pilot study final report. CMU Technical Report, Paper 341, 1998.
Google Scholar
H. Becker, M. Naaman, and L. Gravano. Beyond Trending Topics: Real-World Event Identification on Twitter. ICWSM Conference, pp. 438–441, 2011.
Google Scholar
D. Beeferman, A. Berger, and J. Lafferty. Statistical models for text segmentation. Machine Learning, 34(1–3), pp. 177–210, 1999.
Article Google Scholar
D. Blei and P. Moreno. Topic segmentation with an aspect hidden Markov model. ACM SIGIR Conference, pp. 343–348, 2001.
Google Scholar
N. Chambers, S. Wang, and D. Jurafsky. Classifying temporal relations between events. Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pp. 173–176, 2007.
Google Scholar
F. Choi. Advances in domain independent linear text segmentation. North American Chapter of the Association for Computational Linguistics Conference, pp. 26–33, 2000.
Google Scholar
F. Choi, P. Wiemer-Hastings, and J. Moore. Latent semantic analysis for text segmentation. EMNLP, 2001.
Google Scholar
J. Eisenstein and R. Barzilay. Bayesian unsupervised topic segmentation. Conference on Empirical Methods in Natural Language Processing, pp. 334–343, 2008.
Google Scholar
E. Erosheva, S. Fienberg, and J. Lafferty. Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences, 101, pp. 5220–5227, 2004.
Article Google Scholar
M. Hearst. TextTiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics, 23(1), pp. 33–64, 1997.
Google Scholar
R. Kannan, H. Woo, C. Aggarwal, and H. Park. Outlier detection for text data. SDM Conference, 2017.
Chapter Google Scholar
J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML Conference, pp. 282–289, 2001.
Google Scholar
X. Ling and D. Weld. Temporal information extraction. AAAI, pp. 1385–1390, 2010.
Google Scholar
D. Litman and R. Passonneau. Combining multiple knowledge sources for discourse segmentation. Association for Computational Linguistics, pp. 108–115, 1995.
Google Scholar
I. Mani and G. Wilson. Robust temporal processing of news. ACL Conference, pp. 69–76, 2000.
Google Scholar
A. McCallum, D. Freitag, and F. Pereira. Maximum entropy Markov models for information extraction and segmentation. ICML Conference, pp. 591–598, 2000.
Google Scholar
D. McClosky, M. Surdeanu, and C. Manning. Event extraction as dependency parsing. Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 1626–1635, 2011.
Google Scholar
J. Ponte and W. Croft. Text segmentation by topic. International Conference on Theory and Practice of Digital Libraries, pp. 113–125, 1997.
Chapter Google Scholar
J. Pustejovsky et al. The timebank corpus. Corpus Linguistics, pp. 40, 2003.
Google Scholar
J. Pustejovsky et al. TimeML: Robust specification of event and temporal expressions in text. New Directions in Question Answering, 3. pp. 28–34, 2003.
Google Scholar
A. Ritter, Mausam, O. Etzioni, and S. Clark. Open domain event extraction from twitter. ACM KDD Conference, pp. 1104–1102, 2012.
Google Scholar
A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in tweets: an experimental study. Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534, 2011.
Google Scholar
T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors. World Wide Web Conference, pp. 851–860, 2010.
Google Scholar
G. Salton and J. Allan. Selective text utilization and text traversal. Proceedings of ACM Hypertext, 1993.
Google Scholar
G. Salton, J. Allan, and C. Buckley. Approaches to passage retrieval in full text information systems. ACM SIGIR Conference, pp. 49–58, 1997.
Google Scholar
G. Salton, A. Singhal, M. Mitra, and C. Buckley. Automatic text structuring and summarization. Information Processing and Management, 33(2), pp. 193–207, 1997.
Article Google Scholar
R. Sauri, R. Knippen, M. Verhagen, and J. Pustejovsky. Evita: a robust event recognizer for QA systems. Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 700–707, 2005.
Google Scholar
H. Sayyadi, M. Hurst, and A. Maykov. Event detection and tracking in social streams. ICWSM Conference, 2009.
Google Scholar
J. Yamron, I. Carp, L. Gillick, S. Lowe, and P. van Mulbregt. A hidden Markov model approach to text segmentation and event tracking. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 333–336, 1998.
Google Scholar
Y. Yang, T. Pierce, and J. Carbonell. A study of retrospective and online event detection. ACM SIGIR Conference, pp. 28–36, 1998.
Google Scholar
J. Zhang, Z. Ghahramani, and Y. Yang. A probabilistic model for online document clustering with application to novelty detection. NIPS Conference, pp. 1617–1624, 2004.
Google Scholar
http://opennlp.apache.org/index.html
http://nlp.stanford.edu/software/
http://www.nltk.org/
http://mallet.cs.umass.edu/
http://www.nltk.org/api/nltk.tokenize.html#nltk.tokenize.texttiling.TextTilingTokenizer
http://www.itl.nist.gov/iad/mig/tests/tdt/

Download references

Author information

Authors and Affiliations

IBM T. J. Watson Research Center, Yorktown Heights, NY, USA
Charu C. Aggarwal

Authors

Charu C. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aggarwal, C.C. (2018). Text Segmentation and Event Detection. In: Machine Learning for Text. Springer, Cham. https://doi.org/10.1007/978-3-319-73531-3_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-73531-3_14
Published: 20 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73530-6
Online ISBN: 978-3-319-73531-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics