Skip to main content

A Multilevel Text Processing Model of Newsgroup Dynamics

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2553))

  • 446 Accesses

Abstract

We present a multilevel model of discussions in USENET newsgroups that includes the use of statistical and linguistic methods to obtain lexical, semantic and discourse characteristics of the text. We expose constraints that make information extraction and summarization more amenable to analysis at different levels. Our model makes use of posting structure, times of posting, time spans, and length and depth of a thread in order to extract higher-level information on subject matter, interest level, topicality, and discussion trends.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, D., Dolin, R., El Abbadi A. 1999. Scalable Collection Summarization and Selection. In Edward A Fox and Neil Rowe, editors, ACM DL’99, 49–58, August 1999.

    Google Scholar 

  2. Broder, A., Kumar, R., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J. 2000. Graph Structure in the Web. Procs. 9 th WWW Conference, Amsterdam, May, 2000.

    Google Scholar 

  3. Cardie, C. 1997. Empirical Methods in Info Extraction. In AI Magazine, 18:4, 65–79 1997.

    Google Scholar 

  4. Dolin, R., Agrawal, D., El Abbadi, A., Pearlman, J. 1998. Using Automated Classification for Summarizing and Selecting Heterogeneous Information Sources. In D-Lib Magazine, ISSN 1082-9873, January 1998.

    Google Scholar 

  5. Jiang, M-F., Tseng, S-S., Tsai, C-J. 1999. Discovering Structure from Document Databases. In N. Zhong and L. Zhou, editors, Methodologies for Knowledge Discovery and Data Mining. Lecture Notes in Artificial Intelligence 1574. Springer-Verlag, New York, 1999.

    Google Scholar 

  6. Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A. 1999. Trawling the Web for Emerging Cyber-communities. Procs. 8 th WWW Conference, Toronto, May, 1999.

    Google Scholar 

  7. Manning, C., Schutze, C. 1999. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (Mass.), 1999.

    MATH  Google Scholar 

  8. Melnik, S., Raghavan, S., Yang, B., Garcia-Molina, H. 2001. Building a Distributed Full-Text Index for the Web. Procs. 10 th World-Wide Web Conference, Hong Kong, May, 2001.

    Google Scholar 

  9. Ng, H.T., Zelle J. 1997. Corpus-Based Approaches to Semantic Interpretation in Natural Languge Processing. In AI Magazine Winter 1987, 18:4, 45–64 1997.

    Google Scholar 

  10. Staab, S. 1999. Grading Knowledge: Extracting Degree Information from Texts. Lecture Notes in Artificial Intelligence 1744. Springer-Verlag, New York, 1999.

    Google Scholar 

  11. Witten, I. H., Frank, E. 1999. Data Mining. Morgan Kaufman, San Francisco, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sampathsampath@tcnj.edu, G., Martinovic, M. (2002). A Multilevel Text Processing Model of Newsgroup Dynamics. In: Andersson, B., Bergholtz, M., Johannesson, P. (eds) Natural Language Processing and Information Systems. NLDB 2002. Lecture Notes in Computer Science, vol 2553. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36271-1_19

Download citation

  • DOI: https://doi.org/10.1007/3-540-36271-1_19

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00307-6

  • Online ISBN: 978-3-540-36271-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics