Skip to main content

Discovering Volatile Events in Your Neighborhood: Local-Area Topic Extraction from Blog Entries

  • Conference paper
Information Retrieval Technology (AIRS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5839))

Included in the following conference series:

Abstract

This paper presents a method for the detection of occasional or volatile local events using topic extraction technologies. This is a new application of topic extraction technologies that has not been addressed in general location-based services. A two-level hierarchical clustering method was applied to topics and their transitions using time-series blog entries collected with search queries including place names. According to experiments using 764 events from 37 locations in Tokyo and its vicinity, our method achieved 77.0% event findability. It was found that the number of blog entries in urban areas was sufficient for the extraction of topics, and the proposed method could extract typical volatile events, such as performances of music groups, and places of interest, such as popular restaurants.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allan, J.: Topic Detection and Tracking: Event-Based Information Organization. Kluwer Academic Publication, Dordrecht (2002)

    Book  MATH  Google Scholar 

  2. Borthwick, A.: A Maximum Entropy Approach to Named Entity Recognition. PhD thesis, New York Univeristy (1999)

    Google Scholar 

  3. Bun, K.K., Ishizuka, M.: Topic Extraction from News Archive Using TF*PDF Algorithm. In: Proceedings of International Conference on Web Information Systems Engineering (WISE 2002), pp. 73–82 (2002)

    Google Scholar 

  4. Chen, C.C., Chen, Y.T., Sun, Y., Chen, M.C.: Life Cycle Modeling of News Events Using Aging Theory. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 47–59. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Chen, K.-Y., Luesukprasert, L., Chou, S.T.: Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling. IEEE Transactions on Knowledge and Data Engineering 19(8), 1016–1025 (2007)

    Article  Google Scholar 

  6. Frantsi, K., Ananiadou, S.: Extracting Nested Collocations. In: Proceedings of International Conference on Computational Linguistics (COLING 1996), pp. 41–46 (1996)

    Google Scholar 

  7. Fujiki, T., Nanno, T., Suzuki, M., Okumura, M.: Identification of Bursts in a Document Stream. In: Proceedings of International Workshop on Knowledge Discovery in Data Streams (2004)

    Google Scholar 

  8. Kamvar, S., Klein, D., Manning, C.: Interpreting and Extending Classical Agglomerative Clustering Algorithms Using a Model-Based Approach. In: Proceedings of International Conference on Machine Learning (ICML 2002), pp. 283–290 (2002)

    Google Scholar 

  9. Kikuchi, M., Okamoto, M., Yamasaki, T.: Extraction of Topic Transition through Time Series Document based on Hierarchical Clustering. Journal of the DBSJ 7(1), 85–90 (2008)

    Google Scholar 

  10. Kleinberg, J.: Bursty and Hierarchical Structure in Streams. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2002), pp. 91–101 (2002)

    Google Scholar 

  11. Kumaran, G., Allan, J.: Text Classification and Named Entities for New Event Detection. In: Proceedings of Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2004), pp. 297–304 (2004)

    Google Scholar 

  12. Mei, Q., Liu, C., Su, H., Zhai, C.: A Probabilistic Approach to Spatiotemporal Theme Pattern Mining on Weblogs. In: Proceedings of International World Wide Web Conference (WWW 2006), pp. 533–542 (2006)

    Google Scholar 

  13. Otterbacher, J., Radev, D., Kareem, O.: News to Go: Hierarchical Text Summarization for Mobile Devices. In: Proceedings of Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), pp. 589–596 (2006)

    Google Scholar 

  14. Sakai, T., Saito, Y., Ichimura, Y., Koyama, M., Kokubu, T., Manabe, T.: ASKMi: A Japanese question answering system based on semantic role analysis. In: Proceedings of Recherche d’Information Assistée par Ordinateur (RIAO 2004), pp. 215–231 (2004)

    Google Scholar 

  15. Salton, G., Yang, C.S.: On the Specification of Term Values in Automatic Indexing. J. Documentation, 351–372 (1973)

    Google Scholar 

  16. Schiller, J.H., Voisard, A.: Location-based Services. Morgan Kaufmann Publishers, San Francisco (2004)

    Google Scholar 

  17. Yasukawa, M., Yokoo, H.: Clustering Search Results for Mobile Terminals. In: Proceedings of Annual ACM SIGIR Conference on Information Retrieval (SIGIR 2008), p. 880 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Okamoto, M., Kikuchi, M. (2009). Discovering Volatile Events in Your Neighborhood: Local-Area Topic Extraction from Blog Entries. In: Lee, G.G., et al. Information Retrieval Technology. AIRS 2009. Lecture Notes in Computer Science, vol 5839. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04769-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04769-5_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04768-8

  • Online ISBN: 978-3-642-04769-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics