Skip to main content

A Local-Global LDA Model for Discovering Geographical Topics from Social Media

  • Conference paper
  • First Online:
Book cover Web and Big Data (APWeb-WAIM 2017)

Abstract

Micro-blogging services can track users’ geo-locations when users check-in their places or use geo-tagging which implicitly reveals locations. This “geo tracking” can help to find topics triggered by certain events in certain regions. However, discovering such topics is very challenging because of the large amount of noisy messages (e.g. daily conversations). This paper proposes a method to model geographical topics, which can filter out irrelevant words by different weights in the local and global contexts. Our method is based on the Latent Dirichlet Allocation (LDA) model but each word is generated from either a local or a global topic distribution by its generation probabilities. We evaluated our model with data collected from Weibo, which is currently the most popular micro-blogging service for Chinese. The evaluation results demonstrate that our method outperforms other baseline methods in several metrics such as model perplexity, two kinds of entropies and KL-divergence of discovered topics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR, pp. 50–57. ACM (1999)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Ashbrook, D., Starner, T.: Using GPS to learn significant locations and predict movement across multiple users. In: UbiComp, pp. 275–286 (2003)

    Google Scholar 

  4. Mei, Q., Liu, C., Su, H., Zhai, C.: A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In: WWW, pp. 533–542. ACM (2006)

    Google Scholar 

  5. Wang, C., Wang, J., Xie, X., Ma, W.Y.: Mining geographic knowledge using location aware topic model. In: GIR, pp. 65–70. ACM (2007)

    Google Scholar 

  6. Backstrom, L., Kleinberg, J., Kumar, R., Novak, J.: Spatial variation in search engine queries. In: WWW, pp. 357–366. ACM (2008)

    Google Scholar 

  7. Palma, A.T., Bogorny, V., Kuijpers, B., Alvares, L.O.: A clustering-based approach for discovering interesting places in trajectories. In: SAC (2008)

    Google Scholar 

  8. Li, H., Li, Z., Lee, W.C., Lee, D.L.: A probabilistic topic-based ranking framework for location-sensitive domain information retrieval. In: SIGIR, pp. 331–338. ACM (2009)

    Google Scholar 

  9. Sizov, S.: Geofolk: latent spatial semantics in web 2.0 social media. In: WSDM, pp. 281–290. ACM (2010)

    Google Scholar 

  10. Mathioudakis, M., Koudas, N.: Identifying, attributing and describing spatial bursts. In: Proceedings of the VLDB Endowment, pp. 1091–1102. ACM (2010)

    Google Scholar 

  11. Eisenstein, J., Connor, B.O., Smith, N.A., Xing, E.P.: A latent variable model for geographic lexical variation. In: EMNLP, pp. 1277–1287. ACM (2010)

    Google Scholar 

  12. Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-locating twitter users. In: CIKM, pp. 759–768. ACM (2010)

    Google Scholar 

  13. Li, Z., Ding, B., Han, J., Kays, R., Nye, P.: Mining periodic behaviors for moving objects. In: SIGKDD, pp. 1099–1108. ACM (2010)

    Google Scholar 

  14. Yin, Z., Cao, L., Han, J., Zhai, C., Huang, T.: Geographical topic discovery and comparison. In: WWW, pp. 247–256. ACM (2011)

    Google Scholar 

  15. Ye, M., Yin, P., Lee, W.C., Lee, D.L.: Exploiting geographical influence for collaborative point-of-interest recommendation. In: SIGIR, pp. 325–334. ACM (2011)

    Google Scholar 

  16. Hong, L., Ahmed, A., Gurumurthy, S., Smola, A.J., Tsioutsiouliklis, K.: Discovering geographical topics in the Twitter stream. In: WWW, pp. 769–778. ACM (2012)

    Google Scholar 

  17. Bauer, S., Noulas, A., Seaghdha, D.O., Clark, S., Mascolo, C.: Talking places: modelling and analyzing linguistic content in foursquare. In: SocialCom/PASSAT, pp. 348–357. IEEE (2012)

    Google Scholar 

  18. Hu, B., Ester, M.: Spatial topic modeling in online social media for location recommendation. In: RecSys, pp. 25–32. ACM (2013)

    Google Scholar 

  19. Hu, B., Jamali, M., Ester, M.: Spatio-temporal topic modeling in mobile social media for location recommendation. In: ICDM, pp. 1073–1078. ACM (2013)

    Google Scholar 

  20. Ahmed, A., Hong, L., Smola, A.J.: Hierarchical geographical modeling of user locations from social media posts. In: WWW, pp. 25–36. ACM (2013)

    Google Scholar 

  21. Yuan, Q., Cong, G., Ma, Z., Sun, A., Thalmann, N.M.: Who, where, when and what: discover spatio-temporal topics for twitter users. In: SIGKDD, pp. 605–613. ACM (2013)

    Google Scholar 

  22. Kim, Y., Han, H., Yuan, C.: TOPTRAC: topical trajectory pattern mining. In: SIGKDD, pp. 587–596. ACM (2015)

    Google Scholar 

  23. Liu, Y., Ester, M., Hu, B., Cheung, D.W.: Spatio-temporal topic models for check-in data. In: ICDM, pp. 889–894. IEEE (2015)

    Google Scholar 

  24. Wu, F., Li, Z., Lee, W.C., Wang, H., Huang, Z.: Semantic annotaion of mobility data using social media. In: WWW, pp. 1253–1263. ACM (2015)

    Google Scholar 

  25. https://en.wikipedia.org/wiki/Sina_Weibo

  26. https://www.yelp.com/dataset_challenge

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongkun Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Qiang, S., Wang, Y., Jin, Y. (2017). A Local-Global LDA Model for Discovering Geographical Topics from Social Media. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10366. Springer, Cham. https://doi.org/10.1007/978-3-319-63579-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63579-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63578-1

  • Online ISBN: 978-3-319-63579-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics