Skip to main content

Measuring Social Change Using Text Data: A Simple Distributional Approach

  • Chapter
  • First Online:
Book cover Reconstruction of the Public Sphere in the Socially Mediated Age

Abstract

This paper proposes a simple approach to measuring social change using text data. The approach is based on the idea that any significant change in a society should affect the distribution of the words used in the society. Essentially we use the total variation distance between the distributions of words in adjacent months as a measure of social change during the latter month. Basedł on text data from the Nikkei Newspaper from 1989 to 2015, the largest social change observed in Japan during this period took place in March 2011, the month of the Great East Japan Earthquake.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Zhao et al. (2011) used LDA to detect topics in text data from the New York Times over a four months’ period.

  2. 2.

    See Atefeh and Khreich (2015), Goswami and Kumar (2016), Cordeiro and Gama (2016), and Hasan et al. (2017) for recent surveys on event detection in Twitter data.

  3. 3.

    The data set was purchased from Nikkei Media Marketing, Inc.

  4. 4.

    http://taku910.github.io/mecab/.

  5. 5.

    Hiragana is the primary Japanese syllabary.

  6. 6.

    In fact, we directly computed \(\{n_{w,t}\}_{w \in W, t \in T}\) from the raw text data without explicitly constructing \(D_{t}\).

References

  • Aggarwal, C.C., and K. Subbian. 2012. Event detection in social streams. In Proceedings of the 2012 SIAM International Conference on Data Mining, ed. J. Ghosh, H. Liu, I. Davidson, C. Domeniconi, and C. Kamath, 624–635.

    Google Scholar 

  • AlSumait, L., D. Barbar’a, and C. Domeniconi. 2008. On-line LDA: adaptive topic models for mining text streams with applications to topic detection and tracking. In Eighth IEEE International Conference on Data Mining, ed. F. Giannotti, D. Gunopulos, F. Turini, C. Zaniolo, N. Ramakrishnan, and X. Wu, 3–12.

    Google Scholar 

  • Andrade, M.A., and A. Valencia. 1998. Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families. Bioinformatics 14: 600–607.

    Article  Google Scholar 

  • Antadze, N., and F.R. Westley. 2012. Impact metrics for social innovation: barriers or bridges to radical change? Journal of Social Entrepreneurship 3: 133–150.

    Article  Google Scholar 

  • Atefeh, F., and W. Khreich. 2015. A survey of techniques for event detection in Twitter. Computational Intelligence 31: 132–164.

    Article  MathSciNet  Google Scholar 

  • Bee Dagum, E., and S. Bianconcini. 2016. Seasonal Adjustment Methods and Real Time Trend-Cycle Estimation. Switzerland: Springer.

    Google Scholar 

  • Blei, D.M., A.Y. Ng, and M.I. Jordan. 2003. Latent Dirichlet allocation. Journal Machine Learning Research 3: 993–1022.

    MATH  Google Scholar 

  • Cordeiro, M., and J. Gama. 2016. Online social networks event detection: a survey. In Solving Large Scale Learning Tasks: Challenges and Algorithms: Essays Dedicated to Katharina Morik on the Occasion of Her 60th Birthday, ed. S. Michaelis, N. Piatkowski, and M. Stolpe, 2–41. Swtzerland: Springer International Publishing.

    Google Scholar 

  • Dzogang, F., T. Lansdall-Welfare, F.N. Team, and N. Cristianini. 2016. Discovering periodic patterns in historical news, PloS one 11.11, e0165736.

    Google Scholar 

  • Findley, D.F., B.C. Monsell, W.R. Bell, M.C. Otto, and B.-C. Chen. 1998. New capabilities and methods of the X-12-ARIMA seasonal-adjustment program. Journal of Business & Economic Statistics 16: 127–152.

    Google Scholar 

  • Garonna, P., and U. Triacca. 1999. Social change: measurement and theory. International Statistical Review 67: 49–62.

    Article  Google Scholar 

  • Goodwin, R. 2009. Changing Raltions: Achieving Intimacy in a Time of Social Transition. Cambridge UK: Cambridge University Press.

    Google Scholar 

  • Goswami, A., and A. Kumar. 2016. A survey of event detection techniques in online social networks. Social Network Analysis and Mining 6: 107.

    Article  Google Scholar 

  • Griffiths, T.L., and M. Steyvers. 2004. Finding scientific topics. Proceedings of the National Academy of Sciences 101: 5228–5235.

    Article  Google Scholar 

  • Hasan, M., M.A. Orgun, and R. Schwitter. 2017. A survey on real-time event detection from the Twitter data stream. Journal of Information Science 2017: 1–21.

    Google Scholar 

  • Livingstone, S. 2002. The changing social landscape. In Handbook of New Media: Social Shaping and Social Consequences of ICTs, ed. L.A. Lievouw, and S. Livingstone, 17–21. London: Sage.

    Google Scholar 

  • Phillips, F. 2011. The state of technological and social change: impressions. Technological Forecasting & Social Change 78: 1072–1078.

    Article  Google Scholar 

  • Sayyadi, H., M. Hurst, and A. Maykov. 2009. Event detection and tracking in social streams. In Proceedings of the International Conference on Weblogs and Social Media, 311–314.

    Google Scholar 

  • Swan, R., and J. Allan. 2000. Automatic generation of overview timelines. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ed. N.J. Belkin, M.-K. Leong, and P. Ingwersen, 49–56.

    Google Scholar 

  • U.S. Census Bureau. 2011. X-12-ARIMA Reference Manual, Version 0.3.

    Google Scholar 

  • Wang, X., and A. McCallum. 2006. Topics over Time: A non-Markov continuous-time model of topical trends. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 424–433.

    Google Scholar 

  • Yang, Y., T. Pierce, and J. Carbonell. 1998. A study of retrospective and on-line event detection. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ed. L. Ungar, M. Craven, and D. Gunopulos, 28–36.

    Google Scholar 

  • Zhao, W.X., J. Jiang, J. Weng, Jing He, E.-P. Lim, H. Yan, and X. Li. 2011. Comparing Twitter and traditional media using topic models. In Advances in Information Retrieval: 33rd European Conference on IR Reseaech, ECIR 2011, ed. P. Clough, C. Foley, C. Gurrin, G.J.F. Jones, W. Kraaji, H. Lee, and V. Murdoch, 338–349.

    Google Scholar 

Download references

Acknowledgements

Financial support from the Japan Society for the Promotion of Science (“Topic-Setting Program to Advanced Cutting-Edge Humanities and Social Sciences Research”; KAKENHI 15H05729) is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takashi Kamihigashi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kamihigashi, T., Seki, K., Shibamoto, M. (2017). Measuring Social Change Using Text Data: A Simple Distributional Approach. In: Endo, K., Kurihara, S., Kamihigashi, T., Toriumi, F. (eds) Reconstruction of the Public Sphere in the Socially Mediated Age. Springer, Singapore. https://doi.org/10.1007/978-981-10-6138-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6138-7_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6137-0

  • Online ISBN: 978-981-10-6138-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics