Advertisement

Analyzing Big Data

  • Balázs Bodó
  • Bob van de Velde
Chapter

Abstract

This chapter looks into how big data and data science methods can be used to support law and policy research with empirical evidence on digital media production and consumption. To this end we analyze two cases. The simple case concerns the automatic scraping of news media websites to gather data on what is being published by news organizations. The complex case is about Robin, a research infrastructure which allows volunteers to donate their web browsing data stream so the process of personalized communications online can be studied. We discuss the issues researchers need to consider during the planning, data collection, and analysis phases of big data based research. We conclude that despite the limitations, difficulties and well-justified critique, social scientists, legal scholars, and researchers working in the humanities need to develop individual skills, and institutional competencies in big data methods, because data science is quickly becoming to be an indispensable part of the methodological tool-set of these disciplines.

References

  1. Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete. Wired Magazine, 16(7), 16-07.Google Scholar
  2. Berry, D. (2011, July 11). The computational turn: Thinking about the digital humanities. Culture Machine, 12 [online]. Available at http://www.culturemachine.net/index.php/cm/article/view/440/470.
  3. Bodó, B., Helberger, N., & de Vreese, C. H. (2017). Political micro-targeting: A Manchurian candidate or just a dark horse? Internet Policy Review, 6(4). Google Scholar
  4. Bodó, B., Helberger, N., Irion, K., Borgesius Zuiderveen, F. J., Moller, J., van der Velde, B., … de Vreese, C. H. (2017). Tackling the algorithmic control crisis—The technical, legal, and ethical challenges of research into algorithmic agents. Yale Journal of Law & Technology, 19, 133.Google Scholar
  5. Borgesius, F. J., Trilling, D., Möller, J., Bodó, B., de Vreese, C. H., & Helberger, N. (2016). Should we worry about filter bubbles? An interdisciplinary inquiry into self-selected and pre-selected personalised communication. Internet Policy Review, 5(1). Google Scholar
  6. boyd, d., & Crawford, K. (2012). Critical questions for big data. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.1080/1369118x.2012.678878.CrossRefGoogle Scholar
  7. Cadwalladr, C., & Graham-Harrison, E. (2018). How Cambridge analytica turned Facebook ‘likes’ into a lucrative political tool. The Guardian, 18. https://www.theguardian.com/technology/2018/mar/17/facebook-cambridge-analytica-kogan-data-algorithm.
  8. Chen, C. L. P., & Zhang, C.-Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on big data. Information Sciences, 275, 314–347. https://doi.org/10.1016/j.ins.2014.01.015.CrossRefGoogle Scholar
  9. Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J. M., & Welton, C. (2009). MAD skills. Proceedings of the VLDB Endowment, 2(2), 1481–1492.  https://doi.org/10.14778/1687553.1687576.CrossRefGoogle Scholar
  10. Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35(2), 137–144. https://doi.org/10.1016/j.ijinfomgt.2014.10.007.CrossRefGoogle Scholar
  11. Hazen, B. T., Boone, C. A., Ezell, J. D., & Jones-Farmer, L. A. (2014). Data quality for data science, predictive analytics, and big data in supply chain management: An introduction to the problem and suggestions for research and applications. International Journal of Production Economics, 154, 72–80. https://doi.org/10.1016/j.ijpe.2014.04.018.CrossRefGoogle Scholar
  12. Jacobs, A. (2009). The pathologies of big data. Communications of the ACM, 52(8), 36. https://doi.org/10.1145/1536616.1536632.CrossRefGoogle Scholar
  13. Kitchin, R. (2014). Big data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), 2053951714528481.CrossRefGoogle Scholar
  14. Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of Google Flu: Traps in big data analysis. Science, 343(6176), 1203–1205.CrossRefGoogle Scholar
  15. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011). Big data: The next frontier for innovation, competition, and productivity. New York, NY: McKinsey & Company.Google Scholar
  16. Mayer-Schönberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. Boston: Houghton Mifflin Harcourt.Google Scholar
  17. Mayer-Schönberger, V., & Cukier, K. (2014). Learning with big data: The future of education. Boston: Houghton Mifflin Harcourt. Google Scholar
  18. McAfee, A., Brynjolfsson, E., Davenport, T. H., Patil, D. J., & Barton, D. (2012). Big data: The management revolution. Harvard Business Review, 90(10), 60–68.Google Scholar
  19. Shah, D. V., Cappella, J. N., & Neuman, W. R. (2015). Big data, digital media, and computational social science: Possibilities and perils. The ANNALS of the American Academy of Political and Social Science, 659(1), 6–13. https://doi.org/10.1177/0002716215572084.CrossRefGoogle Scholar
  20. Provost, F., & Fawcett, T. (2013). Data science and its relationship to big data and data-driven decision making. Big Data, 1(1), 51–59. https://doi.org/10.1089/big.2013.1508.CrossRefGoogle Scholar

Further Reading

  1. Bodó, B., Helberger, N., Irion, K., Borgesius Zuiderveen, F. J., Moller, J., van der Velde, B., … de Vreese, C. H. (2017). Tackling the algorithmic control crisis—The technical, legal, and ethical challenges of research into algorithmic agents. Yale Journal of Law & Technology, 19, 133. Google Scholar
  2. Borgesius, F. Z., Gray, J., & Eechoud, M. V. (2015). Open data, privacy, and fair information principles: Towards a balancing framework. Berkeley Technology Law Journal, 30, 2073.Google Scholar
  3. Kitchin, R. (2014). Big data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), 2053951714528481.CrossRefGoogle Scholar
  4. Mitchell, R. (2018). Web scraping with python: Collecting more data from the modern web. Sebastopol: O’Reilly Media. Google Scholar
  5. Shah, D. V., Cappella, J. N., & Neuman, W. R. (2015). Big data, digital media, and computational social science: Possibilities and perils. The ANNALS of the American Academy of Political and Social Science, 659(1), 6–13. https://doi.org/10.1177/0002716215572084.CrossRefGoogle Scholar

Copyright information

© The Author(s) 2019

Authors and Affiliations

  • Balázs Bodó
    • 1
  • Bob van de Velde
    • 2
  1. 1.Institute for Information LawUniversity of AmsterdamAmsterdamThe Netherlands
  2. 2.Lead Architect Data ScienceSLTN Inter AccessHilversumThe Netherlands

Personalised recommendations