Recording the Web

  • Janne Nielsen
Chapter

Abstract

The world wide web is simultaneously characterized by permanence and ephemerality. The changeability of the web is accompanied by various recording practices, including tracking and web archiving, and the it is thus not only a place for communication but also for recording and registration. This chapter contributes to the theorizing about recording as a central feature of digital media by offering insights into the characteristics of the web, of prominent tracking technologies, and of the archived web in web archives. It argues for the necessity of historical studies of tracking and shows how web archives are fundamental for such studies because the web itself is not an archive of the web of the past.

References

  1. Acar, Gunes, Christian Eubank, Steven Englehardt, Marc Juárez, Arvind Narayanan, and Claudia Díaz. 2014. The Web Never Forgets—Persistent Tracking Mechanisms in the Wild. In CCS’ 14. ACM Conference on Computer and Communications Security, Scottsdale, AZ, November 3–7, 2014.Google Scholar
  2. Act on Legal Deposit of Published Material. 2004. Translation of Act No. 1439 of 22 December 2004. Accessed September 9, 2017. http://www.kb.dk/en/kb/service/pligtaflevering-ISSN/lov.html. Archived version available in Internet Archive: http://web.archive.org/web/20170715184105/http://www.kb.dk/en/kb/service/pligtaflevering-ISSN/lov.html
  3. Agata, Teru, Yosuke Miyata, Emi Ishita, Atsushi Ikeuchi, and Shuichi Ueda. 2014. Life Span of Web Pages: A Survey of 10 Million Pages Collected in 2001. In JCDL’14. Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries, London, September 8–12, 2014, 463–464.Google Scholar
  4. Altaweel, Ibrahim, Nathaniel Good, and Chris Jay Hoofnagle. 2015. Web Privacy Census. Technology Science, December 15, 2015. https://techscience.org/a/2015121502/
  5. Angwin, Julia. 2010. The Web’s New Gold Mine: Your Secrets. The Wall Street Journal, July 31, 2010.Google Scholar
  6. Ankerson, Megan Sapnar. 2012. Writing Web Histories with an Eye on the Analog Past. New Media & Society 14 (3): 384–400.CrossRefGoogle Scholar
  7. Ayenson, Mika D, Dietrich J Wambach, Ashkan Soltani, Nathaniel Good, and Chris Jay Hoofnagle. 2011. Flash Cookies and Privacy Ii: Now with Html5 and Etag Respawning. SSRN, July 30, 2011. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1898390
  8. Berners-Lee, Tim, Tim Bray, Dan Connolly, Paul Cotton, Roy Fielding, Mario Jeckle, Chris Lilley, et al. 2004. “Architecture of the World Wide Web, Volume One.” Edited by Ian Jacobs and Norman Walsh. W3.org. December 15. https://www.w3.org/TR/webarch/. Accessed 28.09.2017. Archived version available in Internet Archive: http://web.archive.org/web/20170922014108/http://www.w3.org/TR/webarch/
  9. Bordewijk, Jan L., and Ben van Kaam. 2002 [1986]. Towards a New Classification of Tele-Information Services. In McQuail’s Reader in Mass Communication Theory, ed. Denis McQuail, 113–124. London: Sage.Google Scholar
  10. Brügger, Niels. 2005. Archiving Websites. Aarhus: The Centre for Internet Research.Google Scholar
  11. ———. 2011. Web Archiving. Between Past, Present, and Future. In The Handbook of Internet Studies, ed. Mia Consalvo and Charles Ess, 24–42. Chichester: Wiley.CrossRefGoogle Scholar
  12. ———. 2013. Web Historiography and Internet Studies. Challenges and Perspectives. New Media & Society 15 (5): 752–764.CrossRefGoogle Scholar
  13. ———. forthcoming. The Archived Web: Doing History in the Digital Age. Cambridge, MA: MIT Press.Google Scholar
  14. Brügger, Niels, Ditte Laursen, and Janne Nielsen. 2017. Exploring the Domain Names of the Danish Web. In The Web as History, ed. Niels Brügger and Ralph Schroeder, 62–80. London: UCL Press.CrossRefGoogle Scholar
  15. ———. forthcoming. Methodological Reflections About Establishing a Corpus of the Archived Web: The Case of the Danish Web From 2005 to 2015. In The Historical Web and Digital Humanities. The Case of National Web Domains, ed. Niels Brügger and Ditte Laursen. London: Routledge.Google Scholar
  16. Cho, Junghoo, and Hector Garcia-Molina. 2000. The Evolution of the Web and Implications for an Incremental Crawler. In VLDB ’00. Proceedings of the 26th International Conference on Very Large Data Bases, 200–209. September 10–14, 2000.Google Scholar
  17. Crain, Matthew. forthcoming. A Critical Political Economy of Web Advertising History. In The Sage Handbook of Web History, ed. Niels Brügger and Ian Milligan. London: Sage.Google Scholar
  18. D’Angelo, Frank. 2009. Happy Birthday, Digital Advertising! AdAge, October 26, 2009. http://adage.com/article/digitalnext/happy-birthday-digital-advertising/139964/. Archived version available in Internet Archive: http://web.archive.org/web/20170731095446/http://adage.com/article/digitalnext/happy-birthday-digital-advertising/139964/
  19. Day, Michael. 2003. Collecting and Preserving the World Wide Web. Citeseerx.Ist.Psu.Edu. JISC & The Wellcome Trust.Google Scholar
  20. ———. 2006. The Long-Term Preservation of Web Content. In Web Archiving, ed. Julien Masanes, 177–199. London: Springer.CrossRefGoogle Scholar
  21. Dougherty, Meghan, Eric T. Meyer, Christine McCarthy Madsen, Charles Van den Heuvel, Arthur Thomas, and Sally Wyatt. 2010. Researcher Engagement with Web Archives: State of the Art. JISC Report.Google Scholar
  22. Eckersley, Peter. 2010. How Unique is Your Web Browser? In Privacy Enhancing Technologies. Proceedings from 10th International Symposium, PETS 2010, ed. Mikhail J. Atallah and Nicolas J. Hopper, 1–18. London: Springer.Google Scholar
  23. European Commission. n.d. EU Internet Handbook: Cookies. Accessed September 14, 2017. http://ec.europa.eu/ipg/basics/legal/cookies/index_en.htm. Archived version available in Internet Archive: http://web.archive.org/web/20170902160903/http://ec.europa.eu/ipg/basics/legal/cookies/index_en.htm
  24. Ghostery. n.d. About Ghostery. Accessed September 11, 2017. https://www.ghostery.com/about-ghostery/. Archived version available in Internet Archive: https://web.archive.org/web/20170901155310/https://www.ghostery.com/about-ghostery/
  25. Helmond, Anne. 2017. Historical Website Ecology: Analyzing Past States of the Web Using Archived Source Code. In Web 25: Histories From the First 25 Years of the World Wide Web, ed. Niels Brügger, 139–155. New York: Peter Lang.Google Scholar
  26. IIPC. n.d. About. Accessed September 19, 2017. https://netpreserveblog.wordpress.com/about/. Archived version available in Internet Archive: https://web.archive.org/web/20170722204633/https://netpreserveblog.wordpress.com/about/
  27. Internet Archive. n.d. About the Internet Archive. Accessed September 19, 2017. https://archive.org/about/. Archived version available in Internet Archive: https://web.archive.org/web/20170715085801/http://archive.org/about/
  28. Jackson, Andy. 2015. Ten Years of the UK Web Archive: What Have We Saved? Presentation from the 2015 IIPC General Assembly, Palo Alto.Google Scholar
  29. Kahle, Brewster. 2015. Locking the Web Open, a Call for a Distributed Web. Blog.Archive.org, February 11, 2015. http://blog.archive.org/2015/02/11/locking-the-web-open-a-call-for-a-distributed-web/. Archived version available in Internet Archive: https://web.archive.org/web/20150305064916/https://blog.archive.org/2015/02/11/locking-the-web-open-a-call-for-a-distributed-web/
  30. Kamkar, Samy. 2010. Evercookie. Samy.Pl. September 20, 2010. https://samy.pl/evercookie/. Archived version available in Internet Archive: https://web.archive.org/web/20170930221445/https://samy.pl/evercookie/
  31. Klein, Martin, Herbert Van de Sompel, Robert Sanderson, Harihar Shankar, Lyudmila Balakireva, Ke Zhou, and Richard Tobin. 2013. Scholarly Context Not Found: One in Five Articles Suffers From Reference Rot. PLoS ONE 9 (12): e115253.CrossRefGoogle Scholar
  32. Koehler, Wallace. 2004. A Longitudinal Study of Web Pages Continued: A Consideration of Document Persistence. Information Research 9 (2). http://www.informationr.net/ir/9-2/paper174.html.
  33. Krishnamurthy, Balachander, and Craig E Wills. 2009. Privacy Diffusion on the Web. A Longitudinal Perspective. In WWW 2009, Proceedings of the 18th International Conference on World Wide Web, Madrid, April 20–24, 2009. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.232.3038&rep=rep1&type=pdf
  34. Lawrence, Steve, David M. Pennock, Gary William Flake, Robert Krovetz, Frans M. Coetzee, Erik Glover, Finn Årup Nielsen, Andries Kruger, and C. Lee Giles. 2001. Persistence of Web References in Scientific Research. IEEE Computer 34 (2): 26–31.CrossRefGoogle Scholar
  35. Liebler, Raizel, and Liebert June. 2013. Something Rotten in the State of Legal Citation. The Life Span of a United States Supreme Court Citation Containing an Internet Link (1996–2010). Yale Journal of Law and Technology 15 (2): 1–39.Google Scholar
  36. Lyman, Peter. 2002. Archiving the World Wide Web. Council on Library and Information Resources. http://www.clir.org/pubs/reports/pub106/web.html. Accessed September 29, 2017. Archived version available in Internet Archive: https://web.archive.org/web/20140706052301/https://www.clir.org/pubs/reports/pub106/web.html
  37. Maclay, Kathleen. 2009. Web Privacy Report Finds Widespread Data Sharing, “Web Bugs”. UC Berkeley News, July 2, 2009. http://www.berkeley.edu/news/media/releases/2009/06/02_webprivacy.shtml. Archived version available in Internet Archive: https://web.archive.org/web/20160624022704/http://www.berkeley.edu/news/media/releases/2009/06/02_webprivacy.shtml
  38. Masanes, Julien. 2005. Web Archiving Methods and Approaches: A Comparative Study. Library Trends 54 (1): 72–90.CrossRefGoogle Scholar
  39. Massicotte, Mia, and Kathleen Botter. 2017. Reference Rot in the Repository: A Case Study of Electronic Theses and Dissertations (ETDs) in an Academic Library. Information Technology and Libraries 36 (1): 11–28.CrossRefGoogle Scholar
  40. Mayer, Jonathan R., and John C. Mitchell. 2012. Third-Party Web Tracking: Policy and Technology. In IEEE Symposium on Security and Privacy, 413–427. San Francisco, May 20–23, 2012.Google Scholar
  41. McCullough, Brian. 2014. On the 20th Anniversary, an Oral History of the Web’s First Banner Ads. Accessed September 28, 2017. http://www.internethistorypodcast.com/2014/10/the-webs-first-banner-ads/. Archived version available in Internet Archive: https://web.archive.org/web/20170706102419/; http://www.internethistorypodcast.com/2014/10/the-webs-first-banner-ads/
  42. McDonald, Aleecia M., and Lorrie Faith Cranor. 2011. A Survey of the Use of Adobe Flash Local Shared Objects to Respawn HTTP Cookies. Cylab.Cmu.Edu. CyLab, Carnegie Mellon University, January 31, 2011.Google Scholar
  43. Merriam-Webster. n.d.-a Record. Accessed September 23, 2017. https://www.merriam-webster.com/dictionary/record
  44. ———. n.d.-b Register. Accessed September 23, 2017. https://www.merriam-webster.com/dictionary/register
  45. Montulli, Lui. 2013. The Irregular Musings of Lou Montulli: the Reasoning Behind Web Cookies. Web. Archive.org, May 14, 2013. Archived version available in Internet Archive: https://web.archive.org/web/20130627180619/http://www.montulli-blog.com/2013/05/the-reasoning-behind-web-cookies.html
  46. Mowery, Keaton, and Hovav Shacham. 2012. Pixel Perfect: Fingerprinting Canvas in HTML5. Proceedings from the Web 20 Workshop on Security and Privacy.Google Scholar
  47. Nielsen, Janne. 2016. Using Web Archives in Research: An Introduction. 1st ed. Aarhus: NetLab.Google Scholar
  48. Nikiforakis, Nick, Alexandros Kapravelos, Wouter Joosen, Christopher Kruegel, Frank Piessens, and Giovanni Vigna. 2013. Cookieless Monster: Exploring the Ecosystem of Web-Based Device Fingerprinting. In Proceedings from the 2013 IEEE Symposium on Security and Privacy, 541–555. Washington, DC, May 19–22, 2013.Google Scholar
  49. Nyvang, Caroline, Thomas Hvid Kromann, and Eld Zierau. 2017. Capturing the Web at Large: A Critique of Current Web Referencing Practices. In Proceedings from the Researchers, Practitioners and their Use of the Archived Web Conference (RESAW2 London, 2017). https://doi.org/10.14296/resaw.0002
  50. Roesner, Franziska, Tadayoshi Kohno, and David Wetherall. 2012. Detecting and Defending Against Third-Party Tracking on the Web. In Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, San José, CA, April 25–27, 2012.Google Scholar
  51. Schneider, Steven M., and Kirsten A. Foot. 2004. The Web as an Object of Study. New Media & Society 6 (1): 114–122.CrossRefGoogle Scholar
  52. Schostag, Sabine, and Eva Fønss-Jørgensen. 2012. Webarchiving: Legal Deposit of Internet in Denmark, a Curatorial Perspective. Microform & Digitization Review 41 (3–4): 110–120.Google Scholar
  53. Shen, Catherine. 2014. Tracking the Trackers: Investigators Reveal Pervasive Profiling of Web Users. Princeton.Edu, November 5, 2014. https://www.princeton.edu/news/2014/11/05/tracking-trackers-investigators-reveal-pervasive-profiling-web-users. Archived version available in Internet Archive: https://web.archive.org/web/20170916123406/https://www.princeton.edu/news/2014/11/05/tracking-trackers-investigators-reveal-pervasive-profiling-web-users
  54. Simpkins, Lindsay, Xiaohong Yuan, Jwalit Modi, Justin Zhan, and Li Yang. 2015. A Course Module on Web Tracking and Privacy. In Proceedings of the 2015 Information Security Curriculum Development Conference, Kennesaw, GA, October 10, 2015Google Scholar
  55. Soltani, Ashkan, Shannon Canty, Quentin Mayo, Lauren Thomas, and Chris Jay Hoofnagle. 2009. Flash Cookies and Privacy. SSRN, August 10, 2009. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1446862
  56. Stringer, Nick. 2012 All About… The EU Cookie Law. Campaign, April 19, 2012. https://www.campaignlive.co.uk/article/eu-cookie-law/1127629
  57. Tene, Omer, and Jules Polonetsky. 2011. To Track or “Do Not Track”: Advancing Transparency and Individual Control in Online Behavioral Advertising. SSRN, September 1, 2011, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1920505##
  58. Thomas, Arthur, Eric T. Meyer, Meghan Dougherty, Charles Van den Heuvel, Christine McCarthy Madsen, and Sally Wyatt. 2010. Researcher Engagement with Web Archives: Challenges and Opportunities for Investment. Jisc. Joint Information Systems Committee.Google Scholar
  59. Tracker Tracker. 2017. Digital Methods Initiative Amsterdam. Accessed September 19, 2017. https://wiki.digitalmethods.net/Dmi/ToolTrackerTracker. Archived version available in Internet Archive: https://web.archive.org/web/20160304015338/https://wiki.digitalmethods.net/Dmi/ToolTrackerTracker
  60. Turow, Joseph. 2012. The Daily You. How the New Advertising Industry is Defining Your Identity and Your Worth. New Haven: Yale University Press.Google Scholar
  61. Webster, Peter. 2013. Political Party Web Archives—UK Web Archive Blog. Blogs.Bl.Uk, December 11, 2013. http://blogs.bl.uk/webarchive/2013/12/political-party-web-archives.html. Archived version can be reconstructed by using Memento Time Travel: http://timetravel.mementoweb.org/reconstruct/20171024154310/http://blogs.bl.uk/webarchive/2013/12/political-party-web-archives.html
  62. Weinberger, Amy. 2011. The Impact of Cookie Deletion on Site-Server and Ad-Server Metrics in Australia. comScore, February 3, 2011. http://www.comscore.com/Insights/Presentations-and-Whitepapers/2011/The-Impact-of-Cookie-Deletion-on-Site-Server-and-Ad-Server-Metrics-in-Australia-An-Empirical-comScore-Study?&cs_edgescape_cc=DK
  63. Zittrain, Jonathan, Kendra Albert, and Lessig Lawrence. 2014. Perma: Scoping and Addressing the Problem of Link and Reference Rot in Legal Citations. Harvard Law Review Forum, March 17, 2014. https://harvardlawreview.org/2014/03/perma-scoping-and-addressing-the-problem-of-link-and-reference-rot-in-legal-citations/. Archived version available in Internet Archive: https://web.archive.org/web/20170914162229/https://harvardlawreview.org/2014/03/perma-scoping-and-addressing-the-problem-of-link-and-reference-rot-in-legal-citations/

Copyright information

© The Author(s) 2018

Authors and Affiliations

  • Janne Nielsen
    • 1
  1. 1.Department of Media and Journalism StudiesAarhus UniversityAarhusDenmark

Personalised recommendations