Archiving the Web: How to Support Research of Future Heritage?

We would like to invite you to the next CATCH meeting organized by the project WebART (, which will be held on 19 April from 12.00 to 18.00. This meeting is open to any interested party. The venue will be the National Library of the Netherlands (Koninklijke Bibliotheek), The Hague (

Registation at:

Detailed program and schedule:


Archiving the Web: How to Support Research of Future Heritage?

The web has become the central medium of our time — all our traditional media have become digital and even our own lives are increasingly taking place ‘on the web’.  Preservation and archiving practices haven’t kept pace, resulting in our future heritage being  lost to posterity, rapidly and indefinitely.  Web Archives constantly struggle with challenges of preserving an ephemeral medium, and have to make crucial decisions on selection policies, storage constrains, and the desired frequency of crawling and harvesting.  The evolution of Web-based technologies and services — such as dynamic content, social media, RSS feeds, Tweets, Mobile Applications and API’s — creates new challenges. Also the demands of researchers using the web archive evolve rapidly, requiring novel access tools for exploring the existing archived Web strata, or new types of data that are currently not preserved by Web archives.  What is the best way forward to make both ends meet?

Keynote speakers:

Helen Hockx-Yu, British Library, “Web archiving and scholarly use of web archives”

This talk will give a general overview of how web archives are used and focus on the British Library operated UK Web Archive, how often it is used and what scholars think of it (based on survey data). It will introduce the various access methods developed by the British Library to encourage scholarly use of archived websites, not only as historical documents but also as datasets for analytics and visualisation. It will also cover the challenges of archiving social media and give an overview of the how limited this is done for the UK Web Archive. Furthermore, it will introduce Twittervane, an application which is capable of collecting and analysing Twitter feeds and outputs URLs mentioned in the Tweets. These URLs shared on the Twitter could potentially point to web resources relevant to web archive collections.

Bernhard Rieder, University of Amsterdam, “This is where we draw the line!”

While often pursuing diverging goals, Internet researchers and archivists are facing similar challenges when engaging the Web and, in particular, Social Networking Services. These challenges link scholarly questions of corpus- and collection-building with technical and often legal and ethical questions that are in many ways dissimilar from the issues encountered in the past. This talk will focus on the Twitter microblogging service and discuss the difficulties posed by a global and increasingly algorithmic medium, as well as the networked contents it enables. I will argue that the problem of demarcation is particularly thorny and detail a number of strategies to approach it conceptually and practically.

William LeFurgy, Library of Congress, “Guiding the Stewardship of Big–Really Big–Digital Collections”

Libraries, archives and other cultural heritage organizations are now facing a conundrum in connection with content that is born digitally on the web. The value of this content for research is apparent, and institutions are increasingly aware of a responsibility to add it to their collections. But preserving digital content requires an infrastructure that is fundamentally different than that used for decades to manage traditional varieties of cultural heritage resources. Implementing the needed skills, tools and practices for collecting and managing digital collections at huge scale requires a new framework of institutional guidelines. This presentation will offer some thoughts on what this framework should look like and describe the major factors that influence its development.

Any questions, please contact:

CATCH secretariat

NWO – Netherlands Organisation for Scientific Research – Division Physical Sciences