Thursday, October 6, 2011

Digital Preservation at the Parliamentary Archives

At the Parliamentary Archives we’re working to develop a digital preservation capability for the UK Parliament. For the Day of Digital Archives, I’ve highlighted some of the things we’re working on this week, which will hopefully be representative of the kinds of issue many digital archives are addressing.

Our ambition is not only to be able to provide an equivalent level of service – in terms both of collection care and user access – for Parliament’s digital records as we have always done for paper and parchment, but also to support innovative ways of creating, managing and exploiting those information resources. We are very much a corporate archive: we preserve and make available the records of our own organisation or, in this case, the two separate institutions of the House of Commons and the House of Lords. As with most memory institutions, our digital preservation activities are being driven by, and hopefully anticipating, new digital ways of working, and the new kinds of information resource which result from these. For example, having the capability to preserve digital records with the same assurance as paper is a prerequisite for moving to digital-only publication and management of information.

We formally started work on our digital preservation project in 2008. As well as the major task of setting-up the technical infrastructure required - at the heart of which will be our digital repository - this is also tackling softer, but equally-important challenges, such as developing procedures and standards, training staff and raising awareness.

We’ve recently completed procurement of our repository software, and are now in the midst of detailed design work. The initial version of our repository will be operational by mid-2012, with further enhancement planned over the following years, moving to a fully business-as-usual basis by 2015.

So what are we doing this week?

· Ingest and delivery: We’re currently doing a lot of analysis to understand exactly how we’ll ingest different types of content into the repository, and how to present that content back to our users. Our collecting scope is essentially defined as any information created by Parliament which is considered to have permanent historical value; much will be the digital equivalent of information previously produced on paper – business records, published information about the work of both Houses, building plans etc. – but we will also receive entirely new kinds of information, such as websites. As well as born-digital, we’ll be collecting digitised surrogates of physical items. In each case, we need to identify how it will come to us, with what metadata, and how users will expect to see it. Subject to technical limitations, we aim to make all the open records we hold available online.

· Describing content: Closely related to this, we’re doing a lot of work to identify the metadata we need to describe the objects in our catalogue and manage them in the repository, and to map this to the metadata we can expect to receive on ingest. This will also enable us to identify any gaps, and think about how to fill them. We’re also thinking about some of the challenges of describing digital records. All archival records which we store in our digital repository will be catalogued as part of our wider collection, through our online catalogue Portcullis. We follow ISAD(G), the international standard for archival cataloguing, and we don’t want to use different approaches to describe analogue and digital records: our catalogue should be format-agnostic. However, ISAD(G) was developed in a pre-digital world, and its application to digital objects requires some careful thought.

· Characterisation tools: I’m in the middle of preparing a presentation on characterisation (and especially file format identification) tools for the GOPORTIS Digital Preservation Summit in Hamburg later this month. I would be really interested to know what people think about the current crop of tools and, in particular, where the major gaps are, in terms both of format coverage and tool functionality.

· Staff seminar: Today we’re starting to plan a seminar on our digital preservation project for all staff at the House of Lords. Setting-up a functioning repository is only half the story: we also need to ensure our potential depositors and users are aware of it, and understand its implications. We’re therefore putting a lot of effort into telling people about the project and what we’re aiming to achieve, and working with them to understand how it can help them. If anyone has useful tips on approaches which have worked for them, we’d be very interested to hear them – perhaps relating the institutional issues to personal data such as music or photos, or getting people to think about the digital information they rely on day to day.

· Web archiving: We’re in the midst of QA-ing the latest crawl of the Parliamentary web estate. We’ve been archiving this since 2009, taking thrice-yearly snapshots of c. 30 websites, and the resultant collection is now available online. The results of the QA process are invaluable in feeding back to our web team, so that we can prioritise improvements to the sustainability of the site.

· Demonstrating the benefits: We have a conundrum. Many of the benefits we expect our repository to deliver will only be demonstrable over timescales measured in decades, and indeed centuries. By definition, we can only show that we are able to maintain long term access to our records... well, over the long term. However, we need to be able to demonstrate some evidence of success much more quickly, not least to those who are funding our activities. So how do you demonstrate that a long-term repository is fit-for-purpose over the short-term? Perhaps the best we can do today is to use some of the emerging Trusted Digital Repository standards (such as TRAC, DRAMBORA and the Data Seal of Approval) to show that we meet current best practice. We’re therefore looking at how we might apply such standards.

Follow us @UKParlArchives or via #digitalarchivesday if you’d like to know more.

No comments:

Post a Comment