As a historical research project, a large quantity of
records I produce are research notes that I have taken after studying multiple
sources. And since my work flow is almost entirely digital, a large portion of
my records are born-digital. When I’m conducting research, I create a Word
document, and I will write out exactly, passages from my sources. I also have a
folder of digital journal articles I’ve cited, all in PDFs, since that is the
file type uploaded online. I typically brainstorm ideas and a thesis with paper
and pen, but I will create another Word document for these notes so that I can
access them while writing the paper on my computer. Due to the nature of my topic, I’m limited
in terms of access to primary source documents that I would want to use.
Instead, I can only look at secondary sources that discuss the primary sources
I would want. Another type of record I could potentially produce, if it
existed, would be to gather scanned images of Lollard Bibles and
Lollard-created records, IF they are scanned and the institution allows public
access to the images. These would most likely be uploaded as JPEGs, and I would
have a folder of these on my computer to refer to.
For this week’s blog post, I felt it was best to go back to
my notes for the course, INF 2122 “Digital Preservation and Curation” in order
to approach preserving DOC, PDF, and JPEG files. The first thing to do
when assessing how best to preserve digital records is to determine the
significant properties of the object, which can be identified as the content
(information convey, i.e. text, image, programming code), the context (background
information on its creation, i.e. creator, custodian), the structure (the
arrangement of the component parts, i.e. pagination), the behaviour (essential
functionality, i.e. hypertext links, updating calculations), and appearance
(how the content appears, i.e. font and size, page layout, colour). These all
relate to the “essence” of a digital record, which is the main component being
preserved.
The National Archives of the UK has a very comprehensive
guide for preserving digital records, accessible to the public. The National
Archives has developed multiple tools for digital preservation including PRONOM
and DROID. PRONOM is an online repository about data file formats and
supporting software, with details on over 1,000 different digital file formats.
DROID is a tool that scans a computer’s hard drive and identifies files, either
through its file extension or its internal signature, with entries in PRONOM. Now
some file formats have greater longevity than others due to file format
obsolescence. For example, it’s extremely difficult to open a World Perfect file
these days unless you are able to migrate the file to another format, in which
case you could lose essential behaviour, appearance, or content. Or you emulate
an environment where it’s possible to perform the file's original encoding structure.
In general, what I have gathered from the Digital Preservation and Curation
class, it is best practice to convert .DOC files into .PDF, and JPEG files into
.TIFF as they have greater longevity and are not as prone to format
obsolescence or bit rot. I would also have multiple copies of my files in
differing formats, keeping the original DOC and JPEG, in multiple places
including on a Cloud service like Dropbox, and an external hard drive.
Andrew Wilson (2008). “Significant Properties of Digital
Objects.” JISC Significant Properties Workshop, British Library, London, UK.
April 7, 2008.
Preserving Digital Records. The National Archives. http://www.nationalarchives.gov.uk/information-management/manage-information/preserving-digital-records/
“Selecting File Formats for Long-Term Preservation.” The National Archives, August 2008, http://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf
No comments:
Post a Comment