October 10 is Electronic Records Day and this year the Utah State Archives is featuring several projects from archival institutions that are advancing the preservation and management of electronic records.
The North Carolina State Archives has been capturing web content since 2004. However, in 2012 they expanded into the realm of social media. In a partnership with ArchiveSocial, North Carolina State Archives is developing a searchable database of social media content created by state agencies. Facebook, Instagram, Twitter and YouTube videos produced by North Carolina government agencies are available for public access. Although the searching function is still in beta testing, it is worth taking the time to explore their social media archive.
The State Library of Virginia is publishing the email of the 70th governor Timothy M. Kaine 2006-2010. Although the library is still processing the 167 gigabytes containing 1.3 million emails, they are being in batches for public access. All of the emails are searchable by text and organized by mailbox owner. The library’s goal is to create an experience where the user can assume the role of one of Kaine’s administrative officials and “approximate what they saw when they logged into their email accounts.”
The Texas State Library and Archives Commission(TSLAC) recently completed a project in which 26,000 audio tapes of Texas State Senate hearings from 1972-2006 were digitized and placed online for public access. These tapes were often the only record copy of the meetings and as such, reference use and duplication requests were frequent. Furthermore, regular patron usage and age deterioration threatened the future access and preservation efforts. Fortunately, an LSTA grant provided TSLAC with the resources to digitize the audio cassettes. The project resulted in a high quality record copy with easily reproducible copies for research use.
The Utah State Archives is integrating the locally developed new M-Disc in some of its electronic records projects. Unlike Compact Discs that use an organic polycarbonate layer, the M-Discs have an inorganic mineral composite layer on which to record information. Since they can last up to 1000 years, the M-Disc is an excellent medium for long term electronic records storage. Utah State Archives has begun to replace aging diazo microfilm with M-Disc copies as well as storing master images for projects in the digital archives. Furthermore, by storing those master images on M-Disc rather than a server, the archives is saving IT money and resources for other projects. Since adopting them, Utah State Archives has created over 700 M-Discs for record and reference copies for electronic records as well as storing master images.
A year ago, we here at Utah State Archives did an experiment. Could we store electronic records on microfilm? You may well ask how an electronic record could be stored on microfilm and still keep all of its digital attributes. The answer lies in the QR code. Yes, that same little black and white square symbol you can scan with your phone also contains digital information, and the symbol can be microfilmed. Our experiment was to transform an electronic data file into a QR Code, convert that QR Code to an analog storage medium such as microfilm, and then scan it back into an electronic format without losing data integrity.
We began by turning an electronic record into a binary file, uploading it to our server, and calculating its checksum. Then, we had our software convert the binary file to base64, which is a long string of alphanumeric characters. Once the file was in base64, we could chop that single data string into smaller strings and produce a QR code from each of the smaller string. Since the amount of data a QR code can hold is finite, a series of QR codes were needed to hold all the strings together. Once those strings were in a series of QR Codes, our software wrote the codes a PDF document, which was then microfilmed without being printed. After microfilming, we took the finished roll and scanned the frames to convert the QR codes back into digital form again. Since nine QR codes were captured per frame, the resulting image required some editing to make each QR code individually recognizable. Subsequently, we uploaded the QR codes back to our server and concatenated all the Base64 data strings contained within each QR Code back into one base64 string. This base64 string was then converted back into binary form. Finally we ran a checksum to ensure file integrity had not been compromised from all the data manipulation.
The experiment was both a success and a failure. We were able to create QR codes from binary data and transfer them to a PDF file. We were also able to extract data from QR code images and transform it back to binary. However, the project failed in the microfilming process. Our equipment was unable to film the QR codes with clearly defined edges, resulting in the inability to extract data from individual codes. The problem lay with how many QR codes per frame we were trying to microfilm and the inability of the filming equipment to zoom in on certain sections of the PDF document. We did not try to microfilm only a single QR code per frame, although that would likely have eliminated the fuzzy-edge problem. Although many were involved, Elizabeth Perkes was the mastermind behind our experiment.