Вы находитесь на странице: 1из 8

The Digital Black Hole: The importance of the preservation of digital born material

Introduction
The current generation of 18-to-35 year olds, have come of age in a very interesting
time. For them, the digital space of the Internet is, to put it simply, a fact of life. The creation of
content and documentation about their lives in happening in a digital realm.
It should be of great concern to the historians and keepers of the collective memory that
we are at the threshold of a great conflict in the preservation of the collective memory of an
entire generation. Where projects have been undertaken by libraries and historical societies to
utilize digitization as a means of extending preservation efforts, we have the beginnings of a
new preservation strategy. Being able to use digital replicas of still existing physical artifacts has
led to furthering preservation efforts by improving access without increasing physical wear-andtear. The profession has also been hesitant to immediately jump to the digital as a sole means
of preservation since previous early adoption strategies have proven to be short sighted. Digital
preservation, be it on CD or DVD media, or on hard drives has not been proven as a stable form
of preservation, especially in light of examples shown by the conservation of books and other
print media and textiles.
Microfilm, Microfiche, 5 inch floppy discs, zip drives, CD-ROMS, DVD, Blu-Ray, solid
state drives; these are the footsteps on the road of digital preservation, each one touting the
ability to store more information while taking up less space. But while these forms have all stood
up to try, and some have failed and been replaced, the younger generations have also grown
with these technologies, and they have grown at the same pace. What this also means is that
as time continues and as these technologies continue, we as preservationists and conservators
are faced with a generation that never started with an analog collection. They did not have
photo albums and scrap books that were then transferred to Facebook to share with family that
had moved around the country and around the world. They stared digital and they remain
digital.
It is of great importance that the library profession, including the fields of preservation,
archives and curation, fully understand this fact and move to set the standard towards
preserving this unique generations memory. As it stands, on a large scale, the profession is
taking a step backwards, looking for some stability in the digital realm in order to then provide a
standard on which to preserve it. It must be understood that there are no books in the digital
sphere. A classic medium that will stand the test of time may come one day, but it shall only
come if the library profession steps in to assist in creating it. Currently, the rising digital
generation is turning to service providers (such as Google, YouTube, and Facebook) to house
and keep their memories on a large scale. But that can change, if we are willing to step in and
assist in that change.
Wait a minute, not so fast
While the concerns about the collective memory of this new digital generation, coupled
with currently available technology, might cause (and in some cases, has caused, as will be
demonstrated here) individual programs to immediately adopt a new policy of digital
preservation, there are understandable reasons to be cautious.

The history of technology in and of itself is rife with fantastical presentations of the last
item that you will ever need for a particular task.
Nicholson Baker showed a clear example of the rapid adoption of new technology and
the consequences that this can have on the field of preservation. Over the course of his book
Double Fold: Libraries and the Assault on Paper, Baker demonstrates not only the loss of
unique physical artifacts, but also examines the challenge of preservation in and of itself. As
Baker states, all conservation is preservation, but not all preservation is conservation,
(Baker, P108).
Throughout the book, Baker breaks down the issue of the rapidly adopted practice of
microfilming to address the apparent crisis of paper in libraries rapidly turning to dust. Materials,
particularly physical ones, all face a state of entropy. It is a natural progression. However, as
Baker demonstrated, the wide belief that books were literally turning to dust and blowing away
was a bit of an over-exaggeration. Books were undergoing the natural decay of time.
Spearheaded in the United States by the Library of Congress, a mass adoption of what was
dubbed preservation microfilming took place. A large part of this project involved the large
collections of bound newspaper volumes. This involved a two-step process. The first was to
take images of the pages of the bound newspapers and place them on to microfilm. The second
step was to throw the newspapers away (Baker). Not only were the newspapers crumbling, but
they were also taking up such an inconvenient amount of space on the shelves.
But as time has progressed, the issue of entropy has shown its face again. Much of the
microfilm can suffer the same fate as all film. It degrades due to the chemicals that are used to
create it; and this time frame of degradation is shorter than of the newspapers and other
physical materials these were replacing. This has been a persistent theme in the modern history
of preservation. New potential tools have arrived, but they have yet to demonstrate as much
stability as the original source material.
There have been a number of high profile issues with digital preservation efforts that
have cast shadows on the use of digital technology as a first run tool in the field of preservation.
In 2008 the British Broadcasting Corporation (BBC) approved a plan to move forward with a
project called the Digital Management Initiative. The scope of the project was to digitize the
BBCs vast archive of still images and moving images, and to integrate it into a connected digital
production environment that would allow the staff more creative abilities directly from their desks
(BBC Careers). Initial funding of 80 million was invested into the project.
However, the project did not go as planned. Creation of the system was contracted out
to a contractor. After a year of development, the system was not moving along as scheduled
and at that time the BBC canceled their contract and moved the work in house. This did not lead
to an improvement in the creation of the system. The facility charged with digitizing the large
amounts of film and images was so overwhelmed that BBC employees were having to collect
the materials themselves and transport them back to the BBC on public transportation in order
to use them. By May of 2013, the DMI project was cancelled. In the end, the total cost of the
project to the BBC with nothing to come of it was 98.4 million.
But even if a project can be created, the greatest challenged faced is the issue of digital
obsolescence. This was classically demonstrated by The Domesday Project. The original
Domesday book (actually a collection of two volumes) was a survey of England and Wales that
was created in 1086. The original manuscripts are still available and preserved by the British

National Archive. In 1984 a project was undertaken to create a new survey as a celebration of
the 900th anniversary of the original manuscript. It was decided that this new survey would be a
multimedia presentation that included the writings of people all over England as well as images,
data sets, and virtual tours of famous landmarks. The material was collected between 1984 and
1986 and the new survey was published in 1986. The format of this new survey was laser disc.
By 2002, concerns began to grow about the preservation of the material on the discs.
Primarily was the concern that the machines that were capable of reading the discs were
becoming more and more scarce. Additionally, the material that was contained on the discs
were encoded in unique formats that could only be read by these very specific machines. By
this time, JPEG had become the standard format for digital still images, but the image forms on
the discs were unique to that media. The material from the discs has been recovered. The
rough estimate of time of the 1986 survey surviving was 15 years. This is in comparison to the
original survey which is working its way towards 1,000 years and counting.
So, with examples like this being at the forefront of the minds of preservationists, rushing
into any digital project is a recipe for exorbitant costs, and materials that may not fare any better
than the material you might think you are saving. In any attempt to utilize digital preservation,
these examples should be heeded in order to move with caution and skepticism.
The only problem now is, there is an entire generation whose original source material, is
digital.
Meet Generation C
The current generation of 18-35 year olds have a number of different designators.
Sometimes called Generation Y, or Millenials, their birth years have been spread across a large
swath of time from the early 1980s to the early 2000s. They are primarily classed by being a
generation of technology. They have grown with the Internet and seen as it has changed the
fields of music distribution and retail itself.
Amongst this Millenial Generation, is a subset that has been dubbed by many marketing
groups as Generation C. These are the individuals that have been born after 1990, and who by
the year 2020 will have grown up in a world that is primarily digital (Friedrich). For the Marketing
set, the C in the name stands for one very important thing; that is that Generation C is
connected. Connected to each other, connected to products, and primarily through a digital
platform. Now, it must be clear that much of this focus is upon individuals that are living in the
developed world. This includes the United States, Canada, Europe and the lumped together
BRIC countries which refers to Brazil, Russia, India and China. According to Friedrich, by the
year 2020, Generation C will comprise 40% of the population in those countries.
For this new group of consumers, the Internet no-longer sits behind a computer screen
its the way they live their life, and its second nature for them to engage with authentic
content across all platforms and all screens, whenever and wherever they want. ("Meet
Gen C: The YouTube Generation Think with Google")
This generation exists in a digital realm. They communicate and share their lives there
as well. The very prominent social networking sites like Facebook, Twitter and YouTube are not
new means of communication for them; these sites are the means of communication. That is an
important distinction to keep in mind. Currently Facebook is reporting 1.23 billion monthly users

(Sedghi). These numbers are certainly not made entirely up of this newly described generation.
A large portion of the users are new adopters, primarily older than this demographic, that are
using Facebook as a simpler means of communication with friends and family. They have
known other ways of communication before from simple telephone calls, down to the classic
handwritten letter.
But as the quote above indicates, Generation C uses the Internet as a means to engage
with authentic content. That word authentic is of primary importance to this group. While they
may be existing primarily in a digital space, their creations (be they videos, photos, status
updates, or emails) are just as authentic as a tangible letter that one might receive in a mailbox.
This means that the desire of this generation to preserve those items already is important to
them, and should be important to the world at large in order to preserve a collective memory.
YouTube as a website started from very simple beginnings. It was created as a means
to easily share video amongst friends. Today it has become a prominent website sharing huge
amounts of content. According to statistics from its press room, 100 hours of video is uploaded
to YouTube every minute and that in a given month, over 6 billion hours of video is consumed.
There are many categories of videos that are provided and not all are created by
individual users. Major companies also use YouTube as a platform to share content, including
major television networks like CBS and NBC, as well as cable channels like Fox News and
Comedy Central.
But if you would look at what are currently listed as the top YouTube channels based on
subscribers, you are presented with a very interesting picture. According to the YouTube
statistics website SocialBlade, the current top of the rankings with over 32 million people
subscribing to their channel and rapidly approaching 7 billion total views of their videos is a 25year-old named Felix Arvid Ulf Kjellberg, who goes by the name of PewDiePie. The primary
content on his channel is of him playing video games and commenting as he goes. He also
includes videos of him just commenting on issues and other topics, such as introducing his
family.
There are many other channels like his. Viewing the top ten most subscribed channels
on YouTube reveal that six of the top ten most subscribed to channels are created by
individuals. These individuals are all part of this wide digital generation that are recording and
sharing their lives, or at least some facet of it online. There is not a tangible product that can be
held to demonstrate their work, but their work has obvious resonance and arguably obvious
value. While much of the shared content of YouTube and other social media sites, and even
web pages in general, can be viewed as shallow and trite, these are the recordings of a
generation.
And these are the only way these recordings are being captured. They are not
transferring their media from one form to another in order to share it with others. They are
creating it specifically for this medium, and this is where it exists. Without these individuals and
the content created and shared by so many others, the recordings of this generation would be
few and far between, kept to rapidly dwindling images and writings in print media.

The Challenges
The greatest challenge, first and foremost is the issue of costs. It is a plain fact that the
umbrella concern of digital preservation comes in the form of costs. This stands above issues of
digital media decay, and hardware obsolescence because it is the single factor.
In a survey performed by the Digital Preservation Coalition, questions were asked about
current digital preservation strategies. The survey found that 52% of the respondents were
committed to digital preservation. However only 18% actually had a strategy in place and only
33% indicated that adequate funding was being provided (Waller. P16). Digital preservation is a
complicated issue. While grossly oversimplified, the preservation of a physical artifact can be as
simple as providing a barrier, placing the item in a shelf, and recording its location. The digital
realm requires, hardware, software, and human interaction in order to properly maintain the
means of access, let alone the digital artifacts themselves.
A report generated in 2003 by the Washington Secretary of State with regards to the
feasibility of their digital archives examined the estimated costs of the full implementation of a
digital archiving system. Looking at the costs over a period of seven years, the total (including
hardware, software, media, and training) exceeded $18 million (Washington State. P99).
Initial costs of that level are enough to give anyone pause, especially when coupled with
the concerns that lie underneath these costs. The actual hardware is of a concern. As
demonstrated by the Domesday Project, without the necessary hardware to access the material,
the material is not much use to researchers and individuals. Upon examination, we can see that
the price of hardware does decrease over time. In general the price will peak upon first release
when the hardware is brand new, then decrease over time until a new hardware iteration is
released, whereupon the price will peak again (Palm). So this has the potential to leave the
preservation realm in a constant state of catch up, justifying costs against an external business
environment that is constantly updating its materials in order to maximize profits for them.
But the fact of the matter remains that we are rapidly approaching a time when new
material is solely digital, and the experience of that material is digital as well. Facebook content
of course could theoretically be printed out and maintained in a very traditional preservation
environment, complete with humidity controls, and all of the best acid free storage material. But
an argument can be made that the experience of that digital realm has not been adequately
preserved along with the content. That interactive element is as essential to the digital
experience as the content itself. Even as Nicholson Baker pointed out, the microfilm had the
potential to capture the content, but the experience of the newspapers was greatly lost in the
microfilm reproductions (Baker. Inset).
What are the current strategies?
Now, lets not imagine that this concern is in such a place that we are imminently facing
a catastrophic loss of all digital born materials. There is a chance, but already there are
practices in place that are attempting to address these concerns.
The Library of Congress (LOC) itself has already moved forward with a program that is
encompassing an attempt to archive the World Wide Web. The program was first started in
2000 in an effort to collect and preserve websites. In 2003 the LOC joined the International
Internet Preservation Consortium, which included the national libraries of Australia, Canada,

Denmark, Finland, France, Iceland, Italy, Norway, Sweden, the British Library (UK), and the
Internet Archive (USA) (Web Archiving).
In April 2010, as part of this program, the LOC entered into an agreement with the social
media website Twitter, to collect and archive all of their public messages (tweets) since the
companys inception in 2006 (Telegraph). As of the publishing of a white paper in January of
2013, the LOC has currently archived approximately 170 billion tweets. In entering into this
project, the LOC has acknowledged that, As society turns to social media as a primary method
of communication and creative expression, social media is supplementing and in some cases
supplanting letters, journals, serial publications and other sources routinely collected by
research libraries, (Update on the Twitter). For younger generations like Generation C, this
is more and more of a reality. They have grown up with a system that has allowed for faster
communication and sharing of information that renders a desire for classic means of
communication (paper letters and journals for example) as inefficient and unsuited to their
needs.
But while the LOC has embraced this as an important step in preservation of the digital
collective memory, they have acknowledged the challenges. At the time of the white paper,
requests for information searches of the initial collection of Tweets spanning 2006 2010 could
take almost 24 hours to complete due to the size of the archive, a size that increases with each
day. As stated in the paper, It is clear that technology to allow for scholarship access to large
data sets is lagging behind technology for creating and distributing such data. This is important
to understand. The available technology for the archiving and preservation realm is lacking. The
private sector technology is rapidly expanding and creating new systems of creation and
communication, and it will continue to do so.
This has led to other organizations taking the reins in the realm of digital preservation of
unique digital born material. One member of the International Internet Preservation Consortium
is the non-profit group The Internet Archive. The Internet Archive was founded in 1996 and its
aim was to preserve websites from the World Wide Web. In 2001 they made their archive
available through the use of their Wayback Machine. The archive allows users to search for
what are essentially snapshots of websites as they were when captured on a given day.
Conclusion.
The underlying concern in this presentation I simply that we are running out of time. The
amount of digital born material is increasing at an incredible rate. Without a sustained effort to
address this situation, the world stands to lose a very large and important part of its collective
memory. The wait-and-see approach of a completely stable digital preservation format will only
result in the loss of more and more material. It is imperative for the library and preservation
community to become leaders on this issue. The costs of maintaining a digital archive and
repository are soon to be no longer a luxury afforded by only a few. It will be imperative and
preservation programs of all types must be ready to tackle this. We simply owe it to our past and
our future.

CITATIONS
Friedrich, Roman, Michael Peterson, Alex Koster, and Sebastian Blum. "The Rise of Generation C:
Implications for the World of 2020." The Rise of Generation C: Implications for the World of 2020. 26
Mar. 2010. Web. 22 Nov. 2014. <http://www.strategyand.pwc.com/global/home/what-wethink/reports-white-papers/article-display/rise-generation-implications-world-2020>.
"Meet Gen C: The YouTube Generation Think with Google." Meet Gen C: The YouTube
Generation Think with Google. 1 May 2013. Web. 22 Nov. 2014.
<https://www.thinkwithgoogle.com/articles/meet-gen-c-youtube-generation-in-own-words.html>.

Sedghi, Ami. "Facebook: 10 Years of Social Networking, in Numbers." The Guardian, 4 Feb.
2014. Web. 22 Nov. 2014. <http://www.theguardian.com/news/datablog/2014/feb/04/facebookin-numbers-statistics>.
"Statistics." YouTube. YouTube. Web. 22 Nov. 2014.
<https://www.youtube.com/yt/press/statistics.html>.
"Top 100 YouTubers by Subscribed." Top 100 YouTubers Filtered by Subscribers. Web. 22
Nov. 2014. <http://socialblade.com/youtube/top/100/mostsubscribed>.
Baker, Nicholson. Double Fold: Libraries and the Assault on Paper. New York: Random House,
2001. Print.
"Web Archiving." (Library of Congress). Web. 26 Nov. 2014.
<http://www.loc.gov/webarchiving/>.
Telegraph, The. "Library of Congress Is Archiving All Of America's Tweets." Business Insider.
Business Insider, Inc, 22 Jan. 2013. Web. 26 Nov. 2014.
<http://www.businessinsider.com/library-of-congress-is-archiving-all-of-americas-tweets-20131>.
Update on the Twitter Archive at the Library of Congress. Library of Congress, January 4,
2013. http://www.loc.gov/today/pr/2013/files/twitter_report_2013jan.pdf. Nov 26, 2014.
"BBC Careers." BBC. 10 Mar. 2012. Web. 26 Nov. 2014.
<http://web.archive.org/web/20120310041000/http://www.bbc.co.uk/careers/divisions/digitalmedia-initiative>.
"BBC Abandons 100m Digital Project." BBC News. 24 May 2013. Web. 26 Nov. 2014.
<http://www.bbc.com/news/entertainment-arts-22651126>.
Conway, Paul. "Reformatting 6.4 The Relevance of Preservation in a Digital World." 6.4 The
Relevance of Preservation in a Digital World. Web. 28 Nov. 2014. <https://www.nedcc.org/freeresources/preservation-leaflets/6.-reformatting/6.4-the-relevance-of-preservation-in-a-digitalworld>.
Waller, Martin, and Roger Sharpe. "Digital Preservation Coalition." Mind the Gap: Assessing
Digital Preservation Needs in the UK. 1 Jan. 2006. Web. 28 Nov. 2014.
<http://www.dpconline.org/advocacy/mind-the-gap>.

"Washington State Digital Archives Feasibility Study." Washington State Digital Archives
Feasibility Study. Washington Secretary of State, 1 Aug. 2003. Web. 28 Nov. 2014.
<http://www.digitalarchives.wa.gov/state/washington/StaticContent/Feasibility Study.pdf>.
Palm, Jonas. "The Digital Black Hole." The Digital Black Hole. Training for Audiovisual
Preservation In Europe. Web. 28 Nov. 2014. <http://www.tapeonline.net/docs/Palm_Black_Hole.pdf>.