Вы находитесь на странице: 1из 10

1

Rachel Newlin
LIS 882
Fall 2016
Metadata Schema Report

Metadata Schema #1: EAD

DESCRIPTION AND HISTORY

Encoded Archival Description (EAD) began as a project at the University of


California, Berkeleys library in 1993, where the goal was to investigate how feasible and
desired a nonproprietary encoding standard for machine-readable findings aids would be.
Finding aids can be defined as a detailed description of the content and intellectual
organization of collections of archival materials, according to the Library of Congress.
These detailed descriptions can come in the form of inventories, registers, or indexes. The
project began as networks were just beginning to serve information needs in libraries, and
archives were exploring the idea of a standard that could publish finding aids online, thus
creating more access of information, both for users and other institutions.
Daniel Pitti was the principal investigator of the Berkeley project, where he and his
team developed the encoding standard that would eventually become EAD. A criterion
was developed to help structure what was necessary to publishing findings aids in a
networked environment: this included the ability to preserve extensive and interrelated
descriptive information; the ability to preserve the hierarchical relationships existing
between levels of description; the ability to represent descriptive information that is
inherited by one level from another; the ability to move within a hierarchical
informational structure; as well as support for element-specific indexing and retrieval.
Overall, the ability to preserve the incredibly hierarchical nature of archival collections
(and their finding aids) was extremely important to transfer to the encoding standard,
therefore preserving the nature of the way archives organized their collections. The
markup format chosen was SGML (Standard Generalized Markup Language), chosen
because of the hierarchical functionality and the software available at the time. The
standard included an optional title page, the description of a unit of archival material, and
optional back matter, which could include elements to build with.
The standard was shared with many institutions with large archival collections and
in July 1995 the Ann Arbor Accords were reached, which set out to define and establish
finding aids for the purpose of the encoded standard. It was agreed that a finding aid
consists of two segments, one of which provided information about the finding aid, while
the other provided information about the archival material, or collection. The finding aid
would be hierarchically organized; describing units of records and their respective
components, as well as information that would help researchers further, but might not
directly relate to the archival materials. The design principles were few: to make
resources from many institutions accessible; element and attribute names to be as
universal as possible; focus on information to be shared publicly; the creation of a data
structure not data content standard; an SGML/XML based communication format; and
that technical barriers were as minimal as possible.
EAD was released in its first form in August 1998, where it was shortly postponed
2

to consider making the schema more compatible with the XML, but released regardless.
In 2002, EAD was revised and updated to be compatible with XML. The EAD tag set has
146 elements, where the top-level is an overview of the entire collection; the second level
is groupings of materials within the collection; and the third level is where each file or
item is described (Lubus 68).

IMPORTANCE OF SCHEME TO COMMUNITY

In considering the importance of EAD for the archives community, it is important


to consider what archives were doing with their findings aids before EAD was released.
The social system allowed for a belief that because archives were unique, they required
unique approaches, and standards could thus never be applied (Yakel 1427). This idea of
uniqueness made for many paper findings aids and local rules. When EAD was released,
there was less thought about how it was imposing rules on unique archival collections,
and seen more as a tool that could be utilized for finding aid publishing (Dow 109). EAD
was met with interest and excitement, where a solution was given to long-standing
problems of access with archival collections (Dow 110). It was, for the most part, seen as
the most adaptable option available, but some considered it a halfway technology that
did the job, but didnt innovate archival collections the way that some had wanted (Dow
110). Still, within the first few years, 42% of one survey responded that they had utilized
EAD in their descriptive programs, while not quite half still a strong amount
considering the overhead and training involved in getting EAD in archives across the
country (Yakel 1428). Much of the consideration to utilize EAD in archives came down
to the resources available to that particular archive, and whether they had the staff to train
to begin with (Yakel 1430). Most of the initial converts to EAD were university archives
that could afford the transition and training, backed by their respective institutions in
goals and mission. Still, the focus on education that came from the first team members of
that Berkeley project and initial release of EAD made for many training opportunities,
many library science programs with classes that covered EAD, and as the online
environment became more realized and pivotal, EAD caught on at a fairly quick rate
(Dow 108) where it was seen as the best available option when considering complexity,
cost, impact, utility, and outcome (Dow 109). Overall, EAD gave archival collections a
place to exist in a networked environment, and while it sometimes held a high overhead
cost and technical knowledge, it also increased access, as it was meant to do. In addition,
it standardized finding aids, and archival collections, in a way that had not been possible
before, while also creating a path to easy migration when new technologies evolve.

DISCUSSION OF SCHEMA

EADs top-level element is <ead>, followed by two main wrapper elements


<control> and <archdesc>. The control element replaced prior deprecated elements in
EAD 1.0 and 2002. In the recently released EAD3, EAD takes a few element lessons
from other Library of Congress schema, opting for some of the same language: these
include <agent>, <control>, and <relations> which all bring to the forefront the
importance of creation, control, and relationships between items and intellectual
information. In addition, new elements focus more closely on the importance of time and
3

geography, including a fair number of event elements. Overall, the most unique
characteristic of EAD as a metadata schema is its control over its own hierarchical nature.
This comes back to the design principles of the first version of EAD, which asked clearly
to retain the hierarchical principles that keep archival collections afloat. For this reason,
components play a fairly large role in creating a space for every attribute, and every
attribute having its distinct place. One of the most unique elements EAD has to offer is
the <bioghist> and <scopecontent> elements, where contextual information can be added
to further help a researcher. EAD does an excellent job of giving freedom to archival
collections, allowing for expression of context in an essay-type format that can be utilized
for research. In addition, the stand-alone nature of EAD is both a unique blessing and a
unique curse. The authorized thesauri and controlled vocabularies that are utilized by
EAD are ones that are mostly unique to archival collections, never faltering to more
general Library of Congress controlled options. While this can be seen as necessary
catering to unique archival collections, it also makes it much harder to integrate
information from other places, such as a MARC record with relevant information. This
sort of disconnect is one that EAD3 has taken head-on, giving <relation> and other
related attributes a place for integrating separate information without damaging the
hierarchical nature of EAD itself.
Another unique option that EAD provides is linking elements, which is of special
interest to archival collections because of the use of description in units. Because of this,
the ability to link intellectual information together through EAD is quite strong and
capable. The elaborate use of <did> in different places gives a distinct understanding of
how linking works within EAD of course, in true archives fashion, a very hierarchical
way. These linking and ID elements are what make EAD incredibly robust, asking for
information that is concrete and unique at every turn. There is no true place for
generalizations in EAD finding aids, instead the more specific the information, the better
the schema will work out in the end. A unique sort of downside with the hierarchical
nature of EAD is the inflexibility of the schema to create things that are truly beautiful
and engaging. This, of course, is a problem for many schemas, but the finding aid has its
own specific problems. Instead of being able to move things around and make them more
engaging as one might with other types of databases and web pages, there is little room
for movement within a finding aid online document. This, of course, is negligible in
comparison to the wealth of information that a finding aid can provide a researcher in
comparison to other types of records (MARC, for instance). The unique way in which
findings aids have been able to keep their personalized character in the transition to the
web is something to be celebrated.

EXAMPLE
An example from Ruth Kitchin Tillmas website, eadiva.com. It has been shortended and
modified.
<filedesc>
<titlestmt>
<titleproper>Manuscripts of Salazar Slytherin: Finding Aid</titleproper>
<subtitle>A guide to the letters, papers, and spell books held at Hogwarts
School of Witchcraft and Wizardry Archives</subtitle>
<author>Felicia Flitwick</author>
<sponsor>Funding for the initial creation of this electronic finding aid was made
possible in 1992 by a generous gift of Slytherin's class of 1972. Funding for encoding was made possible in
4

2008 by a gift from Slytherin's class of 1998.</sponsor>


</titlestmt>
<publicationstmt>
<publisher>Hogwarts School of Witchcraft and Wizardry</publisher>
<date normal="2014-03-01">1 March 2014</date>
<address>
<addressline>Hogwarts School of Witchcraft and Wizardry,
Archives</addressline>
<addressline>Hogwarts Castle</addressline>
</address>
</publicationstmt>
<seriesstmt>
<titleproper>Manuscripts of the Founders of Hogwarts School of Witchcraft and
Wizardry</titleproper> <num>4</num>
</seriesstmt>
<notestmt>
<controlnote>
<p>The archivist is aware that certain families who came up through Slytherin
still possess original documents and realia from the house's founder. Please consider donating or
bequeathing these to our archives so that we may present Salazar Slytherin's life and views more
completely.</p>
</controlnote>
</notestmt>
</filedesc>
<maintenancestatus value="revised"/>
<publicationstatus value="published"/>
<maintenanceagency>
<agencycode>ST-In-HSWWA</agencycode>
<otheragencycode localtype="wizlib">H</otheragencycode>
<agencyname>Hogwarts School of Witchcraft and Wizardry Archives</agencyname>
</maintenanceagency>
</control>s
RESOURCES

Dow, E.H. (2009). Encoded Archival Description as a Halfway Technology. Journal Of


Archival Organization, 7(3), 108-115. doi:10.1080/15332740903117701

EAD: Encoded Archival Description. (2016, October 18). Retrieved October 31, 2016,
from https://www.loc.gov/ead/

Lubas, Rebecca L., Jackson, Amy S., Schneider, Ingrid. (Chapter 4) The metadata
manual: a practical workbook.

Society of American Archivists. (2016). Retrieved October 31, 2016, from


http://www2.archivists.org/

Tillman, R. K. (2014, April). EADiva. Retrieved October 31, 2016, from


http://eadiva.com/ead

Yakel, E., & Kim, J. (2005). Adoption and diffusion of Encoded Archival Description.
Journal Of The American Society For Information Science & Technology, 56(13),
1427-1437. doi:10.1002/asi.20236.
5

SUMMARY

EAD has revolutionized the way that people think about how archives collect and
provide access. Where some thought that the archives community could never be
wrangled into a standard, it seems that the need for a way create access online and with
other institutions was of a higher importance that the culture of unique collections.
Instead, many institutions continue to develop their collections with EAD as their
encoding standard, some just at the beginning of their EAD journey, others with
experience with it. The changes that EAD have made (to XML, ridding itself of
unnecessary or unused tags) have been significant, but perhaps not significant enough.
Technology advances rapidly, and it seems that in some areas EAD has been left behind
simply of the bureaucracy that follows it around, mostly, Library of Congress
involvement in the standard. Their involvement makes for a slow moving process of
improvement, something that most standards cannot afford or else another encoding
standard will quickly replace it. In addition, it seems that the future of finding aid
encoding might be moving away from EAD altogether, giving itself over to more specific
options for certain types of collections and items. Still there is a great respect for the
uniqueness and hierarchical nature of archival collections in EAD, and that might not be
something that the archival community is willing to give up so easily. EADs integration,
for the most part, as part of the education of archive professionals, has been promoted in
a way that makes it seem as if it is sticking around for the known future, not because it is
on the cutting edge, but because it does what archive professionals want it to, and nothing
more. For that reason, it is hard to say whether EAD will become something more cutting
edge in the future, or it will remain to be efficient through simplicity. The future of
archives is certainly one that will incorporate encoding standards into their workflow and
collection maintenance, but a need for EAD to cultivate what is best in others standard
and best in itself will be necessary if it will be the go-to standard for the archival
community.

Metadata Schema #2: CDWA

BRIEF DESCRIPTION AND HISTORY

Categories for the Description of Works of Art (CDWA) is the product of the Art
Information Task Force (AIRF), a dialogue between art professionals brought together to
develop guidelines for describing art objects and cultural heritage materials. AIRF was
formed in the early 1990s, and funded by the J. Paul Getty Trust and the National
Endowment for the Humanities. CDWA is a set of guidelines for best practice in
cataloging and describing works of works, architecture, other material culture, groups and
collections of works, and related images, arranged in a conceptual framework that may be
used for designing databases and accessing information (Getty). CDWA is meant to
make information both more compatible and more accessible, while contributing to the
integrity and longevity of that data. CDWA is a standard that is made with the end-user
in mind, giving authoritative information priority. Still, many art information
professionals gathered to create CDWA, including art historians, museum curators,
registrars, visual resource professionals, art librarians, information specialists, and
6

technical specialists, all of which brought their own unique needs and wants to the table.
CDWA was developed as a relational data structure, where records for objects and
works are linked to each other through hierarchical relationships. Guidelines included: a
conscious effort of a knowledgeable cataloger, not an automated method; the difference
between information intended for display and information intended for retrieval; the
difference between specificity and exhaustivity; as well as a very serious set of rules for
dealing with uncertain or unknown information, as often is the case with cultural objects.
CDWA was meant to be a robust and complete schema that could be utilized by many
memory institutions, many of which that have very little in common. That being said,
CDWA attempts to fulfill the needs of all these different institutions, and it does it well
but at the price of great complexity.
More recently, the Getty has released CDWA Lite, which is an XML-based schema
that brings together a fewer number of categories to be utilized more widely, with both
art objects and their surrogates, which was a weakness many professionals found in
CDWA (Lubus 95). Getty, ARTstor, and a division of OCLC, with the creation of a
lighter standard in mind, created CDWA Lite. Instead of over 540 categories, CDWA
Lite has only 22 high-level elements. As a more recent schema, these elements are
divided into different categories based on their purpose: 19 of them are descriptive
metadata and 3 are administrative. Only 9 are required, which lightens the load
considerably from the robust required elements of CDWA. CDWA Lite continues the
promotion of compatibility, accessibility, integrity of data, and longevity of data. With
the inclusion of ARTstor, there was a direct effort to make CDWA Lite a schema that
could be shared and harvested by many institutions, moving further into a climate of
linked data and data sharing. CDWA Lite is seen as a low barrier (IFLA 3) way of
sharing and contributing between institutions to union catalogs. The difference between
CDWA/CDWA Lite and other visual resources schemas is the focus of CDWA on
describing materials that may or may not have text included, and the ability of CDWAs
hierarchical structure to place parts to together to make the whole, even when information
is unknowable. Both CDWA and CDWA Lite utilize the CCO initiative (Cataloging
Cultural Objects) where concepts create the framework for record creation. CCO focuses
on the differences between items, groups, volumes, collections, series, sets, and
components, and the relationships between them, proving to be a uniquely art-centered
schema. CCO is the content standard but it is important to note the ways in which
CDWA and CCO overlap, especially because CCO is derivative of CDWA, not the other
way around as it typical of content standards.

IMPORTANCE FOR COMMUNITY

One of the biggest assets to the community on the whole is the robust nature of
CDWA. While the complexity can be daunting, the focus on the end-user as paramount
and the difference between display and indexed fields is one that is more focused with
this schema than many others. The way the information is going to be relayed to the user
is of the utmost importance in a way that proves that the art community created this
schema: what is seen is just as important as what is not (Lubus 96). CDWAs focus on
being a complete and authoritative schema makes it complex, but is met with a
commitment from Getty that goes in every direction: the commitment to cross-walking
7

the schema to others; to providing lighter alternatives with CDWA Lite; to integrating
their thesauri into their schema; as well as to understanding thoroughly what would serve
art communities best. CDWA is the standard by which all other visual resource schemas
are placed against, and that will not change anytime soon (Lubus 96). The importance of
CDWA can also be measured by its data structure and its entity relationships: the
decision to keep the object/work records separate authorities is one that has benefitted art
research for many years now (IFLA 2). This record structure has helped solidify the
difference between of-ness and about-ness in the concept of the work and the
differences and similarities between them (IFLA 3). Lastly, a definite impact that CDWA
has had on the art community is through the sheer number of records that have come to
fruition throughout the years after CDWA was released; and growing since the release of
CDWA Lite. A significant body of records were created and shared through union
catalogs, providing a common language among differing institutions (IFLA 3). While
legacy records and flat records still prove a problem when migrating to CDWA, CDWA
provides a new basis from which to judge the information memory institutions collect
about their works. With the advent of CDWA Lite, the number of records will only
continue to grow, as the barrier of the complexity of CDWA falls into the background,
giving new institutions a place to start and a place to grow.

DISCUSSION OF SCHEMA

As mentioned above, the recommendation to maintain separate authorities for


related visual works, related textual materials, persons/corporate bodies, locations/places
generic concepts, and subjects is one that is taken quite seriously and speaks to the core
of the type of schema that CDWA is in purpose and in practice. Each guideline speaks to
overarching conceptual hierarchical model that is based, mostly, in the separation
between the work and everything else. While that is certainly important in other metadata
schema, it is particularly important to the art community, where both a wealth and lack of
knowledge can make for a confusing entanglement between the work, a series, a
surrogate image, etc. The core categories of CDWA are all very work-specific, and give
the most basic of information about the work itself. These include: catalog level;
object/work type; classification term; title; measurements; materials and techniques;
creator description, identity, role, date; subject matter; current location; and current
repository numbers. In considering these elements, it is clear that the focus on the
physical item is stronger than with other metadata schemas, because there is an
assumption that the end-user will value this information in ways that non-art users would
not.
A unique feature found within CDWA is the way in which the titles of each
element are very specific, in a way that makes something very complex much easier to
understand than even sparse but vague schemas for other types of materials. The clear
distinction between wrapper elements, sub-elements, indexing elements, and display
elements are clear indicators to the encoder and cataloger that these elements serve
different functions for the record and work, and therefore should be treated differently.
Unlike Dublin Core, which allows for inference into what information needs to be input,
CDWA is flexible while being quite stern. It gives a cataloger a place for their unique
work to sign (display fields) while also asking for very clear and cited information for
8

those fields that will be indexed. As the guidelines for both CDWA and CDWA Lite
state, they would much rather you give a broad but correct term, than a narrow and
incorrect term: the importance of correct information is at the heart of each element type
and the instructions attached to that element.
One of the most important core categories in CDWA is the current location. This
category is both important for end-users as much as administrative purposes, where large
collections benefit greatly from the very specific knowledge of their own works in their
collections. Still, there is an assumption about the end-user, here, too, where it might be
assumed that the physical and current location of a resource is relevant to the end-user
not just for their online research, but the in-person research that a person doing art-
historical research might be required to do. The importance of the physical work is
important to all the core categories. Another element that shows the unique nature of
CDWA can be found in the attributed data value for each element. These data values can
get quite specific, which highlights the way in which CDWA and CDWA Lite are
complex schemas, but complex with reasoning behind them: there is not a situation that
they have not thought of a solution for. For example, with CDWA Lites attribution
qualifier creator sub-element, there are many data values that can be placed here: these
include attributed to; studio of; workshop of; atelier of; office of; assistant of; associate
of; pupil of; follower of; school if; circle of; style of; after; copyright of; manner of. All
of these are quite specific to the art community, and all relegation to different art
collections and memory institutions.

EXAMPLE

This example is of CDWA-Lite from the blog, Free Moth. It has been shortened and
modified.
<descriptiveMetadata>
<objectWorkTypeWrap>
<objectWorkType termsource="aat">watercolors (paintings)</objectWorkType>
</objectWorkTypeWrap>
<titleWrap>
<titleSet>
<title pref="preferred" type="descriptive">Conway Castle, North Wales</title>
</titleSet>
</titleWrap>
<displayCreator>Joseph Mallord William Turner (British, 1775-1851)</displayCreator>
<indexingCreatorWrap>
<indexingCreatorSet>
<nameCreatorSet>
<nameCreator type="personalName" termsource="ULAN">Turner, Joseph Mallord
William</nameCreator>
</nameCreatorSet>
<nationalityCreator>British</nationalityCreator>
<vitalDatesCreator birthdate="1775" deathdate="1851">1775-1851</vitalDatesCreator>
<roleCreator>watercolorist</roleCreator>
<roleCreator>painter</roleCreator>
</indexingCreatorSet>
</indexingCreatorWrap>
<indexingSubjectWrap>
<indexingSubjectSet>
<subjectTerm type="conceptTerm" termsourceID="300117546"
9

termsource="aat">seascapes</subjectTerm>
<subjectTerm type="conceptTerm" termsourceID="300008687"
termsource="aat">oceans</subjectTerm>
<subjectTerm type="conceptTerm" termsourceID="300008734"
termsource="aat">coastlines</subjectTerm>
<subjectTerm type="conceptTerm" termsourceID="300006891" termsource="aat">castles
(fortifications)</subjectTerm>
<subjectTerm termsourceID="2009424215" termsource="lcsh">Conwy Castle (Conwy,
Wales)</subjectTerm>
</indexingSubjectSet>
</indexingSubjectWrap>
<classificationWrap>
<classification termsourceID="300033618" termsource="aat">paintings (visual works)</classification>
<classification termsourceID="sh 85007650" termsource="lcsh">Art, European</classification>
</classificationWrap>
</cdwalite>

RESOURCES

An XML Example of CDWA-Lite [Web log post]. (2012, December 12). Retrieved
October 31, 2016, from https://freemoth.wordpress.com/2012/12/12/an-xml-
example-of-cdwa-lite/

Categories for the Description of Works of Art. (2014, March 25). Retrieved October 31,
2016, from
http://www.getty.edu/research/publications/electronic_publications/cdwa/index.htm
l
Corburn, E., Lanzi, E., O'Keefe, E., Stien, R., & Whiteside, A. (n.d.). 107. Cataloging.
The Cataloging Cultural Objects Experience: Codifying Practice for the Cultural
Heritage Community (pp. 1-19). Retrieved October 31, 2016, from
http://www.ifla.org/past-wlic/2009/107-coburn-en.pdf

Keeton, K. (2013, October 16). Introduction | Categories for Description of Works of Art
| CDWA-LITE. Retrieved October 31, 2016, from
http://www.slideshare.net/Kymizsofly/final-keeton-kcdwalite16october2013

Lubas, Rebecca L., Jackson, Amy S., Schneider, Ingrid. (Chapter 5) The metadata
manual :a practical workbook.

SUMMARY
There is much to still be realized about the potential of CDWA and CDWA-Lite.
Both are relatively new metadata schemas, and they are still evolving with the art
community and their information needs. At the forefront of CDWA is the potential of a
end-user centered schema, that not only prioritizes the end-user, but thinks of them at
every opportunity, even when it comes to creating a standard that is more easily
harvestable. There are still unknowns about CDWA-Lite and a lot of learning that it has
to do as a hierarchical XML-based schema. The hierarchy should be tweaked slightly
from the standard that was created with CDWA, just to fit the logical and linear thought-
processes that go along with XML coding. Still, it is an excellent low-barrier choice for
10

those that want to use an art-based metadata schema, but are not ready to commit to VRA
Core or CDWA for whatever reason. It gives a cataloger a place for art-specific
information, without asking for too much from a cataloger as far as supplied information
is concerned. In the future, I see the difference that CDWA-Lite has made between
indexed and display fields, as well as the difference between descriptive and
administrative metadata, one that will be copied across many schemas and platforms in
the future. While certainly other schemes are set up in a similar way, the overt control
over what is entered, in title alone, is one that makes it harder for encoding mistakes.
CDWA-Lite is an emerging schema, but one with so much potential for organizations
looking for something light and efficient.

Вам также может понравиться