Вы находитесь на странице: 1из 39

EADDY.

FINAL 4/1/2008 10:44:00 PM

NOTES

DATABASE COPYRIGHT ANALYSIS


FOR DUMMIES: PROTECTING YOUR
INTELLECTUAL INVESTMENT IS
ABOUT THE DESIGN, NOT THE DATA

Jason H. Eaddy*
Abstract: This Note argues that the legal tests for determining copyright
infringement in software should be used to analyze copyright claims over
computer databases. The Note provides readers with historical background
regarding copyright protection and compilations, focusing on the distinction
between a database and a mere collection of facts. It then discusses the
originality requirement for copyright protection and provides an in-depth
review of the database design process and the resultant sources of originality
present in databases. Thereafter, the Note provides an overview of the
software development process and the two main legal tests used in analyzing
software for copyright infringement purposes. Then, the Note discusses the
application of those tests to a database analysis, and finally the Note provides
an example of how a small software developer could use provisions of the
Digital Millennium Copyright Act to further protect databases used in its
products.

* Candidate for Juris Doctor, New England School of Law (2008). A.B., Computer
Science, cum laude, Princeton University (1998). For the past eight years, Mr. Eaddy has
provided testifying and non-testifying services in computer-related litigation matters as a
computer scientist at Elysium Digital in Cambridge, Massachusetts.

299
EADDY. FINAL 4/1/2008 10:44:00 PM

300 NEW ENGLAND LAW REVIEW [Vol. 42:299

INTRODUCTION
As technology usage in companies around the world continues to
expand, and as intellectual property becomes increasingly sought after, so
too does the need for lawyers to protect that property.1 Internet companies
spring up seemingly overnight, have an initial public offering,2 or are sold
for massive amounts of money.3 Regardless of what these companies
mission or product might have been, the major value of many of these
companies is their database of consumer information.4 Another source of
value related to databases is found where software companies distribute
software which uses databases to store both proprietary company
information and information created by the companys clients.5 Protecting
the intellectual property and data underlying those corporations becomes a
primary interest.6 Without protection, there is little to stop an employee
from misappropriating the intellectual property and launching a competing

1. See, e.g., D.M. Osborne, Yahoo 2.0, IP L. & BUS., Oct. 2006, at 48, available at
10/2006 IPLBUS 48 (Westlaw).
2. See, e.g., Why Optimism Could Last This Time, REVOLUTION, June 1, 2005, at 23,
available at 2005 WLNR 9984449 (describing venture capital funding of technology start-
ups reviving after the dot-com crash of the early 2000s).
3. See, e.g., Web Valuations Again, BUSINESS STANDARD, Nov. 21, 2006, at 15,
available at 2006 WLNR 20133169. The rapid manner in which many technology
companies grow adds to the necessity of understanding the ways in which the law can
protect the underlying value. For instance, YouTube was purchased by Google for $1.6
billion a mere eighteen months after its founding. Id. By comparison, the eighty-year-old
Readers Digest was sold for approximately the same amount. Id.
4. See Edward Robinson & Jonathan Thaw, YouTube, Facebook Spark Copycats,
Bubble Fear in Silicon Valley, BLOOMBERG.COM, Feb. 23, 2007, http://www.bloomberg.
com/apps/news?pid=20601109&sid=a8GnNzotH8IQ&refer=home (explaining the values
being paid for internet companies with large user bases and large amounts of data).
5. See, e.g., Lifeware TEK Launches I-Cook Recipe Management System, WORLDWIDE
DATABASES, Jan. 1, 2007, available at 2007 WLNR 1293 (desktop software which stores
recipes in a database locally and on a web server); Hy-Tek Sports Software for Swimming
and Track and Field, http://www.hy-tekltd.com (Hy-Tek is a software manufacturer
specializing in support of individual-focused sports, such as track and swimming, including
competition management, team organization, and workout planning. Each of Hy-Teks
products communicates with a database, the design of which is distributed to all purchasers
as a part of the software package). Furthermore, open source database software is a booming
industry providing free versions of database software that companies can use in their
software. See Barbara Grady, Startups Mainstream Open-Source Software, OAKLAND TRIB.,
Mar. 6, 2006, available at 2006 WLNR 3765381 (discussing various open source projects
including the widely used MySQL database and database juggernaut Oracles entry into this
arena).
6. See generally U.S. Patent and Trademark Office and AeA Join Together to Fight IP
Theft, Stop Fakes, http://www.aeanet.org/GovernmentAffairs/gajl_stopfakes0805.asp (last
visited Mar. 26, 2008).
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 301

product.7 In Feist Publications, Inc. v. Rural Telephone Service Co.,8 the


United States Supreme Court stated that [t]he most fundamental axiom of
copyright law is that [n]o author may copyright his ideas or the facts he
narrates.9 This presents the question of how a company can protect the
facts contained in a database. In an industry replete with business process
patents, software patents, and software copyright protection,10 protection of
a database itself, instead of the surrounding software and algorithms, is too
often overlooked.
Imagine that a provider of cafeteria-style meals to institutional
organizations needs software to manage the recipes and meals that it makes
each month. A small software company agrees to develop and license the
software to the larger corporation. Under this type of arrangement, the
software company retains the copyright to the software and any other
intellectual property developed as part of the project. The company designs
and develops the software and stores the underlying data in a Microsoft
Access database.11 The development process takes six months to complete.
The client is satisfied and uses the software productively for multiple years,
paying a yearly fee for upgrades and new features. At this point, the client
decides that it can develop its own version of the software and proceeds to
do so in two weeks. Unsurprisingly, the software company strongly
suspects that its former client has copied not only software aspects of its
program, but also the underlying database which stored the recipes. In the
ensuing litigation, both software and database copying is alleged.
In this instance there is no doubt that the alleged infringer had access
to the work.12 The source code and object code portions of the software
were certainly copyrightable,13 but was the database protectable by
copyright?14 Assuming the database was protectable under copyright law,
was the there a substantial similarity to the copyrighted work?15 While the

7. See id.
8. 499 U.S. 340 (1991).
9. Id. at 344-45 (quoting Harper & Row, Publishers, Inc. v. Natl Enters., 471 U.S.
539, 556 (1985)).
10. See Martin Goetz, Patents: Wheres the Invention?, COMPUTERWORLD, Mar. 6,
2006, at 24, available at 2006 WLNR 8808455 (describing the proliferation of software
patents and the fact that software is already protected by copyright).
11. See generally Microsoft Corp., Access Home Page, http://office.microsoft.com/en-
us/access/default.aspx (last visited Mar. 26, 2008).
12. See Walker v. Time Life Films, Inc., 784 F.2d 44, 48 (2d Cir. 1986) (stating that
proving copyright infringement requires showing access to the copyrighted work).
13. See infra Part IV (discussing copyright infringement analysis in software).
14. See infra Part II for a discussion of the requirements for copyright protection of a
database.
15. See Computer Assocs. Intl., Inc. v. Altai, Inc., 982 F.2d 693, 701 (2d Cir. 1992)
EADDY. FINAL 4/1/2008 10:44:00 PM

302 NEW ENGLAND LAW REVIEW [Vol. 42:299

tests for determining the requisite similarity between two pieces of


software have evolved through litigation, there is not a defined technical
test for determining copyright violations in databases.16
This Note provides an analysis of the theories behind database
protection17 and the necessary features a database must possess in order to
qualify for copyright protection in the United States.18 In order to
understand the requirements for protection, the Note reviews the originality
requirement imposed on compilations by the Supreme Court under Feist19
and reviews the treatment of this requirement in cases decided after Feist.20
The Note also gives an overview of the design process and the choices a
database designer makes while creating a database schema.21 Subsequently,
the Note presents the tests used for software copyright infringement
analysis22 and proposes that infringement analysis of database copying
utilizes a combination of these established tests.23 Finally, the Note presents
a simple method of protecting the intellectual property contained in a
database under protections provided by the Digital Millennium Copyright
Act.24

I. THEORIES BEHIND COPYRIGHT PROTECTION OF DATABASES


There are two primary theories of copyright protection for
databases: sweat-of-the-brow and originality.25 As this Note discusses,

(explaining that a defendants work must be substantially similar to the plaintiffs


copyrightable material); Walker, 784 F.2d at 48 (stating that proving a work was copied
requires showing substantial similarity between the copied work and the copyrightable
aspects of the works); Novelty Textile Mills, Inc. v. Joan Fabrics Corp., 558 F.2d 1090,
1092 (2d Cir. 1977) ([A] plaintiff may prove copying by showing access and substantial
similarity of the two works.).
16. See, e.g., Lynx Ventures, LLC v. Miller, 45 Fed. Appx. 68, 69-70 (2d Cir. 2002)
(unpublished decision) (overturning the District Courts application of the widely-criticized
total concept and feel test instead of looking to whether the protectable facts were simply
quantitatively and qualitatively sufficient to support a finding of [copyright] infringement
(internal quotation omitted) (alteration in original)).
17. See infra Part I.
18. See infra Part II.A-B.
19. See infra Part II.A.
20. See infra Part II.B.
21. See infra Part III.
22. See infra Part IV.
23. See infra Part V.
24. See infra Part VI.
25. See Jennifer Askanazi et al., The Future of Database Protection in U.S. Copyright
Law, 2001 DUKE L. & TECH. REV. 0017 1 (2001), http://www.law.duke.edu/journals/
dltr/articles/2001dltr0017.html.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 303

United States copyright law only grants protection for databases containing
a requisite amount of originality; this law does not provide protection to
databases which would only qualify under the sweat-of-the-brow theory.26
Under the sweat-of-the-brow theory, a person who expends
sufficient effort to obtain the data underlying a database is entitled to
protection of the data.27 This doctrine, also termed industrious collection,
state[s] that copyright protection could be provided to those who exert hard
work in compiling information.28 This philosophy means that if a second
person wishes to create a similar database, he must start from scratch and
obtain all of the data himself.29 The sweat-of-the-brow theory rewards the
effort rather than the creativity.30 This philosophy has been validated in
various European states under a European Council directive on database
protection.31
The second theory, loosely termed, the originality theory, holds that
databases must be sufficiently original to qualify for protection under
federal copyright law.32 Article I, section 8, clause 8 of the United States
Constitution gives Congress the authority to encourage scientific and
artistic progress by protecting an authors writings.33 This so-called
writings requirement is satisfied under federal copyright law by requiring
protected works to be fixed in any tangible medium of expression, now
known or later developed, from which they can be perceived, reproduced,
or otherwise communicated.34 Just as important to the copyright protection

26. See generally Feist Publns, Inc. v. Rural Tel. Serv. Co., 499 U.S. 340 (1991). Feist
addressed the fact that lower courts adopted the sweat of the brow or industrious
collection test and extended a compilation's copyright protection beyond selection and
arrangement to the facts. Id. at 352-53. For a more detailed discussion, see infra Part II.A.
27. See Robert Howell, Editorial, Using the Ideas but Not the Expressions from Abroad,
MANAGING INTELL. PROP., Feb. 2006 (discussing the originality requirement in the U.S. and
Canada), available at 2006 WLNR 4066365.
28. Patrick W. Ogilvy, Note, Frozen in Time? New Technologies, Fixation, and the
Derivative Work Right, 8 VAND. J. ENT. & TECH. L. 687, 716 (2006).
29. See id.
30. Id.; see also Feist, 499 U.S. at 344 ([F]acts are not copyrightable; . . . compilations
of facts generally are.).
31. See Council Directive 96/9, art. 7, 1996 O.J. (L 77) 20 (EC), available at
http://www.wipo.int/clea/docs-new/pdf/en/eu/eu005en.pdf.
32. See Feist, 499 U.S. at 345.
33. U.S. CONST. art. I, 8, cl. 8 (Congress shall have Power . . . To promote the
Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors
the exclusive Right to their respective Writings and Discoveries.) (emphasis added).
34. 17 U.S.C. 102(a) (2000). [A]lthough the word writings might be limited to
script or printed material, it may be interpreted to include any physical rendering of the
fruits of creative intellectual or aesthetic labor. Goldstein v. California, 412 U.S. 546, 561
(1973).
EADDY. FINAL 4/1/2008 10:44:00 PM

304 NEW ENGLAND LAW REVIEW [Vol. 42:299

qualification, this work must be an original work by the author.35 These


original works covered under federal copyright law include both individual
works and compilations.36
A compilation is defined as a work formed by the collection and
assembling of preexisting materials or of data that are selected,
coordinated, or arranged in such a way that the resulting work as a whole
constitutes an original work of authorship.37 The protection ostensibly
provided under these sections is limited by section 103(b) which states that
[t]he copyright in a compilation . . . extends only to the material
contributed by the author of such work, as distinguished from the
preexisting material employed in the work, and does not imply any
exclusive right in the preexisting material.38 Thus, it is the work added to
the material contained in the compilation, and not the material itself, that is
protected.39 However, as the next section will discuss, merely collecting the
facts into a compilation is not enough to qualify a work for copyright
protection.
The recipe database discussed previously could fall into both the
originality and sweat-of-the-brow categories. If the recipes had been
gathered from the public domain and assimilated into the database, then a
court following the sweat-of-the-brow doctrine would allow protection of
the collected recipes.40 Under the originality theory, the database could be
protected based upon the fact that the recipes contained within are selected
or organized in an original manner.41

II. THE ORIGINALITY REQUIREMENT FOR COMPILATIONS

A. Feist Publications, Inc. v. Rural Telephone Service Company42

1. Factual Background
In Feist, Rural Telephone Service Co. (Rural) brought suit against
Feist Publications, Inc. (Feist), alleging that Feist unlawfully copied a
telephone directory.43 Rural was a local telephone provider that published a

35. Feist, 499 U.S. at 345 (The sine qua non of copyright is originality.).
36. 17 U.S.C. 102(a), 103(a).
37. Id. 101.
38. Id. 103(b).
39. Id.
40. See supra notes 27-31 and accompanying text.
41. See supra notes 32-39 and accompanying text.
42. 499 U.S. 340 (1991).
43. Id. at 344.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 305

local telephone directory, including both white and yellow pages.44 This
lawsuit focused solely on the listings from the white pages portion of the
directory.45 Like any other white pages listing, Rural provided the names of
its subscribers, in alphabetical order, along with the subscribers towns and
telephone numbers.46 This listing was distributed to all of Rurals
subscribers free of charge.47
In contrast to Rurals local listing, Feist published listings of white
and yellow pages covering a much larger area consisting of multiple
smaller telephone service areas, including the area covered by Rurals
listings.48 Out of eleven local telephone companies providing service in the
areas covered by Feists listings, only Rural refused to license its listings to
Feist for the larger publication.49 Because Feist itself was not a telephone
service provider, the company did not have a direct resource to obtain the
data from Rurals customers for Feists book.50 Instead of leaving the Rural
listings out or continuing to negotiate with Rural for its data, Feist hired
personnel to process Rurals records, exclude any listings outside Feists
intended publication area, and gather additional information, such as street
addresses, for each listing.51 Feist then published its book, including the
listings obtained without a license from Rural.52 Rural subsequently
brought a copyright action as a result of Feists use of Rurals
information.53

2. Holding
In deciding Feist, the Court reiterated that facts themselves are not
protectable, but that compilations of facts are protectable.54 Here, the two
telephone books constituted compilations, the entirety of each being
copyrightable.55 Federal copyright law protects compilations under section
103(a),56 and limits that protection, under section 103(b), to the work added

44. Id. at 342.


45. See id.
46. Id. The fact that every other white pages directory publishes a list of names and
phone numbers in alphabetical order is actually the downfall of Rurals claim. See id. at 350.
47. Id. at 342.
48. See Feist, 499 U.S. at 342-43.
49. Id. at 343.
50. Id.
51. Id. at 343-44.
52. Id. at 344.
53. Id.
54. Feist, 499 U.S. at 344.
55. See id. at 356.
56. 17 U.S.C. 103(a) (2000). The subject matter of copyright as specified by section
EADDY. FINAL 4/1/2008 10:44:00 PM

306 NEW ENGLAND LAW REVIEW [Vol. 42:299

to the collection.57 Therefore, while collections of otherwise


uncopyrightable facts are not in and of themselves copyrightable, original
arrangements of those facts are protected.58
Furthermore, the Court held that the originality requirement found in
copyright law was imposed by the Constitution itself.59 The Constitution
provides Congress with the authority to protect the writings of
authors.60 Feist states that the definition of these terms in previous cases
made it unmistakably clear that these terms presuppose a degree of
originality.61 The Court went on to reiterate its previous statements that
copyright was limited to original intellectual conceptions of the
author62 and the requirement that an author seeking relief for a copyright
violation prove the existence of those facts of originality, of intellectual
production, of thought, and conception.63 Therefore, Feist teaches that
compilations must be an original creation derived through intellection
production, and the Court found that Rurals telephone listings were not
copyrightable as they did not meet the originality requirement.64
In the scope of copyright law, original means only that the work was
independently created by the author . . . and that it possesses at least some
minimal degree of creativity.65 However, [o]riginality does not signify

102 includes compilations and derivative works. Id.


57. Id. 103(b). The copyright in a compilation or derivative work extends only to the
material contributed by the author of such work, as distinguished from the preexisting
material employed in the work, and does not imply any exclusive right in the preexisting
material. Id.
58. Feist, 499 U.S. at 345-46. For instance, scores and statistics from sporting events are
simply facts and therefore not copyrightable, whereas a description of the game provides an
organized expression of these facts and is copyrightable. See, e.g., Natl Basketball Assn v.
Motorola, Inc., 105 F.3d 841, 847 (2d Cir. 1997).
59. Feist, 499 U.S. at 346.
60. U.S. CONST. art. I, 8, cl. 8 (Congress shall have Power . . . To promote the
Progress of Science and useful Arts, by securing for limited Times to Authors . . . the
exclusive Right to their respective Writings . . . .).
61. Feist, 499 U.S. at 346.
62. Id. (quoting Burrows-Giles Lithographic Co. v. Sarony, 111 U.S. 53, 58 (1884)).
63. Id. at 346-47 (quoting Burrows-Giles, 111 U.S. at 59-60).
64. Id. at 362.
The selection, coordination, and arrangement of Rurals white pages do
not satisfy the minimum constitutional standards for copyright
protection. As mentioned at the outset, Rurals white pages are entirely
typical. . . . The end product is a garden-variety white pages directory,
devoid of even the slightest trace of creativity.
Id.
65. Id. at 345.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 307

novelty.66 [A] work may be original even though it closely resembles


other works so long as the similarity is fortuitous, not the result of
copying.67 Thus, if a work appears to be copied from another source, but is
actually created separately, the later work would not constitute
infringement of the previous works copyright.
With particular regard to databases, the conclusion in Feist gives a
database owner a description of three potential sources of copyright
protection.68 While a database often contains nothing but uncopyrightable
facts, the author may still infuse a degree of originality in its creation.69
First, the author . . . chooses which facts to include in the compilation.70
Second, the author chooses the order in which to place the information.71
And, third, the author decides how to arrange the information so that it
may be used effectively.72 However, even if the collective work is
granted copyright protection, a subsequent compiler remains free to use
the facts contained in [the collective work] to aid in preparing a competing
work, so long as the competing work does not feature the same selection
and arrangement.73 While the selection of material may involve more of
what is traditionally regarded as the subject matter of copyright, the latter
two sources of database copyright protection fall squarely within the realm
of the database administrator.

B. Elaborating on the Feist Originality Requirement and Its


Implications

1. Marketing Puffery and Abbreviations: Montgomery County


Assn of Realtors v. Realty Photo Master Corp.74
The Montgomery County Association of Realtors (MCAR)
controlled a multiple listing service (MLS) database containing real
estate listings for Maryland.75 MCAR customers used an access code
through their computers to view the MLS listings.76 Realty Photo Master

66. Id.
67. Feist, 499 U.S. at 345.
68. Id. at 348. The Court actually mentions three such items: selection, order, and
arrangement. Id.
69. See id.
70. Id.
71. Id.
72. Id.
73. Feist, 499 U.S. at 349.
74. 878 F. Supp 804 (D. Md. 1995).
75. Id. at 808.
76. Id.
EADDY. FINAL 4/1/2008 10:44:00 PM

308 NEW ENGLAND LAW REVIEW [Vol. 42:299

Corporation (RPM) began providing photography services to MCAR


members.77 Each day, RPM used the access code of an MCAR customer to
view the new listings in the MLS, photographed the properties, added the
pictures to its own computer system, and then used software to distribute
the pictures to its clients.78 RPM gave its customers software that would
combine the pictures of the listings with the data from MCARs MLS.
Unsurprisingly, MCAR was not pleased with this arrangement and brought
suit against RPM for, among other claims, copyright infringement of the
data contained in the MLS compilation.79
As could be expected, RPM argued that under Feist, the database was
not copyrightable as the data at issue were merely facts relating to the
various real estate properties being listed in the MLS database.80
Disagreeing, the Court found that a valid copyright existed in the MLS
database.81 The minimal degree of creativity required by Feist was found
in the marketing puffery which accompanied the basic facts as well as
the unique and elaborate system of abbreviations used in organizing the
database.82 The fact that each real estate listing record contained factual
information, in addition to the original information added by MCAR, did
not preclude copyright protection on the database.83

2. Originality in Formatting: Engineering Dynamics, Inc. v.


Structural Software, Inc.84
This case presented a question of whether the input formats to a
computer program, and output formats created by a computer program,
could be copyrighted.85 In this case, Engineering Dynamics, Inc. (EDI)
chose to copyright the input and output formats to its software rather than
the software itself.86 The court rejected Structural Softwares argument that
these input formats were merely unoriginal organizations of facts and
therefore did not meet the Feist requirements:

77. Id.
78. Id. at 808-09.
79. Id. at 809.
80. Montgomery County, 878 F. Supp at 810.
81. Id.
82. Id.
83. Id.
84. 26 F.3d 1335 (5th Cir. 1994).
85. See id. at 1340.
86. Id. at 1339. The formats in question are specifications regarding how the data
operated on by the program and the data that is given back to the user are organized. See id.
at 1338. The . . . input formats instruct the user to place specific kinds of information in a
specific place on the card. Id.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 309

What appears on EDIs input and output formats, however, are


not any kind of formulas or facts as such, but organized,
descriptive tables for entry of data on which the computer will
perform necessary calculations. Facts are entered by the user
and factual algorithms are applied by the computer, but the
appearance and expression of the user interface are not
themselves a representation of facts.87

In its discussion, the court described the organization of these formats


allowing for the entry of data which aids the computer in performing
necessary calculations.88 The court further ordered that the district court, in
determining the potential copyright infringement, needed to consider the
extent to which the industry standards dictated how these formats were
actually arranged.89 Both of these facts are closely analogous to both the
usage and purpose of a computer database.90

3. Clearing the Air: Assessment Technologies of Wisconsin,


L.L.C. v. WIREdata, Inc.91
More recently, Judge Posner issued an opinion for the Seventh Circuit
which squarely addressed some of the questions surrounding copyright
protection of databases.92 In this matter, Assessment Technologies created
software, called Market Drive, to assist municipalities in collecting and
organizing real estate assessment data.93 This software would allow
assessors to enter property data through a computer interface and later
present that data in various forms.94 As in the recipe software hypothetical
presented previously, Market Drive stored the data entered into the
program in a Microsoft Access database.95 WIREdata wanted access to the
raw assessment data for a portion of southeastern Wisconsin.96 Assessment
Technologies brought suit for copyright infringement and filed for, and
obtained, a preliminary injunction preventing the municipalities from
letting WIREdata obtain the raw data,97 and this appeal followed. In its

87. Id. at 1345.


88. See id.
89. Id. at 1347.
90. See infra Part IV.
91. 350 F.3d 640 (7th Cir. 2003).
92. See id. at 641-44.
93. See id. at 642-43.
94. Id.
95. Id. at 642.
96. Id.
97. Assessment Techs., 350 F.3d at 642.
EADDY. FINAL 4/1/2008 10:44:00 PM

310 NEW ENGLAND LAW REVIEW [Vol. 42:299

opinion, the court ruled that WIREdata should be given the raw assessment
data from the Market Drive database; it reversed the judgment and ordered
the district court to dismiss the copyright claim in its entirety.98
Instead of finding a lack of originality in the storage of the data, the
court fell back to the lack of originality in the data itself.99 Additionally,
Assessment Technologies could not assert ownership of this data as it was
in the public domain and gathered by public workers.100 In fact, Wisconsin
had an open records law requiring that public records be provided to
anyone who would pay the copying fee.101 However, there was a specific
carve-out in the law to prevent government agencies from being required to
produce copyrighted data.102 Here, the court found that the basic facts about
the houses, provided in a manner different from how they were stored by
the Market Drive software, were merely uncopyrightable facts and
therefore should have been produced.103
Even so, the opinion makes it clear that the database itself was
protected under copyright even though Assessment Technologies failed to
protect the underlying data.104 Indeed, the court stated that WIREdatas
appeal g[ot] off on the wrong foot, with the contention that Market Drive
lack[ed] sufficient originality to be copyrightable.105 The court realized
that no other assessment software stored its information in a database
containing 456 fields grouped into . . . 34 categories106 and that this
structure was not obvious or inevitable.107 Therefore, Assessment
Technologies held a valid copyright in the Market Drive program and
underlying database based upon the originality in the design of the database
itself, as opposed to the data contained inside.108

98. Id. at 647-48.


99. See id. at 646. Mere facts are not protectable under copyright law. Feist Publns, Inc.
v. Rural Telephone Serv. Co., 499 U.S. 340, 344 (1991).
100. Assessment Techs., 350 F.3d at 644, 646.
101. Id. at 642.
102. Id.
103. Id. at 644.
104. See id. at 642-43.
105. Id. at 643.
106. Assessment Techs., 350 F.3d at 643. In the design of a database, the fields could also
be called columns and the groupings are called tables. See infra Part III.
107. Assessment Techs., 350 F.3d at 643. Indeed, the court references the Feist decision
by stating that if Assessment Technologies had simply organized the data in alphabetical or
numerical order, there would not be a valid copyright because the storage would lack the
requisite originality. Id.
108. See id.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 311

III. THE DATABASE DESIGN PROCESSOPPORTUNITIES FOR ORIGINALITY


IN THE CREATION OF A DATABASE SCHEMA

A fundamental and mistaken assumption in the legal field is that a


database is nothing more than a collection of facts.109 Database systems
manage large bodies of information.110 From a computer science
perspective, these large bodies of data are stored using a database
management system or DBMS.111 Such a system consists of a collection
of interrelated data and a set of programs to access those data.112 A single
collection of data is a database.113
One of the main purposes of a database management system is hiding
the underlying storage of data from the end users, provid[ing] users with
an abstract view of the data.114 This separation gives freedom with regard
to how the data is physically stored on the computers and what specific
DBMS software is actually used.115 To achieve this goal, a DBMS divides a
database into separate levels of abstraction. The three major levels of
abstractionconstituting a spectrum from lowest to highestare the
physical level, logical level, and view level.116
The physical level of storage refers to how the data is actually stored
on the computer media.117 The DBMS software determines how the data is
physically stored, meaning that an author or designer of a database does not
generally have control over how the data is actually stored in the
computer.118 Because the user lacks control of the physical storage, the user
cannot provide originality in how the data is physically stored on the

109. See, e.g., id. (appellant argued that there was no valid copyright in a database
because of the public facts contained therein); Matthew Bender & Co. v. West Publg Co.,
158 F.3d 693, 698-99 (2d Cir. 1998) (describing aspects of Wests legal database as a
collection of uncopyrightable facts, and finding that there was not enough originality in the
presentation for copying to constitute infringement) .
110. ABRAHAM SILBERSCHATZ ET AL, DATABASE SYSTEM CONCEPTS 1 (3d ed. 1996).
111. See id.
112. Id.
113. Id.
114. Id. at 4.
115. See id.
116. See SILBERSCHATZ ET AL., supra note 110, at 4.
117. See id. Usually, the underlying data will be stored on one or more hard disk drives,
but these files could be stored in non-volatile RAM. See, e.g., McObject LLC, Embedded
Databases for Real-Time Military and Aerospace, http://www.mcobject.com/milaero.shtml
(last visited Mar. 26, 2008).
118. See SILBERSCHATZ ET AL., supra note 110, at 4; PostgreSQL, PostgreSQL 8.1:
Database Physical Storage, http://www.postgresql.org/docs/8.1/interactive/storage.html (last
visited Mar. 26, 2008) (describing how the PostgreSQL system stores the files for each
database stored in the system and does not describe how a user could change those files).
EADDY. FINAL 4/1/2008 10:44:00 PM

312 NEW ENGLAND LAW REVIEW [Vol. 42:299

computer, and therefore cannot qualify for copyright protection at this


level.119 Hence, we must turn to the other levels of abstraction in order to
find computer science-based support for copyright protection in databases.
Even though other types of databases exist, the typical database used
in companies and software today is a relational database.120 Relational
databases can trace their development back to a paper published by E.F.
Codd in 1970 describing a system in which end-users would not be aware
of the underlying data structure of the system.121 At this point in their
development, relational databases are supported by a substantial amount of
theory underlying how data is stored, organized, and retrieved122meaning
that similarities between database schemas will be present, even in
separately developed schemas. The deep theoretical base surrounding
relational database systems is part of the reason for the general assumption
that any given database is in fact a relational database, and these both have
led to the creation of a large number of relational database systems
appearing on the market.123 Relational databases provide a wealth of
opportunity for an author to supply the type of originality required under
Feist.124
Thus from a copyright law perspective, a lawyer must think of a
database as not merely a collection of uncopyrightable facts,125 but also as a
specifically designed arrangement of data, of which the organization and
expression are protectable by copyright law.126

119. See Feist Publns, Inc. v. Rural Tel. Serv. Co., 499 U.S. 340, 345, 348 (1991).
120. See SILBERSCHATZ ET AL., supra note 110, at 63.
121. E.F. Codd, A Relational Model of Data for Large Shared Data Banks, 13 COMM. OF
THE ACM 377 (1970), available at http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf.
122. See SILBERSCHATZ ET AL., supra note 110, at 63.
123. As examples of commercial, closed-source database products, see the following
websites: Microsoft, Microsoft SQL Server, http://www.microsoft.com/sql/default.mspx
(last visited Mar. 26, 2008); Microsoft, Microsoft Access Homepage, http://
office.microsoft.com/en-us/access/FX100487571033.aspx (last visited Mar. 26, 2008);
Oracle, Oracle Database 11g, http://www.oracle.com/features/hp/11g-generalavailability
.html (last visited Mar. 26, 2008). Additionally, open-source, freely-available database
products can be found here: Firebird Foundation, Inc., Firebird - Innovative RDBMS Thats
Going Where Youre Going, http://www.firebirdsql.org/index.php (last visited Mar. 26,
2008); MySQL Ab, MySQL Products, http://www.mysql.com/products/ (last visited Mar.
26, 2008); PostgreSQL, PostgreSQL: About, http://www.postgresql.org/about/ (last visited
Mar. 26, 2008).
124. See infra Part IV.
125. See supra text accompanying note 109.
126. See infra Part III.A.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 313

A. Logical Level Abstractions in Relational Databases: The Primary


Location of Original, Intellectual Creation in Database Design
As discussed above, the logical level is the middle of the three levels
of abstraction.127 The logical abstraction level sits above the physical level
and provides a description of what data is stored in the database and any
relationships between the data.128 This abstraction level is where the author
of a databaseoften called a database analyst or designerincludes what
types of data are stored in the database and how these pieces of data are
organized in relation to one another.129 When considering the logical level,
the analysis focuses on the database schema (the definition of how data is
stored in the database) rather than a database instance (the data contained
within a particular version of the database at a point in time).130
The highest level of logical groupings inside a database schema are
called tables.131 A table defines a specific grouping of values sets.132
Each of these values is stored in a column, alternatively called a
field.133 Tables can be thought of as similar to a spreadsheet, where each
row in a table contains properties about a single item,134 and each column
of a particular row contains some part of the information that defines the
particular item.135 A database can contain any number of tables and each
field in a table is given a unique name.136 Each of these aspectsthe
number of tables, the contents of tables, and the table namesare defined
by the database designer before individual pieces of data are actually stored
in the database.137 The choices made in grouping data into tables provide
high-level opportunities for originality of expression necessary for

127. See supra Part III.


128. SILBERSCHATZ ET AL., supra note 110, at 4.
129. See id.
130. See id. at 65.
131. See id. at 63.
132. See id.
133. ALLIGATOR DESCARTES & TIM BUNCE, PROGRAMMING THE PERL DBI 53 (2000);
Microsoft, Database Design Basics, http://office.microsoft.com/en-us/access/HA012242471
033.aspx (last visited Mar. 26, 2008).
134. DESCARTES & BUNCE, supra note 133, at 54.
135. See id.
136. See SILBERSHATZ ET AL., supra note 110, at 63.
137. Database Design Basics, supra note 133 (describing good database design practices
and discussing the resulting benefits). However, most database systems provide methods for
modifying the defined structure of the database after the original definition and after data
has been inserted into the database. See, e.g., Microsoft, Add a Field to a Table,
http://office.microsoft.com/en-us/access/HA100728831033.aspx?pid=CH100645681033
(last visited Mar. 26, 2008).
EADDY. FINAL 4/1/2008 10:44:00 PM

314 NEW ENGLAND LAW REVIEW [Vol. 42:299

copyright protection.138
The type of information contained in a tables fields is defined when
the table is created.139 A field can contain a wide variety of data types
including numbers, dates, times, a single character, a string field
consisting of multiple characters, and numerous other data types.140 These
data types can be further refined by the size of data they can hold, i.e. a
small integer (1-byte, can hold values up to the number 255), a large
integer (4-bytes, can hold values up to 4,294,967,295), or a 20-character
string.141 Again, the naming of columns and the selection of the particular
data types when defining columns in a table present potential choices for a
database designer, and therefore additional examples of originality
involved in the design of a database schema.
After the database designer determines the data to be stored in a
particular table, a further step in the table definition involves specifying the
keys, or indices and constraints placed on the table.142 Each table may, and
usually should, have a single primary key, consisting of one or more
columns where the value of the column is unique for any given row of data
contained in the table.143 For example, primary keys could be unique part
numbers, vehicle identification numbers of cars, or social security
numbers.144 Alternatively, most relational DBMS solutions provide an
auto-generated primary key, which will populate the primary key field for a
given table with a unique number.145 The unique property of a primary key
means that the database system can store a link to the data for fast retrieval
of a particular row from storage.146
Similarly, a database designer creates indices on tables to facilitate
fast retrieval of data upon requests for data matching particular
properties.147 A library card catalog system for looking up books by authors

138. See Assessment Techs. of Wis., L.L.C. v. WIREdata, Inc., 350 F.3d 640, 643 (7th
Cir. 2003) (defining the grouping of fields as satisfying the originality requirement of Feist).
139. See Microsoft, Add a Field to a Table, http://office.microsoft.com/en-
us/access/HA100728831033.aspx?pid=CH100645681033 (last visited Mar. 26, 2008).
140. See, e.g., POSTGRESQL GLOBAL DEV. GROUP, POSTGRESQL 8.2.0 DOCUMENTATION
91 (2006).
141. See, e.g., id.
142. Database Design Basics, supra note 133; see DATAMIRROR MOBILE SOLUTIONS,
INC., POINTBASE DEVELOPERS GUIDE VERSION 4.8 at 76 (2004), http://dlc.sun.com/pdf/817-
7464/817-7464.pdf.
143. See Database Design Basics, supra note 133.
144. See id.
145. See, e.g., POSTGRESQL GLOBAL DEV. GROUP, supra note 140, at 95-96.
146. See SILBERSCHATZ ET AL., supra note 110, at 339.
147. See id. An index allows the database server to find and retrieve specific rows much
faster than it could do without an index. PostgreSQL, Documentation: Manuals:
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 315

is a good example of an index.148 Instead of looking through the entire


catalog book-by-book, a person finds the authors card which points to the
particular books written by the author.149 In creating an index, the designer
gives the index a name and chooses the column or columns to be included
in the index.150
Constraints are another method a designer uses to determine what
data may and may not be contained in a table.151 While the column types
provide some level of control over what can be contained in a database
instance, constraints allow for a more fine-grained control of the data.152
Constraints can be placed on individual columns or sets of multiple
columns.153 Much like indexes, constraints are given distinct names.154
These constraints can require the data on a set of columns within a table to
be unique within the table, such that no two rows in a table can have the
same data in the columns subject to the unique constraint.155
Alternatively, the constraints can consist of checking logical expressions on
data, such as an integer field only containing numbers between one and
ten.156 The constraints placed on a table determine what data can appear in
a particular database instance157 and are another method by which a
database author changes a database from a mere collection of facts into a
creative expression.
Finally, a designer defines relationships between the tables of a
database.158 Based upon the name, it is unsurprising that relationships
specify how the data from a row of one table relates to, or is connected to,

PostgreSQL 8.2: Indexes, http://www.postgresql.org/docs/8.2/static/indexes.html (last


visited Mar. 26, 2008).
148. SILBERSCHATZ ET AL., supra note 110, at 339.
149. Id.
150. PostgreSQL, Documentation: Manuals: PostgreSQL 8.2: CREATE INDEX, http://
www.postgresql.org/docs/8.2/static/sql-createindex.html (last visited Mar. 26, 2008).
151. See PostgreSQL, Documentation: Manuals: PostgreSQL 8.2: Constraints, http://
www.postgresql.org/docs/8.2/static/ddl-constraints.html (last visited Mar. 26, 2008).
152. See id.
153. PostgreSQL, Documentation: Manuals: PostgreSQL 8.2: CREATE TABLE, http://
www.postgresql.org/docs/8.2/static/sql-createtable.html (last visited Mar. 26, 2008).
154. See id.
155. PostgreSQL, Documentation: Manuals: PostgreSQL 8.2: Constraints: Unique
Constraints, 5.3.3, http://www.postgresql.org/docs/8.2/static/ddl-constraints.html (last
visited Mar. 26, 2008).
156. See id.
157. See id.
158. See Microsoft, Database Design Basics, http://office.microsoft.com/en-us/access/
HA012242471033.aspx#TableRelationships (last visited Mar. 26, 2008).
EADDY. FINAL 4/1/2008 10:44:00 PM

316 NEW ENGLAND LAW REVIEW [Vol. 42:299

the data from a row in another table in the database.159 In the related table,
the field being referenced is called a foreign key.160 There are three types
of relationships that can be defined between tables: one-to-one, one-to-
many, or many-to-many.161 As their names suggest, these relationships
define data that matches from one object to only one other object, from one
object to multiple other objects, or from multiple objects to multiple objects
respectively.162
Because primary keys, indexes, constraints, and relationships allow
for the efficient storage of the data and correlation with the other tables in
the database, their usage and definitions supply the originality in the design
which qualifies the database for copyright protection.
Computer science theory has determined how to most efficiently
organize the data, but real-world designs usually differ from the theoretic
perfection.163 The primary reason for this difference is improved
performance.164 The process of moving the database from a structure that
stores the intended information to one that stores it efficiently is called
normalization.165 While more exist, the most common forms of
normalization are first, second, and third normal form.166 The theoretical
details behind these forms are not particularly relevant to this Note; the
point is that there are theoretical reasons for a number of the choices made
by a designer when creating a database schema167 and that these choices
can result in commonalities between otherwise independently developed
databases.

159. See id.


160. See id.
161. See id.
162. See the example database infra at Part III.B for an example of a relationship.
163. See DAVE ENSOR & IAN STEVENSON, ORACLE DESIGN 70-71 (1997); CRAIG S.
MULLINS, DATABASE ADMINISTRATION: THE COMPLETE GUIDE TO PRACTICES AND
PROCEDURES 108 (2002) (discussing the mathematical principles of set theory behind
database organization).
164. ENSOR & STEVENSON, supra note 163, at 70.
165. See id. at 70-71.
166. See Microsoft, Database Design Basics, http://office.microsoft.com/en-
us/access/HA012242471033.aspx#RefineDesign (last visited Mar. 26, 2008).
167. Normalization is a prime example of a database design practice which can create
similarities between otherwise completely original and independent databases because the
practice suggests a method of organizing data that reduces the likelihood of inconsistent
data by limiting the amount of redundancy present in the eventual storage of data in the
database. See MULLINS, supra note 163, at 108-15 (describing normalization, the five
normal forms, and the use of normalization in everyday practice).
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 317

B. A Simple Database Schema Example


This section presents an example of a database schema. This schema
represents a simple database for tracking grades in a school. In order to
track the classes, students, and grades, the schema needs three tables, which
we will name Classes, Students, and Grades respectively.
The Classes table will need to have the title of the class, name of the
instructor, semester, and year. The Students table contains the first name,
last name, middle initial, and date of birth of the student. Additionally, each
of these tables will need a primary key column which will be auto-
generated by the relational database management software.168 The columns
will be named in a consistent fashion across tables.169
The Grades table is the first table where a relationship is required.
There is a one-to-many relationship between the Classes table and the
Grades table, and there is another one-to-many relationship between the
Students table and the Grades table. These fields are foreign key fields
because they contain the value of the primary key of another table.170
Additionally, the Grades table needs a number to represent the grade.
Finally, the primary key of the Grades table is also an auto-generated
number.
The following tables provide a graphical illustration of this three table
database schema:

Classes
Name Data Type Key?
Id Auto Number Primary
Title Character(60)
name_instructor Character(30)
Semester Character(6)
Year Integer

168. See, e.g., PostgreSQL, Documentation: Manuals: PostgreSQL 8.2: Numeric Types,
8.1.4, http://www.postgresql.org/docs/8.2/static/datatype-numeric.html#DATATYPE-SERI
AL (last visited Mar. 26, 2008).
169. See Drew Georgopulos, Develop a Consistent Naming Convention for Your
Database Objects, DEVX.COM, Feb. 10, 2003, http://www.devx.com/dbzone/Article/
10866/1954 for an article addressing naming conventions in databases. The larger a
database grows and the more databases a corporation uses in its business, the more
important it becomes to have a consistent naming scheme. See id.
170. Microsoft, Database Design Basics, http://office.microsoft.com/en-us/access/
HA012242471033.aspx (last visited Mar. 26, 2008).
EADDY. FINAL 4/1/2008 10:44:00 PM

318 NEW ENGLAND LAW REVIEW [Vol. 42:299

Students
Name Data Type Key?
Id Auto Number Primary
name_first Character(20)
name_last Character(30)
Middle_initial Character(1)
date_birth Date

Grades
Name Data Type Key?
Id Auto Number Primary
student_id Integer Foreign
class_id Integer Foreign
Grade Integer

As discussed above, in addition to defining the types of data to be


stored in each column of each table, the schema also reflects more specific
constraints. In this case, the Grades table would have a constraint on the
grade field requiring the value to be between 0 and 100. The Classes
table could have a constraint limiting the years to ones after the system is
implemented. The constraints are meant to maintain data integrity
according to rules in the underlying system, and, as such, constraints can
vary wildly from schema to schema according to each systems
requirements.171

C. Queries, Views, Triggers, and Functions


In addition to the table definitions present in the database schema,
most relational database management systems have additional features
which provide further areas of comparison and originality.172 A
programmer issues queries to the database to manipulate the information in
the database.173 A query can be expressed in SQL, or Structured Query

171. See infra note 308 and accompanying text.


172. See, e.g., MySQL AB, MySQL 5.0 Reference Manual, http://www.mysql.org/
doc/refman/5.0/en/index.html (last visited Mar. 26, 2008). The table of contents for the
manual lists stored procedures and functions, triggers, and views amongst its features. Id.
173. See Business Objects, S.A. v. Microstrategy, Inc., 393 F.3d 1366, 1368 (Fed. Cir.
2005).
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 319

Language.174 A query can insert or update rows in a table or select rows


from a table matching certain characteristics.175 Queries can be created in a
program and passed to the database.176 Queries can also be defined and kept
in the database to be referenced by name.177 Further, a database can define
functions which take an input value and generate some form of output
value based upon the input and/or information already contained in the
database.178 These functions can subsequently be used by programmers
accessing the database. A trigger is a function that is run on the occurrence
of some event, often the addition or deletion of a record in a table.179

D. Conclusion of Database Design Discussion


The purpose of describing the database design process in this level of
detail is to illustrate the numerous places where a lawyer can find
originality within a database schema. In considering both of these details,
one needs to remember the apparent choices in the creation of a database
schema are sometimes actual creative choices, and other times, are dictated
by theory. When there is an actual design decision made by the database
author, this creativity in the database schema serves two purposes: (1)
fulfilling the requirement laid down in the Feist decision180 and (2)
providing a starting location for performing an analysis of copyright
infringement.181

IV. LITERAL AND NON-LITERAL COPYRIGHT INFRINGEMENT ANALYSIS IN


SOFTWARE
Showing copyright infringement requires a plaintiff to show that the
defendants work is substantially similar to the copyrighted work in
question, and a computer program is no different.182 A computer
program, or software, is explicitly defined under Title 17 as a set of

174. See id.


175. See id.
176. See, e.g., Sun, JDBC Overview, http://java.sun.com/products/jdbc/overview.html
(last visited Mar. 26, 2008).
177. See, e.g., Microsoft, Create a Simple Select Query, http://office.microsoft.com/en-
us/access/HA100474921033.aspx?pid=CH100645771033 (last visited Mar. 26, 2008).
178. See, e.g., PostgreSQL, User-Defined Functions, http://www.postgresql.org/docs/
8.2/static/xfunc.html (last visited Mar. 26, 2008).
179. See, e.g., MySQL AB, Triggers, http://www.mysql.org/doc/refman/5.0/en/triggers
.html (last visited Mar. 26, 2008).
180. See supra Part II.A.2.
181. See infra Part V for a discussion of copyright infringement analysis testing for
database schemas.
182. See Computer Assocs. Intl, Inc. v. Altai, Inc., 982 F.2d 693, 701 (2d Cir. 1992).
EADDY. FINAL 4/1/2008 10:44:00 PM

320 NEW ENGLAND LAW REVIEW [Vol. 42:299

statements or instructions to be used directly or indirectly in a computer in


order to bring about a certain result.183 Much like how a database provides
a service through its ability to efficiently store and retrieve data, a computer
program serves a particular purpose apart from the copyrighted expression
(for example, creating documents in Microsoft Word or spreadsheets in
Microsoft Excel).184 This dual nature of expression and function
complicates the copyright analysis.
Copyright law does not protect ideas, a principle which originally
stems from the United States Supreme Courts 1879 decision in Baker v.
Selden.185 This common law ruling is now incorporated into statute: In no
case does copyright protection for an original work of authorship extend to
any idea, procedure, process, system, method of operation, concept,
principle, or discovery, regardless of the form in which it is described,
explained, illustrated, or embodied in such work.186 As copyright does not
protect ideas, part of any copyright analysis of software requires separating
the idea from the expression.187
This section discusses the two primary cases dealing with the subject
of substantial similarity analysis in software. In order to understand the
analysis this Note proposes, it is helpful to provide a brief background on
the software creation process and the copyright tests that courts have
applied to software copyright analysis.

A. Software Development Background


Simply put, software engineering is [t]he design and development of
software.188 The software development cycle consists of analysis, design,
software programming and testing, and implementation.189 In an ideal
situation, after the requirements are gathered and the system is designed,
programmers create software by writing instructions in one or more
programming languages.190 A programming language is an artificial
language that can be used to define a sequence of instructions that can

183. 17 U.S.C. 101 (2000).


184. See Computer Assocs., 982 F.2d at 704.
185. See 101 U.S. 99, 104-05 (1879) (determining that the accounting forms produced in
a book on an accounting system were merely an aid to understanding the idea taught by the
work, and therefore were not protectable by copyright).
186. 17 U.S.C. 102(b) (2000).
187. See Computer Assocs., 982 F.2d at 704 (The essentially utilitarian nature of a
computer program further complicates the task of distilling its idea from its expression.).
188. PETER AIKEN ET AL., MICROSOFT COMPUTER DICTIONARY 489 (5th ed. 2002).
189. Id. at 154-55.
190. Id. at 425-26; see Computer Assocs., 982 F.2d at 698.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 321

ultimately be processed and executed by a computer.191 Examples of


programming languages include C,192 C++,193 PERL,194 and Java.195
Programming languages are commonly categorized as object-oriented
or procedural.196 A procedural program divides instructions into routines,
or functions, each of which contains a set of operations to be performed on
data.197 An object-oriented language wraps the data and the operations into
groups called objects.198 For example, these operations could be printing
something to the screen, changing a value based upon some predefined
formula, or computing a new value based upon the input and returning that
to another function.199
A developer creates source files containing instructions, or source
code, written in a programming language.200 These files are not directly
readable by a computer but are compiled into object or machine code in
order for execution.201 When creating source code, a developer breaks the

191. AIKEN ET AL., supra note 188, at 426.


192. See BRIAN W. KERNIGHAN & DENNIS M. RITCHIE, THE C PROGRAMMING LANGUAGE
1 (2d ed. 1988). C was used as a programming language on UNIX operating systems as
early as 1974. See Brian W. Kernighan, Programming in C: A Tutorial, http://
www.lysator.liu.se/c/bwk-tutor.html (last visited Mar. 26, 2008).
193. See BJARNE STROUSTRUP, THE C++ PROGRAMMING LANGUAGE (3d ed. 1997). C++
is a direct descendant of C that retains almost all of C as a subset. Bjarne Stroustrup,
Bjarne Stroustrups: FAQ, http://www.research.att.com/~bs/bs_faq.html (last visited Mar.
26, 2008).
194. See LARRY WALL ET. AL., PROGRAMMING PERL (2d ed. 1996).
195. See Sun Microsystems, Inc., New to Java Center, http://java.sun.com/developer/
onlineTraining/new2java/ (last visited Mar. 26, 2008).
196. See Matt Weisfeld, Moving from Procedural to Object-Oriented Development,
http://www.developer.com/design/article.php/3317571 (last visited Mar. 26, 2008).
Additionally, there are functional programming languages, but these languages are much
less commonly used. See generally Benjamin Goldberg, Functional Programming
Languages, 28 ACM COMPUTING SURVEYS 249 (1996), available at http://cs.nyu.edu/
goldberg/pubs/gold96.pdf.
197. See Functional vs. Procedural Programming Language, http://amath.colorado.edu/
computing/mmm/funcproc.html (last visited Mar. 26, 2008).
198. Weisfeld, supra note 196.
199. See KERNIGHAN & RITCHIE, supra note 192, at 6 (describing how to print text to a
terminal, change a previously defined value, or read input and compute a new value which
is passed to another function).
200. AIKEN ET AL., supra note 188, at 491. The choice of language depends upon which
computers the programmer intends the program to be used by, for some computers can read
only certain languages. Whelan Assocs., Inc. v. Jaslow Dental Lab., Inc., 797 F.2d 1222,
1230 (3d Cir. 1986).
201. See Computer Assocs. Intl, Inc. v. Altai, Inc., 982 F.2d 693, 698 (2d Cir. 1992)
(Once the source code has been completed, the second step is to translate or compile it
into object code. Object code is the binary language comprised of zeros and ones through
EADDY. FINAL 4/1/2008 10:44:00 PM

322 NEW ENGLAND LAW REVIEW [Vol. 42:299

source code down into multiple files,202 each of which may contain one or
more routines,203 commonly called functions in many programming
languages.204 Generally, a function is a grouping of instructions which the
programmer gives a name and which can be executed or called multiple
times within a program.205
With regard to copyright, the expression of a computer program can
take two shapes: source code and object code.206 Source code for a piece of
software consists of multiple files containing instructions written in a
programming language.207 By grouping instructions into functions and
grouping functions into files which are stored on a computers hard drive, a
developer fixes the softwares source code in a tangible medium of
expression, thereby qualifying the source code for copyright protection.208
In a typical case, the source code files are compiled into object code.209 The
object code is what the computers operating system loads and eventually
runs.210 Unsurprisingly, courts have acknowledged that direct copying of
source code or object code constitutes copyright infringement.211

which the computer directly receives its instructions.).


202. See AIKEN ET AL., supra note 188, at 426.
203. Id. at 458.
204. Id. at 228.
205. Id. at 458.
206. Lee A. Hollaar, Digital Law Online: Object Code, http://digital-law-
online.info/lpdi1.0/treatise19.html (last visited Mar. 26, 2008) (Clearly the source code for
a computer program, to the extent that it is original expression, is protectable by
copyright.); see Williams Elecs., Inc. v. Artic Intl, Inc., 685 F.2d 870, 877 (3d Cir. 1982)
(denying that a loophole exists for object code in copyright protection). As object code is
more often an issue when dealing with literal copying, source code is the main focus of the
software section of this Note.
207. See Karl W. Broman, Coding Practices (viewed generally),
http://www.biostat.jhsph.edu/~bcaffo/statcomp/files/coding_ho.pdf 5 (last visited Mar. 26,
2008) (describing the programming practice of breaking source code into files containing
related functions); see also Torsten Seemann, Code Modularity and the C Programming
Language (Feb. 16, 1999), http://www.csse.monash.edu.au/courseware/cse2304/hndtC.html
(encouraging splitting of source code into multiple files).
208. See 17 U.S.C. 102 (2000) (Copyright protection subsists, in accordance with this
title, in original works of authorship fixed in any tangible medium of expression . . . from
which they can be perceived, reproduced, or otherwise communicated, either directly or
with the aid of a machine or device . . . .).
209. See AIKEN ET AL., supra note 188, at 426 (describing a programming language as
usually requiring source code to be translated to machine or object via compilation).
210. See id. at 372.
211. See Whelan Assocs., Inc. v. Jaslow Dental Lab., Inc., 609 F. Supp. 1307, 1319-20
(E.D. Penn. 1985) (acknowledging that software is protected by copyright generally, and
unauthorized sales of software clearly constitute copyright infringements), affd 797 F.2d
1222 (3d Cir. 1986).
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 323

B. Non-literal Infringement via Substantial Similarity in Structure,


Sequence, and Organization: Whelan Associates, Inc. v. Jaslow
Dental Laboratory, Inc.212
In Whelan, the Third Circuit dealt with a case involving software used
to manage dental laboratories213 and, in affirming the lower courts ruling,
helped determine the scope of copyright protection for software.214

1. Factual Background
In this matter, one of the defendants, Rand Jaslow, ran a dental
laboratory which manufactured dental prosthetics and devices.215 Jaslow
felt that a computer program could help the operation of his business, and
he attempted to write such a program.216 Without any background in
computer programming, Jaslow ultimately came to the conclusion that he
could not create the software himself.217
After the failure, Jaslow hired Elaine Whelans company to create the
dental laboratory management software, called Dentalab.218 Whelan
followed the standard software development cycle, spending a considerable
amount of time gathering the requirements for the software by examining
the operations at the Jaslow laboratory and other dental laboratories to gain
an understanding of how the software should operate.219 Once the software
was written, the two parties reached an agreement regarding marketing,
sales, and profit sharing that served them well for two years.220
During the period in which the two parties worked together, Rand
Jaslow became more familiar with computer programming and realized that
there was the potential for greater sales if the Dentalab software was
written for a different type of computer.221 After a year of work on his own
software, which he called Dentcom, Jaslow notified Whelan that their
contract was being terminated.222 Jaslow hired another programmer to
complete the work and proceeded to sell the finished product.223

212. 797 F.2d 1222 (3d Cir. 1986), affg 609 F. Supp. 1307 (E.D. Penn. 1985).
213. Id. at 1224.
214. See id. at 1224-25.
215. Id. at 1225.
216. Id.
217. Id.
218. Whelan, 797 F.2d at 1225-26.
219. Id.
220. Id. at 1226.
221. Id.
222. Id.
223. Id. at 1226-27.
EADDY. FINAL 4/1/2008 10:44:00 PM

324 NEW ENGLAND LAW REVIEW [Vol. 42:299

For copyright purposes, the key distinction between the Dentalab and
Dentcom computer programs is that the two were written in different
languagesDentalab was written in EDL and Dentcom was written in
BASIC.224 This fact is important because, like normal spoken languages,
computer programming languages appear remarkably different even when
the languages communicate the same thing.225

2. Structure, Sequence, and Organization of Software


Because of the inherent differences between the two different
programming languages, there was not any literal infringement in the
software products.226 In literary works, such as plays or novels, copying can
occur even without copying of a works literal elements if a later creator
copies non-literal elements, such as the works plot or plot devices, to an
extent which establishes substantial similarity between the two works.227
In comparing two pieces of software, it is important to remember that
copyright does not cover any idea or method but does cover a particular
expression of an idea or method.228 In defining a test for distinguishing the
idea from the expression, the court decided that the idea of a software
program is the overall purpose of the program, and the expression
necessarily constituted everything else.229 In other words, the purpose or
function of a utilitarian work would be the works idea, and everything that
is not necessary to that purpose or function would be part of the expression
of the idea.230 Therefore, when the function of software could be
accomplished in multiple ways, the choice of how to accomplish the task
reflects expression and is protectable by copyright.231
In comparing the two pieces of software, the court looked to the
developmental process and manner in which a developer breaks the
software down into functions.232 When developing software, a significant

224. Whelan, 797 F.2d at 1226. EDL was a programming language used on an IBM
Series/1 mainframe. Id. BASIC is a programming language, developed in the 1960s, and
became popular with the rise of home computers in the 1970s and 1980s. Mary Bellis, The
History of BASIC Beginners All Purpose Symbolic Instruction Code, http://
inventors.about.com/library/inventors/blbasic.htm (last visited Mar. 26, 2008).
225. See Whelan, 797 F.2d at 1233. In this case, however, the district court did not find
any copying of the source or object codes, nor did the plaintiff allege such copying. Id.
226. Id.
227. Id. at 1234. The court goes on to state that it looked for comprehensive nonliteral
similarity when performing the substantial similarity analysis. Id. at 1234 n.26.
228. Id. at 1234.
229. Id. at 1236.
230. Id. (emphasis omitted).
231. See Whelan, 797 F.2d at 1236.
232. Id. at 1229-30.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 325

portion of the costs attributable to the programming are those attributable


to developing the structure and logic of the program.233 Amongst these
decisions are choices regarding which functionality to place into particular
functionseach method chosen to solve a particular problem creates
efficiencies or inefficiencies, conveniences or quirksthat differentiate
[the chosen method] from other solutions and make the overall program
more or less desirable.234 Said another way, original expression exists in
the creation of software because [t]here are many ways that the same data
may be organized, assembled, held, retrieved and utilized by a
computer.235
Based upon this information, the court held that protection of
computer programs extends beyond the literal source code to the structure,
sequence, and organization of the program.236 It based this conclusion on
the definition of compilation in the Copyright Act of 1976, a work
formed by the collection and assembling of preexisting materials or of data
that are selected, coordinated, or arranged in such a way that the resulting
work as a whole constitutes an original work of authorship.237
Additionally, a derivative work could be a recasting of an original
work.238 Therefore, the court found that even though Congress did not use
the specific language of structure, sequence, and organization, Congress
nevertheless realized that the sequencing and ordering of materials could be
copyrighted.239 Thus, if one software product is organized in a manner
substantially similar to another product, even without regard to the specific
language used to create the software, the later work can infringe on the
earlier works copyright.240

C. Substantial Similarity Test Using Abstraction, Filtration, and


Comparison: Computer Assocs. Int'l, Inc. v. Altai, Inc.241
The Computer Associates case involved an employee who left
Computer Associates and took a position at Altai, taking with him source

233. Id. at 1237.


234. Id. at 1230.
235. Id. at 1238.
236. Id. at 1248.
237. Whelan, 797 F.2d at 1239 (quoting The Copyright Act of 1976, 17 U.S.C. 101
(1982)).
238. Id.
239. Id. at 1248.
240. See id. (upholding the lower courts finding that the structural similarities between
the two programs provided enough evidence to find that Jaslow infringed Whelans
copyright).
241. 982 F.2d 693 (2d Cir. 1992), affg 775 F. Supp. 544 (E.D.N.Y. 1991).
EADDY. FINAL 4/1/2008 10:44:00 PM

326 NEW ENGLAND LAW REVIEW [Vol. 42:299

code to Computer Associates products.242 Predictably, Altai developed a


piece of software based on Computer Associates product.243 In a failed
attempt to avoid copyright liability, after Computer Associates filed suit,
Altai redeveloped the software, using eight programmers who were not
exposed to the previous software.244
The Second Circuit criticized the structure, sequence, and
organization test created in Whelan, stating that the test failed to
adequately take practical considerations into account when performing the
analysis.245 In particular, the court supported the assertion that the Whelan
holding was overly protective in that it assumed that there was only one
idea.246 This could result in an overly broad interpretation of the protectable
pieces.247 Noting this flaw in the reasoning of Whelan, the Second Circuit
argued that because each function in a program is a set of instructions
designed to carry out a purpose, they could each have their own idea
under Whelan.248 Therefore, under this reasoning, the Whelan courts test
was inadequate for being indeterminate with regard to the protections
extended based upon its version of the idea separation analysis.
Additionally, the Computer Associates court found that the structure,
sequence, and organization test of Whelan was incorrect because a
program does not execute in the same order during execution each time, as
user input changes can change the execution of the program.249
Instead of simply looking to whether the structuring and ordering of
the two works were substantially similar to one another, the court adopted a
three-step procedure for determining substantial similarity.250 The three
steps of the procedure were (1) abstraction, (2) filtration, and (3)
comparison.251

242. Computer Assocs. Intl, Inc. v. Altai, Inc., 775 F. Supp. 544, 553 (E.D.N.Y. 1991),
affd 982 F.2d 693 (2d Cir. 1992).
243. Id. at 553-54. Altais program actually took some thirty percent of its code from
Computer Associates product. Id. at 554.
244. Id. at 554. These programmers were prevented from seeing the previous source code
or speaking with the employee who brought the code to the company. Id.
245. Computer Assocs., 982 F.2d at 706.
246. Id. at 705.
247. Id.
248. Id.
249. See id. at 706.
250. Id.
251. Computer Assocs., 982 F.2d at 706-10. See also generally Donald R. Robertson, III,
Note, An Open Definition: Derivative Works of Software and the Free/Open Source
Movement, 42 NEW. ENG. L. REV. 339, 346-61 (2008) (arguing that the abstraction, filtration
and comparison test should be used universally in cases of unauthorized derivative works of
software).
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 327

In the abstraction step, the court dissects the software into a series of
abstractions from the lowest level steps of the program to the functional
and file organization to the general purpose of the program.252 Along the
way, it is necessary essentially to retrace and map each of the designers
stepsin the opposite order in which they were taken during the programs
creation.253 The higher the level of abstraction, the simpler the view of the
program becomes.254 Hence, the court would break out the program at each
level, from the individual instructions, to the functions, to the program as a
whole.255
The filtration step separates the portions of the program which are
covered by copyright and those items which are not protectable.256 As
stated previously, ideas are not protectable, but a particular expression of
an idea is protectable by copyright.257 Items which are necessarily incident
to the expression of an idea are not protectable.258 Under this step, the court
looks at each of the pieces of the abstraction from the first step, and
determines whether or not the particular aspect was an idea or dictated
by considerations of efficiency . . . ; required by factors external to the
program itself; or taken from the public domain.259 If any of these are true,
that particular element is not protectable by copyright and must be removed
from further consideration in determining whether or not the two programs
were substantially similar.260
This doctrine of merger is based upon the fact that [w]hen there is
essentially only one way to express an idea, the idea and its expression are
inseparable and copyright is no bar to copying that expression.261 In
software, for example, efficiency is often the reason given for the
commonality of expression and this step protects an innocent second party
from copyright liability when there was only a single good way for a
portion of a program to be written.262
The final step, comparison, involves actually comparing the

252. See id. at 706-07.


253. Id. at 707.
254. Id.
255. See id.
256. Id. at 707-10.
257. Computer Assocs., 982 F.2d at 703.
258. Id. at 707.
259. Id.
260. Id.
261. Id. at 707-08 (quoting Concrete Mach. Co. v. Classic Lawn Ornaments, Inc., 843
F.2d 600, 606 (1st Cir. 1988)) (alteration in original).
262. Id. at 708. This explanation of optimization leading to similar source code in
software translates closely to the practice of normalizing database tables. See supra note 165
and accompanying text.
EADDY. FINAL 4/1/2008 10:44:00 PM

328 NEW ENGLAND LAW REVIEW [Vol. 42:299

remaining elements after the court completes the first two steps.263 After
the abstraction and filtration steps, the court may be left with a core of
protectable expression.264 Here, the court checks to see whether the
defendant copied any of the plaintiffs protected expression and what level
of importance these elements have in relation to the overall work of the
plaintiff.265 It is left to the trier of fact to determine whether or not the
defendants copying took such a quantity of material from the plaintiff such
that [the] defendant wrongfully appropriated something which belongs to
the plaintiff.266 Given the nature of the works at hand, however, expert
testimony is acceptable with regard to explaining the various abstractions
and filtrations necessary for the analysis to succeed.267

V. ADAPTING THE SOFTWARE COPYRIGHT INFRINGEMENT ANALYSIS


TECHNIQUES TO DATABASES
By their nature, databases fall somewhere in between a classical
literary work and the computer software described in the previous section
of this Note.268 Given the technical nature of a database, it is reasonable to
assume that the infringement analysis will not merely be passed to the
ordinary observer for a determination as to whether or not the material
portions of a databases copyrightable expression were misappropriated.269
As such, courts should impose some sort of standard process such as those
proposed in Computer Associates or Whelan.270 This standard method of
analysis must keep the overall goal of copyright protection in mind.271 The
law seeks to provide an incentive to create new works while not allowing
companies to obtain a chokehold on an industry through the creation of a
monopoly-like protection.272
This Note proposes using the three-step abstraction, filtration,
comparison test from Computer Associates in conjunction with the

263. Computer Assocs., 982 F.2d at 710-12.


264. Id. at 710.
265. Id.
266. Id. at 713 (quoting Arnstein v. Porter, 154 F.2d 464, 473 (2d Cir. 1946)).
267. See id. at 712-14. The court explained that the use of expert testimony was not
eliminating the ordinary observer standard; rather, expert testimony would allow for an
ordinary observer to understand the otherwise impenetrable nature of the software in
question. Id. at 712-13.
268. See generally supra Parts III-IV (describing the various elements of databases and
software).
269. See Computer Assocs., 982 F.2d at 712-14; see also supra text accompanying note
267.
270. See supra Part IV.B-C.
271. See Computer Assocs., 982 F.2d at 696.
272. Id.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 329

structure, sequence, and organization and idea or expression tests from


Whelan.273

A. Partial Defense of the Whelan Tests for Use in Database Analysis


As mentioned previously, the Computer Associates case criticized the
tests espoused in Whelan.274 The primary complaint is that the tests fail to
take practical considerations into account because they imply that a
program has only one idea associated with it and therefore does not
adequately exclude sub-ideas contained within a program.275 This is a well-
deserved criticism, but the courts statement, that everything which does
not constitute the idea is part of the expression,276 can be adapted
successfully. Applying this idea or expression determination at each level
of the abstraction results in revealing the expressive portions of the
abstraction instead of ideas.277 As such, the Whelan test for separating ideas
and expression needs only a small tweak to be useful in the filtration stage.
The other complaint of the Computer Associates court was that the
Whelan courts use of structure, sequence or organization constituted a
misunderstanding of the dynamic nature of software by imposing the
restriction of a single ordering or organization.278 However, for the
purposes of database schema analysis, the structure, sequence, and
organization description is an adequate way of describing the various
levels of abstraction analyzed during the first portion of the test279 because
a database is imposing a rigid structure, sequence, and organization to a set
of data.

B. Abstractions of Structure, Sequence, and Organization from a


Database Schema
The abstraction step of the database analysis test focuses on the
various elements of the databases structure, beginning at the lowest level
and working upward to the database as a whole. At the conclusion, there is

273. See supra Part IV.B.2, C. Like the court in Whelan, this Note treats the terms
structure, sequence, and organization interchangeably. See Whelan Assocs., Inc. v.
Jaslow Dental Lab., Inc., 797 F.2d 1222, 1224 n.1 (3d Cir. 1986).
274. Computer Assocs., 982 F.2d at 705-06.
275. Id.
276. Whelan, 797 F.2d at 1236.
277. See id.
278. See Computer Assocs., 982 F.2d at 706.
279. See supra Part III.A (describing the various static parts of database design); supra
Part III.C (describing the few potentially dynamic aspects of databases). Because there are
not generally the equivalent of functions called interactively in a database, the structure
of a database does not dynamically change like software.
EADDY. FINAL 4/1/2008 10:44:00 PM

330 NEW ENGLAND LAW REVIEW [Vol. 42:299

sufficient evidence available to make a determination as to whether or not


the originality requirement under Feist is satisfied by the database at issue.
The separation of the underlying ideas from their expression in the
filtering stage requires the use of an abstraction operation like the one
described in Nichols v. Universal Pictures Corp.280 and later adapted in
Computer Associates. In the Nichols case, Judge Learned Hand described
the abstractions process with regard to a play as follows:
Upon any work, and especially upon a play, a great number of
patterns of increasing generality will fit equally well, as more
and more of the incident is left out. The last may perhaps be no
more than the most general statement of what the play is about,
and at times might consist only of its title; but there is a point in
this series of abstractions where they are no longer protected,
since otherwise the playwright could prevent the use of his
ideas, to which, apart from their expression, his property is
never extended.281
This type of abstraction-based analysis is as appropriate for databases
as it is for plays and software.
The series of abstractions in a database schema begins at the lowest
level of designthe logical levelwith the individual fields stored in the
databases tables.282 Each field has a defined name and type and potentially
has a default value associated with it and constraints imposed upon it.283
From the school grades database example, the Students table has three
fields for storing the various pieces of a students name.284 These fields
have specific lengths assigned to them and names given to them
(name_first, name_last, and middle_initial). Looking at these
individual fields in each table of the database provides the lowest level of
design.
The next level of abstraction is where each of the fields from the first
level are grouped, the table level. In the example database, the Students and
Classes tables hold enough information to identify particular students or
classes.285 This level of abstraction includes the ordering of the individual
fields that were previously abstracted.286

280. 45 F.2d 119, 121 (2d Cir. 1930).


281. Id.
282. See supra Part III.A.
283. See supra text accompanying notes 139-40.
284. See supra Part III.B.
285. Supra Part III.B.
286. See supra text accompanying notes 282-85; see also infra text accompanying notes
287-89.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 331

Next, looking at the Grades table in the example database brings out
the abstraction of the relationships between the tables.287 The information
contained in the Grades table includes pointers to the other two tables
because the information contained in that table is not particularly useful
when considered separately from the other two tables.288 Additionally, the
table level of abstraction provides the opportunity to abstract out the
indices and constraints imposed on the various fields of a table.289 The final
abstraction is that of the entire database itself, a representation of all the
class, student, and grade information for a school.290
After the abstractions, the court should look at the validity of
copyright protection on the work as a whole. Under Feist, if a work is to be
protected by copyright law, it must possess[] at least some minimal degree
of creativity of the originality required for protection.291 Feist teaches that
if the database schema merely arranges information in an obvious or
completely trivial manner, then the schema does not have enough
originality in its expression to qualify under this standard.292 This
originality requirement implies that a simplistic database which simply
stores standard information in a single table, or perhaps even just a few
tables, could be too unoriginal to qualify for protection.293

C. Filtering Out Ideas and Other Non-Protectable Portions of the


Abstractions
The filtering stage involves removing portions of the abstractions
extracted from the first part of the test which are not protectable.294 As part
of the filtering portion of the test, ideas must be filtered from the
expression of those ideas. Both the Whelan and Computer Associates cases
addressed the necessity of separating the idea from the expression in a
software program.295 Just as the court identified in Computer Associates

287. See supra text accompanying notes 158-62.


288. Relationships allow for the separation of data into multiple tables while reducing the
amount of data stored and maintaining data integrity. See supra text accompanying notes
158-62.
289. See generally supra Part III.A.
290. See supra Part III.B.
291. Feist Publns, Inc. v. Rural Tel. Serv. Co., Inc., 499 U.S. 340, 345 (1991).
292. See id.
293. See id. Under this analysis, one could argue that the example database described in
Part III.B of this Note would not qualify for protection for lack of originality in that it
merely stores information about students, classes, and grades, which is not a novel thought.
However, given the various choices that are made, there is a strong argument that even this
sample database schema satisfies the originality requirement.
294. See Computer Assocs. Intl, Inc. v. Altai, Inc., 982 F.2d 693, 707 (2d Cir. 1992).
295. See supra text accompanying notes 228-31, 245-49.
EADDY. FINAL 4/1/2008 10:44:00 PM

332 NEW ENGLAND LAW REVIEW [Vol. 42:299

with regard to software, a database often contains multiple ideas.296 As in


the previous portion of the test, one starts at the lowest level of abstraction
and works upward, filtering the unprotectable ideas from the protectable
expression.
Looking at the fields used for storing the name of each table, per
Computer Associates, there is a connecting idea herethe idea of storing a
persons name in three separate fieldsand, that idea is not protected.297
The definition of these fields is a protectable expression as they are not part
of the idea, but rather an expression of it.298
When considering the tables of a database, each table stores a subset
of data needed for storing the overall idea, and thus, each table has its own
idea behind it.299 The Students table has an idea of storing student
information in a single table, and the Classes table has an idea of storing
class information in a table.300 Again, these ideas are not protectable, but a
expression of these ideas can be found in the ordering of the specific fields
of the tablea new form of expression over the fields themselves.301 The
ordering of the fields in this table is a choice made by the designer and,
therefore, is protectable from an expression standpoint.302 Therefore, the
ordering of columns is protectable, but the ideas of storing student
information in one table and class information in another table are not
protectable.
Next, the relationships between the tables constitutes expression
because there are numerous ways in which the data from the tables could
be expressed.303 While the idea of storing grade information in a table is not
protectable, the specific embodiment of storing this information across
three tables with relationships between is a choice made by the designer
and qualifies as protectable expression.304
In considering the indices and constraints, the design choices present
yet another opportunity for choice in development and therefore, these
choices are a form of creative expression. As discussed, indices provide a
method of organizing information for faster retrieval and constraints

296. See, e.g., infra text accompanying notes 297-302.


297. See The Copyright Act of 1976, 17 U.S.C. 102(b) (2000).
298. See Computer Assocs., 982 F.2d at 703.
299. The example database tables ideas are storing information about students, classes,
and grades. See supra Part III.B.
300. Supra Part III.B.
301. See supra text accompanying notes 282-298.
302. See Computer Assocs., 982 F.2d at 702-03.
303. See supra notes 158-62 and accompanying text.
304. See Computer Assocs., 982 F.2d at 702-03.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 333

impose restrictions upon the information that can be kept in a table.305 For
instance, an index could be created on the last names table to allow faster
retrieval of a particular record when searching by last name.306 A constraint
could be imposed on the table requiring the combination of the three name
fields and the date of birth field to be unique in the table.307 These indices
and constraints are optional items which are often imposed as part of the
optimization of a database in order to speed access or ensure data
integrity.308 As such, they are separate from the ideas represented therein
and constitute protectable expression.
Finally, the idea of the overall database is designed to store a specific
set of information, such as the idea of tracking grades in a school.309 As
this is inherently just an idea, it is not protectable by copyright.

D. Comparison of the Elements Remaining After Filtration


After the abstraction step is completed and the non-protected works
are filtered out, what remains is a core of protectable expression.310 Here,
the court looks to see what, if any, of the remaining elements contain
substantial similarities so as to indicate that the second work
misappropriated enough of the protectable expression from the original
work to find that the formers copyright was infringed.

VI. PROTECTING A DATABASE THROUGH THE DIGITAL MILLENNIUM


COPYRIGHT ACT
The test proposed by this Note for determining copyright
infringement between two databases provides one method for protecting
valuable pieces of intellectual property,311 but protection in this manner has
drawbacks. The largest of these drawbacks is probably the expense
associated with the actual litigation of a copyright matter.312 A less

305. See supra text accompanying notes 147, 151.


306. See W3Schools.com, SQL Create Database, Table, and Index,
http://www.w3schools.com/sql/sql_create.asp (last visited Mar. 26, 2008).
307. See Gayathri Gokul, Constraints in Microsoft SQL Server 2000 (Feb. 11, 2004),
http://www.aspfree.com/c/a/Database-Code/Constraints-In-Microsoft-SQL-SERVER-2000/
(follow PDF Version of Article hyperlink).
308. See id. at 1.
309. See supra Part III.B (example database designed for managing grades in a school
environment).
310. See Computer Assocs. Intl, Inc. v. Altai, Inc., 982 F.2d 693, 710 (2d Cir. 1992).
311. See supra Part V.
312. Litigation of small copyright claims can easily run into the tens of thousands of
dollars. See U.S. Copyright Office, Statement of the United States Copyright Office Before
the Subcomm. on Courts, the Internet, and Intellectual Property, Comm. on the Judiciary,
EADDY. FINAL 4/1/2008 10:44:00 PM

334 NEW ENGLAND LAW REVIEW [Vol. 42:299

expensive form of protection for some databases could be found in the


provisions of the Digital Millennium Copyright Act (DMCA).313 The
DMCA is recognized as legislation which is favorable to copyright
holders,314 and an owner of a sufficiently original database315 can benefit
from this legislation.
Among other items, the DMCA protects a copyright holders
property through preventing the circumvention of a copy protection
mechanism.316 In effect, the DMCA can be understood to legally reify a
technology of exclusion, prohibiting the circumvention of measures that
effectively control access.317 This form of protection can be available to
software companies who distribute software along with a database to store
the data generated during the use of the software.318 There is an argument
that protection for these types of databases is available to a distributor
merely by setting a password in, for example, a Microsoft Access
database.319 Without a password, a user could normally open the Access
database in Microsoft Access and view the database schema, but once the
password has been set on the database, the password must be supplied in
order to open the database.320 Without access to the schema, copying the
schema becomes impossible in reference to any potential copyright
violations as one of the requirements for showing a copyright violation is
access to the original work.321 However, a password in Microsoft Access is

109th Cong. 2d Sess. (2006), available at http://www.copyright.gov/docs/regstat


032906.html. The fees associated with testifying and non-testifying experts in these
litigations can add tens more thousands to the total litigation costs for even these so-called
small copyright claims. See id at n.3.
313. Digital Millennium Copyright Act, Pub. L. No. 105-304, 112 Stat. 2860 (1998)
(codified as amended in scattered sections of 17 U.S.C.).
314. John A. Fonstad, Comment, Protecting Fair Use with Fogerty: Toward a New Dual
Standard, 40 U. MICH. J.L. REFORM 623, 633-34 (2007) (discussing the broad protections
allowed preventing fair use and infringement under this statute).
315. See supra Part II.A.2 for an elaboration of the originality requirements for copyright
protection and the application to the database context.
316. See 17 U.S.C. 1201(a)(1)(A) (2000). No person shall circumvent a technological
measure that effectively controls access to a work protected under this title. Id.
317. Greg Lastowka, Decoding Cyberproperty, 40 IND. L. REV. 23, 67 (2007).
318. See supra note 5 and accompanying text for real-life examples of these types of
companies. Additionally, the hypothetical situation posited supra is an example of a
company distributing a database for use in conjunction with software. See supra p. 301.
319. See Frank C. Rice, Exploring Microsoft Access Security, July 2002, http://
msdn2.microsoft.com/en-us/library/aa139961(office.10).aspx, for an example of setting a
password on a Microsoft Access database.
320. See id. (describing how to programmatically open an Access database which has
been secured with a password).
321. See Walker v. Time Life Films, Inc., 784 F.2d 44, 48 (2d Cir. 1986) (stating that
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 335

easily avoided and does not constitute much of an actual barrier to the
viewing of a database schema.322 This is where the DMCA reifies the
password protection mechanism.
As previously stated, the DMCA provides that no one is allowed to
circumvent a technological measure which effectively controls access
to a copyrighted work.323 Potentially foreseeing an inevitable debate
surrounding the definition of these terms, Congress helpfully acted as its
own lexicographer and clearly set forth what its intent was in the
legislation.324
A person circumvents of a technological measure if he or she
avoid[s] . . . [or] bypass[es] . . . a technological measure, without the
authority of the copyright owner.325 Clearly, someone using a piece of
software designed to provide access to a password-protected database
without knowing the password avoids the technological measure that is
inherent in the password protection.
However, there is no DMCA violation unless the circumvention is of
a device which effectively controls access to the copyrighted work,326
meaning that the measure must require[] the application of information
during its ordinary course of . . . operation in order to access the
protected work.327 In this hypothetical, the supplied information used to
gain access to the database file is the previous specified password.328
Hence, the password protection mechanism qualifies as an effective
protection mechanism and bypassing that protection constitutes a violation
of the anti-circumvention provision of the DMCA.329

proving copyright infringement requires showing access to the copyrighted work).


322. Searching Google.com for access password produces over 37,000,000 hits, with at
least the first page of search results and advertisements being devoted entirely to tools used
to gain access to password protected Microsoft Access databases. See Google,
http://www.google.com (search for the words access and password) (last visited Mar.
28, 2008).
323. 17 U.S.C. 1201(a)(1)(A) (2000).
324. Id. 1201(a)(3) (defining both circumvent a technological measure and
effectively controls access to a work).
325. Id. 1201(a)(3)(A). Perhaps revealing its main intent was to prevent the decoding of
scrambled works or decryption of encrypted works, this section mentions these actions first
when listing the various items that constitute circumvention under the code. See id.
326. Id. 1201(a)(1)(A).
327. Id. 1201(a)(3)(B).
328. See supra text accompanying note 319.
329. See supra text accompanying notes 316-22.
EADDY. FINAL 4/1/2008 10:44:00 PM

336 NEW ENGLAND LAW REVIEW [Vol. 42:299

Protecting a database under these DMCA provisions provides more


possibilities for financial recovery. Specifically, the civil remedies
provided by section 1203 provide for the recovery not only of actual
damages,330 but also the awarding of statutory damages,331 cost recovery,332
and attorney fees for a prevailing party.333 Given the ease of protecting a
database via this mechanism, there are few reasons not to do so.334

VII. CONCLUSION
Protecting the intellectual property investment of companies is of
growing importance to the legal industry.335 As such, lawyers need to
understand the various sources of intellectual property in their clients
possession and the protection options available for each of them. This Note
presented a hypothetical situation where a client developed a piece of
software and a related database which was thereafter illicitly copied.336
Because of the complicated nature of a database, the ordinary
observer is not likely to be able to determine whether or not a copyright in
a database has been violated without the assistance of expert testimony.337
This expert testimony should follow the practices developed in the Whelan
and Computer Associates cases in order to properly extract all of the
expression potentially protected by copyright and arrange it for the
ordinary observer to review.338 The adapted test proposed in this Note
presents an opportunity to provide a standard method of review for
database copyright analysis which should help lawyers protect their
intellectual property.
Finally, given the protection afforded to copyright holders under the
DMCA, a software company that distributes a copyrightable database in
conjunction with its software should strongly consider protecting its

330. 17 U.S.C. 1203(c)(1)(A) (2000).


331. Id. 1203(c)(1)(B). The statutory damages provisions are set forth in section
1203(c)(3).
332. Id. 1203(b)(4).
333. Id. 1203(b)(5).
334. One pitfall is the prevention of access to data rightfully owned by the end user, and
therefore, a copyright holder considering this course of action should provide some
mechanism for the end-user to extract non-copyrightable data from the program for his or
her own use. See Assessment Techs. of WI, L.L.C. v. WIREdata, Inc., 350 F.3d 640, 645
(7th Cir. 2003) (mentioning in dicta that copying of a copyrighted work is acceptable if non-
copyrightable material contained therein is so entangled as to require the otherwise illegal
action in order to access the non-copyrightable data).
335. See supra text accompanying notes 1-5.
336. See the hypothetical situation presented supra at page 301.
337. See supra note 267 and accompanying text.
338. See supra Part V.
EADDY. FINAL 4/1/2008 10:44:00 PM

2008] DATABASE COPYRIGHT ANALYSIS 337

database schema by password protecting the database.339 The added


protections afforded are worth the minimal effort required to meet the
standards of the statute.340

339. See supra Part VI.


340. See supra Part VI.

Вам также может понравиться