Академический Документы
Профессиональный Документы
Культура Документы
Data Deni,on
Thomas Heinis
t.heinis@imperial.ac.uk
wp.doc.ic.ac.uk/theinis
Data deni,on
SQLs Data Deni>on Language (DDL) is concerned with
schema crea>on and modica>on; as well as the specica>on
of constraints and performance op>ons such as materialised
views and indexing.
Well mainly look at base rela>ons (stored tables) and briey
at derived rela>ons (computed views).
Constraints in SQL include: domain (type) constraints, primary
key constraints, foreign key constraints, unique constraints,
not null constraints, check constraints, and asser>ons.
Crea,ng a Rela,on
The create table statement is used to create a new named
rela>on and declare its schema. The rela>on is persistent
and stored on disk in specially organised les.
Example: movie(>tle, year, length, genre)
Primary Key
Each rela>on should have a primary key (a candidate key)
that determines the other aQributes in the rela>on and is
used to uniquely iden>fy each tuple of the rela>on.
The primary key is also used to enforce foreign key
constraints on the rela>on (see later)
Only one primary key is permiQed for a rela>on. If there is a
choice of candidate keys, then choose one where the key will
never be updated (or very rarely updated), otherwise choose
one with few aQributes (a small memory footprint).
Other candidate keys can be dened using unique constraints
(see later).
5
will fail since insert will aQempt to assign null to year, and year is an
aQribute of the primary key.
6
Unique constraints
Addi>onal candidate keys (or superkeys or any aQributes for that maQer) can be declared
using unique constraints, which ensure that no two tuples can have the same set of
values for the aHributes listed in unique, i.e. that every tuple must have a unique value
for the aHributes listed in unique (as a whole).
Example: movie(>tle, year, length, genre, ISAN)
create table movie (
>tle
varchar(120),
year
int default 2011,
length int default 0,
genre
char(20),
ISAN
char(24),
primary key (>tle, year),
unique (ISAN)
)
Exercise
Q.
Declare the SQL schema for the following rela>on - dont worry about ge`ng all the
details right - make educated guesses.
Solu,on
Q.
Declare the SQL schema for the following rela>on - dont worry about ge`ng all the
details right - make educated guesses.
sta(id, opened, openedby, updated, updatedby, validfrom, validto,
login, email, lastname, rstname, telephone, room)
create table sta (
)
id
int,
opened
,mestamp
not null default current_>mestamp,
openedby char(10)
not null,
updated ,mestamp
not null default current_>mestamp,
validfrom date
not null default current_date,
validto
date
not null default date 2020-12-31,
login
char(12)
not null,
email
varchar(80)
not null,
lastname char(30)
not null,
rstname char(30)
not null,
telephone char(20)
not null default ,
room
char(10)
not null default ,
primary key (id), unique(login), unique(email)
10
Check constraints
AQribute types and not null constraints allow us to limit the values that we
store. check constraints allow us to dene predicates that must be sa>sed
when a tuple is inserted or updated.
Example:
Check constraints
Although check constraints are typically simple checks on a
single aQribute they can be arbitrary expressions involving
several aQributes and/or a query.
Example:
movie(>tle, year, length, genre, ISAN)
create table movie (
>tle varchar(120),
year int,
length int,
genre char(20),
ISAN char(24) not null,
primary key (>tle, year),
check(ISAN in (select no from ISANcatalog))
)
12
Exercise
Q. For the sta rela>on add a check constraint for validfrom and validto and one for
room given the following addi>onal rela>on:
13
Solu,on
Q.
Add a check constraint for validfrom and validto and one for room given the rela>on
building(id, ..., room, area, desks, occupancy)
create table sta (
id
int,
...
updated
,mestamp
not null default current_>mestamp,
validfrom
date
not null default current_date,
validto
date
not null default date 2020-12-31,
...
room
char(10)
not null default ,
primary key (id), unique(login), unique(email),
check(validto>=validfrom),
check(room in (select room in building)),
check(updated>=current_>mestamp)
14
Asser,ons
Its also possible to declare asser,ons, which are check constraints over data in several
rela>ons.
Example:
Although asser>ons are a very powerful feature of SQL they are hard to implement eciently.
Triggers are an alterna>ve, more powerful and more opera>onal approach for le`ng the
programmer deal with constraint checking when data is modied.
15
Naming constraints
Its good prac>ce to name constraints. This allows us to drop them
using alter table but also claries error messages when a constraint
viola>on is reported.
Example:
movie(>tle, year, length, genre, ISAN)
create table movie (
>tle varchar(120),
year int,
length int,
genre char(20),
16
Exercise
Q.
18
Solu,on
Q. Write foreign key constraints for some of the following rela>ons:
sta(id, login, email, lastname, rstname, telephone, room, deptrole, department)
student(id, login, email, lastname, status, entryyear, externaldept)
course(id, code, >tle, syllabus, term, classes, popes>mate)
class(id, degreeid, yr, degree, degreeyr, major, majoryr, leQer, leQeryr)
foreign key (degreeid) references degree (id),
degree(id, >tle, code, major, grp, leQer, years)
xcourseclass(courseid, classid, required, examcode)
foreign key (courseid) references course (id),
foreign key (classid) references class (id),
xcoursesta(courseid, stad, stajours, role, term)
foreign key (courseid) references course (id),
foreign key (stad) references sta (id),
xstudentclass(studen/d, classid)
foreign key (studen>d) references student (id),
foreign key (classid) references class (id),
xstudentsta(studen/d, stad, role, grp, projec`tle)
foreign key (studen>d) references student (id),
foreign key (stad) references sta (id),
19
20
Exercise
Q. Rewrite some of the foreign key constraints maintaining referen>al integrity.
sta(id, login, email, lastname, rstname, telephone, room, deptrole, department)
student(id, login, email, lastname, status, entryyear, externaldept)
course(id, code, >tle, syllabus, term, classes, popes>mate)
class(id, degreeid, yr, degree, degreeyr, major, majoryr, leQer, leQeryr)
degree(id, >tle, code, major, grp, leQer, years)
xcourseclass(courseid, classid, required, examcode)
xcoursesta(courseid, stad, stajours, role, term)
xstudentclass(studen/d, classid)
xstudentsta(studen/d, stad, role, grp, projec`tle)
22
Solu,on
Q.
23
Views
Views are rela>ons that are dened using a query (a select). Views are
not physically stored on disk unless they are materialised (see later).
Example:
Exercise
Q.
Write a view on sta for all sta who are currently here (i.e. currently valid).
sta(id, opened, openedby, updated, updatedby, validfrom, validto,
Q.
Write a view on sta who were here in the academic year 2008 to 2009 (October to
September). Tricky.
25
Solu,on
Q.
Write a view on sta for all sta who are currently here (i.e. currently valid).
sta(id, opened, openedby, updated, updatedby, validfrom, validto,
Q.
26
Materialised views
Views are normally recomputed each >me they are needed. If a view is used suciently
oVen then it might be more ecient to materialise (store) the view at the cost of extra
storage and extra >me to keep the view up-to-date when the underlying rela>ons are
changed by inser>ons, updates, and dele>ons.
Materialised view maintenance can be expensive however, for example, requiring lots of
changes to the underlying rela>ons versus few queries on the views. The decision is a
tradeo between extra-storage and view maintenance costs and the faster speed of
querying a materialised view.
Rather than keeping a materialised view eagerly up-to-date, some RDBMSs allow a
materialised view to be brought up-to-date only when the view is accessed (lazy
maintenance). Others allow the view to become stale and only update it periodically e.g.
when database ac>vity is low (overnight) or explicitly by a refresh command. Others create/
remove a materialised view transparently as a query op>misa>on.
Materialised views are a non-standard extension.
28
Updateable views
Views are normally read-only, used to retrieve data. One could consider updateable views -
views that allow inserts, deletes and updates directly on the view.
Updatable views usually dont make sense e.g. dele>ng a tuple in actorgenre for example.
Even in simple cases, for example, inser>ng a tuple into comedies, there is the issue of missing
aQributes - aQributes in the underlying rela>ons that are not in the rela>on. In this case we
could set them to null or their default value.
SQL has a complex set of rules for dening an updateable view including:
1.
2.
3.
4.
5.
No subqueries.
29
Indexes
When rela>ons have many tuples it can be very slow to scan the rela>on tupleby-tuple in order to sa>sfy a query.
Example:
If there were 10,000 movies then reading and tes>ng all 10,000 tuples might be a
liQle slow. Imagine if there were 100 million tuples in the rela>on.
Indexes are copies of an aQributes data that are automa>cally maintained by the
RDBMS but can be searched very very quickly. For the example, we could create
an index on both year and genre together, or separate indexes on one or both
which might be more exible:
create index yearindex on movies(year);
create index genreindex on movies(genre);
Like materialised views, there is a tradeo between the space needed for indexes
and the cost of maintaining them vs the greater speed of access to the indexed
data.
30