Академический Документы
Профессиональный Документы
Культура Документы
DATABASES
Lecturer Guide
Modification History
Version
Date
Revision Description
V1.0
June 2011
For Release
Page 2 of 158
DB Lecturer Guide V1.0
Title Here
CONTENTS
1.
2.
3.
Syllabus .......................................................................................................................... 8
4.
5.
Resources .................................................................................................................... 10
6.
Lectures.........................................................................................................................11
6.2
Tutorials.........................................................................................................................11
6.3
6.4
7.
Assessment ................................................................................................................. 11
8.
Topic 1:
1.1
1.2
1.3
Timings ..........................................................................................................................13
1.4
1.5
1.6
1.7
Topic 2:
2.1
2.2
2.3
Timings ..........................................................................................................................23
2.4
2.5
2.6
2.7
Topic 3:
3.1
3.2
3.3
Timings ..........................................................................................................................35
3.4
Title Here
3.4.1
3.5
3.6
3.7
Topic 4:
4.1
4.2
4.3
Timings ..........................................................................................................................45
4.4
4.5
4.6
4.7
Topic 5:
5.1
5.2
5.3
Timings ..........................................................................................................................55
5.4
5.5
5.6
5.7
Topic 6:
6.1
6.2
6.3
Timings ..........................................................................................................................69
6.4
6.5
6.6
6.7
Topic 7:
7.1
7.2
7.3
Timings ..........................................................................................................................83
Page 4 of 158
DB Lecturer Guide V1.0
Title Here
7.4
7.5
7.6
7.7
Topic 8:
8.1
8.2
8.3
Timings ..........................................................................................................................97
8.4
8.5
8.6
8.7
Topic 9:
9.1
9.2
9.3
Timings ........................................................................................................................111
9.4
9.5
9.6
9.7
Topic 10:
Title Here
11.2 Pedagogic Approach ...................................................................................................138
11.3 Timings ........................................................................................................................138
11.4 Lecture Notes ..............................................................................................................139
11.4.1 Guidance on the Use of the Slides ..................................................................139
11.5 Laboratory Sessions ....................................................................................................144
11.5.1 Views ...............................................................................................................144
11.5.2 Indexes ............................................................................................................145
11.5.3 Constraints ......................................................................................................146
11.6 Private Study ...............................................................................................................147
11.7 Tutorial Notes ..............................................................................................................148
Topic 12:
Page 6 of 158
DB Lecturer Guide V1.0
Overview
1.
This unit aims to give the learner a thorough grounding in practical techniques for the design and
development of database systems, and the theoretical frameworks that underpin them.
2.
Learning Outcomes;
The Learner will:
Assessment Criteria;
The Learner can:
1. Understand the concepts associated 1.1 Summarise the common uses of database
with database systems
systems
1.2 Explain the meaning of the term database
1.3 Explain the meaning of the term database
management system (DBMS)
1.4 Describe the components of the DBMS
environment
1.5 Describe the typical functions of a DBMS
1.6 Summarise the advantages and disadvantages of
a DBMS
2. Understand the concepts associated 2.1 Summarise the concept of the relational model
with the relational model
2.2 Explain the terminology associated with the
relational model
2.3 Explain the purpose of relational integrity
3. Understand how to design
develop a database system
4. Be able to develop a logical database 4.1 Identify a set of tables from an ER model
design
4.2 Check that the tables are capable of supporting
the required transactions
5. Be able to develop a database 5.1 Create database tables
system using SQL
dictionary
5.2 Insert data into the tables
5.3 Update data in the tables
5.4 Delete data in the tables
Page 7 of 158
DB Lecturer Guide V1.0
based
on
data
Title Here
3.
Syllabus
Syllabus
Topic
No
Title
Proportion
Content
Introduction to the
Module and
Database
Fundamentals
1/15
Databases and
Database
Management
Systems (DBMS)
1/15
2 hours of
lectures
1 hour of
tutorials
1 hour of
laboratory
Learning Outcome: 1
sessions
Constructing ER models
Strong and weak entities
Identifying problems in ER models
Problem solving in ER models
2 hours of
lectures
1 hour of
tutorials
1 hour of
laboratory
Learning Outcome: 3
sessions
4
2 hours of
lectures
2 hours of
tutorials
2 hours of
laboratory
Learning Outcome: 3
sessions
Page 8 of 158
DB Lecturer Guide V1.0
Title Here
5
The Relational
Model (1)
1/10
2 hours of
lectures
2 hours of
tutorials
2 hour of
laboratory
Learning Outcome: 2
sessions
6
The Relational
Model (2)
1/10
2 hours of
lectures
2 hour of
tutorials
2 hours of
laboratory
Learning Outcome: 2
sessions
7
SQL (1)
1/12
Understanding requirements
Identify a set of tables from an ER model
The data dictionary
Use of CASE tools
Entities to tables
2 hours of
lectures
1 hour of
tutorials
2 hours of
laboratory
Learning Outcome: 3
sessions
8
SQL (2)
1/12
2 hours of
lectures
1 hour of
tutorials
2 hours of
laboratory
Learning Outcome: 3
sessions
9
Database Design
1/10
2 hours of
lectures
2 hours of
tutorials
2 hours of
laboratory
sessions
Learning Outcome: 4
Page 9 of 158
DB Lecturer Guide V1.0
Title Here
10
Supporting
Transactions
1/10
2 hours of
lectures
2 hours of
tutorials
2 hours of
laboratory
Learning Outcome: 4
sessions
11
Database
Implementation
1/10
2 hours of
lectures
2 hours of
tutorials
2 hours of
laboratory
sessions
Learning Outcomes: 5
12
Summary
1/30
Summary of module
Identifying links with other modules/subject
areas
2 hours of
lectures
Clarification of module material and related
issues as identified by students
Learning Outcomes: ALL
5.
Resources
Lecturer Guide:
This guide contains notes for lecturers on the organisation of each topic, and
suggested use of the resources. It also contains all of the suggested
exercises and model answers.
PowerPoint Slides:
These are presented for each topic for use in the lectures. They contain many
examples which can be used to explain the key concepts. Handout versions
Page 10 of 158
Title Here
of the slides are also available; it is recommended that these are distributed to
students for revision purposes as it is important that students learn to take
their own notes during lectures.
Student Guide:
This contains the topic overviews and all of the suggested exercises.
This module also makes use of SQL. You may choose which version of SQL is to be used but
students will need to have access to this during laboratory and private study time. You may wish to
consider
MySQL.
This
is
open
source
software
which
is
available
from
http://www.mysql.com/downloads.
6.
Pedagogic Approach
Suggested Learning Hours
Lectures:
Tutorial:
24
Seminar:
17
Laboratory:
19
Private Study:
90
Total:
150
The teacher-led time for this module is comprised of lectures, laboratory sessions and tutorials. The
breakdown of the hours is also given at the start of each topic.
6.1
Lectures
Lectures are designed to start each topic and PowerPoint slides are presented for use during these
sessions. Students should also be encouraged to be active during this time and to discuss and/or
practice the concepts covered. Lecturers should encourage active participation wherever possible.
6.2
Tutorials
These are designed to deal with the questions arising from the lectures and private study sessions.
For some topics these will be structured sessions with students engaging in tasks related to the
lecture. Other sessions will involve problem solving and trouble-shooting discussions related to the
practical work.
6.3
Laboratory Sessions
During these sessions, students are required to work through practical tutorials and various
exercises. The bulk of the tutorial sessions will be related to gaining a sufficient level of mastery of
the chosen database tool and the SQL language sufficient to implement the assessment task.
Students will be introduced to SQL in the laboratory sessions and this learning will later be
augmented by lecture and tutorial sessions. The details of these are provided in this guide and also
in the Student Guide.
6.4
Private Study
In addition to the taught portion of the module, students will also be expected to undertake private
study. Exercises are provided in the Student Guide for students to complete during this time.
Teachers will need to set deadlines for the completion of this work. These should ideally be before
the tutorial session for each topic, when Private Study Exercises are usually reviewed.
7.
Assessment
This module will be assessed by means of an examination worth 75% of the total mark and an
assignment worth 25% of the total mark. These assessments will be based on the assessment
Page 11 of 158
DB Lecturer Guide V1.0
Title Here
criteria given above and students will be expected to demonstrate that they have met the modules
learning outcomes. Samples assessments are available through the NCC Education Campus
(http:campus.nccedu.com) for your reference.
Assignments for this module will include topics covered up to and including Topic 7. Questions for
the examination will be drawn from the complete syllabus. Please refer to the Academic Handbook
for the programme for further details.
8.
A selection of sources of further reading around the content of this module must be available in your
Accredited Partner Centres library. The following list provides suggestions of some suitable
sources:
Benyon-Davies, P. (2003). Database Systems, 3rd Revised Edition. Palgrave Macmillan.
ISBN-10: 1403916012
ISBN-13: 978-1403916013
Connolly, T. and Begg, C. (2009). Database Systems: A Practical Approach to Design,
Implementation, and Management, 5h Edition. Pearson Addison Wesley.
ISBN-10: 0321523067
ISBN-13: 978-0321523068
Hoffer, J., Ramesh, V. and Toppi, H. (2010). Modern Database Management, 10th Edition. Pearson
Prentice Hall.
ISBN-10: 1408264315
ISBN-13: 978-1408264317
Page 12 of 158
DB Lecturer Guide V1.0
Topic 1
Topic 1: Introduction
to
Fundamentals
1.1
the
Module
and
Database
Learning Objectives
This topic provides an overview of the module syllabus and a general introduction to databases
On completion of the topic, students will be able to:
1.2
Pedagogic Approach
Information will be transmitted to the students during the lectures. Students will be encouraged to
participate in activities during the lecture. They will then practise the skills during the tutorial session.
The laboratory sessions will introduce SQL to enable students to begin creating some database
structures that will be used in future topics.
1.3
Timings
Lectures:
2 hours
7.5 hours
Tutorials:
1 hour
Page 13 of 158
DB Lecturer Guide V1.0
Title Here
1.4
Lecture Notes
The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:
Slide 3:
Slide 4:
Slide 5:
Slide 6:
Slide 7:
This slide gives a brief overview of the importance of databases. Point out that they
are a relatively new technology. You may also like to ask students to give their own
explanations/definitions of what a database is. This can be helpful in gauging
students initial knowledge levels on the topic.
Slides 8-9:
The students should now be asked to write down all the different places where they
think information might be kept about them on a database or where they have
interacted with a database. Run a class feedback session to gather feedback and
collate the results before showing students the examples on Slide 9.
Slides 10-11:
Page 14 of 158
DB Lecturer Guide V1.0
Title Here
To match this to the right policy, they would have to search through one or more
databases before coming up with the insurance quote.
Sometimes this matching of person to policy is done with a sophisticated piece of
software called an Expert System. This uses sets of rules to match the person to
the policy and uses databases of the personal data, the policy data and the actual
rules themselves to do so.
Slides 12-13:
So far, the lecture has looked at lots of different uses of database but how do we
define what a database is? There are various definitions given in textbooks. This
slide presents the definition given by one of the founding fathers of modern
databases, C.J.Date: A database is a computerised record keeping system. This
definition is ok as a starting point but highlight to students that some people include
manual filing systems as being a type of database.
Slide 14:
Databases have the capacity to store, manipulate and retrieve data. We keep data
there (storage), we do things to that data via programs and applications
(manipulation) and we need to be able to get the data out of the database when we
need it (retrieval).
Slides 15-16:
Slides 17-18:
This slide makes the point that databases are not just big buckets of data. Most of
these systems are not just a mass of data that isnt in any way organised or
controlled. Thus, Dates definition is not precise enough and a more detailed
definition from Hoffer et al is presented on Slide 18.
Slides 19-21:
The terms in this definition will be examined in more depth as the module
progresses. How data is organised and logically related will be extremely important
when students examine entity relationship modelling and the relational model later
in the module. These slides present an introduction to the importance of these
ideas. The example of a salespersons database is given on Slide 21 but this could
be substituted for another database with which students are familiar, for example a
database holding hospital records. We say that data should be related in that
certain data belongs together in the sense that it forms a meaningful set for the
person using it.
Slide 22:
Now ask students to work in pairs or small groups and think about what qualities
there are about themselves that might become data. They should make two lists:
one is a list of their data that might be relevant to a database for a college or
university; the other list should be for a database for a social networking site.
After students have had a few minutes to discuss, run a class feedback session.
The lists should contain different sorts of data. For example a college would be
interested in their qualifications; a social networking site would be interested in what
sort of music and films they liked.
Slides 23-24:
These slides return to the question of what data is. Historically, data meant facts
that could be recorded and stored in a computer system. For example in a sales
database the data would include facts such as a customer name, address and
telephone number. This is quite simple data which consists of text. Numerical data,
Page 15 of 158
Title Here
such as the amount that customer spent last year, might also be stored. Today, this
definition has to be expanded to reflect a new reality since databases store objects
such as whole documents, photographic images, sound and video.
Slides 25-26:
Traditionally there has been a distinction made between data and information.
Ask students to look at the list on Slide 25 and guess what it means. Highlight that if
it is processed in a certain way, it becomes more meaningful. Information is data
that has been processed in such a way that it can increase the knowledge of the
person who uses it.
Slide 27:
This slide highlights the importance of information. We are told that we live in a
knowledge economy where information is the most vital resource. Databases
therefore are one of the key technologies to gain access to this information.
Slide 28:
Recap of learning outcomes. Ask the student to go through how they have been
covered in the lecture.
Page 16 of 158
DB Lecturer Guide V1.0
Title Here
1.5
Laboratory Sessions
Exercise 1
The following SQL scripts will create two tables. Create and run them. If possible learn how to save
them in a file and run them from the SQL prompt. The mechanism for doing this will depend upon
the version of SQL you are using.
Create table departments
(dept_no integer not null,
department_name varchar(30),
location varchar2(3)
primary key dept_no);
Create table workers
(emp_no integer not null,
first_name varchar(30),
last_name varchar(30),
job_title varchar(30),
age integer,
dept_no integer,
primary key emp_no,
foreign key (dept_no) references departments)
Exercise 2
Examine the tables you have created. You do this using the desc <table_name> command.
Suggested Answer:
desc departments
Page 17 of 158
DB Lecturer Guide V1.0
Title Here
desc employees
Exercise 3: Insert Statements
To insert data into a table, you need to use an insert statement. The structure of insert statements
is:
Insert into departments values ('1','Packing','Cairo');
Now use similar statements to insert the Accounts department in Lagos with reference number 2
and the Human Resources department in London with reference number 3.
Note: are used around text based fields and are not required for numeric fields.
Suggested Answer:
Insert into departments values (2,Accounts,Lagos)
Insert into departments values (3,Human Resources,London)
Exercise 4
Now use insert statements to create the following workers:
Emp_no
First_name
Last_name
Job_title
Age Dept_no
Lawrence
Surani
Manager
56
Jason
Argo
Manager
33
Emily
Villa
Manager
32
Ahmed
Mukani
Packer
23
Joe
Todj
Packer
24
Hattie
Smith
Accountant
56
Sally
Boorman
Admin
Assistant
34`
Suggested Answer:
Insert statements structured as above with relevant data
e.g. Insert into workers values (1,Lawrence,Surani,Manager,56,1)
Page 18 of 158
DB Lecturer Guide V1.0
Title Here
Exercise 5: Looking at the Data
To see the data in your table, you need to use a select statement. The structure of select statements
is:
select <column_name> from <table_name>
To see all the columns:
select * from <table_name>
Use the select command to view the contents of your tables.
Suggested Answer:
Relevant select statement such as Select * from Workers.
Page 19 of 158
DB Lecturer Guide V1.0
Title Here
1.6
Private Study
The time allocation for private study in this topic is expected to be approximately 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
The Private Study exercises will be reviewed during the tutorial session so students should be
directed to complete this work in advance of that session.
Do they have databases at the moment? If so, what are they used for?
How might they breakdown the data they are interested in into different categories? These
might be types of people, objects for sale, courses etc.
Suggested Answer:
The answer will depend on the organisations. Students should be encouraged to make educated
guesses about aspects that they are not familiar with and cannot get access to. Students should
also be encouraged to ask questions.
With regard to the breakdown of data, students should not be expected as this stage to have come
up with definitive data structures but to demonstrate that they have thought about different types of
data.
Exercise 2
Many organisations that collect data that will eventually go into a database begin the collection
process with paper forms. Example of this might be a passport or driving license application, a job
application or application to join a library.
Collect some examples of such paper forms and a turn each into a list of the data that could then be
input into a database.
Suggested Answer:
Once again, it is not expected that students should come up with fully-fledged data structures. The
aim here is to encourage them to think about different sources of data, data types and the way data
is organised.
Page 20 of 158
DB Lecturer Guide V1.0
Title Here
Exercise 3
Revise the topics that were discussed in the lecture. Ensure that you understand examples of
databases in use, the definitions of databases that were given, types of data, and the difference
between data and information.
If anything remains unclear after you have revised the topic, make a list of your questions and bring
it to the tutorial session.
Page 21 of 158
DB Lecturer Guide V1.0
Title Here
1.7
Tutorial Notes
Exercise 1:
In small groups, discuss your findings to Private Study Exercises 1 and 2. You should collate your
findings and report back to the class.
Exercise 2:
Page 22 of 158
DB Lecturer Guide V1.0
Topic 2
Topic 2: Databases
(DBMS)
2.1
and
Database
Management
Systems
Learning Objectives
2.2
Pedagogic Approach
Information will be transmitted to the students during the lectures. They will investigate some of the
topics further in private study and feed their results back to the tutorial. The laboratory session will
be exercises in the SQL language.
2.3
Timings
Lectures:
2 hours
7.5 hours
Tutorials:
1 hour
Page 23 of 158
DB Lecturer Guide V1.0
Title Here
2.4
Lecture Notes
The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:
Recap of data
Introducing the concept of metadata
Pre-database file processing systems
Structure of a database system
Common applications
DBMS architecture and functions
DBMS advantages and disadvantages
Commercial products
Data models
Scope and Coverage and Learning Outcomes - Go through the topics that will be
covered in this lecture. Recap some of the key concepts covered in the first lecture.
The main features of database systems will be described in more detail. Predatabase information management tools (file processing systems) will be examined
in order to point out the different approach of database systems and the benefits
and advantages of database systems.
Slides 4-5:
A recap on the previous definition of data is given here: data is the raw facts stored
in a computer system. In a relational database, data is stored in the form of tables
with columns and rows, and each row represents an instance of data. This is
sometimes referred to using the term tuple. Each column represents a different
attribute of that data. MS Access and Oracle are both examples of relational
database systems and are demonstrated here as examples of how their table
structures are shown. Note that in the Oracle slide, the data in the table has been
retrieved using a SELECT statement of the sort the students will be using in the
laboratory sessions.
Slides 6-9:
Metadata is data that contains the structure of other data. The structure of the
tables in a relational database is kept within the database itself in the form of
metadata. This defines the name of the table, the name of the column, the length of
the column and the data-type. Here, the students could be asked about their
understanding of data-types, e.g. what data-types are available? It should be
pointed out that different implementations of relational databases from different
vendors might have slightly different names for the same data type; although there
are standards. These slides show examples of how metadata is held in MS Access
and Oracle. Metadata is important because it is how the database keeps a record of
the structure of the data that is stored in it; the keeping of metadata means that
within the database itself, there is a record of the structure of the tables and
attributes. This enables a database to realise one of the advantages of the
database approach, namely, program-data independence. This will be discussed in
more detail below. The collection of metadata in a database is known as the data
dictionary.
Page 24 of 158
Title Here
Slide 10:
Activity - The students should be asked to define metadata for the systems
mentioned in the slide. The important point to get across here is that they
understand the distinction between data and metadata; this exercise will test the
learning of the distinction between these concepts.
Slides 11-12:
These slides examine a two file-processing system. The example used is of a car
rental system. One system processes CUSTOMER data, and the other processes
RENTAL data. Each of the files and the applications that use them are totally
separate. Although this is an improvement over older manual systems, there are a
number of problems here:
Data duplication:
For example, a customers name, address and phone number might be stored
many times over. Once in the customers file and once again every time they
make a rental (therefore possibly many times in the rental file).
This wastes space and also raises the more serious problem of compromising
data integrity. Data integrity refers to data being logically consistent. For
example, if a customer changes his or her name or address, then all the files
containing that data must be updated, but the danger with duplication is that this
doesnt happen. The address might be changed in one file and not in another,
which would lead to difficulties in knowing which the correct address is.
Incompatible files:
Due to this application program dependency, files that can be processed by one
programming language will be different to those processed by another. This
makes files difficult to combine, which reinforces the isolation and separation of
data that we discussed earlier.
Title Here
which we have already noted is difficult, just to make the data appear natural to
the users.
Slide 13:
It should be pointed out that the most obvious feature of the diagram of the Basic
Structure of a database system is that now the data is stored in one place. All the
various applications will access it via the Database Management System.
Slide 14:
Slide 15:
The features of the database approach overcome the problems discussed with
regard to file-processing systems. The following points should be discussed:
Integrated data:
In a database system, all the application data is stored in a single facility called
a database. An application program can access customer information and rental
information easily. The program can specify how to combine the data and the
DBMS will do it.
Program/Data independence
Since the record formats are stored in the database itself (as metadata), then
we dont need to include file information in our application programs.
Title Here
requirements and pre-conceptions, is a major topic of study in systems analysis,
but we should always bear in mind that what we are concerned with is the
peoples perceptions and understanding.
Slide 16:
Slides 1718:
Slide 19:
The students should be asked what their understanding of an application is. They
can be encouraged to think about how they have interacted with databases. An
overview of the various types of applications is given:
Batch Processes - Programs that perform an activity on the database in one go,
such as multiple updates. If, for example, someone wanted to delete all the
customers
from a system who hadnt brought anything for over a year, a
batch process could
do this in one go without an end-user having to go
through the records one by one.
Slide 14 showed how the Database Management System (DBMS) sat as the
intermediary between the applications and the database itself. The DBMS is the
piece of software that handles all the interactions between applications and the
database. Paul Benyon-Davis provides a useful way of looking at the structure of
the DBMS itself.
Kernel - Central engine, which operates most of the core data management
functions such as those discussed below
Toolkit - The tools and applications that interact with the end-users. There is a
vast range of these available now. These might be provided as part of the
DBMS product or as separate piece of software.
Interface: handles the interaction between the toolkit and the kernel
The standard functions that the DBMS performs should be outlined. Most of these
will be performed by the kernel. It should be noted that some of these topics will be
covered in more detail later in the module.
CRUD - Stands for Create, Retrieve, Update and Delete. The basic interactions
between applications and the data.
Data Dictionary - The repository for the metadata should be supported. Not only
the structure of tables, but also the primary keys, relationships between tables
etc.
Concurrency Control - The ability for many users to perform transactions at the
same time.
Page 27 of 158
DB Lecturer Guide V1.0
Title Here
Slide 20:
Slides 2123:
Data Integrity - Making sure that the database data reflects accurately the
model of the world that data is being kept about. This involves the use of
integrity constraints, such as enforcing that the values of an attribute are valid
values.
The interface should provide the following languages to support DBMS functions:
These slides list the advantages and disadvantages of the DBMS. Rather than just
going through this list, the lecturer could use each slide in turn as a set of headings
from which the students should discuss what they think these points mean and how
they relate to the content of the lecture so far.
Some concepts might need to be explained at the outset:
Integrity This refers to the consistency of data. This can be enforced with the
use of constraints.
A full discussion of these points can be found in Connolly and Begg chapter 1.
Slides 2425:
These slides look at commercial products. The market share of the main vendors is
shown. Some points can be mentioned with regard to the three main vendors. The
students will be investigating this further as part of their private study.
Oracle The biggest database company which, for a long time, held the market
share. It has a mature product and support network. It is favoured by many
professional organisations.
Page 28 of 158
DB Lecturer Guide V1.0
Title Here
Slide 26:
Microsoft SQL Server This fully integrates with other Microsoft tools.
It should be noted at this point that the lecture has focused on the relational model.
This is the most widely used data model for databases. It is the model that will be
the focus of this module. It is not, however, the most up-to-date model, in the
sense that there are newer data models available for databases. The order of the
bullet points here is more or less a chronological one.
As part of their private study, the students will be investigating the various different
data models that have existed.
Lecturers Notes:
Please note that the Private Study exercise for this topic requires organisation in order to ensure
suitable topic coverage (see Section 2.6 below). You should ensure that students know which topic
they have been assigned following the lecture session(s).
Page 29 of 158
DB Lecturer Guide V1.0
Title Here
2.5
Laboratory Sessions
Exercise 1
Select the emp_no, first_name and last name from the workers table.
Suggested Solution:
Select emp_no, first_name, last_name from workers;
Exercise 2
Select the emp_no, first_name and last_name from the workers table for all the workers in
department no 1.
Page 30 of 158
DB Lecturer Guide V1.0
Title Here
Suggested Solution:
Select emp_no, first_name, last_name from workers
Where dept_no = 1;
Exercise 3
Select the first_name, last_name and job_title for all the managers in the workers table.
Suggested Solution:
Select emp_no, first_name, last_name from workers
Where job_title = Manager;
Exercise 4
Select the first_name and last_name for all the workers whose first names start with the letter J.
Suggested Solution:
Select first_name, last_name from workers
where first_name like 'J%';
Exercise 5
Select all the columns from the workers table for workers over the age of 50.
Suggested Solution:
Select * from workers
Where age > 50;
Exercise 6
Select the emp_no, first_name and last_name for all the managers who are under the age of 40.
Suggested Solution:
Select emp_no, first_name, last_name
from workers
where age < 40
and job_title = 'Manager';
Page 31 of 158
DB Lecturer Guide V1.0
Title Here
Exercise 7
Select the name and location of all the departments.
Suggested Solution:
Select dept_name, location from departments;
Exercise 8
Select all the columns for the department located in Cairo.
Suggested Solution:
Select * from departments where location = Cairo;
Page 32 of 158
DB Lecturer Guide V1.0
Title Here
2.6
Private Study
The time allocation for private study in this topic is expected to be 7 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide. Students are also expected to use some of their private study time to review the
content of this topic and to conduct any extra reading required to improve their understanding.
You will need to allocate the topics to the students to ensure that there is suitable coverage of each
one. Students should write individual reports and ideally half of the class will cover each topic.
Exercise 1
Write a report on one of the following topics. Your report should be 600-900 words in length and you
should be prepared to discuss your report in the tutorial session. Your lecturer should allocate the
topics to ensure the different content is covered evenly.
Prepare a report on three of the alternative data model approaches that have been used
for database systems: network, hierarchical, relational, object-oriented, deductive and
post-relational. Discuss the history, structure and uses of each of your chosen models.
These reports will be used as the basis for the classroom discussion in the tutorial session.
Suggested Answer:
The report should be about 600 to 900 words with a third given to each of the chosen systems or
data models. Students should be encouraged to carry out research online and use the results of this
to produce a document that is in their own words.
Page 33 of 158
DB Lecturer Guide V1.0
Title Here
2.7
Tutorial Notes
Exercise 1
Work in a small group with other students who have written a report on the same topic during
private study time.
Discuss the information you have found. You should take the opportunity to add any additional
information to your own notes.
Now prepare to present your information to students who have worked on the other report. You
should work together as group to prepare a short (5 minutes), informal presentation which will give
the other students a summary of the main information you have found.
Exercise 2
Join together with another small group who have worked on the other report topic.
Work with your group to present your information to students from the other group. You should also
answer any questions they might have.
Now listen to their presentation and take notes.
Page 34 of 158
DB Lecturer Guide V1.0
Topic 3
Topic 3: Entity Relationship (ER) Modelling (I)
3.1
Learning Objectives
3.2
Pedagogic Approach
Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.
3.3
Timings
Lectures:
2 hours
7.5 hours
Tutorials:
1 hour
Page 35 of 158
DB Lecturer Guide V1.0
Title Here
3.4
Lecture Notes
The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:
Scope and Learning Outcomes. It should be mentioned that this topic will be
covered over 2 lectures and the scope and learning outcomes presented here are
those that relate to this topic's lecture.
Slide 4:
Point out that the ERM technique is used to specify database structures among
professionals within database industry. It is also used to communicate data models
to non-specialist users, such as clients sponsoring database projects.
Slides 5-7:
The students should be made aware that that there are different notations available.
For this unit, the UML notation will be used, but they might come across other
notations in textbooks. The notations are more or less equivalent in terms of what
they can express.
Slides 8-9:
These slide present definitions of an Entity Type and Entity Occurrence. These are
best explored though the use of examples.
Slide 10:
The students should be encouraged to engage in this activity to think about what
entity types might exist in a library. Allow some time for them to do this either
individually or in groups.
Slides 11-12:
Allow students to make notes before going through the likely answers, to illustrate
the difference between entity type and entity occurrence.
Slide 13:
This slide explains UML standard notation for entity type. Note that attributes (more
of which below) will sometimes be listed on the diagram and this will be shown in
more detail in the second ER lecture.
Slides 14-15:
Slide 16:
For these entities identified as part of a Library System, specify the relationships
between them by connecting them with a line.
Slides 17-18:
Discuss with students their own solutions, which might be different. Ask what they
understand as the nature of the relationships they have specified.
Slide 19:
Relationship Names.
Page 36 of 158
Title Here
Slides 20-23:
Slide 24:
As part of self-study, the students will be asked to specify the multiplicity of the
remaining entities on the Library System diagram. BOOK to LOAN; LOAN to
BORROWER.
Slide 25:
Slide 26:
Slides 27:
Identify attributes for the entity types of the library system. Allow some time for this
and discuss the findings with students.
Slides 28-30:
Page 37 of 158
DB Lecturer Guide V1.0
Title Here
3.5
Laboratory Sessions
Title Here
6. Using the COUNT function and joining the two tables, count how many workers there are in
Lagos.
Suggested Solutions:
1. Select d.department_name, w.first_name, w.last_name, w.job_title
from departments d, workers w
where d.dept_no = w.dept_no;
2. Select d.department_name, d.location, w.first_name, w.last_name, w.job_title
from departments d, workers w
where d.dept_no = w.dept_no;
3. Select d.department_name, w.first_name, w.last_name, w.job_title
from departments d, workers w
where d.dept_no = w.dept_no
and d.department_name = 'Packing';
4. Select d.department_name, d.location, w.first_name, w.last_name, w.job_title
from departments d, workers w
where d.dept_no = w.dept_no
and d.location = 'Cairo';
5. Select w.job_title, w.age, d.location
from departments d, workers w
where d.dept_no = w.dept_no
and d.location = 'London';
6. Select Count(*)
from departments d, workers w
where d.dept_no = w.dept_no
and d.location = 'Lagos'
Exercise 3
The ORDER BY statement is a way of specifying the order in which you want your selected data to
appear. For example, to retrieve the emp_no and first_name of workers, we could order by emp_no
or order by first_name:
To order by emp_no:
Select emp_no, first_name
From workers
Order by emp_no;
Page 39 of 158
DB Lecturer Guide V1.0
Title Here
To order by first_name
Select emp_no, first_name
From workers
Order by first_name;
1. Select the department_name from departments, the first_name, last_name and age from
workers. Order by the age.
2. Select the first and last names of the workers who work in Cairo and order them by their age.
Suggested Solutions:
1. Select d.department_name, w.first_name, w.last_name, w.age
from departments d, workers w
where d.dept_no = w.dept_no
order by w.age;
2. Select w.first_name, w.last_name
from departments d, workers w
where d.dept_no = w.dept_no
and d.location = 'Cairo'
order by w.age
Page 40 of 158
DB Lecturer Guide V1.0
Title Here
3.6
Private Study
The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
Exercise 1
In the Private Study session for Topic 1, you were asked to collect paper data input forms from an
organisation, such as a library or government department or any other organisation.
Examine these forms and specify what entities and attributes might be needed in a database to
capture the data that they collect.
Bring both the paper forms and your analysis to the tutorial for discussion.
Suggested Answer:
The answer will depend on the material brought by the student. However, students should be
looking for field entries for attributes and thinking how these might be grouped under certain entities.
Entities might be identifiable as section headings within forms, such as personal information,
qualifications etc.
Exercise 2
Examine the library system diagram on Slide 24. Identify the missing multiplicities for Book to Loan
and Loan to Borrower.
Suggested Answer:
A Loan is for 1 Book and for 1 Borrower so if someone took out more than one Book, it would be set
up as a number of different Loans. A Borrower might have none or more Loans. A Book might have
none or more Loans.
Exercise 3
Revise the topics studied in the lecture. Make notes on the following topics and make sure you
understand the concepts:
Entity Type
Entity Occurrence
Relationship Type
Relationship Occurrence
Attributes
Multiplicity
Page 41 of 158
DB Lecturer Guide V1.0
Title Here
Suggested Answer:
The students should make some short notes on each of the topics and demonstrate their
understanding of them in the tutorial. They should be encouraged to raise aspects of the topics that
they find difficult or do not understand.
Page 42 of 158
DB Lecturer Guide V1.0
Title Here
3.7
Tutorial Notes
Exercise 1:
In small groups, discuss your findings to Private Study Exercises 1 and 2. You should collate your
findings and report back to class.
Exercise 2
Use this time to raise questions regarding the material. In small groups, discuss the concepts listed
below and report your findings back to the class.
Do you understand the following concepts? :
Entity Type
Entity Occurrence
Relationship Type
Relationship Occurrence
Attributes
Multiplicity
Page 43 of 158
DB Lecturer Guide V1.0
Title Here
Page 44 of 158
DB Lecturer Guide V1.0
Topic 4
Learning Objectives
4.2
Pedagogic Approach
Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.
4.3
Timings
Lectures:
2 hours
7.5 hours
Tutorials:
2 hours
Page 45 of 158
DB Lecturer Guide V1.0
Title Here
4.4
Lecture Notes
The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:
Identifying entities
Primary and Foreign keys
Chasm and Fan traps
Scope and Learning Outcomes. It should be noted that this is the second lecture
about ER modelling, which expands on topics covered in the first lecture. There is a
focus in this topic on practical application of the model to example scenarios.
Slides 4-8:
Slide 9:
Primary Key -Attributes were introduced in the last topics lecture; they are qualities
of an entity. This is an example of attributes of the entity Client. Each entity will
have one or more attributes that uniquely defines an entity instance; in this case, it
is ClientNo. No two client numbers will be the same.
Slide 10:
Entities are linked by foreign keys. A foreign key is the copy of an attribute from
another entity. The attribute copied across links the two entities. It is usually (though
not always) the primary key that is copied across. In the example here, a Client has
a Preference for types or room and a maximum rent. We know whose Preference it
is, because the clientNo is a Foreign Key that has been copied from Client and links
the two attributes.
Slide 11:
Page 46 of 158
DB Lecturer Guide V1.0
Title Here
Slide 12:
Slides 13-15:
These slides show how instances of data are represented on the entities with the
use of foreign keys. Note that the Primary Key of the Student on Module is the
compound primary key composed of the two attributes Module Code and Student
Code. Ask the students whether this would really provide a unique identifier for
each instance of Student on Module. What would happen in a situation where
students were allowed to resit a Module? In that case, there would be duplicate
values in the primary key on Student on Module, which is not allowed. Therefore,
we would need to introduce another attribute as part of the primary key on Student
on Module, for example a date or semester and year.
Slides 16-17:
Slide 18:
Primary keys on weak entity types will normally include the primary key (that exists
there as a foreign key) as part of their own primary key.
Slides 19-20:
Fan Traps -These slides illustrate that where there is a structure similar to that in
Slide 19, there is a potential problem as shown by the example data in Slide 20. We
have a Campus entity that shows a number of Staff and a number of Departments
as belonging to a particular Campus. However, there is no link between Department
and Staff, and therefore, where we have a Campus that has more than one
Department, we will not always know the Department in which a member of Staff
works.
Slide 21-22:
The solution is to adopt a different structure where we have a Campus having one
or more Departments, which in turn have one or more Staff. It should be clear from
the example data that this solves the problem.
Slides 23-24:
Chasm Traps - These occur where there are relationships between entities, but one
of the relationships is non-mandatory, i.e. there does not HAVE to be an instance of
this relationship. This is shown in the example data in Slide 23; here a Branch has
many Staff members who manage Properties, but not every Property must be
managed by someone. Therefore, because Hill House is not managed by a
member of Staff, we do not know from which Branch that Property is managed.
Slides 25-26:
The solution is to change the structure and represent both relationships. This is
illustrated by the data shown.
Slides 27-29:
Students should be given a fairly generous amount of time to attempt the activity,
either individually or in pairs, during the lecture. The solution should then be
discussed and used as an example of the techniques of entity discovery, setting up
relationships and multiplicity that have been discussed other the two lectures. Note
that the self-study and tutorial exercises will involve more example ER diagrams.
Page 47 of 158
DB Lecturer Guide V1.0
Title Here
4.5
Laboratory Sessions
Exercise 1:
Inserting Data
Insert another new department Public Relations. Its dept_no will be 5. Its location is Madrid.
2.
Insert another new department Research and Development. Its dept_no is 6. Its location is
unknown at the moment.
3.
A new employee has joined Research and Development. Their first_name is Jonas, their
last_name is McKenzie. They are 38 years old. Their emp_no is 8. They are 38 years old,
but currently their job_title has not been decided. Insert data into the workers table for them.
IMPORTANT: Make sure you commit your work by writing the command: commit;
Suggested Solutions:
1.
2.
3.
Insert
into
workers
(8,'Jonas','McKenzie',38,6);
Exercise 2:
(emp_no,first_name,last_name,age,dept_no)
Updating Data
values
Title Here
Set age = age + 1
Where emp_no = 1;
This will add one to the age of worker one.
Try this Update script.
1.
It has now been decided where the new Research and Development department should be
located. Write an update script that sets the location of this department to Berlin.
2.
Write an update script to set the job_title of the employee McKenzie to Manager.
Suggested Solutions:
1. Update departments
set location = 'Berlin'
where dept_no = 6;
2. Update workers
set job_title = 'Manager'
where emp_no = 8;
Exercise 3:
Deleting Data
The DELETE statement allows us to get rid of data from our database.
For example, if we wanted to delete all the managers from the database, we would write the
following (do NOT run this; it is an example):
Delete from workers
Where job_title = Manager;
1.
Use the INSERT statement to insert yourself in the workers table with an ID of 10. Then write
a DELETE statement to get rid of yourself from the database.
Suggested Solution:
The insert statement will depend on the data but will be along the lines of those in Exercise 1 above.
Delete from workers
where emp_no = 10;
Page 49 of 158
DB Lecturer Guide V1.0
Title Here
4.6
Private Study
The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
CUSTOMER
0..*
REGION
CATEGORY
0..N
0..*
PRODUCT LINE
Exercise 2:
0..*
A boat is rented to a customer for a set period of time. Any damage to the boat is recorded for that
particular rental.
Page 50 of 158
DB Lecturer Guide V1.0
Title Here
Suggested Solution:
1
BOAT
0..*N
0..*
RENTAL
CUSTOMER
1
0..*
DAMAGE
Exercise 3:
A Personnel Database
EMPLOYEE
Exercise 4:
0..N
MEMBER
0..N
DEPARTMENT
Each author may have written one or more books. A book might have one or more authors. Each
book belongs to one category.
Suggested Solution:
BOOK TYPE
0..N
BOOK
0..N
WRITTEN BY
Page 51 of 158
DB Lecturer Guide V1.0
0..N
AUTHOR
Title Here
Exercise 5:
Each ticket is for one flight and one customer. A customer may book many flights. A flight has many
customers.
Suggested Solution:
1
FLIGHT
Exercise 6:
0..N
TICKET
0..N
PASSENGER
The shop needs to keep track of rentals. A member can rent films. A film can be rented by many
members. A film can be rented by the same member more than once.
Suggested Solution:
FILM
0..N
RENTAL
Page 52 of 158
DB Lecturer Guide V1.0
0..N
CUSTOMER
Title Here
4.7
Tutorial Notes
Exercise 1:
Review of ER Modelling
Ask your tutor any questions you have with regard to Topic 3 and 4 on Entity Relationship Modelling.
Exercise 2:
Work in a small group and review your answers to Private Study Exercises 1-6. Discuss the
decisions you took in drawing each ER diagram.
Exercise 3
Answer the following questions in your own words:
a. Give an explanation of a fan trap using examples.
b. Give an explanation of a chasm trap using examples.
c. Why is it important to resolve many-to-many relationships into one-to-many relationships?
d. Give examples of entities, from any system or example system, which represent the
following: people, events, concepts, physical objects.
e. Define the following: simple attribute; composite attribute and single-valued attribute.
Suggested Answers:
a. A fan trap is where one too many relationships fan out from a central entity, so the link
between the two entities at the many end becomes unclear. For example, if we had a central
entity of Branch (of an organisation) and two entities fanning out from it: Staff and
Department, we cannot clearly link Staff with Department. The students should provide a
suitable example similar to this.
b. A chasm trap is a data model that has relationships where there is a non-mandatory link
between a parent and a child entity type. An example of this (derived from Connolly and
Begg) is where there is a Branch entity that has a one-to-many relationship with a Staff
entity. The Staff entity has a one-to-Many relationship with a PropertyForRent entity. If the
Page 53 of 158
DB Lecturer Guide V1.0
Title Here
relationship between Staff and PropertyForRent is non-mandatory, then the information as to
which Branch the PropertyForRent rows are related to could be lost. The students should
provide a suitable example similar to this.
c. It is important to do this because since the FK will be at the many end of the relationship, it
could result in FKs at both ends each relating to the other in ways that make it unclear what
the actual relationship is.
d. An example of a person would be a student entity type.
An example of an event might be an appointment or lecture entity type.
An example of a concept might be a course entity type.
An example of a physical object might be a book entity type.
Other examples are, of course, acceptable.
e. Simple Attribute - composed of a single component for example 'Sex' is just one value for
any occurrence.
Composite Attribute - composed of more than one component. For example, 'Address' might
have address lines, town, post or zip code.
Single-Valued Attribute - holds a single value for an occurrence of an entity type. Again, 'Sex'
is a good example, because there will only be one value.
Page 54 of 158
DB Lecturer Guide V1.0
Topic 5
Topic 5: The Relational Model (I)
5.1
Learning Objectives
5.2
Pedagogic Approach
Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.
5.3
Timings
Lectures:
2 hours
7.5 hours
Tutorials:
2 hours
Page 55 of 158
DB Lecturer Guide V1.0
5.4
Lecture Notes
The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:
Note that this is the first of two lectures on the relational model. This first topic will
serve as a background to the relational model and go through some of the
fundamental terms. The second topic will focus more on the process of
normalisation.
Slide 4:
Slide 5:
Point out that the relations in the model are NOT the same as the realtionships in
ER modelling; relations should be thought of as tables. This distinction can be a
cause of confusion for students.
Slide 6:
All the full advantages that databases have over previous file processing systems,
as discussed in an earlier lecture, are only fully realised with the coming of the
relational model. The three aspects mentioned here are key to realising those
advantages.
Slide 7:
Data independence means that access to data moves from being the realm of the
programmer to that of, ideally, the end user. The internal storage structure of the
data does not need to concern someone who wants to access the data. All they
need to know about is the structure of the realtions (or tables), and the attributes (or
columns). Previously, in a language like COBOL for example, a programmer was
needed to take account for the file structure; this made accessing and changing
data much more difficult.
Slide 8:
Slide 9:
Slide 10:
System R was developed at IBM's San Jose laboratory and involved some of the
key people in the early development of databases, such as Codd and Boyce. They
used to play something called the 'Query Game' to work out how to express queries
as simply as possible. This led to the development of SQL. There were also
Page 56 of 158
Slide 12:
The Peterless Relational Test Vehicle was the first relational database to be able to
handle large volumes of data in term of both rows and columns.
Slide 13:
The relational model has its own set of terms. These are often unfamiliar names for
concepts and structues with which we are familiar.
Slide 14:
Slide 15:
Slide 16:
Slide 17:
It should be pointed out that the sets of terms that have been used also have a
counterpart in a third set of terms (Alternative 2) that are remnants of older file
processing and pre-relational database technologies.
Page 57 of 158
DB Lecturer Guide V1.0
Slides 18-20:
Slide 21:
This shows the same data rearranged as a relation. Note that there is lots of
repetion here; for example the name, address and course. Note also that where an
address is not known, there is no data and this column is NULL.
Slide 22:
In order to overcome the problem of repetition, the relation is split into three. This
should result in reducing repetion to a minimum. Only certain attributes are
repeated and these are foreign keys that are linking the data in one relation with the
data in another. Understanding this data should be intuative. So if asked 'What
modules is Guy Smith taking and what are their names?' the students should all be
able to answer this by reading the data from the thre different tables.
Slide 23:
Primary Keys are the attributes that uniquely identify a row in a table (a tuple in a
relation).
Slide 24:
Foreign Keys are an attibute or the set of attributes within one relation that matches
an attribute (usually the primary key but not always) in another relation. A foreign
key in a relation can match an attribute in the same relation. If the attribute that it
matches is not a primary key, then it will be what is known as a candiate key (see
below). Foreign keys are the way in which relationships (of the sort that exist in ER
diagrams) are represented in the realtional model and in implmented databases.
Slide 25:
The process that has been shown here is a flavour of what is known as
normalisation.
Slide 26:
Primary Key and Foreign Key have been discussed - elicit from students what
they are in order to check their understanding.
Candidate Key - A candidate key should be a super key. However ALL the
attributes of this super key must be necessary to uniquely identify it. It should
not be the case that any of the attributes that go to make up this key should
qualify as a super key; ALL the attributes are necessary.
Page 58 of 158
DB Lecturer Guide V1.0
5.5
Laboratory Sessions
Page 59 of 158
DB Lecturer Guide V1.0
Exercise 2
Our new table only has one attribute and no primary key. Therefore, we should modify this with the
ALTER TABLE statement as follows:
Add a column for the salary of that job_title:
ALTER TABLE job_type
Add salary FLOAT;
Note that Float designates the floating point data type. It is also known as REAL.
Run this script to alter the table.
Exercise 3
We now need to add the primary key for this table. The primary key will be the job_title field.
ALTER TABLE job_type
ADD PRIMARY KEY (job_title);
Run this script.
Exercise 4
We must now enforce the fact that job_title in the workers table is now a foreign key to the job_type
table. We do this in a similar way using the ALTER table statement.
ALTER TABLE workers
ADD FOREIGN KEY (job_title) REFERENCES job_type(job_title);
Run this script.
Be aware that different vendors versions of SQL may implement these constraints slightly
differently.
Exercise 5
You will notice that the salary field is blank. Update the job_type table to set the salaries as follows:
Suggested Solution:
Update job_type
Set salary = 30000
Where job_title = Manager;
The other updates will be similar.
Page 60 of 158
DB Lecturer Guide V1.0
Exercise 6
You should now be confident enough to be able to create tables of your own design.
1.
Design a table that keeps your personal details. This should include your name, address and
date of birth. Create this table using SQL with an appropriate primary key.
2.
Design a table that keeps a list of your qualifications. This will have a foreign key to the table
with your personal details. Create this table using SQL with the appropriate primary and
foreign keys. You should include information about the name of the qualification, the level of
the qualification (e.g. Level 4), the name of the institution the qualification was taken at and
the final grade.
Suggested Solution:
This will depend on the student, but an example is given below:
Create table Personal_details
(personal_no integer not null,
first_name varchar(30),
second_name varchar(30)
primary key personal_no);
Create table qualifications
(qual_no integer not null,
qual_name varchar(30),
qual_level integer,
institution varchar(30),
grade varchar (30),,
personal_no integer,
primary key qual_no,
foreign key (personal_no) references personal_details)
Page 61 of 158
DB Lecturer Guide V1.0
5.6
Private Study
The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
Exercise 1
Look at the following table of data from a hair-care product supplier:
Customer ID
Customer Name
Customer Products
Product Prices
C1
Manjeet Islam
Hair dryer
$35
Shampoo
$7
$8
C4
Tolu Amusia
Hair net
C2
Sid James
$7
Hair dryer
$35
C6
1.
2.
3.
4.
5.
Ambereen Reeza
Clippers
Suggested Answers:
1.
2.
3.
4.
5.
There are repeating groups of values in individual cells. There is also no name.
No we cannot tell.
Their values are Null.
No, the order of the tuples/rows is not important in a relation.
An example is given below:
Page 62 of 158
DB Lecturer Guide V1.0
Name: Customers
Customer ID
Customer Name
Customer Products
Product Prices
C1
Manjeet Islam
Hair dryer
$35
C1
Manjeet Islam
Shampoo
$7
C1
Manjeet Islam
$8
C4
Tolu Amusia
Hair net
C2
Sid James
$7
C2
Sid James
Hair dryer
$35
C6
Ambereen Reeza
Clippers
Exercise 2
Looking at the single table you have produced for Question 5 of the Exercise 1 above where you
were asked to redraw the table as a single table. There will still be a number of problems with it.
What issues are there with duplication and the primary key?
Suggested Answer:
There is a lot of duplication now with customer information repeated and price information repeated.
Customer ID cannot be the primary key as it is duplicated across different rows and a primary key
must uniquely identify each row.
Exercise 3
Redraw the single table as three separate tables that have less duplication. You should be guided in
this by the example shown in the lecture for this topic.
Suggested Solution:
Customer ID Customer Name
C1
Manjeet Islam
C4
Tolu Amusia
C2
Sid James
C6
Ambereen Reeza
Page 63 of 158
DB Lecturer Guide V1.0
Customers
Product
Price
Hair dryer
$35
Shampoo
$7
$8
Hair net
Clippers
Products
Customer ID
Product
C1
Hair dryer
C1
Shampoo
C1
C4
Hair net
C2
C2
Hair dryer
C6
Clippers
Note that Products might be given an ID and Customer Products would have the appropriate FK.
Exercise 4
Identify the primary and foreign keys for each of your new relations.
Page 64 of 158
DB Lecturer Guide V1.0
Suggested Answers:
Customer. PK is Customer ID
Products. PK is Product or possibly a Product ID
Customer Products the PK is both columns. Both columns are FKs to the other tables.
Exercise 5
Review the content of this topic and conduct any further reading you need to undertake in order to
ensure that you understand the material. You should make note of anything that you still feel
requires further clarification and bring your questions to the tutorial for this topic.
Page 65 of 158
DB Lecturer Guide V1.0
5.7
Tutorial Notes
Exercise 1:
In small groups, discuss your findings to Private Study Exercises 1-4. Your tutor will then lead a
class feedback session, during which you can also raise any questions you have about the material
covered in this topic.
Exercise 2:
Questions
b.
What was System R and what was its importance in the development of the relational
model?
c.
What is meant by the term NULL and why cant a primary key contain a NULL value?
d.
e.
f.
Look at the following tables that form part of a database from a library system:
Book
BookID
BookName
AuthorID
BookTypeCode
ISBN
BookType
BookTypeCode
BookTypeDescription
Author
AuthorID
AuthorName
NationalityCode
Country
Page 66 of 158
NationalityCode
CountryName
Borrower
BorrowerID
BorrowerName
Loan
BorrowerID
BookID
LoanStartDate
LoanEnd Date
Idenfity the primary and foreign keys in the above schema.
Suggested Answers:
a. Data independence means that the internal storage structure of the data does not need to
concern someone who wants to access the data. All they need to know about is the structure
of the realtions (or tables), and the attributes (or columns). Previously, in a language like
COBOL for example, a programmer was needed to take account for the file structure; this
made accessing and changing data much more difficult.
b. System R was an early relational database developed at IBM's San Jose laboratory and
involved some of the key people in the early development of databases, such as Codd and
Boyce.
It was most important as a testing ground for relational concepts and for leading to the
development of SQL.
There were also commercial implemenations of System R and commercial spin-offs. Other
aspects that were investigated during the System R project include: transaction
management, concurrency control, recovery techniques, query optimisation, data securtiy,
data integerity, human factors and user interfaces
c. A NULL is is an unknown value. It is worth re-emphasising that a NULL is NOT a blank, or a
zero, it is unknown and could potentially have a value. Because of this, it cannot be part of a
primary key, as it could potentially be the same as a value of an already exisitng primary key.
Thus the primary key would not be unique.
d. This was specified on Slide 18 of the Topic 5 lecture slides.
e. Foreign keys are the links between relations. A foreign key must have the value of an
attribute that exists in another table (or must be null). The foreign key must relate to a
candiate key in its parent table. This is usually, but not always, the primary key.
Page 67 of 158
DB Lecturer Guide V1.0
f.
Book
BookID (PK)
BookName
AuthorID (FK)
BookTypeCode (FK)
ISBN
BookType
BookTypeCode (PK)
BookTypeDescription
Author
AuthorID (PK)
AuthorName
NationalityCode (FK)
Country
NationalityCode (PK)
CountryName
Borrower
BorrowerID (PK)
BorrowerName
Loan
BorrowerID (PK) (FK)
BookID (PK) (FK)
LoanStartDate (PK)
LoanEnd Date
Page 68 of 158
DB Lecturer Guide V1.0
Topic 6
Topic 6: The Relational Model (2)
6.1
Learning Objectives
6.2
Pedagogic Approach
Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.
6.3
Timings
Lectures:
2 hours
7.5 hours
Tutorials:
2 hours
Page 69 of 158
DB Lecturer Guide V1.0
6.4
Lecture Notes
The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:
Relational integrity
Normalisation
Anomalies
Functional dependency
The process of normalisation
Slide 4:
Relational integrity refers to the different rules that exist within the model to make
sure that it is made of relations.
The concept of nulls was introduced in the last session, but it would be useful to
recap at this point as it is an important concept with regard to integrity rules. Nulls
represent values of an attribute that are unknown. Note that this does NOT mean
blank or zero. Since null means unknown, it is NOT possible to say that an attribute
with a value of null is equal to another attribute with a value of null.
Entity integrity - This rule is about making sure that each tuple (or row) in a relation
is unique. The rule states that no attribute of a primary key can be null.
Activity: Ask the students why an attribute that is a primary key (or part of a
primary key) cannot not be null. Why would this potentially violate uniqueness?
Answer: A null value, being unknown, might be the same as the value in the
primary key of another tuple.
Referential integrity - If a foreign key exists in a relation, it must much a candidate
key in its home relation or must be null.
General constraints - Any additional rules that are set up at the request of the users
in order to satisfy their requirements. For example, in a database of voters in an
election, a rule could be set up that says all voters must be over a certain age.
Slide 5:
Slides 6-8:
Slides 9-11:
The students should be asked to identify the candidate key. Remember the formal
definition of a candidate key is that firstly it is a superkey (an attribute or set of
attributes that uniquely identifies a tuple). Secondly, it is a superkey such that no
part of it could be a superkey on its own. Therefore it is the minimum set of
Page 70 of 158
attributes that could uniquely identify a tuple. It is called a candidate key, because
it is a candidate to become a primary key.
In this example, the functional dependencies can be examined to help the student
identify a candidate key. If student ID is known, does that mean that the other two
attributes are known? The answer is no. If activity is known then the fee would be
known, but not the student ID. Knowing the fee tells nothing about the other
attributes, so there are functional dependencies there. What is clear is that the
candidate key is a combination of other attributes.
However, if this does not feel right, semantically speaking, this is because this
relation is not fully normalised, there are anomalies.
Slide 12:
Activity: The students should be asked to think about loss of information if a tuple
is deleted from this relation.
Answer: We would lose the price of Skiing as well as the fact that student 9901
was taking skiing. This deletion anomaly is one of the anomalies that occur when
relations are not fully normalised.
There are also insert and update anomalies.
If we want to record a new activity, but no one has yet taken it, we cannot do so; we
need a student ID because the student ID is part of the primary key and therefore
cannot be null. This is an insert anomaly.
If we wanted to change the cost of swimming to 75, we would have to do it for
every tuple where someone was taking swimming. In a relational database, doing
such repetitive updates should not be necessary. It points to an update anomaly.
The problem with this relation is that it contains details about two separate facts
who is taking an activity and how much the activity costs. The solution is to split the
relation.
Slide 13:
The anomalies in the previous section should be revisited. They have been
overcome. This is because the relations are now normalised.
The functional dependencies here are much clearer. In student activity, the two
attributes are self contained; since a student may take many activities they both
must be part of the primary key. In activity cost, there is a functional dependency
whereby activity determines cost. If we know the activity, we know the cost. The
attribute activity will be the primary key.
Slide 14:
There is a more formal process of normalisation. What follows is one example of it.
There are other approaches, although the rules for each normal form are the same.
The starting point is a paper document of the sort that is used to store data in a
manual system.
It is worth noting a number of features of this document. It is a document containing
data about students and the modules they take. There is also information about the
results. Note that each result code has a corresponding description so that P
means Pass and RE means Refer Exam etc.
Note also that this is information for one student and that for that one student, as for
all the other students in the system, there is information about more than one
Page 71 of 158
module. In this example, there are seven rows of data about modules. These rows
are called, in the language of normalisation, repeating groups. For each student
there are repeating groups of information about modules, results etc.
Slide 15:
The first step is to identify which attributes belong to the repeating group. The
attributes are listed as shown. Those attributes where there is one occurrence are
annotated with a 1. Those attributes where there is a repeating group are
annotated with a 2. The 2 in this case simply means more than one. The
tentative primary key is also underlined. In this case it is student number.
Slide 16:
First Normal Form. In the column marked 1NF (for first normal form) the repeating
group information is separated out. Note the important step also: the student
number which is the primary key of the upper set of attributes, is also copied down
to become a foreign key in the lower block. Foreign keys are identified by a star.
This maintains the link between the student information and the module information.
The repeating group data is module information for a particular student. The
primary key of the lower block of attributes (the repeating group) is also identified as
being the combination of student number and module code: once again this makes
sense semantically, because this is data about modules for a particular student.
It is worth stressing that this step, identifying the primary key of the repeating group
block, is often a cause for confusion. If this is not done properly, then the remaining
steps will go wrong.
Slide 17:
For second normal form, we need to examine the functional dependencies that
exist when there is a primary key that is made up of more than one attribute.
Activity: Which block of data do we have to examine now?
Answer: The repeating group block, because it has a primary key made up of more
than one attribute.
Examine each non-key attribute in the relation and see if it is fully dependant on the
WHOLE primary key.
Examining each of the attributes in turn:
Grade Point: Yes. A grade is given for a particular student for a particular
module.
Module Title: no. Module title is dependent just on the module code.
The next step is that where we have identified attributes that are only dependent on
one part of the primary key, we separate them out. We take out the part of the
primary key on which they are dependent. That is also left behind and becomes a
foreign key. In this case, we have separated out the module information and
Page 72 of 158
DB Lecturer Guide V1.0
module code becomes the primary key of the module information. Module code is
left behind as a foreign key.
Slide 18:
Third Normal Form. The process now looks for any functional dependencies in a
relation that are not on the primary key. Go through each attribute in turn to see if is
dependent on the primary key directly. There are two examples where this is not so
here. Course title is dependent on the course code, so while it is true that here is a
dependence on student number (in the sense that if I know a student number, I
know the course title) this dependency is what is known as transitive, i.e. through
another attribute; in this case, the course code.
The course code is separated out with the course title. Course code is also left as a
foreign key in the student block.
The second example is the result code and result. Result is transitively dependent
on the student number/module code primary key and so is separated out.
Slides 1920:
What we are left with at third normal form is different blocks of attributes that
correspond to entities. We can now work from the bottom up and draw our entity
diagram since we know a foreign key means the many end of a relationship. As
part of private study, students will draw the full ER diagram for this set of attributes.
Page 73 of 158
DB Lecturer Guide V1.0
6.5
Laboratory Sessions
Exercise 1
Create a table called Student. The table should have the following attributes all of type varchar:
Student_id
First_name
Last_name
Gender
Student_id is the primary key for the Student table. The Student table is attached to the table
Course in a one-to-many relationship where Student is the many part of the relationship. The
primary key of the Course table is Course_id. You will also need to create the Course table.
Suggested Solution:
The point of this exercise is to ensure that students can create tables with appropriate primary and
foreign keys from a given statement about the relevant tables. The students may have different
sizes for each of the attributes. For example, Student_id could be varchar(10) because a student ID
is not likely to be more than 10 characters, etc.
Page 74 of 158
DB Lecturer Guide V1.0
Exercise 4
Delete the student with the ID NCC001 from the Student table.
Suggested Solution:
Delete from Student where Student_id = NCC001;
Exercise 5
1.
Using the COUNT function and joining Student and Course, count how many students there are
on the Software Engineering course.
2.
Select the first and last names of the students who are on the Database Systems course and
order them by their gender.
Suggested Solution
1.
Select Count(*)
from Student s, Course c
where s.Course_id = c.Course_id
and c.Course_name = 'Software Engineering';
2.
Page 76 of 158
DB Lecturer Guide V1.0
6.6
Private Study
The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
Exercise 1
Draw the ER diagram for the set of relations produced in Slide 20.
Suggested Answer:
Course
Student
0..*
1
0..*
Student Module
0..*
Module
0..*
1
Module Type
Page 77 of 158
DB Lecturer Guide V1.0
Exercise 2:
Normalisation
Supplier
Number
Suppliers
Name
Suppliers
Product Ref
No
Price
Main Supplier
Y/N ?
099
Gibbons
WB09
0100
Jarrolds
Fittings
98383
3.50
0101
H Drammond
B010
3.75
098
Crambornes
Br 7
3.99
078
Jamison
8383
3.99
Above is a form used by a firm to keep track of the different suppliers that supply them the same
part. Suppliers Product Ref No is the reference number given to the part by the supplier. Main
Supplier Y/N indicates whether this is their preferred supplier of the part.
Using the techniques discussed in the lecture, break this document down into a set of third normal
form relations.
Page 78 of 158
DB Lecturer Guide V1.0
Suggested Answer:
UNF
Lev
1NF
2NF
3NF
Product Number
Product Number
Product Number
Product Number
Product Name
Product Name
Product Name
Product Name
Product Type
Code*
Supplier Number
Product Type
Name
Product Type
Code
Supplier Name
P/S Reference
Product Price
Main Indicator
Product Type
Name
Product Type
Code
Product Type
Name
Product Number*
Supplier Number
Supplier Name
P/S Reference
Product Price
Main Indicator
Product Number*
Supplier Number*
P/S Reference
Product
Number*
Product Price
Supplier
Number*
Main Indicator
P/S Reference
Product Price
Main Indicator
Supplier Number
Supplier Name
Supplier Number
Supplier Name
Entities
PRODUCT
PRODUCT TYPE
SUPPLIER PRODUCT
SUPPLIER
Exercise 3
Review the content of this topic and conduct any further reading you need to undertake in order to
ensure that you understand the material. You should make note of anything that you still feel
requires further clarification and bring your questions to the tutorial for this topic.
Page 79 of 158
DB Lecturer Guide V1.0
6.7
Tutorial Notes
Exercise 1:
In small groups, discuss your findings to Private Study Exercises 1 & 2. Your tutor will then lead a
class feedback session, during which you can also raise any questions you have about the material
covered in this topic.
Exercise 2:
Questions
b.
What is the purpose of normalisation? Why is it necessary to split data into separate tables
c.
Why do you think Entity Diagrams are usually referred to as a top-down approach and
normalisation as a bottom-up approach?
d.
e.
Suggested Answers:
a.
First Normal Form: A formal definition might be that it is a table where each cell contains only
one value. For our purposes, the students should be able to express that it is a table which
contains no repeating group information.
Second Normal Form: A relation in which any non-key attribute is fully functionally dependent
on the primary key. There are no partial key dependencies.
Third Normal Form: A relation in which no attribute is transitively dependent on the primary
key.
b. The purpose of normalisation is to produce a set of relations that has the minimum amount of
duplication of data. It is also to avoid anomalies. Tables should contain data about one topic;
that is they should semantically fit that portion of the real world that the data is trying to
represent. If all the data was put into one big table, then every time something was changed
or added it would mean replicating already existing data.
Page 80 of 158
DB Lecturer Guide V1.0
c. ER diagrams approach the problem of organising data by identifying the larger clusters of
data known as entities that correspond to real-world categorisation of data by an
organisation. The aim is to get an overview of the main ways in which data is grouped. The
details of this (the attributes, data-types) and so on, tend to be added later.
Normalisation starts with attributes and works its way up to arrive at a set of entities after
the process of normalisation.
d. This concept describes the relationship between attributes in a relation such that if there is a
functional dependency between two attributes, it means that if one attribute is known then
the value of the other will also be known. If functionally A determines B, then for each value
of A there will be exactly one value of B. For example, for each student ID there will be one
student name. But the reverse is not true.
e. Within normalisation, each stage will be looking for functional dependencies and applying the
rules of that stage to see if they fit. For example, 2nd Normal form is about identifying
functional dependencies between non-primary key attributes.
Page 81 of 158
DB Lecturer Guide V1.0
Page 82 of 158
DB Lecturer Guide V1.0
Topic 7
Topic 7: SQL (I)
7.1
Learning Objectives
7.2
Pedagogic Approach
Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.
7.3
Timings
Lectures:
2 hours
7.5 hours
Tutorials:
1 hour
Page 83 of 158
DB Lecturer Guide V1.0
7.4
Lecture Notes
The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:
The objectives of SQL are to create the database and relation structures, perform
basic tasks, such as insertions, updates and deletions of data from base relations
and perform simple and complex queries.
Slides 5-6:
DDL is for defining the database structures and controlling access to data. DML is
for retrieving and updating data.
Slides 7-8:
These slides look at the history of SQL. SQL was developed out of System R,
which was mentioned during the lecture on the relational model. It was developed
as THE standard database language during the late 1970s. SQL was the first
standard published in 1987 by ISO (International Standards Organisation) based on
work by ANSI (American National Standards Institute). Revision to SQL occurred as
follows:
1992 - Major revision introducing new data types like VARCHAR and some
new set operators like UNION JOIN and NATURAL JOIN.
Before the language was fully standardised in 1987, it was being developed by
different vendors. Various vendors added features known as extensions. The result
of this is what is known as the various SQL dialects.
Slide 9:
Page 84 of 158
DB Lecturer Guide V1.0
INSERT, UPDATE and DELETE - These keywords are for updating data; that is
changing it in various ways on the database.
Ask the students if they understand the difference between an update and an insert.
An update changes data that is already there, whereas an insert is to put new data
into the database.
Slide 10:
Slides 11-12:
Activity: Ask the students to look at this example of a query. What are the
functions of the keywords specified in bold here?
SELECT specifies which columns from the table are to appear in the result
FROM specifies which table or tables are to be used to get the results
WHERE specifies some condition that will restrict the rows that are retrieved
GROUP BY groups rows by some column value
HAVING used to restrict the result that will be grouped
ORDER BY specifies the order in which the result will appear
Mention that the full power of the SELECT construct will be looked at in greater
detail in the next lecture.
Slide 13:
Database Update. SQL can be used not only to query the database despite its
name, it can also be used to add, change and delete data. These constructs are
usually simpler than the SELECT statement.
Slide 14:
This slide shows an example of an insert statement. In the first, the columns are
specified. The second assumes an insert for all the columns. Note that an insert
statement is to add new data to a table.
Slide 15:
This slide shows an example of an update statement. In the first, ALL ROWS are
updated. The second updates specific rows based on a condition. Note that an
update statement is for changing data that is already in the database.
Slide 16:
This slide shows an example of a delete statement. In the first, ALL ROWS are
deleted. It should be pointed out that this needs to be used with care. The second
deletes specific rows based on a condition.
Slides 1718:
Activity: Look at the department table that was created in both the first laboratory
session and in subsequent ones. How would you write a statement to insert a new
row of data in the table? The department will be number 8, based in Glasgow and
will be the Complaints department.
Slide 19:
The commit statement is needed to actually save the changes you have made;
otherwise they will be lost.
Slide 20:
The rollback command can be used to undo an action like an insert, update or
delete.
Page 85 of 158
Slide 21:
Datatypes are used to enforce a domain. Ask the students if they can give a
definition of a domain as discussed in the topic on the relational model. A domain is
a set of allowable values for a column or attribute. Datatypes enforce general
domains, such as whether a column takes a character or number. For more specific
domains, such as Male or Female, then SQL uses something called constraints;
this will be examined in a later topic.
It should be noted that different vendors have extended their versions of SQL to
include new datatypes.
Slides 2223:
, this would
followed by
Numeric Datatypes. Numeric and Decimal are the same; the definition is Decimal
(M,N) with the M being the number of digits before the point and the N the number
of digits after the point. Decimal is abbreviated to DEC.
Integer is a number without a decimal point. It is abbreviated to INT.
Float (also called real) is a number stored with a decimal point that can move as the
number requires it.
Slide 25:
Page 86 of 158
DB Lecturer Guide V1.0
It should be noted that the representation of dates can vary depending on the
vendor and there are extensions in the various vendors flavours of SQL. There was
much refinement in SQL systems (along with all other computer systems) in the run
up to the year 2000, because of the need to store the year in a four digit format.
The students could be asked why this was such an issue at the time and what is the
problem with storing the year as a two digit number? The answer is because
ofconflicts between years, e.g. does 01-JAN-20 mean the first of January 1920 or
2020? We only know if we store the year as a four digit number. The date formats
can usually be specified by the given vendors methods.
Slide 26:
Slide 27:
C.J. Date pointed out early in 1987 that SQL did not support all the features of the
relational model. Referential integrity is not supported in the sense that while we
can define primary keys and foreign keys, it is possible to create tables without a
primary key (so allowing the insert of duplicate tuples) and to have non-enforced
foreign keys.
There is no one standard and the different flavours of SQL can be confusing. SQL
has had to be extended to support new developments in database technology, such
as object-oriented features.
Page 87 of 158
DB Lecturer Guide V1.0
7.5
Laboratory Sessions
Exercise 1
In Topic 5, you should have designed and created your own personal details tables. Gather details
of at least 8 of your fellow students. You should get data for both tables. Insert the data into the
tables.
Suggested Solution:
The purpose of these exercises is for students to get used to writing queries and performing
operations on data structures of their own devising. The answers will depend on the table structure
that the students have created. The model answers here will be based on the generic solution given
in Topic 5.
Create table Personal_details
(personal_no integer not null,
first_name varchar(30),
second_name varchar(30),
primary key personal_no);
Create table qualifications
(qual_no integer not null,
qual_name varchar(30),
qual_level integer,
institution varchar(30),
grade varchar(30),
personal_no integer,
primary key qual_no,
foreign key (personal_no) references personal_details)
Please note that some implementations of SQL specify the primary key in a slightly different way:
Create table Personal_details
(personal_no integer not null primary key,
first_name varchar(30),
second_name varchar(30));
Therefore the insert statements will look like this:
Page 88 of 158
DB Lecturer Guide V1.0
Page 89 of 158
DB Lecturer Guide V1.0
Exercise 5
Write a query that shows how many qualifications you have.
Suggested Solution:
Select Count(q.personal_no)
From personal_details p, qualifications q
Where q.personal_no = p.personal_no
And p.last_name = <Students Last Name>
The student will put their own last name within the angular brackets.
Exercise 6
If there is not one already there then add a column to the personal_details to record a persons age.
Suggested Solution:
Alter table personal_details
Add age integer;
Exercise 7
Update the personal_details table with each persons age.
Suggested Solution:
This is an example.
Update personal_details
Set age = 18
Where personal_no = 1;
Exercise 8
Write a query to show all first names, last names and the level 2 qualifications for students who are
under the age of 20;
Suggested Solution:
Select p.first_name, p.last_name, q.qual_name
From personal_details p, qualifications q
Where q.personal_no = p.personal_no
And q.qual_level = 2
And p.age < 20;
Exercise 9
Create a new table called Qualification_Type using the as statement that shows all the
qualifications that exist. There should be one row for each qualification without duplications.
Page 90 of 158
DB Lecturer Guide V1.0
Suggested Solution:
Create table qualification_type as
(Select distinct qual_name
From qualifications);
Exercise 10
Add a column to the qualification_type table to show the level the qualifications is at.
Suggested Solution:
Alter table qualification_type
Add qual_level varchar(30);
Exercise 11
Update the qualification_type with the correct level for each qualification.
Suggested Solution:
The simple way to do this without nested sub-queries is to use an update statement of the basic
form.
Update qualification_type
Set qual_level = 3
Where qualification = Certificate in Computing;
Exercise 12
Once the qualification_type table is updated with the level then the level can be deleted from the
qualification table. Use the drop column scripts as shown below:
Alter table qualification
Drop column qual_level;
Exercise 13
Make the qualification attribute the primary key of qualification_type;
Suggested Solution:
Alter table qualification_type
Add primay_key (qualification);
Exercise 14
Now create a foreign key between qualification and qualification_type using the qualification
attribute.
Page 91 of 158
DB Lecturer Guide V1.0
Suggested Solution:
Alter table qualification
Add foreign_key (qualification) references qualification_type (qualification);
Exercise 15
Rewrite the query from Exercise 4. Show all those people who have achieved a Level 3 qualification.
You will now need to include all three tables.
Suggested Solution:
Select p.First_name, p.Second_name
From personal_details p, qualifications q, qualification_type t
Where q.personal_no = p.personal_no
And q.qualification = t.qualification
And qt.qual_level = 3;
Page 92 of 158
DB Lecturer Guide V1.0
7.6
Private Study
The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
Exercise 1
In a Customer Accounts System, the following tables have been created using SQL DDL
commands.
1.
2.
3.
4.
A user tried to execute the following commands in the given order to insert values into the created
tables. Find those commands that would result in the return of an error message. Explain why.
1. INSERT INTO Item Type values (2345, Hand Drill, 25);
2. INSERT INTO Item Type values (2344, Electronic Drill);
3. INSERT INTO Item Type values (2346, Drill Bit);
Page 93 of 158
DB Lecturer Guide V1.0
INSERT INTO Item Type values (2345, Hand Drill, 25); ERROR TOO MANY VALUES
INSERT INTO Item Type values (2344, Electronic Drill); OK
INSERT INTO Item Type values (2346, Drill Bit); OK
INSERT INTO Item values (1010, 2344, 2344); ERROR INCORRECT DATA TYPE ITEM
NAME IS A CHAR
5. INSERT INTO Item values (1005, Dulux Cordless Electronic Drill, 2344); OK
6. INSERT INTO Item values (1005, 5mm Ceramic Drill Bit,2354); ERROR FOREIGN KEY
DOES NOT MATCH PRIMARY KEY IN TARGET TABLE
7. INSERT INTO Item values (1005, Standard Long Cord Electronic Drill,2344); ERROR
INCORRECT DATA TYPE PRIMARY KEY IS CHAR NOT A NUMBER
8. INSERT INTO Customer values (5566, HASNET, LONDON); OK
9. INSERT INTO Customer values (5667, SONGARA, BIRMINGHAM); OK
10. INSERT INTO Customer values (5667, SINGH, CAIRO); ERROR DUPLICATE VALUE
FOR CUSTOMER ID
11. INSERT INTO CustomerPurchase values (1005, 5566, 03-FEB-2004, 20); OK
12. INSERT INTO CustomerPurchase values (1007, 5566, 04-FEB-2004, 40); ERROR
FOREIGN KEY DOES NOT MATCH PRIMARY KEY IN TARGET TABLE
Exercise 2
Using online resources, compare the features of any two implementations of SQL, as provided by a
vendor. For example, you could compare Oracle SQL*Plus with MySQL.
Use your own words to write your answer and make sure you include a reference list of the places
where you found the information.
Suggested Answer:
The students should present their answer to the class during the tutorial session.
The answer would depend on the products chosen. For example, if choosing Oracle as one of the
products, there should be discussion of how SQL*Plus has extended the language with nonstandard features, including the procedural extensions. Those choosing MySQL could discuss its fit
with PHP and its availability.
Page 94 of 158
DB Lecturer Guide V1.0
7.7
Tutorial Notes
Exercise 1:
In small groups, discuss your findings to Private Study Exercise 1, asking your tutor for clarification
when needed.
Exercise 2
Work in a small group. Present your findings from Private Study Exercise 2 to the other students and
answer any questions they may have.
Make notes on the findings of the other students to increase your understanding.
Your tutor will then run a whole class feedback session.
Page 95 of 158
DB Lecturer Guide V1.0
Page 96 of 158
DB Lecturer Guide V1.0
Topic 8
Learning Objectives
8.2
Pedagogic Approach
Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.
8.3
Timings
Lectures:
2 hours
7.5 hours
Tutorials:
1 hour
Page 97 of 158
DB Lecturer Guide V1.0
8.4
Lecture Notes
The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:
Creating tables
More of the select statement
Fixing errors and optimisation
The most important part of data definition that is covered in this module is creating
tables.
The following activity relating to Slide 4 can be discussed with the class. The
answers are on slide 5.
Activity: Examine the Create table statement. Where is the name of the table
defined? Where are the columns defined? What defines the datatype for each of
the columns? What defines whether the columns in mandatory or not? What
defines the maximum length of each of the columns?
Slide 6:
This slide shows the specification of a primary key. Note that there are different
ways of doing this depending on the vendors flavour of SQL.
Slide 7:
Slide 8:
There are various ways in which a table can be modified after it has been created.
List:
Drop a constraint
Drop a default
The significance of being able to do this is that changes in the database structure
will not necessarily mean that a table will have to be recreated from scratch.
Modifying an existing table leaves the data in that table intact, dropping the table
and creating it from scratch does not, and would entail creating a temporary table
with the data. There are limitations to altering the table, for example creating a new
column that was NOT NULL where there was already rows in the table would not
work.
Slide 9:
Page 98 of 158
DB Lecturer Guide V1.0
Slide 10:
Slide 11:
Data Manipulation - The select statement was introduced in its basic form in the
previous lecture. Students should be familiar with it and with other aspects of SQL
from the workshop materials. The following slides provide an overview of the key
features listed below:
Select
Order by
Aggregate functions
Group by
Sub-queries
Joins
Select - Simple retrieval uses the keywords SELECT, FROM and WHERE.
Activity: Ask the students if they know the function of each of these keywords.
Answer: Select specifies the columns, From specifies the table(s) and Where
specifies some condition that will limit the columns retrieved. This was covered in
previous lecture.
Slide 12:
This slide shows retrieving all the columns using the star operator and a simple
retrieval specifying the columns and a where clause.
Slide 13:
Slide 14:
Ascending and Descending - The default order is ascending, but using the DESC
keyword will make it descending in this example.
Slides 1516:
Standard SQL defines five aggregate functions: Count, Sum, Avg, Min and Max.
Activity: What is the purpose of each of these functions?
Answer: Count returns the number of values in a column, Sum returns the sum
total of values of a column, Avg returns the mean average of values in a column,
Min returns the lowest value in a column and Max returns the highest value in a
column.
Slide 17:
This slide shows an example of the syntax for using aggregate functions.
Slide 18:
The aggregate functions can be used to find summaries of data for particular
groups of rows in a database table. Using the Group By function means that the
aggregate function used is applied to each group. The slide shows an example
drawn from the workshop.
Page 99 of 158
DB Lecturer Guide V1.0
Slide 19:
The group by clause can be modified using the having clause. This is equivalent to
a where clause in the main part of the select statement. This slide shows an
example drawn from the workshop.
Slide 20:
SQL has the capability to nest one query inside another. The nested query is known
as a sub-query. This allows SQL to have queries where the results are based on
the outcome of another query. This slide shows an example drawn from the
workshop.
It is worth noting that there is often more than one way to produce the same query
result in SQL. Often a result that uses a sub-query could equally be done by using a
join of some sort.
Slide 21:
Slide 22:
Not specifying the join condition correctly is one of the commonest mistakes when
starting to use SQL. The result can often be a very long output from a select
statement. Unfortunately, there is no magic fix for debugging SQL and the way in
which different vendors have implemented the language does not help. Error
messages are often cryptic and might just point to the line in which the error occurs
(or even the next line as it is the next point parsed (examined) by the SQL
compiler). There is also no standard editing feature built into SQL, so it is important
to establish, with the product being used, how editing will take place, for example
using the operating systems text editor.
Slide 23:
Much time is spent by professionals using SQL to work out the best way to write a
particular query. There are lots of ways of doing the same thing in SQL, especially
when it comes to querying. The example of producing the same query results with
either a join or a sub-query is an example of this. Different ways of writing the query
can result in dramatic differences in the amount of time it takes to produce the
result. This issue is known as performance and is especially important in databases
where there are large numbers of rows. Much work in query optimisation requires
knowledge of one of the underlying languages of the relational model - relational
algebra. This is beyond the scope of this module.
Slide 24:
Slide 25:
The best way of becoming adept with SQL is to use it. The work in the laboratory
sessions and the work that students do as part of their assignment will help them
develop their ability with SQL.
8.5
Laboratory Sessions
8.5.1 Aggregation
Many important queries in a database involve using the aggregation functions COUNT, MIN, MAX,
SUM and AVG.
COUNT counts the number of times something occurs in a database table; the number of rows that
meet a particular condition.
MIN finds the minimum or lowest occurrence of an attribute in a database table.
MAX finds the highest occurrence of an attribute in a database table.
AVG finds the mean average of an attribute in a database table.
SUM finds the totals of all the values of an attribute in a database table.
Example: To find the number of rows in the workers table, we use the primary key, Emp_no, as
there will be exactly one occurrence of this value for each row and no duplicates.
Select count(emp_no)
From Workers;
8.5.2 Part One: Using the Workers, Departments and Job_Types Tables
Exercise 1
Try the above select statement.
Exercise 2
Find the average age of the workers in the workers table.
Suggested Solution:
Select Avg(Age) from Workers;
Exercise 3
Find the average age of the managers in the workers table.
Suggested Solution:
Select Avg(Age)
From Workers
Where job_title = Manager;
Exercise 4
Find the minimum, maximum, average and the sum of the age all the packers in the workers table.
Suggested Solution:
Select min(age), max(age), avg(age), sum(age)
From workers
Where job_type = Packer;
Exercise 5
Write a query that tells you the age of the youngest employee in Cairo. You will need to use the
joining of tables that you have studied in previous tutorials.
Suggested Solution:
Select min(age)
From Workers w, Departments d
Where w.dept_no = d.dept_no
and d.location = Cairo;
Exercise 6
Write a query that tells you how many employees there are in Lagos.
Suggested Solution:
Select count(emp_no)
From Workers w, Deparments d
Where w.dept_no = d.dept_no
And d.location = Lagos;
Exercise 7
Write a query that finds the job_type with the highest salary. You will need to use the job_type table
you created in Topic 5.
Suggested Solution:
Select max(salary)
From Job_Type;
Exercise 8
What is the total of all salaries paid?
Suggested Solution:
Select Sum(j.salary)
From job_type j, workers w, departments d
Where w.job_title = j.job_title
And w.dept_no = d.dept_no
Exercise 9
What is the lowest salary paid in Cairo?
Suggested Solution:
Select Min(j.Salary)
From job_type j, workers w, departments d
Where w.job_titlee = j.job_title
And w.dept_no = d.dept_no
And d.location = Cairo;
Suggested Solution:
Select max(qual_level) from qualification_type;
Exercise 11
Select the highest level of qualification attained by you.
Suggested Solution:
Select max(qual_level)
from qualification_type qt, personal_details p, qualification q
Where qt.qual_name = q.qual_name
And q.personal_no = p.personal_no
And p.second_name = <students_surname>;
Exercise 12
Select the highest level achieved for those students who are over 20.
Suggested Solution:
Select max(level)
from qualification_type qt, personal_details p, qualifications q
Where qt.qual_name = q.qual_name
And q.pesonal_no = p.personal_no
And p.age > 20;
Exercise 13
How many students have achieved level 2 qualifications?
Suggested Solution:
Select count(p.personal_number)
from qualification_type qt, personal_details p, qualification q
Where qt.qual_namen = q.qual_name
And q.personal_no = p.personal_no
And q.qual_level = 2;
Exercise 14
What is the average grade for level 3 qualifications?
Suggested Solution:
Select Avg(q.grade)
From qualifications q, qualification_type qt
Where qual_name = qt.qual_name
And qt.qual_evel = 3;
Exercise 15
What is the average level of qualification achieved by students under 19?
Suggested Solution:
Select Avg(qt.level)
From qualifications q, qualification_type qt
Where q.qqual_name = qt.qqual_name, personal_details p
And p.personal_no = q.persona_no
And p.age < 19;
8.6
Private Study
The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
The following tables are for a garden products database.
Customers
Customer ID
Customer Name
C1
Arthur Smith
C4
Samson Odogo
C2
Jagpal Singh
C6
Jenkins Watson
Products
Product
Price
Land mower
100
Slug Repellent
Trowel
Weed killer
Knee rest
Customer Products
Customer ID
Product
C1
Lawn Mower
C1
Slug Repellent
C1
Trowel
C4
Weed Killer
C2
Weed Killer
C2
Lawn Mower
C6
Trowel
Exercise 3
Write an SQL statement that finds the average price for all the products.
Suggested Solution:
Select avg(price) from products
Exercise 4
Write an SQL statement that sets the price of weed killer to 5.
Suggested Solution:
Update products
Set price = 5
Where product = Weed Killer;
Exercise 5
Write a query that gives the total spent by each customer.
Suggested Solution:
Select customer_name, sum(price)
From customers, customer_products, products
Where customer.customer_id = customer_products.customer_id
Exercise 6
Review all the material for Topics 7 and 8 (SQL). You should make sure that you understand the
following concepts and be prepared to raise any questions about them in the next tutorial:
8.7
Tutorial Notes
Exercise 1:
Work in a small group and review your answers to Part One of the private study exercises. Your
tutor will then lead a whole class feedback session.
Exercise 2:
Questions
SQL has two major components, DDL and DML. What are these components and what are
their functions?
b.
What are the disadvantages of the CHAR data-type and how does the VARCHAR data-type
overcome these?
c.
d.
e.
What are the advantages of using the ALTER TABLE statement as opposed to creating a
new table from scratch when changes are needed?
f.
Suggested Answers:
a.
DDL is data definition language. Its purpose is to define the database objects, such as tables
and columns. Its primary operator is the CREATE statement.
DML is data manipulation language. It is concerned with performing actions on data. This
could be putting data into the data base using the INSERT statement, or changing data using
UPDATE, or deleting data using DELETE. It is also concerned with performing retrievals of
data using the SELECT statement.
b.
The CHAR data-type causes problems because it is of fixed length. If data is entered into it
that is less than the fixed length, then the rest of the characters are filled with blanks. This
causes problems with select statements where the equals statement does not always work.
The VARCHAR data-type overcomes this by being only as long as the data that is entered
into it.
Page 109 of 158
c.
The ROLLBACK statement takes the database back to the state it was in just after the last
COMMIT statement was issued - to the last point it was saved. It can be used to undo
transactions and operations that have already been carried out that the user does not want
to be saved.
d.
Advantages:
Universal
Easy to Use
Disadvantages:
e.
f.
This clause is used when an aggregate function is used in the main part of the select
statement. Its purpose is to group the results by some attribute from the table or tables on
which the select statement operates.
Exercise 3:
Review of SQL
Take part in a class discussion around the relevant points of Topics 7 and 8 that are listed in Private
Study Exercise 6. Ask your tutor any questions you have about SQL.
Topic 9
Topic 9: Database Design
9.1
Learning Objectives
9.2
Pedagogic Approach
Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.
9.3
Timings
Lectures:
2 hours
7.5 hours
Tutorials:
2 hours
9.4
Lecture Notes
The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:
Understanding requirements
Moving from entities to tables
Documenting attributes with a data dictionary
The use of Case tools
Slide 6:
There are many different approaches to systems development; there has been
what might be termed a mainstream and a traditional approach, but this has fallen
out of favour in recent years, particularly with regard to the question of
understanding user requirements.
Understanding requirements is vital to being able to produce a finished system that
meets the business needs of an organisation.
Slide 7:
The traditional methodology goes under various names, the Systems Development
Life Cycle (SDLC) or sometimes the Waterfall approach.
This involves a complete set of steps that a team follows. The fundamental idea is
to divide the development process into a series of phases or stages, each of which
finishes before the next one starts.
This process is often viewed as a cascade of steps, which is why it has been called
the waterfall approach.
It is worth noting that there are many different variations on the steps involved. This
is because there are particular methodologies known as Structured Methods,
which have been developed to guide developers through the whole lifecycle of
building a computer system. Some examples of these are Structured Systems
Analysis and Design (SSADM) and Information Engineering.
Slide 8:
Feasibility Study
Systems Analysis (or Analysis)
Slide 9:
Design
Implementation
Maintenance
The concern of this module is being able to design a database given a particular
business scenario. It is worth understanding how a developer might have reached
this point following the traditional approach:
Systems Analysis
In this stage, investigation is undertaken to understand the requirements of
both the business and the users. The emphasis should be on what the system
should deliver, rather than how it should be delivered.
Design
In this stage, the purpose is to translate the requirements that have been
gathered during the previous stage into a systems design which details how
they will be satisfied.
Implementation
The construction uses a particular choice of software (a DBMS product) of the
actual system from the given design. Here the development will build the
required structures on the database, write the application programs, and
integrate on the chosen hardware platform.
Slide 10:
We have seen that the fundamental idea of the Waterfall Approach is to do one
thing after another. However, it is worth acknowledging that very few waterfall
projects adhere to the pure waterfall model. Even the most formal project will
benefit from some amount of feedback and rework based on that feedback. But the
principles behind the model remain: to perform certain processes in a certain order.
Historically, there have been serious problems with this approach.
What if the original requirements specified in original analysis turn out to be wrong
in some way because:
These problems may not become apparent until the users are looking at their new
database system. By now it will be very expensive to make a change. For
example, changing a database field length costs nothing at the analysis stage when
everything is on paper; however, once the system is implemented then that field
has to be changed physically on the database which will be much more costly.
Slide 11:
Alternative approaches have been developed, most of which involve the concept of
what is known as Iteration, which is going over some part of the development and
incorporating user feedback in order to get the requirements right. Usually this
involves some sort of prototyping.
Slide 12:
Slide 13:
The point to stress here is that although the focus will now be on the specifics of
database design, it should be considered that within the context of a larger
Information Technology project, this design stage might also serve as part of a
requirements gathering and verification process. This will involve the database
developer having some of the skills of the systems analyst - primarily good
communication skills.
Slide 14:
Slide 15:
The phases here suggest the sort of linear progression from one to another
characteristic of a traditional systems development life cycle. The outputs of one
phase would be the inputs of another. For example, the output of logical design
might be an Entity Relationship Model. This would become the basis for table
design in the Physical Design phase. More iterative approaches to development
would have the deliverables from previous phases revisted so that, for example,
prototyping at physical design stage might lead to revisions in the ER model that
was produced during logical design.
Discussions of database phases tend to present them in a linear fashion, but it is
worth noting that current trends in project management and development
methodology are more iterative.
Slide 16:
Slide 17:
Logical Database Design constructs the model without regard for the particular
DBMS that will be used. However, the data model (e.g. the relational model) is
known. A key activity is normalisation.
Slide 18:
Physical Database Design - The move from entities to tables is one of the key
activities here involving what is known as designing the base relations. But this
Page 114 of 158
With regard to this module, the activities that are undertaken are those of logical
and physical design. The process of logical design (identifying entities,
normalisation) has been discussed earlier. Our concern in this topic is to examine
some of the aspects of physical design.
Slide 20:
Slide 21:
Slide 22:
Slide 23:
This slide presents document domains and base relations in a data dictionary,
which should include the name of the relation, a list of the attributes, the primary
key and any foreign keys. For each attribute, there should be listed a number of
different aspects; firstly, the domain. The domain might be a specifically defined
domain or simply the data type, length and any constraint and any default value. It
should also be noted as to whether the field is mandatory or whether is can be null.
Constraints will be covered in a coming topic.
Example of data dictionary for base relation Students and its associated domains:
Domains
Domain StudentType varchar, length 30, must be Overseas,Home
Domain City varchar length 30
Base Relations
Students(StudentID Number NOT NULL,
Address_line1
Varchar (30) NOT NULL
Address_line2
Varchar (30) NOT NULL
City
City
StudentType
StudentType NOT NULL DEFAULT Home
PRIMARY KEY (StudentID)
FOREIGN KEY (City) REFERENCES City (City_name));
In this case, StudentType is a simple domain with two values and would be
implemented by a constraint on the table. City as a domain is enforced by having a
separate table with valid cities in it.
Slide 24:
9.5
Laboratory Sessions
9.5.1 Grouping
In the previous laboratory session we looked at aggregation. You were asked to find the minimum,
maximum, average and the sum of the age all the packers in the workers table.
The suggested solution was like this:
Select min(age), max(age), avg(age), sum(age)
From workers
Where job_type = Packer;
But what if we want to provide a query that shows us the maximum age for each of the different
types of workers? SQL provides a group by clause that allows us to do this.
Select job_title, max(age)
From Workers
Group by job_title;
Exercise 1
Run the above query and study the results.
Exercise 2
Write a query that finds the average age for the employees in Cairo. Group this by job_title.
Suggested Solution:
Select job_title, avg(age)
From Workers w, Departments d
Where w.dept_no = d.dept_no
And d.location = Cairo
Group by job_title;
Exercise 3
Write a query that shows the age of the eldest workers in each department. Group this by the
dept_no. You do not have to show the department name.
Suggested Solution:
Select dept_no, max(age)
From Workers
Group by dept_no;
SQL also provides the ability to place a selection condition on the group by clause. This is the
Having clause. This example shows the above query modified so that only those departments with
a maximum age above 35 are shown:
Select dept_no, max(age)
From Workers
Group by dept_no
Having max(age) > 35;
Exercise 4
Find the departments that have an average age of over 40. You do not need to show the department
name.
Suggested Solution:
Select dept_no, avg(age)
From Workers
Group by dept_no
Having avg(age) > 40;
Exercise 5
Find the maximum age, the minimum age, the average age and the job title for those jobs where the
average age is above 35. Group this by the job title.
Suggested Solution:
Select job_title, max(age), min(age), avg(age)
From Workers
Group by job_title
Having avg(age) > 35;
Exercise 6
As part of the laboratory exercises in Topic 5, you created two tables that kept personal information
about yourself and the qualifications that you have. In Topic 7 you should have added some new
rows about your friends and their qualifications to these tables.
Now use the aggregate functions from Topic 8s Laboratory Session and the Group By clause from
Topic 9s Laboratory Session to create a set of useful queries using these tables.
Suggested Solution:
This will depend on the tables the students created. A suitable field to operate on would be age. If
there are no suitable fields, students should add them using the modify table statements that they
learnt in Topic 4s Laboratory Session.
9.6
Private Study
The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
You should also allow time during the laboratory sessions to check that students are working on
their assignments and answer any general questions on the expected scope of the work. You may
also wish to remind them of the submission deadline and documentation requirements.
This topics private study time involves practising some database design based on elaboration of a
previous private study exercise.
Exercise 1
In Topic 4, you were asked to draw an ERD for a boat rental system. The requirements were the
following:
You should be able to record that a boat is rented to a customer for a set period.
Any damage to the boat is recorded against the particular rental.
A boat should have a name.
All boats are of the same type (yacht).
Damage is classified as being hull, interior or other.
Using the ERD for this system, produce a data dictionary specifying the base relations (tables),
attributes and domains. The data dictionary should be in the format given in the lecture.
Suggested Answer:
BOAT
0..N
0..N
RENTAL
CUSTOMER
1
0..N
DAMAGE
Base Relation
Customer(
CustomerID number NOT NULL,
CustomerName varchar 30 NOT NULL,
CustomerAddress varchar 60 NOT NULL,
Primary Key (Customer ID);
Base Relation
Rental(
BoatID number NOT NULL,
CustomerID number NOT NULL,
RentalStartDate date NOT NULL,
RentalEndDate date NOT NULL,
Primary Key (BoatID, CustomerID, RentalStartDate),
Foreign Key (BoatID) REFERENCES Boat (BoatID),
Foreign Key (CustomerID) REFERENCES Customer(CustomerID));
Base Relation
Damage(
BoatID number NOT NULL,
CustomerID number NOT NULL,
RentalStartDate date NOT NULL,
DamageType DamageType);
Exercise 2
Find some examples of CASE tools online. What are their features? For how much of the database
development process do they cater? What might be their disadvantages?
Prepare a brief written discussion for the tutorial.
Suggested Answer:
This will depend on the Case tool chosen. Oracle Designer, for example, covers logical design and
physical design, and there are tools for modelling entities, tables, processes and functions. There
are code generators for the database structures and for applications, such as forms. Students
should be encouraged to give an outline of the tool they have investigated.
Exercise 3
Investigate a systems development methodology such as SSADM. Each stage or step has what is
known as a set of deliverables. These are the outcomes of that stage which will form the basis of
work in the next stage.
What are the deliverables from analysis, design and implementation stages for the methodology that
you have investigated?
Suggested Answer:
This will depend on the development methodology investigated, but typically it would be something
along these lines:
Stage
Deliverable
1. Analysis
2. Design
3. Implementation
Exercise 4
Review the content of this topic and conduct any further reading you need to undertake in order to
ensure that you understand the material. You should make note of anything that you still feel
requires further clarification and bring your questions to the tutorial for this topic.
9.7
Tutorial Notes
Exercise 1:
In small groups, discuss your findings to the Private Study Exercises, asking your tutor for
clarification when needed.
Exercise 2
Answer the following questions, which relate to approaches to development.
1. What is the difference between analysis and design?
2. Why is the traditional systems development approach called the Waterfall Model?
3. What stages in a traditional waterfall lifecycle do you think overlap with the conceptual,
logical and physical stages of database design?
4. What is prototyping and what are its advantages?
Suggested Answers:
1. Analysis is the investigation to understand the requirements of the business and the user.
The emphasis is on WHAT the system should deliver rather than HOW it should be
delivered.
Design is translating the requirements from analysis into a systems design which detail how
they will be satisfied.
2. We divide the development process into a series of phases or stages and each finishes
before the next one starts. This is often viewed as a cascade of steps, hence the water fall.
(You could also draw an analogy with water flow and data flow from one step to next in one
direction.)
3. The stages in a traditional waterfall lifecycle are analysis, design and implementation. We
can assume some of the analysis has been done given that the outcomes of analysis would
be the background under which the conceptual design would begin. Logical and physical
design would fall within the design stage of the waterfall lifecycle. Those parts of physical
design which involve creating objects within the database might be classified as part of the
implementation in a waterfall approach.
Page 122 of 158
DB Lecturer Guide V1.0
4. Prototyping is the process whereby a model is built of part of the envisaged system.
Enhancements or amendments are discussed with the user which can then be incorporated
in the finished product.
You can obtain feedback from the user and use this to amend the original requirements,
analysis and design. Prototyping helps the analyst and user to communicate and it can save
time and money to deliver what the user wants.
Key aspects of prototyping:
Exercise 3
Outline the difference between Conceptual, Logical and Physical Design.
Suggested Answer:
Conceptual design is the initial investigation of the data that is needed to support a system. It does
not take into account either a particular data model or the implementation environment.
Logical design involves investigation of the data using the tools of a particular data model, such as
the relational model, but is still independent of the implementation environment.
Physical Design takes into account the chosen DBMS product and the physical structure and
implementation of the database.
Topic 10
Topic 10: Supporting Transactions
10.1 Learning Objectives
This topic provides an overview of supporting transactions.
On completion of the topic, students will be able to:
Identify transactions;
Understand business rules and their implications;
Recognise potential performance issues;
Identify the potential need for de-normalisation.
10.3 Timings
Lectures:
2 hours
7.5 hours
Tutorials:
2 hours
Business Rules
Identifying and documenting transactions
Views and de-normalisation
This topic will use the example of the boat rental system specified in the Topic 9
Private Study exercise.
Slide 6:
The suggested answer shown on the slide specifies the basis for the system in
terms of the relational model. It also embodies essential business rules of the
organisation; for example, the original scenario states that any damage is stored
against the rental, so the entity damage is related to the rental entity. This may
seem counter-intuitive since it is boats that get damaged; however, by storing it
against the rental, the database tells us who was renting the boat when it was
damaged, during what time was it rented as well as which boat was damaged.
In this case, the business rule is built into the structure of the database. Other
business rules might be embodied in a constraint of the sort created on the Workers
table in the Topic 9 laboratory session (that workers had to be 70 or younger). One
could imagine a business rule on this database that specified a maximum time for a
rental period. This would be enforced with a check constraint. The other rule that is
specified here is for the domain of damage type.
Identifying business rules will form part of the requirements gathering process.
Slide 7:
Slide 8:
In order to design a database that meets the user requirements, the transactions
that will take place have to be identified. There is both quantitative information
about them (what they are, what they do), and qualitative information (how often
they run, how many rows they will affect.)
Slide 9:
Connolly and Begg provide a useful summary of what to look at when identifying
transactions and beginning to analyse the effect they will have on the design of the
database, the applications and the performance of the database when running
queries.
Slide 10:
Trace all transactions to the relations that they use or affect this will mean
thinking about what tables are written to, read from etc.
Analyse how the data is used by a given transaction this will be discussed
below by looking at CRUD matrixes, but it is worth noting that this could go to
quite a detailed level looking at each attribute on a table.
Note. All these investigations can be done much more efficiently with the user of a
CASE tool that allows cross referencing between transactions and relations
Slide 11:
This slide shows the requirements for transactions for the boat hire system:
a.
b.
c.
d.
e.
f.
g.
h.
Enter the details of all the boats. Update any details for customers. Delete
boats.
Enter the details for customers. Update any details for customers
Enter the details for hiring of boats
Enter the details for any damage to boats
List the details of all the boats
List the details of all the customers and their hire, for which boats
List the details for damage, to which boats, during which hire periods and for
which customers.
Provide a summary of the hires for a particular period
Activity: Examine the blank CRUD matrix presented on this slide, copy it and fill in
the relevant operations in the transactions.
Slide 13:
This slide shows the completed CRUD (or IRUD) matrix for the transaction.
Transaction/Relations A
Boat
CUD
Customer
Damage
R
CU
Rental
Slide 14:
R
R
Some transactions may affect some attributes, but not another. For example,
updating a customers details to change their address would only affect the address
attributes and not name or ID.
Slide 15:
Slide 16:
Entity
Boats
Slide 17:
Type of Access
R
Average Number
1 * 50
Peak Number
100 * 50
tables. Be aware that attributes that are often updated will be slowed down by an
index.
Slide 18:
Note that a lot of the performance related issues presented on Slide 18 pertain to
databases with tables that have a large number of rows. Such tables are
increasingly common as databases become large.
The issue of analysing transactions and their behaviour in databases of different
sizes is known as scalability. Professionals have to investigate whether a
database is scalable. This means, does a database and its transactions behave in a
comparable way when it is implemented at quite a small scale (with very few rows)
to when it is implemented with a much larger number of rows.
Slide 19:
How the transactions will be manifest as part of a deployed system will depend on
the set of applications that are built on top of it. This will depend on the type of
system it is. A database can be accessed by web-based forms, by queries
embedded in a website, by forms applications built in a language like Visual Basic
or Oracle Forms. Queries might be run from an SQL prompt, but are more likely to
be embedded inside some other application.
Much specialised database development also involves the development of
applications.
Slide 20:
Not every user of a database has the same role. Within any organisation and
among the population of people in it that are users of the database, there will be
different levels in terms of hierarchy as well as people doing different jobs. These
differing roles will require access to different information. There is also the issue of
data protection with many countries having legislation which means that data
should only be accessed by those who have a genuine need to use it. Some people
will need to just see data, whereas others will have a need to insert, update or
delete data.
Slide 21:
In the boat hire system, we might define two types of users: the manager who
should be able to have full access to the entire database; the administrative
assistant who is hired for the holiday period to insert the rentals, add any new
customers and record any damage to the boats. They would have different
privileges on the database.
We can define this using a similar CRUD technique, but here we are recording
access rather than transactions.
Table/User
Boat
Customer
Rental
Damage
Manager
CRUD
CRUD
CRUD
CRUD
Admin Assistant
CRU
CRU
CRU
Slide 22:
SQL has various facilities to manage the way different users are granted access to
different parts of the database. These facilities can enforce the roles and access
rights that have been defined for a database.
Slide 23:
The grant facility gives access to a particular object, e.g. a view or a table created
by a different user.
Grant create on Boat to Admin this command will give the role of Admin the right
to create data on the table Boat.
Grant all on Boat to Manager- this command will give the role of Manager the right
to carry out any operation on the table Boat.
Slide 24:
Slide 25:
Normalising our data model means we will have the minimum amount of
redundancy. However, this can have an effect on performance. If we are running a
query that joins tables, this will be slower than running a query against a single
table or view.
Denormalisation can be done by including an attribute in a table that should not be
there according to the rules of normalisation, e.g. the name of the boat on the Hire
entity.
Slide 26:
This slide presents the use of view as a way to improve performance. A view is a
virtual table. The students will have encountered the creation of views in the SQL
laboratory sessions. Views can be used to combine tables, so that instead of joining
tables in a query, the query will just access the view and thus be quicker.
Slide 27:
In the boat hire system, we could create a view where there is a query that
combines lots of tables. For example, transaction G - List the details for damage: to
which boats, during which hire periods and for which customers (presented on Slide
11) involves joining all the tables in the database. For part of the private study, you
will need to write the SQL for creating a view for this transaction that will contain all
the relevant data.
Exercise 3
Write a query using a nested sub-query to find those department IDs where the average age of the
workers is less than the average age for all the workers in the company.
Suggested Solution:
Select d.dept_no
From departments d, workers w
Where d.dept_no = w.dept_no
And w.age < (select avg(age) from workers);
Exercise 4
Note that the result above produces multiple repeating values. Use the Group By clause after the
closing brackets to group by department ID.
Suggested Solution:
Select d.dept_no
From departments d, workers w
Where d.dept_no = w.dept_no
And w.age < (select avg(age) from workers)
Group by d.dept_no;
Exercise 5
Make this query more user-friendly by changing the department ID to the department name.
Suggested Solution:
Select d.department_name
From departments d, workers w
Where d.dept_no = w.dept_no
And w.age < (select avg(age) from workers)
Group by d.department_name;
Exercise 6
Using a nested sub-query, get the first names of all those workers who have the maximum age for
the whole company. Remember to use Group By if you are getting repeating values.
Suggested Solution:
Select w.first_name
From workers w, departments d
Where w.age = (select max(age) from workers)
Group by w.first_name;
Exercise 1
Write the SQL for creating the view for the following transaction:
List the details for damage: to which boats, during which hire periods and for which customers.
Suggested Answer:
Create View as
Select b.BoatName, c.Customer, r.RentalStartDate, r.RentalEndDate, d.DamageType
From Boat b, Customer c, Rental r, Damage d
Where r.BoatID = b.BoatID
And r.CustomerID = c.CustomerID
And d.CustomerID = r.CustomerID
And d.BoatID = r.BoatID
And d.RentalStartDate = r.RentalStartDate;
Exercise 2
Explain what you think the purpose and effect of this view would be for the Boat Hire system.
Suggested Answer:
The purpose is to group all the information about the damage that the users need to see in one
convenient place; in this case it will be stored in a view. The effect should be to speed up retrievals
of damage information. Without the view, each time this information is needed, the whole query
would have to be written out. This query joins all four tables and, as can be seen from the length of
the join statement, means that in a database with lots of rows, there could be a significant effect on
the performance of the query. This would be manifested in the time it took for the query to come
back with a result.
If the view is set up however, then a simple select from the view would give all the data needed.
Exercise 3
Use online resources to look for jobs advertised for database development work. What sorts of skills
are being required for the jobs? What software is involved?
Suggested Answer:
There should be a discussion about the sort of skills and software that is required during the tutorial
session. Different database vendors should be highlighted. The development of application software
that accesses databases (PHP, VB.net, Oracle Forms etc.) should be pointed out.
Exercise 4
One of the definitions of a transaction is that it should possess four basic properties, usually
remembered by the abbreviation ACID:
Atomicity
Consistency
Isolation
Durability
Exercise 1:
In groups, review the work you carried out during your private study.
Exercise 2
Give an explanation as to your understanding of a business rule. Using a system you are familiar
with, from an example in the course materials or through personal experience, specify business
rules that apply to that system.
Suggested Answer:
Definition of a business rule: A rule, procedure or way of doing something that applies to a particular
business.
The examples the students give should demonstrate that they understand that business rules are
something that is in addition to the normal integrity rules of relational databases (although a
business rule could be enforced with an integrity constraint.) For example, a rule that all boats must
be blue, red or green is something particular to a business that would have to be enforced, e.g. with
a domain.
Exercise 3
A student record system consists of three tables: Students, Modules, StudentsModule and the ER
shown below:
1
Student
0..*
0..*
StudentModule
Data dictionary
Student
StudentID (PK)
StudentFirstName
StudentLastName
StudentAddress
Page 135 of 158
DB Lecturer Guide V1.0
Module
StudentAge
StudentModule
StudentID(PK)(FK)
ModuleCode(PK)(FK)
Semeseter
Year
Result
Module
ModuleCode(PK)
ModuleName
Complete a CRUD matrix for the following transactions:
a.
b.
List a students personal details and results for all of the modules they have taken. Include
the module name.
c.
d.
e.
f.
Suggested Answer:
Transaction/Table
Student
StudentModule
Module
Exercise 4
A number of business rules have been defined for this student records system:
1.
All students should have an enrolment date recorded for them and a final completion date.
All students should be deleted from the system three years after their completion date.
2.
3.
Discuss how each of these business rules might be enforced on the system. This might require the
creation of new attributes or other database structures.
Suggested Solution:
1.
Attributes for enrolment date and completion date would have to be added to the Student
table. There would have to be a transaction to delete students three years after their
completion date.
2.
A new attribute of StudentType would have to be added to the Student table. The domain
for this (with values Home and Overseas could be supported in a number of ways. This
could be the creation of a separate domain or a check constraint on the Student table.
3.
The reason why students would NOT be able to retake a module at the moment is that the
Primary Key on StudentModule is the StudentID and the ModuleCode. If the student took
the module again, as things stand, then the Primary Key would be duplicated. Therefore the
change that would need to be made to make it possible for the student to retake the module
would be for some other attribute or attributes to become part of the primary key. This
might be Year and Semester.
Topic 11
Topic 11: Database Implementation
11.1 Learning Objectives
This topic provides an overview of database implementation issues and the implementation
environment.
On completion of the topic, students will be able to:
11.3 Timings
Lectures:
2 hours
7.5 hours
Tutorials:
2 hours
This slide presents aspects of implementation, such as creating the tables from the
data dictionary by writing the create scripts (this will include any tables used to
enforce domains). Aspects of implementation also include creating other structures
in the database, such as domains themselves, indexes and views.
Slide 5:
Slide 6:
Domains can be enforced in a number of ways. The ISO SQL standard has
specified a statement that creates domains as separate structures in the database.
This is an example of the syntax of such a statement; it creates a domain called
allowable colours, which can have one of three values Red, Blue or Green. It
also specifies a default of Red, which means that if no colour is specified when this
domain is used, then it will be set to Red.
Create domain Allowable_Colours As Char
Default Red
Check (Value in (Red,Blue,Green));
Slide 7:
Slide 8:
This slide presents an example of referential integrity, which is where a foreign key
references another table. This example is from the laboratory session In Topic 1.
Slide 9:
This slide presents the propagation constraint, which is that when two tables are
related and there is an update in one, it can affect the other. What would happen to
all the records for hire and damage in our Boat Hire system if there was an update
or deletion of one of the Boats? The answer will depend on how we have set up our
database. But if it were just the case that we could delete Boats without doing
anything about the records on the Rental table, then there would be a lot of Rental
records that referred to Boats that no longer existed.
Slide 10:
This slide presents the create script for the rental table:
Page 139 of 158
Base Relation
Create Table Rental
(BoatID number NOT NULL,
CustomerID number NOT NULL,
RentalStartDate date NOT NULL,
RentalEndDate date NOT NULL,
Primary Key (BoatID, CustomerID, RentalStartDate)
Foreign Key (CustomerID) REFERENCES Customer(CustomerID),
Foreign Key (BoatID) REFERENCES Boat (BoatID)
On delete no action
On update cascade);
What the script is saying is that if a boat is deleted from the Boat table, then we
leave the Rental record as it is, in order to keep a genuine record of our business.
However, if a boat is updated in some way that affects the primary key, then this will
also be updated in the rental record.
Slide 11:
Slide 12:
Slide 13:
There are a number of ways of dealing with the knock on effect of an action being
performed on one table that will affect another table. This action is known as
propagation and the ways of dealing with it are propagation constraints. The various
settings for propagation constraints are presented on this slide.
No action means the record in the table with the foreign key is left as it is.
Set Default means that the change in the parent table (in this case Boat)
causes the record in the child table to be set to some sort of default, e.g. a
boat could be deleted and in the Rental record, the foreign key would be set
to some value, such as X, which indicates that the record refers to a boat
that no longer exists on the system.
Set Null is similar to Set Default except that the table with the foreign key has
that foreign key set to null.
A domain is the set of allowable values that an attribute can have. Domain
constraints are rules for specifying how this set of allowable values can be enforced
on the database. For example, if we say that a Boat can only be of a number of
types, then how do we make sure that this is the case? We do so by enforcing the
domain with a constraint. There are a number of ways we can implement this
constraint. Firstly, we add a column to the Boat table called BoatType. The
allowable values for BoatType can be enforced in several ways:
This slide shows how to enforce a domain using a check constraint. A check
constraint is a rule placed on an attribute on a table that specifies that the values for
that attribute must obey that rule. In this case, the rule is that the value of BoatType
must be either Yacht, Cruiser or Rower.
Slide 14:
This slide shows how to enforce a domain, by setting up a separate domain object
in the database. Here, a domain of BoatType is set up and the table Boat refers to it
in the definition of the BoatTypeCode attribute.
Slide 15:
This slide shows how to enforce a domain constraint as a foreign key to another
table. As mentioned in the lecture in Topic 10, if the values of the domain are likely
to be dynamic, then it would be better to implement them as a look-up table.
Boat Type is set up as separate table that is referenced by the Boat table. The
advantage of this is that if a new Boat Type is used by the business, then all that is
needed is to add a row with details of it to the Boat Type table. Note that it is the
BoatTypeCode that is now on Boat. To get the full description of the boat type in
any query, then the tables would have to be joined. This could have an effect on
performance.
Slide 16:
A table constraint is where a constraint forms part of the table definition, but does
not fall into one of the other categories.
The example on this slide limits the amount of times a boat can be rented to less
than 10. Table constraints can be dropped using the Alter Table clause.
Slides 17-18:
Note that as well as the statement for inserting multiple rows of data, different
vendors have products which automate the process. These tools make the process
of loading lots of data into a database easier in that data already exists in an
electronic form. It is increasingly the case that databases are not just built to
replace paper-based systems, but are to replace other computerised systems
instead. Data in the older systems will be stored in various formats. The data
loading tools mean these files can be read and the data is put into a format which
will allow loading into a relational database.
Slide 19:
Slide 20:
Slide 21:
The Oracle environment has tools that support various languages, such as Java
and XML, and helps them interact with the database. This means that the database
would be able to interpret data from a wide range of sources.
Slide 22:
Slide 23:
The objects that are supported by the Oracle environment include physical
implementation of all the objects that exist in the logical structure in the relational
model: tables, indexes, views, domains etc.
Additional objects include objects themselves in the object-oriented sense of
having internal structure. For example, an address column could be defined as an
object type address which has a structure of address lines, city and post or zip
code.
PL/SQL is a procedural extension of SQL with all the capacity that this implies: if
statements, loops etc. It allows for the writing of complex programs what can be
stored on the database itself in the form of:
Stored functions
Stored procedures
Triggers.
A trigger is a piece of logic that will begin to operate (or fire) when some action is
carried out. Usually some type of database transaction will cause a trigger to fire.
For example, there is a trigger known as an On-Update trigger. This trigger will fire
when a transaction updates a table. What happens when the trigger fires will
depend on what logic is put into the trigger. An example would be that the trigger
could call a function or procedure that did something like create a record in another
table for audit purposes.
Slide 24:
Slide 25:
Logical structure includes table-spaces. These are logical storage units that
contain the other logical structures, such as tables. Tables and other objects will be
defined as belonging to a particular schema. The logical structure also includes
definitions of users.
Slide 26:
Datafiles are the physical files that contain the logical structures, such as the
tablespace, which in turn contains objects. Redo log files record all the changes to
the data, so that in the event of some kind of system failure, then data can be
recovered. Control files contain a list of all the other files in the database.
Slide 27:
The Oracle instance is all the processes and memory areas that are needed to
allow and control access to a database. It is made up of processes and memory
areas. Some examples of these are:
System Global Area - an area of shared memory used to store data for one
instance, which in turn contains other memory structures: database buffer
cache, redo log buffer,
shared pool.
Program Global Area: shared memory for server processes.
User Processes
Oracle Processes including log writer and recovery processes.
Lecturers Notes:
Please note that the Private Study exercise for this topic requires organisation in order to ensure
suitable topic coverage (see Section 11.6 below). You should ensure that students know which topic
they have been assigned following the lecture session(s).
11.5.1 Views
A view can be thought of as a virtual table. It is the result of an SQL select operation and, to the
user, looks like a table with rows and columns. However, unlike a table, it does not necessarily exist
permanently in the database.
The syntax to create a view is similar to a select statement, but with a Create View added. For
example:
Create View WorkersOverThirty
As Select Emp_no, First_name, Last_name
From Workers
Where Age > 30;
Exercise 1
Run the above script and then run a select statement to see all the data from it.
Suggested Solution:
Run the above script then:
Select * from WorkersOverThirty;
Exercise 2
Create a view that will contain the last name and the job title for all the workers in Cairo.
Suggested Solution:
Create View CairoWorkers
As Select last_name, job_title
From Workers where dept_no = (select dept_no from departments where location = Cairo);
Note: There will be different ways of doing this, for example using a sub-query, but it is not
acceptable to just use the Workers table only and have the dept_no = 1, because the student knows
this is the dept_no of the department in Cairo.
Exercise 3
Create a summary view that includes the emp_no, first_name, last_name, department_name,
location, the job type and the salary (from the job type table).
Suggested Solution:
Create View Summary
As Select w.emp_no, w.first_name, w.last_name, d.department_name, d.location, j.job_title ,j.salary
From Workers w, Departments d, Job_Type j
Where w.dept_no = d.dept_no
And w.job_title = j.job_title;
Exercise 4
Now recreate the summary view from Exercise 4, but make it only for Workers who earn more than
25000.
Note: You will have to give it a different name from the previous summary.
Suggested Solution:
Create View Summary
As Select w.emp_no, w.first_name, w.last_name, d.department_name, d.location, j.job_title ,j.salary
From Workers w, Departments d, Job_Type j
Where w.dept_no = d.dept_no
And w.job_title = j.job_title;
And j.salary > 25000;
11.5.2
Indexes
Exercise 5
An index is a structure in a database that helps queries run more quickly. This will be discussed in
more detail in a coming lecture. Indexes can be unique, meaning that they will prevent a duplicate
value from being added to that column, or they can be non-unique.
The syntax to create a unique index for the Workers table column Emp_no is:
Create Unique Index EmpNoIndex on Workers(Emp_No);
Run this script.
If you need to get rid of this index, the syntax is:
Drop Index EmpNoIndex;
11.5.3 Constraints
Exercise 6
As well as having constraints to enforce primary and foreign keys, constraints can also be added to
enforce a business rule. This will be discussed in more detail in a coming lecture. The example
below enforces a rule that all our workers must be 70 or younger.
Alter table Workers
Add Constraint Valid_age
Check (age < 71);
Run this script and then see what happens if you try to update someones age to over 70.
Exercise 1
You should prepare a short presentation on the database architecture of the vendor that you used to
implement your assignment. Focus should be on the logical structure and the physical structure. The
degree of detail that you will need to present should be guided by the lecture slides, i.e. it should be
an overview in your own words rather than a detailed technical paper.
Exercise 2
Your lecturer will assign one of the following topics, concerning bulk loading facilities, to you.
Prepare a short report about the features and the facilities of the tool that you are assigned to
investigate.
Bulk insert in SQL server
http://sqlserver2000.databases.aspfaq.com/how-do-i-load-text-or-csv-file-data-into-sql-server.html
Oracle SQL loader
http://oreilly.com/catalog/orsqlloader/chapter/ch01.html
My-SQL uses something called 'Bulk Insert'
http://mysql.bigresource.com/Bulk-insert-from-text-files-dDPRzHYo.html#2t6P0D5I
Exercise 3:
Review the materials for all the topics up to this week and prepare questions for the final overview
lecture in Topic 12.
Exercise 1:
Vendor Presentation
Give your presentation on the database architecture of the vendor you have chosen to the rest of
the group.
Takes notes on interesting points while other students are speaking. Your tutor will also lead a
discussion to summarise the findings of the class at the end of the presentations.
Exercise 2
Work in a small group with other students who have written a report on the same topic during
private study time.
Discuss the information you have found. You should take the opportunity to add any additional
information to your own notes.
Now prepare to present your information to students who have worked on the other report. You
should work together as group to prepare a short (5 minutes), informal presentation which will give
the other students a summary of the main information you have found.
Exercise 3
Work with your group to present your information to students from the other groups. You should also
answer any questions they might have.
Now listen to their presentations and take notes.
Topic 12
Topic 12: Summary
12.1 Learning Objectives
This topic provides an overview of the module materials as a whole.
On completion of the topic, students will be able to:
12.3 Timings
Lectures:
2 hours
Private Study:
7.5 hours
Summary of module
Clarification of module material and related issues as identified by students
Identify links to other modules/subject areas
This lecture recaps the list of topics covered during the module. It should be noted
that the overview given in this weeks topic should serve as a pointer to further
revision of material and not as a definitive statement of all that needs to be known
from the module for the purposes of the final summative assessment. The slides go
over and revisit some of the key features of the topics that have been studied.
Slide 5:
This slide provides examples of databases in use. There are various definitions
given in textbooks and this slide presents the definition given by one of the founding
fathers of modern databases, C.J.Date: A database is a computerised record
keeping system. This definition is sufficient as a starting point, but highlight to
students that some people include manual filing systems as being a type of
database. Databases have the capacity to store, manipulate and retrieve data. We
keep data there (storage), we do things to that data via programs and applications
(manipulation) and we need to be able to get the data out of the database when we
need it (retrieval).
Slide 6:
Data duplication
A customers name, address and phone number might be stored many times
over, i.e. once in the customers file and once again every time they make a
rental (therefore possibly many times in the rental file).
This wastes space and also raises the more serious problem of compromising
data integrity. Data integrity refers to data being logically consistent. For
example, if a customer changes his or her name or address, then all the files
containing that data must be updated, but the danger with duplication is that
this does not happen. The address might be changed in one file and not in
another, which would lead to difficulties in knowing which the correct address
is.
Incompatible files
Due to the application program dependency, files that can be processed by
one programming language will be different to those processed by another.
This makes files difficult to combine, which reinforces the isolation and
separation of data that we discussed earlier.
Slide 7:
This slide presents how the database approach overcomes the problems
associated with pre-database systems:
Integrated data
In a database system, all the application data is stored in a single facility
called a database. An application program can access customer information
and rental information easily. The program can specify how to combine the
data and the DBMS will do it.
Program/Data independence
Since the record formats are stored in the database itself (as metadata) then
we do not need to include file information in our application programs.
stored in tables. It is the job of the DBMS to store and retrieve data in these
tables. When a user wants to see the data in other formats, such as on a
screen or in a report, then we have to develop applications to do so.
Slides 8-9:
Data are raw facts kept in a computerised system. An example was given of data
in a sales database. Data in a sales database would include facts, such as a
customer name, address and telephone number. This is quite simple data which is
comprised of bits of text. Numerical data, such as the amount that a customer spent
last year, might also be stored. Today, this definition has to be expanded to reflect a
new reality since databases store objects, such as whole documents, photographic
images, sound and video.
Traditionally there has been a distinction made between data and information.
Information is data that has been processed in such a way that it can increase the
knowledge of the person who uses it.
Slide 10:
Metadata is data that contains the structure of other data. The structure of the
tables in a relational database is kept within the database itself in the form of
metadata. This defines the name of the table, the name of the column, the length of
the column and the data-type.
Slide 11:
know how to access this in order to be able to retrieve data from the database. The
collection of metadata in a database is known as the data dictionary.
Slide 12:
The students could be asked at this point about their understanding of data-types,
e.g. what data-types are available? It should be pointed out that different
implementations of relational databases from different vendors might have slightly
different names for the same data-type, even though there are standards.
Slide 13:
Relationship - The way one entity is linked to another. This represents real
world relationships, such as a customer buying a product. It can also
represent concepts within our data model, such as customer types relating to
customers.
Slides 14-15:
The students were asked in Topic 4 to draw the ER for this scenario. They should
be at the stage where they are comfortable being able to understand such a
scenario and construct the relevant ER and accompanying data dictionary with
primary and foreign keys and other appropriate attributes.
Slide 16:
Slides 17-18:
Relations and Tables - Tables can be thought of as the most basic structure
in the relational model. Within the model itself, as opposed to its
implementations, the equivalent of table is known as a relation. Tables are
made up of attributes. Relations are implemented in the database as twodimensional tables made up of columns and rows.
Domain - The set of valid values of an attribute. For example, the valid values
in the domain sex would be Male and Female.
Ask the students to give a definition of 1st, 2nd and 3rd Normal Form. The students
should also be able to normalise a document such as the one in the example. In
Topic 6, this process was illustrated in detail by breaking it down into steps.
Students should revisit this and be confident they understand the process at each
of the steps. There is also an example in the tutorial materials for Topic 6.
Slide 19:
The key concepts in SQL are create, insert, update, delete and select statements.
The students should understand how each of these important parts of SQL works.
Students should refer to the lectures and the laboratory work to make sure they
understand these concepts.
Slide 20:
Point out that database development involves skills from other disciplines within
ICT, e.g. systems analysis. There are particular issues with regards to database
development that are different from other types of development, for example the
development of networks or websites. This module focuses on the unique
elements.
Requirements gathering is part of systems analysis. There are different
methodologies available, but recently iterative approaches have been popular.
These entail a large amount of user involvement, for example by using prototypes
that are shown to the user to enable them to better identify what they want.
Slide 21:
Slide 22:
A transaction is one or more operations that are carried out on the database.
Transactions can generally be identified as retrievals, inserts, updates and deletes.
This is sometimes usefully remembered by the acronym CRUD (Create, Retrieve,
Update, Delete). Transactions can be cross referenced with tables by constructing
a CRUD matrix that shows which of the operations in a transaction (Create,
Retrieve, Update and/or Delete) affects which tables. .
Slide 23:
What is de-normalisation and what is its purpose? Normalising our data model
means we will have the minimum amount of redundancy. However, this can have
an effect on performance. If we are running a query that joins tables, this will be
slower than running a query against a single table or view.
Slide 24:
Slide 25:
This slide lists how constraints can be enforced using SQL as discussed in Topic
11.
Slide 26:
This slide looks at links with other modules. The Level 5 Database Development
module focuses specifically on the development aspects. Again, using SQL the
students will develop a system from an example scenario. They will also gain a
greater understanding of the development process and study each phase of it in
detail.
Systems Analysis modules are an important part of understanding how the
requirements for database systems are arrived at in the first place. Other modules
that deal with web technology will also enable students to understand aspects of
the interaction between web applications and databases.
Slides 27-28:
This is a good opportunity for an open question session. Students should have
prepared questions. These can have been e-mailed to the lecturer or raised in open
discussion.
Metadata
Fan traps
Chasm traps
Constraints on data
In order to make sure that you can show your understanding of the above,read through the lecture
slides and make short notes on each of the points. Revise from these notes. You can also ask your
tutor for guidance.
Describe
Entity types
In order to make sure that you can describe the above, read through the lecture slides and make
detailed notes on each of the points. Revise from these notes. You can also ask your tutor for
guidance.
Produce
In order to make sure that you can produce the above, go through the appropriate laboratory, tutorial
and private study exercises and make sure that you can answer the questions. If you are having
difficulties answering the questions, you may need to either revisit the lecture slides or ask your tutor
for guidance.