Академический Документы
Профессиональный Документы
Культура Документы
Problem
Define the following terms: data, database, DBMS, database system, database catalog,
program-data independence, user view, DBA, end user, canned transaction, deductive database
system, persistent object, meta-data, and transaction-processing application.
Step-by-step solution
Step 1 of 14
Data
The word data is derived from the Latin which means ‘to give’; data is real given facts, from
which additional facts can be inferred. Data is a collection of known facts that can be recorded
and that have implicit meanings.
Comment
Step 2 of 14
Database
Database is a collection of related data or operational data extracted from any firm or
organization. In other words, a collection of organized data is called database.
Comment
Step 3 of 14
DBMS is a collection of programs that enables users to create, maintain, and manipulate a
database. The DBMS is a general purpose software system that facilitates the process of
defining, constructing, and manipulating database.
Comment
Step 4 of 14
Database Systems
A database system comprises a database of operational data, together with the processing
functionality required to access and manage that data. The combination of the DBMS and the
database is called database systems.
Comment
Step 5 of 14
Database Catalog
A database catalog contains complete description of the databases, database objects, database
structure, details of users, and constraints etc. that are stored.
Comment
Step 6 of 14
Program-data independence
In traditional file processing, the structure of the data files is ‘hard-coded” into the programs. To
change the structure of the data file, one or more programs that access that file, should be
changed. The process of changing can introduce errors. In contrast to this more traditional
approach, DBMS access stores the structure in a catalog, separating the DBMS programs and
the data definition. Storing the data and programs separately is known as program-data
independence.
Comment
Step 7 of 14
User View
The way in which the database appears to a particular user is called user view.
Comment
Step 8 of 14
DBA (Database Administrator)
DBA is a person who is responsible for authorizing access to the database, coordinating and
monitoring its use, and acquiring software and hardware resources as needed.
Comment
Step 9 of 14
End User
End users are the people who want to access the database for different purposes like, querying,
updating, and generating reports.
Comment
Step 10 of 14
Canned Transactions
Standardized queries and updates on the database using carefully programmed and tested
programs.
Comment
Step 11 of 14
Deductive Database System
A deductive database system is a database system that supports the proof-theoretic view of a
database, and ,in particular, is capable of deducing are inferring additional facts from the given
facts in the extensional database by applying specified deductive anxious are rules of inference
to those given facts.
Comments (3)
Step 12 of 14
Persistent object
Object-Oriented database systems are compatible with programming languages such as c++ and
JAVA. An object that is stored in such a way that it survives that termination of the DBMS
program is persistent.
Comment
Step 13 of 14
Meta Data
Information about the data is called Meta data. The information stored in the catalog is called
Meta data. The schema of a table is an example of Meta data.
Comment
Step 14 of 14
A transaction is a logical unit of database. The processing includes one or more database
operations like, insertion, deletion, modification and retrieval. The database operations that form
a transaction can either be embedded within an application program on they can be specified
interactively via a high-level query language such as SQL.
Comment
Chapter 1, Problem 2RQ
Problem
What four main types of actions involve databases? Briefly discuss each.
Step-by-step solution
Step 1 of 5
• Database Administration
• Database Designing
Comments (1)
Step 2 of 5
• Database Administration:
• Database Administration is a process of administering the database resources such as
application programs, database management system.
• Database Administrator (DBA) is responsible for giving the permission to access the
database.
• The administrative work also includes acquiring the software and hardware resources.
Comment
Step 3 of 5
• Database designing:
• Database designing is a process of designing the database which includes identifying the data
to be stored in the database and which data structures will be required to store the data.
• Database design should fulfill the requirements of all the user groups of the organization.
Comment
Step 4 of 5
• End users are the users who can directly access the database for querying, updating and
generating the reports. There are following types of end users:
o Casual end user: These are the users who access the database occasionally. Middle and
high-level managers are the examples of the Casual end users.
o Parametric end user: These are the users who constantly access the database. Bank tellers
are the examples of the parametric end users.
o Sophisticated end user: They are under the category of engineers, scientists who implement
the application to meet the complex requirements.
o Standalone users: These are the users who maintain personal database by using ready-made
program packages.
Comment
Step 5 of 5
• The system analysis is a process which determines the requirement of the end users.
• The system analysis is done by the System Analysts. System Analysts develop the
specification for the canned transactions that meet the requirement of the end users.
Comment
Chapter 1, Problem 3RQ
Problem
Discuss the main characteristics of the database approach and how it differs from traditional file
systems.
Step-by-step solution
Step 1 of 4
Characteristics of Database:
A fundamental characteristic of the database approach is that the database system contains not
only the database itself but also complete definitions are description of the database. Structure
and constraints.
• The information stored in the catalogs is called meta – data, and if describes the structure of the
primary database.
• In traditional file processing, data definition is typically part of the application programs
themselves.
Those programs are constrained to work with only one specific database; whose structure is
declared in the application programs.
Comment
Step 2 of 4
In traditional file processing, the structure of data files is embedded in the applications programs,
so any changes to the structure of a file may require changing all programs that access that file.
• The structure of data files is stored in DBMS catalog separately from the access programs.
Comment
Step 3 of 4
A database typically has many users; each of whom may require a different perspective are view
of the database.
• A multi-user DBMS whose users have a variety of district applications must provide facilities for
defining multiple view.
Comment
Step 4 of 4
• In traditional database, no such data sharing is possible, there is no such concurrency software
available.
Comment
Chapter 1, Problem 4RQ
Problem
What are the responsibilities of the DBA and the database designers?
Step-by-step solution
Step 1 of 2
Responsibilities of DBA:
DBA stands for Data Base Administrator. The purpose of a database administrator is highly
technical, who is responsible for managing the database used in the organization.
• The database administrator has the responsibility to build the physical design of the database.
o Defence enforcement
Step 2 of 2
Database designer is the Architect of the database, database designer work is versatile, and
He/she works with everyone in the organization. The responsibilities of database designer is as
follows,
• They communicate about the architecture to business and management and also may
participates in business development as advisor
Comment
Chapter 1, Problem 5RQ
Problem
What are the different types of database end users? Discuss the main activities of each.
Step-by-step solution
Step 1 of 2
The end users perform various database operations like querying, updating, and generating
reports.
• Standalone Users
Comment
Step 2 of 2
Casual end users:
• Each time they access the database, their request will vary.
• They use sophisticated database query language to retrieve the data from the database.
• Naïve or parametric end users spend most of their time in querying and updating the database
using standard types of queries.
• The sophisticated end users access the database to implement their own applications to meet
their specific goals.
• The sophisticated end users are engineers, scientists, and business analysts.
Standalone Users:
• The standalone end users maintain their own databases by creating one using the ready-made
program packages that provides a graphical user interface.
Comment
Chapter 1, Problem 7RQ
Problem
Discuss the differences between database systems and information retrieval systems.
Step-by-step solution
Step 1 of 14
Database Approach:– A databases is more than a file it contains information about more then
one entity and information about relationships among the entities.
Information retrieval systems:– It information retrieval system data are stored in file is a very old
rout often used approach to system developed.
Comment
Step 2 of 14
Database approach:– Data about a single entity (i.e., Product customer, department) are each
stored to a “table” in the database.
Comment
Step 3 of 14
Information retrieval systems: Each program (system) often had its own unique set of files.
Comment
Step 4 of 14
Database approach: Databases are designed to meet the needs of multiple users and to be used
in multiple applications.
Comment
Step 5 of 14
Information retrieval systems: User of information retrieval systems are almost always at the
mercy of the information department to write programs that manipulate stored data and produce
needed information.
Comment
Step 6 of 14
Database approach: Database approach are relatively complex to design, implement and
maintained.
Comment
Step 7 of 14
Information retrieval systems: Information retrieval systems are very simple to design and
implement as they are normally based on a single application or information system.
Comment
Step 8 of 14
Database approach: The process speed is slow in comparison to information retrieval systems.
Comment
Step 9 of 14
Information retrieval systems:– The processing speed is faster than other ways of storing data
Comment
Step 10 of 14
Author Differences :–
Comment
Step 11 of 14
Comment
Step 12 of 14
Improve data sharing is present in database, but in case of data retrieval limited data
sharing.
Comment
Step 13 of 14
In database flexibility and scalability are present but in retrieval system, data are not flexible
and scalable
Comment
Step 14 of 14
In database, reduce data redundancy, but in case of data retrieval systems data redundancy
is are of the important problems.
Comment
Chapter 1, Problem 7RQ
Problem
Discuss the differences between database systems and information retrieval systems.
Step-by-step solution
Step 1 of 14
Database Approach:– A databases is more than a file it contains information about more then
one entity and information about relationships among the entities.
Information retrieval systems:– It information retrieval system data are stored in file is a very old
rout often used approach to system developed.
Comment
Step 2 of 14
Database approach:– Data about a single entity (i.e., Product customer, department) are each
stored to a “table” in the database.
Comment
Step 3 of 14
Information retrieval systems: Each program (system) often had its own unique set of files.
Comment
Step 4 of 14
Database approach: Databases are designed to meet the needs of multiple users and to be used
in multiple applications.
Comment
Step 5 of 14
Information retrieval systems: User of information retrieval systems are almost always at the
mercy of the information department to write programs that manipulate stored data and produce
needed information.
Comment
Step 6 of 14
Database approach: Database approach are relatively complex to design, implement and
maintained.
Comment
Step 7 of 14
Information retrieval systems: Information retrieval systems are very simple to design and
implement as they are normally based on a single application or information system.
Comment
Step 8 of 14
Database approach: The process speed is slow in comparison to information retrieval systems.
Comment
Step 9 of 14
Information retrieval systems:– The processing speed is faster than other ways of storing data
Comment
Step 10 of 14
Author Differences :–
Comment
Step 11 of 14
Comment
Step 12 of 14
Improve data sharing is present in database, but in case of data retrieval limited data
sharing.
Comment
Step 13 of 14
In database flexibility and scalability are present but in retrieval system, data are not flexible
and scalable
Comment
Step 14 of 14
In database, reduce data redundancy, but in case of data retrieval systems data redundancy
is are of the important problems.
Comment
Chapter 1, Problem 8E
Problem
Identify some informal queries and update operations that you would expect to apply to the
database shown in Figure 1.2.
Step-by-step solution
Step 1 of 2
Information Queries:–
b) List the name of students who took the section of the ‘Database’ course offered in fall 2005
and their grades in that section.
Comment
Step 2 of 2
Updates Operations:–
b) Create a new section for the “Database” course for this semester.
c) Enter a grade of ‘A’ for ‘Smith’ in the ‘Database’ section of last semester
Comment
Chapter 1, Problem 9E
Problem
What is the difference between controlled and uncontrolled redundancy? Illustrate with
examples.
Step-by-step solution
Step 1 of 3
Storing the same facts or data at multiple places in the database is considered as redundancy. In
other words, duplication of data is known as redundancy.
• Inconsistency of data
Comment
Step 2 of 3
Step 3 of 3
Assume that an employee can work on multiple projects. So, in works table, empno and deptno
are redundant if an employee works on two or more projects.
Figure 1 is an example of controlled redundancy. Deptno for empno 100 is same in all three
records.
Figure 2 is an example of uncontrolled redundancy. Deptno for empno 100 is inconsistent in the
two records.
Comment
Chapter 1, Problem 10E
Problem
Specify all the relationships among the records of the database shown in Figure 1.2.
Step-by-step solution
Step 1 of 2
Relationship in the database specify how the data tables are related to each other.
Comment
Step 2 of 2
• Consider the tables COURSE and SECTION. The two tables have common column
“Course_number”.
• Consider the tables STUDENT and GRADE_REPORT. The two tables have common column
“Student_number”.
• Consider the tables COURSE and PREREQUISITE. The two tables have common column
“Course_number”.
• Consider the tables SECTION and GRADE_REPORT. The two tables have common column
“Section_identifier”.
Problem
Give some additional views that may be needed by other user groups for the database shown in
Figure 1.2.
Step-by-step solution
Step 1 of 2
New view can be created, which filters each section number of a student and grade of the
student.
GRADE_SEC_REPORT
This view is very helpful for university’s administration to print each section’s grade report.
Comment
Step 2 of 2
Additional view can be created, which filters total number of courses took by a student and the
grade achieved by a student in that courses.
COURSE_GRADE_REPORT
This view is very helpful for university’s administration to determine students’ honours.
Chapter 1, Problem 12E
Problem
Cite some examples of integrity constraints that you think can apply to the database shown in
Figure 1.2.
Step-by-step solution
Step 1 of 1
4. Prerequisite of each course must have been an offered course in past or must be an existing
course.
Comment
Chapter 1, Problem 13E
Problem
Give examples of systems in which it may make sense to use traditional file processing instead
of a database approach.
Step-by-step solution
Step 1 of 2
Despite the advantages of using a database approach, there are some situations in which a
DBMS may involve unnecessary overhead costs that would not be incurred in traditional file
processing.
Comment
Step 2 of 2
The following are examples of systems in which it may make sense to use traditional file
processing instead of a database approach.
• Many computer aided design foals (CAD) used by the chemical and civil engineers have
proprietary file and data management software that is geared for the internal manipulations or
drawing and 3D objects.
• Similarly, communication and switching systems designed by companies like At & T.
• The GIS implementations often implement their own data organization schemes for efficiently
implementing functions related to processing maps, physical contours, lines, polygons, and so
on. General purpose DBMS’s are inadequate for their purpose.
Comment
Chapter 1, Problem 14E
Problem
a. If the name of the ‘CS’ (Computer Science) Department changes to ‘CSSE’ (Computer
Science and Software Engineering) Department and the corresponding prefix for the course
number also changes, identify the columns in the database that would need to be updated.
b. Can you restructure the columns in the COURSE, SECTION, and PREREQUISITE tables so
that only one column will need to be updated?
Step-by-step solution
Step 1 of 2
a) The following columns need to be updated when the name of the department changed along
with the course number.
In the STUDENT table, Major has to be updated. In the COURSE table, Course_number and
Department should be updated. In the SECTION table, Course_number should be updated. In
the PREREQUISITE table, Course_number and Prerequisite_number are to be modified.
Comment
Step 2 of 2
Problem
Define the following terms: data model, database schema, database state, internal schema,
conceptual schema, external schema, data independence, DDL, DML, SDL, VDL, query
language, host language, data sublanguage, database utility, catalog, client/server architecture,
three-tier architecture, and n-tier-architecture.
Step-by-step solution
Step 1 of 19
Data model
The data model describes the logical structure of the database and it introduces abstraction in
the DBMS (Database Management System). The data model provides a tool to describe the data
and their relationships.
Comment
Step 2 of 19
Database Schema
The database schema describes the overall design of the database. It is a basic structure to
define how the data is organized in the database. The database schema can be depicted by the
schema diagrams.
Comment
Step 3 of 19
Database state
The actual data stored in the database in a moment in time is called the database state.
Comment
Step 4 of 19
Internal Schema
It is also referred as the Physical level schema. The internal schema represents the structure of
the data as viewed by the DBMS and it describes the physical storage structure of the database.
Comment
Step 5 of 19
Conceptual Schema
It is also referred to as the Logical level schema. It describes the logical structure of the whole
database for a group of users. It hides the internal details of the physical storage structure.
Comment
Step 6 of 19
External Schema
The external schema referred as User level schema. It describes the data which is viewed by the
end users. This schema describes the part of the database for a user group and it hides the rest
of the database from that user group.
Comment
Step 7 of 19
Data independence
The capacity to change the schema at the physical level of a database system without affecting
the schema at the conceptual or external level is called data independence.
Comment
Step 8 of 19
DDL
DDL stands for Data Definition Language. It is used to create, alter, and drop the database
tables, views, and indexes.
Comment
Step 9 of 19
DML
DML stands for Data Manipulation Language. It is used to insert, retrieve, update, and delete the
records in the database.
Comment
Step 10 of 19
SDL
SDL stands for Storage Definition Language. It is used to specify the internal schema of the
database and specify the mapping between two schemas.
Comment
Step 11 of 19
VDL
VDL stands for View Definition Language. It specifies the user views and their mappings to the
logical schema in the database.
Comment
Step 12 of 19
Query Language
The query language is a high-level language used to retrieve the data from the database.
Comment
Step 13 of 19
Host Language
The host language is used for application programming in a database. The DML commands are
embedded in a general-purpose language to manipulate the data in the database.
Comment
Step 14 of 19
Data Sublanguage
Comment
Step 15 of 19
Database utility
The database utility is a software module to help the DBA (Database Administrator) to manage
the database.
Comment
Step 16 of 19
Catalog
The catalog stores the complete description of the database structure and its constraints.
Comment
Step 17 of 19
Client/server architecture
The client/server architecture is a database architecture and it contains two modules. A client
module usually a PC that provides the user interface. A server module can respond the user
queries and provide services to the client machines.
Comment
Step 18 of 19
Three-tier architecture
The three-tier architecture consists of three layers such as client, application server, and
database server. The client machine usually contains the user interface and the intermediate
layer (application layer) running the application programs and storing business rules. The
database layer stores the data.
Comment
Step 19 of 19
n-tier architecture
The n-tier architecture consists of four or five tiers. The intermediate layer or business logic layer
is divided into multiple layers. And distributing programming and data throughout a network.
Comment
Chapter 2, Problem 2RQ
Problem
Discuss the main categories of data models. What are the basic differences among the relational
model, the object model, and the XML model?
Step-by-step solution
Step 1 of 2
Comment
Step 2 of 2
The Differences between relational model, the object model and XML model are as
follows:
Relational Model Object Model XML Model
The data in relational model It refers to the model which The data in the XML model is in
is represented logically and deals with how applications hierarchical mode. We can
information about the will interact with the resources define different types of the
relationship types. from any external resource. data in a single XML document.
Comment
Chapter 2, Problem 3RQ
Problem
Step-by-step solution
Step 1 of 1
Database schema is a description of the database and the database state is the database it
self.
The description of a database is called the database schema, which is specified during
database design and is not expected to change frequently. Most data models have certain
convention for displaying schemas as diagram. A displayed schema is called a schema diagram
schema diagram displays the structure of each record type but not the actual instances of
records. A schema diagram displays only some aspects of a schema, such as the names of
record types and data items, and some types of constraints.
The data in the database at a particular moment in time is called a database state. It is also
called the current set of occurrences are instances in the data base. In a given database state,
each schema construct has its own current set of instances many database states can be
constructed to covers pond to a particular data base schema. Every time we insert are delete a
record are change the value of a data item in a record we change one state of the database into
another state.
When we define a new database we specify its database schema only to the DBMS. At this
point, the covers pending database state in the empty state with no data. The DBMS in partly
responsible for ensuring the every state of the database is a valid state. – that is , a state that
satisfies the structure and constraints specified in the schema.
The schema is sometimes called the intension, and a database state is called an extension of
the schema.
Comment
Chapter 2, Problem 4RQ
Problem
Describe the three-schema architecture. Why do we need mappings among schema levels? How
do different schema definition languages support this architecture?
Step-by-step solution
Step 1 of 3
Three-schema architecture :-
The goal of he three-schema architecture is to separate the user applications and the physical
database. In this architecture schemas can be defined at the following three levels.
it has an internal schema, which describes the physical storage structure of the database.
It has a conceptual schema, which describes the structure of the whole database for a
community of users. The conceptual schema hides the details of physical storage structures and
concentrates on describing entities, data types, relationships, user operations and constraints.
Comment
Step 2 of 3
It includes a number of external schema are user views. Each external schema describes the
part of the database that a particular user group is interested in and hides the rest of the
database from that group. A high-level data model on an implementation data model can be used
at this level.
Need of mapping :-
The process of transforming requests and results between levels are called mappings.
The conceptual internal mapping define the coverspondence between the conceptual view and
the stared database. It specifies how conceptual records and fields are represented at the
internal level.
An external conceptual mapping defines the covers pondence between a particular external view
and the conceptual view.
Comment
Step 3 of 3
DDL :-
Data definition language is used to specify conceptual and internal schemas for the database
and any mappings between the two, the DBMS will have a DDL compiler whose function is to
process DDL statements in order to identify descriptions of the schema constructs and to store
the schema description in the DBMS catalog.
SDL :-
Storage definition language is used to specify the internal schema. The mappings between the
two schemas may be specified in either one of these languages. In mast relational DBMS’s to
day, there is no specific language that performs the sale of SDL. Instead the internal schema is
specified by a combination of parameters and specifications related to storage.
VDL :-
View Definition Language is used to specify user view and their mappings to the conceptual
schema but in most DBMS’s the DDL is used to define both conceptual and external schemas. In
relational DBMS’s SQL is used in the sale of VDL to define user are application views as results
of predefined queries.
Comment
Chapter 2, Problem 5RQ
Problem
What is the difference between logical data independence and physical data independence?
Which one is harder to achieve? Why?
Step-by-step solution
Step 1 of 3
The data independency refers to the task of changing a level of schema without affecting the
other levels or the levels at higher level.
There are following two different ways in which data independence is achieved:
Comment
Step 2 of 3
Logical data independence is the capacity to change the conceptual schema without changing
the external schema. This only requires changing the view definition and the mappings. For
example, changing the constraints of an attribute that does not affect the external schema,
insertion and deletion of data items that changes the table size but does not affect the external
schema.
Physical data independence is the capacity to change the internal schema without changing the
conceptual schema or the external schema. For example, reorganization of files on the physical
storage to enhance the operations on the database and since the data is the same and only the
files are relocated, the conceptual/external schema remains unaffected.
Comment
Step 3 of 3
The logical data independence is harder to achieve. Changing the attribute constraints and the
structure of the table might result in invalid data for the changed attributes. The table or the
application program that references the modified table will get affected which should not be the
case in logical data independence.
Comment
Chapter 2, Problem 6RQ
Problem
Step-by-step solution
Step 1 of 2
Procedural DML :-
Procedural data manipulation language is called low level DML. Procedural DML must be
embedded in a general purpose programming language. This type of DML typically retrieves.
Individual records are objects from the database and process each separately. Therefore, it
needs to use programming language. Constructs, such as looping to retrieve and process each
record form a set of records.
Comment
Step 2 of 2
Non-procedural DML :-
Non-procedural is called high level DML. Non-procedural DML can be used on its own to specify
complex database operations concisely many DBMS’s allow high-level DML statements either to
be entered interactively from a display monitor ore terminal are to be embedded in a general-
purpose programming language.
A query in a high level DML often specifies which data to retrieve rather than how to retrieve it.
Therefore such languages are also called declarative.
Non-procedural DML requires a user to specify what data are needed without specifying low to
get these data.
Comment
Chapter 2, Problem 7RQ
Problem
Discuss the different types of user-friendly interfaces and the types of users who typically use
each.
Step-by-step solution
Step 1 of 7
(a)
Menu-Based interfaces:
• These interfaces contain the lists of options through which the user can send the request.
• These types of interfaces are used by the web browsing users and web clients.
Comment
Step 2 of 7
(b)
Forms-based interfaces:
• These Forms are usually designed and programmed for naive users as interfaces to recorded
transactions.
• User who wants to submit the online information by filling and submitting the details.
• Mostly used to create accounts on a website, or enrolling into some institution etc.
Comment
Step 3 of 7
(c)
• A graphical user interfaces contain a diagrammatic form that comprises a schema to the user.
• These interfaces use mouse as pointing device to pick certain parts of the displayed schema
diagram.
• Mostly used by the users who uses the electronic gadgets such as mobile phones and touch
screens.
• Users who uses the applications that are accessed by pointing devices.
Comment
Step 4 of 7
(d)
• These interfaces accept the request from the user and tries to interpret it.
• The natural language interfaces have its own schema which is like the database conceptual
schema.
• The Search engines in these days are using natural language interfaces.
• The users can use these search engines that accepts the words and retrieves the related
information.
Comment
Step 5 of 7
(e)
• These interfaces accept speech as an input and outputs the speech as a result.
• These types of interfaces are used in the inquiry for telephone directory or to get the flight
information over the smart gadgets, etc.
Comment
Step 6 of 7
(f)
• Paramedic users such as bank tellers have a small set of operations that they must perform
repeatedly.
• These interfaces contain some commands to perform a request with minimum key strokes.
Comment
Step 7 of 7
(g)
• These interfaces contain some commands for creating accounts, to manipulate the database
and to perform some operations on the database.
Comment
Chapter 2, Problem 8RQ
Problem
Step-by-step solution
Step 1 of 7
A database management system (DBMS) is a set of program that empowers users to build and
maintain a database.
Comment
Step 2 of 7
List of other computer system software a database management system (DBMS) interacts
with:
The following are the list of other computer system software a database management system
(DBMS) interacts with:
• Communication software.
Comment
Step 3 of 7
CASE tools:
The design phase of the database system often employs the CASE tools.
Comment
Step 4 of 7
Data dictionaries:
Data dictionaries are similar to database management system catalog, however, they include
variety of information.
• Typically, data dictionaries can be directly accessed by the database administrator (DBA)
whenever required.
Comment
Step 5 of 7
o JBuilder (Borland)
o PowerBuilder (Sybase)
Comment
Step 6 of 7
• The information repository is a kind of data dictionary that can also stores information like
design decisions, application program descriptions, usage standards, and user information.
• Like data dictionaries, information repository can also be directly accessed by the database
administrator.
Comment
Step 7 of 7
Communication software:
• The database management system also requires interfacing with communication software.
• The main function of the communication software is to enable users residing remote from the
database system to access the database through personal computers, or workstations.
• The communication software are connected to the database system through communications
hardware like routers, local networks, phone lines, or satellite communication devices.
Comment
Chapter 2, Problem 9RQ
Problem
What is the difference between the two-tier and three-tier client/server architectures?
Step-by-step solution
Step 1 of 2
The difference between a two-tire architecture and a three tire architecture is that of a layers
through which data and queries pass at time of processing, for any database.
In two tire architecture there is two layers viz., Client layer (user interface) and query server or
transaction server. Application programs run on client side and when data processing is required
connection is established with the server (DBMS), where data is stored. Once connection is
established, transaction and query requests are sent using Open Database Connectivity’s API’s,
which are then processed at server side. It may also happen that client side takes care of user
interaction and query processing while server stores data, manages disks etc. Exact distribution
of functionality differs but two - tire architecture has two layers.
Comment
Step 2 of 2
In three- tire architecture there are three layers, and a new application or web layer is between
client and database service layer. The idea behind three tire architecture is to partition roles in
different layers and each layer has specific task. In three-tire architecture, user layer or client
layer provide user interface from where user can run query. Query gets processes at application
or web server layer. This layer also checks for any business constraints that may be imposed on
type of query user can send or verify credentials of user so has verify access permissions that
user has. This layer can also be called as Business logic layer. Finally Database server manages
storage of data in the system.
Comment
Chapter 2, Problem 10RQ
Problem
Discuss some types of database utilities and tools and their functions.
Step-by-step solution
Step 1 of 2
Few categories of database utilities and tools and their functions are:
1. Loading:
Load existing data files such as text files into the database.
• Transfer data from one dbms to another dbms easily used in many organizations.
• Vendors are offering the conversion tools. Those tools are useful loading programs.
2. Backup:
• Put entire database onto tape and those database backup copies can be used in the case of
catastrophic loss for recovering system state.
Comment
Step 2 of 2
It is a utility that can be used to restructure a set of database files into a different file organization
to raise the performance of the database.
4. CASE tools:
• It is one of the repository is used to store design process, user information and application
program description.
6. Performance monitoring:
• Those stats are used by the DBA in making selection, those selections are related to file
restructure and indexing for raise the performance of database.
Comment
Chapter 2, Problem 11RQ
Problem
Step-by-step solution
Step 1 of 1
It is customary to divide the layer between the user and the stored data in three tire architecture
into finer components, thereby giving rise to an n-tire architecture, where n may be 4 or 5.
Typically, the business logic layer is divided into multiple layer.
2. Each tire can run on appropriate processor or operating system platform and can be handled
independently.
Another layer that is typically used by vendors of ERP and CRM packages is the middleware
layer which accounts for the front-end modules communicating with a number of back-end
databases.
Comment
Chapter 2, Problem 13E
Problem
Choose a database application with which you are familiar. Design a schema and show a samp
database for that application, using the notation of Figures 1.2 and 2.1. What types of additional
information and constraints would you like to represent in the schema? Think of several users o
your database, and design a view for each.
Step-by-step solution
Step 1 of 2
• Each flight is identifies by Number, and consists of one or more FLIGHT_LEGs with Leg_no.
And flies on certain weekdays.
• Each FLIGHT_LEG has scheduled arrival and departure time and arrival and departure airport
and one or more LEG_INSTANCEs – one for eachDate on which flight travels.
• FARE is kept for each flight and there are certain set of restrictions on FARE.
• For each FLIGHT_LEG instance, SEAT_RESERVATIONs are kept, as are AIRPLANE used on
each leg and the actual arrival and departure times and airports.
Comment
Step 2 of 2
a. Asked flight number or flight leg is available on given date. Data can be checked from
LEG_INSTANCE table.
b. A non reserved seat must exist for specifies date and flight. We can get total number of seats
available from AIRPLANE.
c. Fligh_leg can correspond to existing flight number.
e. Leg_instance can have entries only for valid Flight_number and leg_number combination.
f. Flight_number in any relation is of a valid flight that has its entry in FLIGHT table.
Comment
Chapter 2, Problem 14E
Problem
If you were designing a Web-based system to make airline reservations and sell airline tickets,
which DBMS architecture would you choose from Section 2.5? Why? Why would the other
architectures not be a good choice?
Step-by-step solution
Step 1 of 4
There are four architectures discussed in section 2.5 in the textbook. They are
Comment
Step 2 of 4
For designing a Web-based system to make airline reservations and sell airline tickets, Three-tie
client/server architecture will be the best choice.
• A web user interface is necessary as different types of users such as naive users or casual
users will interact with the system.
• User can interact with user interface and submit the transactions.
• Web server can handle those transactions, validate the data and manipulate database
accordingly.
Comment
Step 3 of 4
In centralized DBMS architecture, DBMS functionality and user interface are performed on the
same system. But for a Web-based system, they must be on different systems.
Comment
Step 4 of 4
In three-tier Client/Server Architecture, the business logic is placed in application server or web
server.
Hence, Basic Client/Server architecture and Two-Tier Client/Server architecture are not
appropriate for web-based system.
Comment
Chapter 2, Problem 15E
Problem
Consider Figure 2.1. In addition to constraints relating the values of columns in one table to
columns in another table, there are also constraints that impose restrictions on values in a
column or a combination of columns within a table. One such constraint dictates that a column o
a group of columns must be unique across all rows in the table. For example, in the STUDENT
table, the Student_number column must be unique (to prevent two different students from havin
the same Student_number). Identify the column or the group of columns in the other tables that
must be unique across all rows in the table.
Step-by-step solution
Step 1 of 2
By using schema diagram of the database, the database tables are constructed. Each data bas
table contains column and those columns are unique.
Comment
Step 2 of 2
1. STUDENT: Student_number
2. COURSE: Course_number. If course name is separate for each course Course_name can
also be a column.
3. PREREQUISITE: Course_number can be a unique identifier but only if a course has single
PREREQUISITE or else Course_number and Prerequisite_number will together form unique
combination.
4. SECTION: Section_identifier
• Look at that Section_identifier is unique only within a given course allow in a given term.
• The Section_identifier will be different if a student takes the same course or different course in
other term.
Comment
Chapter 3, Problem 1RQ
Problem
Discuss the role of a high-level data model in the database design process.
Step-by-step solution
Step 1 of 2
High-level data model provides the concepts for presenting data which are close to the user
recognize data. It helps to show the data requirements of the users in a detailed description of
the entity types, relationships and constraints.
Comment
Step 2 of 2
The role of a high-level data model in the database design process is as follows:
• The design process of the High-level data model is easy to understand and useful in
communicating with non-technical users.
• This model acts as a reference to ensure that all the user requirements are met and do not
conflict with each other.
• High-level data model helps to concentrate on specifying the properties of data to the database
designers, without being concerned with storage details in the database design process.
Comment
Chapter 3, Problem 2RQ
Problem
List the various cases where use of a NULL value would be appropriate.
Step-by-step solution
Step 1 of 2
For example: In a schema that stores information about a person if we have an attribute called
Company, which sores the company name where a person works. Now for a student who is no
working, this attribute value will be irrelevant, so we can put in a NULL value at its place.
Comment
Step 2 of 2
2. When value of a particular attribute is not known; either because it is not known that value for
attribute exist or because existing value is unknown; then we can put NULL as value.
For example: In a schema that stores information about a person if we have an attribute called
Company, which sores the company name where a person works. Now for a person it is
possible that he is not working or it might be the case that the value of the company in which
person works is unknown, so we can put in a NULL value at its place.
Comment
Chapter 3, Problem 3RQ
Problem
Define the following terms: entity, attribute, attribute value, relationship instance, composite
attribute, multivalued attribute, derived attribute, complex attribute, key attribute, and value set
(domain).
Step-by-step solution
Step 1 of 5
1. Entity: An entity is an object (thing) with independent physical (car, home, person) or
conceptual (company, university course) existence in the real world.
2. Attribute: Each real world entity (thing) has certain properties that represent its significance i
real world or describes it. These properties of an entity are known as attribute.
For example: consider a car: various things that describe a car can be: model, manufacture,
color, cost etc...
All these are relevant in a miniworld and are important in describing a car. These are attributes o
a CAR.
Comment
Step 2 of 5
3. Attribute Value: Associated with each real world entity are certain attributes that describe tha
entity. Value of these attributes for any entity is called attribute value.
For Example: Attribute Value of color attribute of car entity can be Red.
For example: In relationship type WORKS_FOR between the two entity types EMPLOYEE and
DEPARTMENT, which associates each employee with the department for which the employee
works. Each relationship instance in the relationship set WORKS_FOR associates one
EMPLOYEE and one DEPARTMENT.
Comment
Step 3 of 5
5. Composite Attribute: An attribute that can be divided into smaller subparts, which represent
more basic attributes with independent meanings, is called a composite attribute.
For Example: consider an attribute called phone number that in relation to an employee of a
company. One can have phone number as a single attribute or as two attributes, viz. ., area cod
and number. Since phone number can be broken into two independent attributes, it is a
composite attribute.
Weather to break a composite attribute or divide it in basic attributes depends on usage of the
attribute in miniworld.
6. Multivalued Attribute: For a real world entity, an attribute may have more than one value. Fo
example: Phone number attribute of a person. A person may have one, two or three phones. So
there is a possibility of more than one value for this attribute. Any attribute that can have more
than one value is a multivalued attribute.
Comment
Step 4 of 5
7. Derived Attribute: For a real world entity, an attribute may have value that is independent of
other attributes or can not be derived from other attributes; such attributes are called as stored
attributes. There are also certain attributes, whose value can be derived using value of other
attributes; such attributes are known as derived attributes.
For example: if date of birth of a person is a stored attribute, and using DOB attribute and
current date age of a person can be calculated; so age is a derived attribute.
8. Complex Attribute: Composite and multivalued attribute can be nested arbitrarily. Arbitrary
nesting can be represented by grouping components of a composite attribute between
parenthesis () and separating the components with comas, and by displaying multivalued
attributes between braces {}. Such attributes are called composite attributes.
For Example: if a person has more than one address and each residence has multiple phones
and address_phone attribute can be specifies as:
(Address_phone({Phone(Area_code,Ph_Num)},Address(street_address,
(Number,Street,Apartment_number),City,State,Zip))
Comment
Step 5 of 5
9. Key Attribute: Each real world entity is unique in itself. There are certain attributes whose
value is different for all similar type of entities. These attributes are called Key attributes. These
attributes are used to specify uniqueness constraint in a relation.
For Example: Consider a entity Car. For all cars, attribute, registration number and car number
will have different values. These are key of all entity of car type.
10. Value Set (domain): For a Attribute of a real world entity, there is a range of values from
which a particular attribute can take value. For example: Age attribute of an employee must
have value, let, from 18-70 then all integers in range 18-70 are domain of attribute Age In most
programming languages basic data types such as integers, strings, float, date etc… are used to
specify domain of a particular attribute.
Comment
Chapter 3, Problem 4RQ
Problem
What is an entity type? What is an entity set? Explain the differences among an entity, an entity
type, and an entity set.
Step-by-step solution
Step 1 of 4
Entity type: An entity type defines a collection (or set) of entities that have the same attributes.
database usually contains a group of entities that are similar. These entities have same attribute
but different attribute values. A collection of these entities is an entity type.
For example a car dealer might like to store details of all car in his showroom in a car database
A collection of all car entities will be call as entity type.
Each entity type in a database is represented by its name and its attributes.
Comment
Step 2 of 4
For example in CAR can be the name of the entity type and Reg_num, Car_num, Manufacturer
model, cost, color can be attributes.
Entity Set: At a particular time the dealer might have a set of eight cars and at some other time
he might have a set of different 4 cars.
The collection of all entities of a particular entity type in a database, at any point of time are
called entity set. It is referred by same name as entity type.
Comment
Step 3 of 4
Name: CAR
Entities: e1(reg_1, DL_1, ford, 1870,2000000,white),e2(reg_2, DL_3, ford,
1830,1000000,white),e3(reg_3, DL_3, ford, 1877,2100000,red),e4(reg_4, DL_4, ford,
1970,2500000,white)
Comment
Step 4 of 4
An entity is a real world object or thing that has independent physical or conceptual existence.
Often there are many entities of similar type and about those information needs to be stored in
database. Name of this database and attributes of entity jointly form an entity type, or in other
words entity type is collection of entities that have similar attributes. At two instance of time,
entities in miniworld about which information is stored in the database can be different. Collectio
of entities of an entity type at an instance of time is called entity set.
Comment
Chapter 3, Problem 5RQ
Problem
Step-by-step solution
Step 1 of 2
Attribute:
Every entity has certain things that represent its importance in the real world. These properties o
entities are known as attribute.
Example:
Let us consider a Bus, bus contains different things that describe a bus can be model, color,
manufacture date, year, country etc.
Value set:
For attribute of an entity, there is a range of values from which an attribute can take a value.
Example:
Age attributes of an employee must have a value. Let us consider Age is attribute in the range o
16 - 60 then all are integers and those known as the value set of attribute Age.
Comment
Step 2 of 2
A table grouped the data in rows and columns. The Value set is the group of values that may
columns are known as attributes of that table. be allow to that attribute for each entity.
Problem
What is a relationship type? Explain the differences among a relationship instance, a relationship
type, and a relationship set.
Step-by-step solution
Step 1 of 3
Relationship type:
This expresses a type of relationship that is occurring between the entities and also lists the
possible count of relationships between entities.
Comment
Step 2 of 3
Explanation:
STUDENT and COURSE are entities and ENROLL refers to the relationship.
r1, r2, r3… are the relationship types between the entities.
Relationship type is the association between the entities. In the above diagram ENROLL is the
relationship type.
Relationship instance refers to exactly one instance from each participating entity type. S1 is
related to C1 through r1. S1 and C1 are one instance, S2 and C2 are one instance, S3 and C1
and so on.
Relationship set refers to all instances of a Relationship type. {(S1, C1), (S2, C2) , S1, C3) …}
form the relationship set.
Comment
Step 3 of 3
It refers to exactly one instance from It refers association This is a collection instances
each participating entity type. between the entities. of a relationship type.
Comment
Chapter 3, Problem 7RQ
Problem
What is a participation role? When is it necessary to use role names in the description of
relationship types?
Step-by-step solution
Step 1 of 3
The Participation role is the part that every entity participates in a relationship.
• This role is important to use role name in the depiction of relationship type when the similar
entity type participates more than once in a relationship type in various roles.
Example:
So, we can say that a relationship may exist between various entities (of same or different entity
type).
Each entity type that participates in a relationship type plays a role in the relationship.
Comment
Step 2 of 3
Participation Role or Role name signifies role that a participating entity from the entity type
plays in each relationship instance and helps to explain what relationship means.
Example:
In WORKS_FOR relationship type, EMPLOYE plays the role of worker and DEPARTMENT plays
role of department or employer. In figure below an employee works for department. E1 and E3
work for D1 and E2 works for D2.
Comment
Step 3 of 3
Using Role name is not necessary in the description of relationship types where all participating
entities are distinct as in example above because, in such cases name of entity type generally
specify the role played by each entity type.
But when one entity type participates in a relation in more than one role; recursive
relationships; it becomes necessary to use role names in the description of relationship types.
Example:
Consider entity type EMPLOYEE. There can be another employee who can supervise the first
employee. In this case role cannot be describes using the entity type name as this is relationship
of an entity type with itself. In such a case using role name becomes important. In figure below
Supervision relationship type relates employee and supervisor.
Comment
Chapter 3, Problem 8RQ
Problem
Describe the two alternatives for specifying structural constraints on relationship types. What are
the advantages and disadvantages of each?
Step-by-step solution
Step 1 of 3
The two alternatives for specifying structural constraints on relationship types are as follows:
• Cardinality ratio
• Participation constraint
Comment
Step 2 of 3
Cardinality Ratio:
• For a binary relationship, the cardinality ratios can be 1:1, 1:N, N:1 and M:N.
• Cardinality Ratio is represented on ER diagram by 1,M and N on the left and right side of the
diamond.
Participation constraint:
• The participation constraint specifies the minimum number of relationship instances that can be
participated by each entity.
• The participation constraint specifies the minimum participation of the entity. It is also called as
minimum cardinality constraint.
• There are two types of participation constraints. They are total and partial participation
constraints.
Comment
Step 3 of 3
• The cardinality ratio and participation constraint specify the participation of the entity in the
relationship instances.
• It is a costly affair for some of the entities and relationships to be expressed using these two
modeling constructs.
Comment
Chapter 3, Problem 9RQ
Problem
Under what conditions can an attribute of a binary relationship type be migrated to become an
attribute of one of the participating entity types?
Step-by-step solution
Step 1 of 2
• The attributes of a relationship with cardinality 1:1 and 1: N can be migrated to become an
attribute of entity types.
• In case of 1:1 cardinality, the attribute can be moved to either of entity types in the binary
relationship.
• In case of 1: N cardinality, the attribute can be migrated only to N side of the relationship.
Comment
Step 2 of 2
Example
• Each employee is in one department but there can be several employees in a single
department.
• In this scenario, an attribute Start_date in relationship type WORKS_FOR that can be migrated
to EMPLOYEE entity type that tells start date when the employee started working for that
department.
Comment
Chapter 3, Problem 10RQ
Problem
When we think of relationships as attributes, what are the value sets of these attributes? What
class of data models is based on this concept?
Step-by-step solution
Step 1 of 3
Solution:
Relationship as attributes:
• Whenever the attribute refers to one entity type to another entity type, there a relationship
exists.
For example:
• Here, each employee is in one department and several employees are in a single department.
This will inform the Start_date, when EMPLOYEE started working for that department.
Date will be the domain or value set for Start_date of EMPLOYEE in any department. This will
not change or depend on any attribute whether it is present or not.
Comment
Step 2 of 3
In conceptual design phase of data model all entity types, relationships and constraints
are specified as follows:
• DEPARTMENT entity type contains the attributes like name, locations, number, manager and
managerstartdate.
• Here, multi-valued attribute is location. Key attributes are both Name and number.
• PROJECT entity type contains the attributes like name, number location,
controllingdepartment.
• EMPLOYEE entity type contains the attributes like name, sex, ssn, salary, department,
address, salary, department, birthdate and supervisor.
• DEPENDENT entity type contains the attributes like employee, dependantname, sex,
relationship, and birthdate.
Comment
Step 3 of 3
Comment
Chapter 3, Problem 11RQ
Problem
What is meant by a recursive relationship type? Give some examples of recursive relationship
types.
Step-by-step solution
Step 1 of 2
Recursive relationship:
If there is a relationship between the two entities of the similar type is called as recursive
relationship.
Comment
Step 2 of 2
Consider that the entity might be a PERSON. In this entity, the attribute will be MOTHER which is
a person itself.
Here, the recursive relationship exists because one row in the PERSON table refers to another
row in the same PERSON table.
Comment
Chapter 3, Problem 12RQ
Problem
When is the concept of a weak entity used in data modeling? Define the terms owner entity type,
weak entity type, identifying relationship type, and partial key.
Step-by-step solution
Step 1 of 5
The concept of a weak entity is used in the conceptual phase of a data modeling. While
modeling, the entity types who do not have key attributes of there own.
Example
• The DEPENDENT attributes can be same for relatives of two employees so, there can be no
unique way of distinguishing between two records such entity types are called weak entity types.
Comments (1)
Step 2 of 5
The entities belong to a weak entity type are identified by being associated to specific entities
from another entity type in combination with one of their attribute values.
Comment
Step 3 of 5
Entity types that do not have key attributes of their own are called weak entity types.
Comment
Step 4 of 5
A relationship type that relates a weak entity to its owner entity type is called identifying
relationship type.
Comment
Step 5 of 5
Partial key
A partial key is a set of attributes in weak entity types that can uniquely identify weak entities that
are related to the same owner entity.
Comment
Chapter 3, Problem 13RQ
Problem
Can an identifying relationship of a weak entity type be of a degree greater than two? Give
examples to illustrate your answer.
Step-by-step solution
Step 1 of 4
Identifying relationship: The relationship between a strong and a weak entity is known as
identifying relationship.
Comment
Step 2 of 4
The degree of an identifying relationship of a weak entity can be two or greater than two.
Comment
Step 3 of 4
Here,
• Student and Company are the two strong entities and Interview is the weak entity.
• In the above ER diagram, the student applies for a job in a company and interview is a selection
process for the student to take a job in the company.
Comment
Step 4 of 4
Therefore, from the above ER diagram, it can be concluded that the degree of an identifying
relationship of a weak entity can be greater than 2.
Comment
Chapter 3, Problem 14RQ
Problem
Step-by-step solution
Step 1 of 1
Comment
Chapter 3, Problem 15RQ
Problem
Step-by-step solution
Step 1 of 1
• The names of the entity type and the relationship type are should written in uppercase letters.
Comment
Chapter 3, Problem 16E
Problem
Which combinations of attributes have to be unique for each individual SECTION entity in the
UNIVERSITY database shown in Figure 3.20 to enforce each of the following miniworld
constraints:
a. During a particular semester and year, only one section can use a particular classroom at a
particular DaysTime value.
b. During a particular semester and year, an instructor can teach only one section at a particular
DaysTime value.
c. During a particular semester and year, the section numbers for sections offered for the same
course must all be different.
Step-by-step solution
Step 1 of 4
a.
The attribute combinations, that must be unique for the above constraint, are as follows:
Comment
Step 2 of 4
b.
Only one section can be taught by an instructor at a particular DaysTime value, during a
particular semester and year.
The attribute combinations, that must be unique for the above constraint, are as follows:
Sem, Year, SecId, DaysTime, Id (of the INSTRUCTOR teaching the SECTION)
Comment
Step 3 of 4
c.
The section numbers corresponding to the sections offered for the same course must all be
different during a particular semester and year.
The attribute combinations, that must be unique for the above constraint, are as follows:
Sem, Year, SecNo, CCode (of the COURSE related to the SECTION)
Comment
Step 4 of 4
Some of the other similar constraints related to SECTION entity are as follows:
• In a particular semester and year, a student can take only one section at a particular DaysTime
value.
• In a particular semester and year, an instructor of a particular rank cannot teach two sections at
the same DaysTime value.
• Only one section of a particular course can use only one classroom during each particular
semester and year.
Comment
Chapter 3, Problem 17E
Problem
Composite and multivalued attributes can be nested to any number of levels. Suppose we want
to design an attribute for a STUDENT entity type to keep track of previous college education.
Such an attribute will have one entry for each college previously attended, and each such entry
will be composed of college name, start and end dates, degree entries (degrees awarded at that
college, if any), and transcript entries (courses completed at that college, if any). Each degree
entry contains the degree name and the month and year the degree was awarded, and each
transcript entry contains a course name, semester, year, and grade. Design an attribute to hold
this information. Use the conventions in Figure 3.5.
Step-by-step solution
Step 1 of 3
Complex attributes are the attributes that are formed by nesting multivalued attributes and
composite attributes.
• The curly braces {} are used to group the components of multivalued attributes.
• The open braces () are used to group the components of composite attributes.
Comment
Step 2 of 3
A multivalued attribute PreviousCollege is used to hold the college previously attended by the
student.
A multivalued attribute Degree is used to hold the details of degrees awarded to the student.
A multivalued attribute Transcript is used to hold the details of transcript of the student.
Comment
Step 3 of 3
An attribute that holds the details of PreviousCollege, Degree and Transcript of the STUDENT
entity is as follows:
Comment
Chapter 3, Problem 18E
Problem
Show an alternative design for the attribute described in Exercise that uses only entity types
(including weak entity types, if needed) and relationship types.
Exercise
Composite and multivalued attributes can be nested to any number of levels. Suppose we want
to design an attribute for a STUDENT entity type to keep track of previous college education.
Such an attribute will have one entry for each college previously attended, and each such entry
will be composed of college name, start and end dates, degree entries (degrees awarded at that
college, if any), and transcript entries (courses completed at that college, if any). Each degree
entry contains the degree name and the month and year the degree was awarded, and each
transcript entry contains a course name, semester, year, and grade. Design an attribute to hold
this information. Use the conventions in Figure 3.5.
Step-by-step solution
Step 1 of 3
The alternative design for the entity STUDENT with attribute to keep track of previous college
education as discussed in the previous problem is as shown below:
Comment
Step 2 of 3
• STUDENT
• COLLEGE
• DEGREE
• TRANSCRIPT
• ATTENDANCE
Comment
Step 3 of 3
• There exists a binary 1:N relationship ATTENDED between COLLEGE and ATTENDANCE.
Comment
Chapter 3, Problem 19E
Problem
Consider the ER diagram in Figure, which shows a simplified schema for an airline reservations
system. Extract from the ER diagram the requirements and constraints that produced this
schema. Try to be as precise as possible in your requirements and constraints specification.
Step-by-step solution
Step 1 of 2
Refer the ER diagram of the AIRLINE database schema given in figure 3:21.
The requirements and the constraints that produced from the schema are as follows:
AIRPORT
• Each AIRPORT has its unique Airport_code, AIRPORT Name, City and State where it is
located.
FLIGHT
• It also specifies the information about the airline for the FLIGHT and the days on which it is
scheduled.
FLIGHT_LEG
• FARE is kept for each flight and there are certain set of restrictions on FARE.
• Each FLIGHT_LEG has the details of its scheduled arrival time, departure time and an Airport
Arrival, Airport Departure.
Comment
Step 2 of 2
LEG_INSTANCE
• Each FLIGHT_LEG has the details of its scheduled arrival time, departure time and Airport
Arrival and Airport Departure with one or more LEG_INSTANCEs.
• The information for the AIRPLANE used and the number of available seats is kept in the LEG
INSTANCE.
RESERVATION
• In LEG INSTANCE, RESERVATIONs for every customer include the Customer Name, Phone,
and Seat Number(s).
• All the information about the AIRPLANEs and AIRPLANE TYPEs are included.
• It has a fixed number of seats and has a particular manufacturing company name.
• CAN_LAND relates AIRPLANE_TYPE to the AIRPORTS where they can land at a time.
Comment
Chapter 3, Problem 20E
Problem
In Chapters 1 and 2, we discussed the database environment and database users. We can
consider many entity types to describe such an environment, such as DBMS, stored database,
DBA, and catalog/data dictionary. Try to specify all the entity types that can fully describe a
database system and its environment; then specify the relationship types among them, and draw
an ER diagram to describe such a general database environment.
Step-by-step solution
Step 1 of 1
Entity types that can fully describe a database environment and users are:
3. TOOLS (Tool_id, Tool_type, Next_tool): Tool_id helps to uniquely identify the tool, Tool_type
tells if the tool is a compiler, or and optimizer or storage tool, Next_tool tells the Tool_id of next
tool that will be used by this tool for completing the transaction.
E-R diagram:
Comment
Chapter 3, Problem 21E
Problem
Design an ER schema for keeping track of information about votes taken in the U.S. House of
Representatives during the current two-year congressional session. The database needs to keep
track of each U.S. STATE’s Name (e.g., ‘Texas’, ‘New York’, ‘California’) and include the Region
of the state (whose domain is {‘Northeast’, ‘Midwest’, ‘Southeast’, ‘Southwest’, ‘West’}). Each
CONGRESS_PERSON in the House of Representatives is described by his or her Name, plus
the District represented, the Start_date when the congressperson was first elected, and the
political Party to which he or she belongs (whose domain is {‘Republican’, ‘Democrat’,
‘Independent’, ‘Other’}). The database keeps track of each BILL (i.e., proposed law), including
the Bill_name, the Date_of_vote on the bill, whether the bill Passed_or_failed (whose domain is
{‘Yes’, ‘No’}), and the Sponsor (the congressperson(s) who sponsored—that is, proposed—the
bill). The database also keeps track of how each congressperson voted on each bill (domain of
Vote attribute is {‘Yes’, ‘No’, ‘Abstain’, ‘Absent’}). Draw an ER schema diagram for this
application. State clearly any assumptions you make.
Step-by-step solution
Step 1 of 2
Comment
Step 2 of 2
ASSUMPTIONS:
1. Each CONGRESS_PERSON can represent one district and one district is represented by one
CONGRESS_MAN.
2. CONGRESS_PERSON: who are elected from various regions and are related to
US_STATE_REGION by relationship REPRESENTATIVE.
3. BILL: each bill is related to CONGRESS_PERSON, who presents it and is voted by all
CONGRESS_MAN.
Comment
Chapter 3, Problem 22E
Problem
A database is being constructed to keep track of the teams and games of a sports league. A
team has a number of players, not all of whom participate in each game. It is desired to keep
track of the players participating in each game for each team, the positions they played in that
game, and the result of the game. Design an ER schema diagram for this application, stating any
assumptions you make. Choose your favorite sport (e.g., soccer, baseball, football).
Step-by-step solution
Step 1 of 2
Consider a soccer league in which various teams participate to win the title. The following is the
ER diagram for the database of a sports league.
Comment
Step 2 of 2
Assumptions:
Comment
Chapter 3, Problem 23E
Problem
Consider the ER diagram shown in Figure for part of a BANK database. Each bank can have
multiple branches, and each branch can have multiple accounts and loans.
b. Is there a weak entity type? If so, give its name, partial key, and identifying relationship.
c. What constraints do the partial key and the identifying relationship of the weak entity type
specify in this diagram?
d. List the names of all relationship types, and specify the (min, max) constraint on each
participation of an entity type in a relationship type. Justify your choices.
e. List concisely the user requirements that led to this ER schema design.
f. Suppose that every customer must have at least one account but is restricted to at most two
loans at a time, and that a bank branch cannot have more than 1,000 loans. How does this show
up on the (min, max) constraints?
Step-by-step solution
Step 1 of 6
(a)
• LOAN
• CUSTOMER
• ACCOUNT
• BANK
Comment
Step 2 of 6
(b)
Yes there is a weak entity type BANK_BRANCH and its Partial key is Branch_no and
Comment
Step 3 of 6
(c)
• A bank can have any number of branches but a branch is of only one bank.
Comment
Step 4 of 6
(d)
• BRANCHES: BANK (min, max) = (1, 1) and BANK_BRANCH (min, max) = (1.*). A bank can
have any number of branches but a branch can be owned by a single bank
• ACCTS: ACCOUNT (min, max) = (1..*) and BANK_BRANCH(min, max) = (1, 1). An account
can be with one branch but a branch can have many accounts.
• LOANS: LOAN (min, max) = (1..*) and BANK_BRANCH(min, max) = (1,1). A branch can give
any number of loans but a loan is given from one branch only.
• A_C: ACCOUNT(min, max) = (1.*) and CUSTOMER(min, max) = (1,1). A customer can have
any number of accounts but an account is owned by only one customer
• L_C: CUSTOMER(min, max) = (1,1) and LOAN(min, max) = (1..*). A customer can take any
number of loans but a loan is given to only one customer.
Comments (1)
Step 5 of 6
(e)
• A bank can have any number of BANK_BRANCH. Each BANK_BRANCH has number that is
unique in branches of that bank.
• Each account and loan.is identifies by account number and has balance, is of particular type.
• Each customer is identified by Ssn. Name address phone of customer are stored.
Comment
Step 6 of 6
(f)
• BRANCHES: BANK (min, max) = (1, 1) and BANK_BRANCH (min, max) = (1.*)
Comments (2)
Chapter 3, Problem 24E
Problem
Consider the ER diagram in Figure Assume that an employee may work in up to two
departments or may not be assigned to any department. Assume that each department must
have one and may have up to three phone numbers. Supply (min, max) constraints on this
diagram. State clearly any additional assumptions you make. Under what conditions would the
relationship HAS_PHONE be redundant in this example?
Step-by-step solution
Step 1 of 2
Consider the ER diagram for the COMPANY database. The employee may work in up to two
departments or may not be a part of any department. The (min, max) constraint in this case is (0,
2). Each department must have one phone number and may have up to three phone numbers.
The (min, max) constraint in this case is (1, 3).
The following are the other assumptions made for the COMPANY database:
• Each department must have one employee and may have up to twenty employees. The (min,
max) constraint in this case is (1, 20).
• Each phone used by only one department. The (min, max) constraint in this case is (1, 1).
• Each phone is assigned to at least one employee and may be assigned to 5 employees. The
(min, max) constraint in this case is (1, 5).
• Each employee must have one phone and may have up to 3 phones. The (min, max) constraint
in this case is (1, 3).
Comment
Step 2 of 2
The following is the ER diagram after supplying the (min, max) constraints for the COMPANY
database:
• If the EMPLOYEEs assigned to all PHONEs of their DEPARTMENT and none of any other
department.
Comment
Chapter 3, Problem 25E
Problem
Consider the ER diagram in Figure. Assume that a course may or may not use a textbook, but
that a text by definition is a book that is used in some course. A course may not use more than
five books. Instructors teach from two to four courses. Supply (min, max) constraints on this
diagram. State clearly any additional assumptions you make. If we add the relationship ADOPTS,
to indicate the textbook(s) that an instructor uses for a course, should it be a binary relationship
between INSTRUCTOR and TEXT, or a ternary relationship among all three entity types? What
(min, max) constraints would you put on the relationship? Why?
Step-by-step solution
Step 1 of 1
TEACHES: INSTRUCTOR (min, max) = (1,1) and COURSE (min, max) = (2,4). Assumption: One
course is taught by a single teacher.
USES: TEXT (min, max) = (0, 5) and COURSE (min, max) = (1, 1).
If relationship ADOPTS is added in between INSTRUCTOR and TEXT (min, max) constraints
would be:
INSTRUCTOR (min, max) = (1,1) and TEXT (min, max) = (0, 20).
Since each Instructor can take 2-4 courses and can use unto five texts for each course or none,
min and max constraints will be like above.
Comment
Chapter 3, Problem 26E
Problem
Consider an entity type SECTION in a UNIVERSITY database, which describes the section
offerings of courses. The attributes of SECTION are Section_number, Semester, Year.
Course_number, Instructor, Room_no (where section is taught), Building (where section is
taught), Weekdays (domain is the possible combinations of weekdays in which a section can be
offered {‘MWF’, ‘MW’, ‘TT’, and so on}), and Hours (domain is all possible time periods during
which sections are offered {‘9–9:50 a.m.’, ‘10–10:50 a.m.’, …, ‘3:30–4:50 p.m.’, ‘5:30–6:20 p.m.’,
and so on}). Assume that Section_number is unique for each course within a particular
semester/year combination (that is, if a course is offered multiple times during a particular
semester, its section offerings are numbered 1, 2, 3, and so on). There are several composite
keys for section, and some attributes are components of more than one key. Identify three
composite keys, and show how they can be represented in an ER schema diagram.
Step-by-step solution
Step 1 of 4
• Section_number
• Semester
• Year
• Course_number
• Instructor
• Room_no
• Building
• Weekdays
• Hours
Comment
Step 2 of 4
As unique room can be allocated for a specific days and hours in a particular semester of a year,
{Semester, Year, Room_no, Weekdays, Hours} can be considered as composite key for
SECTION entity.
As unique Instructor can be allocated to teach for a specific days and hours in a particular
semester of a year, {Semester, Year, Instructor, Weekdays, Hours} can be considered as
composite key for SECTION entity.
Comment
Step 3 of 4
Comment
Step 4 of 4
Problem
Cardinality ratios often dictate the detailed design of a database. The cardinality ratio depends on
the real-world meaning of the entity types involved and is defined by the specific application. For
the following binary relationships, suggest cardinality ratios based on the common-sense
meaning of the entity types. Clearly state any assumptions you make.
Step-by-step solution
Step 1 of 3
1. Each student will have a unique social security number. So there exists a 1:1 cardinality ratio
between STUDENT and SOCIAL_SECURITY_NUMBER entities.
2. A student can be taught by many teachers and a teacher can teach many students. So there
exists a M: N cardinality ratio between STUDENT and TEACHER entities.
3. A class room can have 4 walls and there will be a common wall for two class rooms. So there
exists a 2: 4 cardinality ratio between CLASSROOM and WALL entities.
4. Each country will have an only one president and a person can be president to only one
country. So there exists a 1:1 cardinality ratio between COUNTRY and PRESIDENT entities.
5. A course can have any number of textbooks but a textbook can belong to only one course. So
there exists a 1:N cardinality ratio between COURSE and TEXTBOOK entities.
Comments (2)
Step 2 of 3
6. An order can consist of many items and an item can belong to more than one order. So there
exists a M: N cardinality ratio between ORDER and ITEM entities.
7. A student can belong to one class, but a class can consist of many students. So there exists a
N:1 cardinality ratio between STUDENT and CLASS entities.
8. A class can have many instructors and an instructor can belong to more than one class. So
there exists a M: N cardinality ratio between CLASS and INSTRUCTOR entities.
9. An instructor can belong to one office, but an office can have more than one instructor. So
there exists a N:1 cardinality ratio between INSTRUCTOR and OFFICE entities.
10. An eBay auction item can have any number of bids. So there exists a 1:N cardinality ratio
between EBAY_AUCTION_ITEM and EBAY-BID entities.
Comment
Step 3 of 3
Problem
Assume that MOVIES is a populated database. ACTOR is used as a generic term and includes
actresses. Given the constraints shown in the ER schema, respond to the following statements
with True, False, or Maybe. Assign a response of Maybe to statements that, although not
explicitly shown to be True, cannot be proven False based on the schema as shown. Justify each
answer.
b. There are some actors who have acted in more than ten movies.
l. There are some actors who have done a lead role, directed a movie, and produced a movie.
Step-by-step solution
Step 1 of 13
a.
There exists a many to many (M: N) relationship named PERFORMS_IN between ACTOR and
MOVIE. ACTOR and MOVIE have full participation in relationship PERFORMS_IN.
Comment
Step 2 of 13
b.
There exists a many to many (M: N) relationship named PERFORMS_IN between ACTOR and
MOVIE. The maximum cardinality M or N indicates that there is no maximum number. Some of
the actors may be acted in more than ten movies.
Comment
Step 3 of 13
c.
There exists a 2 to N relationship named LEAD_ROLE between ACTOR and MOVIE. The
maximum cardinality for an actor to act in a movie as a lead role is N. N can be 2 or more.
Comment
Step 4 of 13
d.
There exists a 2 to N relationship named LEAD_ROLE between ACTOR and MOVIE. The
maximum cardinality 2 indicates that an actor can act as a lead role in only two movies.
Comments (1)
Step 5 of 13
e.
There exists a one to one (1: 1) relationship named ALSO_A_DIRECTOR between ACTOR and
DIRECTOR. Director does not have total participation in the relationship named
ALSO_A_DIRECTOR. So, there may be an actor who is also a director, but every director cannot
be an actor.
Comment
Step 6 of 13
f.
There exists a one to one (1: 1) relationship named ACTOR_PRODUCER between ACTOR and
PRODUCER. Producer does not have total participation in the relationship named
ACTOR_PRODUCER. So, there may be an actor who is also a producer.
Comment
Step 7 of 13
g.
Comment
Step 8 of 13
h.
There exists a many to many (M: N) relationship named PERFORMS_IN between ACTOR and
MOVIE. The maximum cardinality M indicates that there is no maximum number. A movie can
have more than 12 actors performing in it.
Comment
Step 9 of 13
i.
There exists a one to one (1: 1) relationship named ALSO_A_DIRECTOR between ACTOR and
DIRECTOR.
There exists a one to one (1: 1) relationship named ACTOR_PRODUCER between ACTOR and
PRODUCER.
Comment
Step 10 of 13
j.
There exists a one to many relationship named DIRECTS between DIRECTOR and MOVIE. A
director can direct N movies.
There exists a many to many relationship named PRODUCES between PRODUCER and
MOVIE. A producer can produce any number of movies.
So, there may be one director and one producer for a movie.
Comment
Step 11 of 13
k.
There exists a one to many relationship named DIRECTS between DIRECTOR and MOVIE. A
director can direct N movies.
There exists a many to many relationship named PRODUCES between PRODUCER and
MOVIE. A producer can produce any number of movies.
So, there can be one director and several producers for movies.
Comment
Step 12 of 13
l.
There exists a one to one (1: 1) relationship named ALSO_A_DIRECTOR between ACTOR and
DIRECTOR.
There exists a one to one (1: 1) relationship named ACTOR_PRODUCER between ACTOR and
PRODUCER.
So, there may an actor who is a producer, director and performed a lead role in a movie.
Comment
Step 13 of 13
m.
There may be a movie in which a director performed in the movie directed by him.
Comment
Problem
Chapter 3, Problem 29E
Given the ER schema for the MOVIES database in Figure, draw an instance diagram using three
movies that have been released recently. Draw instances of each entity type: MOVIES,
ACTORS, PRODUCERS, DIRECTORS involved; make up instances of the relationships as they
exist in reality for those movies.
Step-by-step solution
Step 1 of 2
Comment
Step 2 of 2
Amir Khan: Produced a movie he acted in and Also directed the movie.
Comment
Chapter 3, Problem 30E
Problem
Illustrate the UML diagram for Exercise. Your UML design should observe the following
requirements:
a. A student should have the ability to compute his/her GPA and add or drop majors and minors.
b. Each department should be able to add or delete courses and hire or terminate faculty.
c. Each instructor should be able to assign or change a student’s grade for a course.
Reference Problem 16
Which combinations of attributes have to be unique for each individual SECTION entity in the
UNIVERSITY database shown in Figure 3.20 to enforce each of the following miniworld
constraints:
a. During a particular semester and year, only one section can use a particular classroom at a
particular DaysTime value.
b. During a particular semester and year, an instructor can teach only one section at a particular
DaysTime value.
c. During a particular semester and year, the section numbers for sections offered for the same
course must all be different.
Step-by-step solution
Step 1 of 5
The UML diagram consists of a class, such that the class is equivalent to the entity in ER
diagram. The class consists of following three sections:
• Class name: It is the top section of the UML class diagram. Class name is similar to the entity
type name in ER diagram.
• Attributes: It is the middle section of the UML class diagram. Attributes are the same as the
attributes of an entity in the ER diagram.
• Operations: It is the last section of the UML class diagram. It indicates the operations that can
be performed on individual objects, where each object is similar to the entities in ER diagram.
Comment
Step 2 of 5
a.
The operation that indicates the ability of the student to calculate his/her GPA and also to add or
drop the majors and minors is specified in the last section of the UML class diagram. The
operations are as follows:
• computer_gpa
• add_major
• drop_major
• add_minor
• drop_minor
Comment
Step 3 of 5
b.
The operation that indicates the ability of each department to add or delete a course and also to
hire or terminate a faculty is specified in the last section of the UML class diagram. The
operations are as follows:
• add_course
• delete_course
• hire_faculty
• terminate_faculty
Comment
Step 4 of 5
c.
The operation that indicates the ability of each instructor to assign or change the grade of a
student for a particular course is specified in the last section of the UML class diagram. The
operations are as follows:
• assign_grade
• change_grade
Comment
Step 5 of 5
Comment
Chapter 3, Problem 31LE
Problem
Consider the UNIVERSITY database described in Exercise 16. Build the ER schema for this
database using a data modeling tool such as ERwin or Rational Rose.
Reference Exercise 16
Which combinations of attributes have to be unique for each individual SECTION entity in the
UNIVERSITY database shown in Figure 3.20 to enforce each of the following miniworld
constraints:
a. During a particular semester and year, only one section can use a particular classroom at a
particular DaysTime value.
b. During a particular semester and year, an instructor can teach only one section at a particular
DaysTime value.
c. During a particular semester and year, the section numbers for sections offered for the same
course must all be different.
Step-by-step solution
Step 1 of 1
Refer to the exercise 3.16 for the UNIVERSITY database. Use Rational Rose tool to create the
ER schema for the database as follow:
• In the options available on left, right click on the option Logical view, go to New and select the
option Class Diagram.
• Name the class diagram as UNIVERSITY. Select the option Class available in the toolbar and
then click on empty space of the Class Diagram file. Name the class as COLLEGE.
Right click on the class, select the option New Attribute, and name the attribute as CName.
Similarly, create the other attributes COffice and CPhone.
• Now right click on the attribute CName, available on the left under the class UNIVERSITY, and
select the option Open Specification. Select the Protected option under Export Control. This
will make CName as primary key.
• Similarly create another class INSTRUCTOR; its attributes Id, Rank, IName, IOffice and
IPhone; and Id as the primary key.
• Select the option Unidirectional Association from the toolbar, for creating relationships
between the two classes. Now click on the class COLLEGE; while holding the click drag the
mouse towards the class INSTRUCTOR and release the click. This will create the relationship
between the two selected classes.
Name the association as DEAN. Since the structural constraint in the ER diagram is specified
using (min, max) notation, so specify the structural constraints using the Rational Rose tool as
follows:
• Right click on the association close to the class COLLEGE and select 1 from the option
Multiplicity.
• Again, right click on the association close to the class INSTRUCTOR and select Zero or One
from the option Multiplicity.
• Similarly, create other classes and their associated attributes. Specify the relationships and
structural constraints between the classes, as mentioned above.
ER schema may be specified using alternate diagrammatic notation that is class diagram,
through the use of Rational Rose tool as follows:
Comment
Chapter 3, Problem 32LE
Problem
Consider a MAIL_ORDER database in which employees take orders for parts from customers.
The data requirements are summarized as follows:
■ The mail order company has employees, each identified by a unique employee number, first
and last name, and Zip Code.
■ Each customer of the company is identified by a unique customer number, first and last name,
and Zip Code.
■ Each part sold by the company is identified by a unique part number, a part name, price, and
quantity in stock.
■ Each order placed by a customer is taken by an employee and is given a unique order number.
Each order contains specified quantities of one or more parts. Each order has a date of receipt
as well as an expected ship date. The actual ship date is also recorded.
Design an entity-relationship diagram for the mail order database and build the design using a
data modeling tool such as ERwin or Rational Rose.
Step-by-step solution
Ask an expert
Chapter 3, Problem 35LE
Problem
Consider the ER diagram for the AIRLINE database shown in Figure Build this design using a
data modeling tool such as ERwin or Rational Rose.
Step-by-step solution
Step 1 of 1
Refer to the figure 3.21 for the ER schema of AIRLINE database. Use Rational Rose tool to
create the ER schema for the database as follow:
• In the options available on left, right click on the option Logical view, go to New and select the
option Class Diagram.
• Name the class diagram as AIRLINE. Select the option Class available in the toolbar and then
click on empty space of the Class Diagram file. Name the class as AIRPORT.
Right click on the class, select the option New Attribute, and name the attribute as Airport_code.
Similarly, create the other attributes City, State and Name.
• Now right click on the attribute Airport_code, available on the left under the class AIRPORT,
and select the option Open Specification. Select the Protected option under Export Control.
This will make Airport_code as primary key.
• Select the option Unidirectional Association from the toolbar, for creating relationships
between the two classes. Now click on the class AIRPORT; while holding the click drag the
mouse towards the class FLIGHT_LEG and release the click. This will create the relationship
between the two selected classes.
• Right click on the association close to the class AIRPORT and select 1 from the option
Multiplicity.
• Again, right click on the association close to the class FLIGHT_LEG and select n from the
option Multiplicity.
• Similarly, create other classes and their associated attributes. Specify the relationships and
structural constraints between the classes, as mentioned above.
ER schema may be specified using alternate diagrammatic notation that is class diagram,
through the use of Rational Rose tool as follows:
Comment
Chapter 4, Problem 1RQ
Problem
Step-by-step solution
Step 1 of 3
Subclass:
The sub class is also called as a derived class. This class extends from another class (Parent
Class) so that it inherits protected and public members from the parent class.
The sub class is same as the entity in the superclass but in a distinct specific role.
Comment
Step 2 of 3
An entity is an object (thing) with independent physical (car, home, person) or conceptual
(company, university course) existence in the real world.).
Each real-world entity (thing) has certain properties that represent its significance in real world or
describes it. These properties of an entity are known as attribute. An entity type defines a
collection (or set) of entities that have the same attributes.
A database usually contains a group of entities that are similar. These entities have same
attributes but different attribute values. A collection of these entities is an entity type.
In each entity type there may exist, smaller groupings on basis of one or other
attribute/relationship. Such attributes or relationships may not apply to all entities in entity type
but are of significant value for that group. All such groups can be represented as separate
classes or entity types. These form subclass of bigger entity type.
Example:
Consider am entity type VEHICLE. Now all vehicles have property that they have manufacturer,
number_plate, registration_number, colour etc. , but there are certain properties hat we may link
only to carrier vehicles like load_capacity, size(for width and height of product it can take) etc…,
and certain attributes that can be attached to passenger vehicles only are sitting_capacity,
ac/non ac etc…, so we can have subclasses for Entity type vehicle as PASSENGER_VEHICLE
and GOODS_VEHICLE. PASSENGER_VEHICLE and GOODS_VEHICLE are subclasses of
VEHICLE superclass.
Comment
Step 3 of 3
To define inheritance relationship between two classes, the subclass is needed in data modeling.
Concept of subclass is used in data modeling to represent data more meaningfully and to
represent those attributes/relationships clearly that are part of a group of entities in superclass
and are not part of all entities.
Comment
Chapter 4, Problem 2RQ
Problem
Step-by-step solution
Step 1 of 9
1. Superclass of a subclass: In each entity type there may exist, smaller groupings on basis
of one or other attribute/relationship. Such attributes or relationships may not apply to all
entities in entity type but are of significant value for that particular group. All such groups can be
represented as separate classes or entity types. These form subclass of bigger entity type.
Bigger entity type is known as superclass.
For example: Consider am entity type VEHICLE. Now all vehicles have property that they have
manufacturer, number_plate, registration_number, colour etc. , but there are certain properties
hat we may link only to carrier vehicles like load_capacity, size(for width and height of product it
can take) etc…, and certain attributes that can be attached to passenger vehicles only are
sitting_capacity, ac/non ac etc…, so we can have subclasses for Entity type vehicle as
PASSENGER_VEHICLE and GOODS_VEHICLE. PASSENGER_VEHICLE and
GOODS_VEHICLE are subclasses of VEHICLE superclass
Comment
Step 2 of 9
Comment
Step 3 of 9
For example: Consider am entity type VEHICLE. Now all vehicles have property that they have
manufacturer, number_plate, registration_number, colour etc. , but there are certain properties
hat we may link only to carrier vehicles like load_capacity, size(for width and height of product it
can take) etc…, and certain attributes that can be attached to passenger vehicles only are
sitting_capacity, ac/non ac etc…, so we can have subclasses for Entity type vehicle as
PASSENGER_VEHICLE and GOODS_VEHICLE. PASSENGER_VEHICLE and
GOODS_VEHICLE are subclasses of VEHICLE superclass.
4.
Comment
Step 4 of 9
Foe example: On basis that vehicle is commercial or not we can have other specialization
{COMMERCIAL, PRIVATE}.
c. Establish additional specific relationship types between each subclass and other entity types
or other subclasses.
Comment
Step 5 of 9
5. Generalization: This is a reverse process of abstraction in which differences between several
entity types are suppressed, common features are identified, and generalized into a single
superclass of which the original entity types are special subclass.
For example: GOOD_VEHICLE and CARRIER_VEHICLE are two classes and they have certain
attributes, viz. , number_plate, reg_number, color, etc. ; these attributes from both these classes
can be taken in common and a new superclass can be created VEHICLE. This is called
generalization.
Comment
Step 6 of 9
6. Category: It may happen sometime that need arises for modeling a single
superclass/subclass relationship with more than one superclass, where the superclasses
represent different entity types. In this case, the subclass will represent a collection of objects
that is a subset of the of distinct entity types; such a subclass is called a union or a category.
Comment
Step 7 of 9
7.
Comments (2)
Step 8 of 9
Specific (local) attributes: Consider am entity type VEHICLE. Now all vehicles have property that
they have manufacturer, number_plate, registration_number, colour etc. , but there are certain
properties hat we may link only to CARRIER_VEHICLES subclass like load_capacity, size(for
width and height of product it can take) etc…, and certain attributes that can be attached to
PASSENGER_VEHICLES subclass only: sitting_capacity, ac/non ac etc. These attributes that
are part of only subclaases and not of superclass are called local attributes or specific
attributes.
Comment
Step 9 of 9
8. Specific relationships: Like local attributes there are certain relationships that are true only
for a subclass of superclass and not for all subclasses or for superclass. Such relations are
called specific relationships.
Comment
Chapter 4, Problem 3RQ
Problem
Step-by-step solution
Step 1 of 2
The Enhanced entity relationship (EER) model is the extension of the ER model. The EER model
includes some new concepts in addition to the concepts of the ER model. The EER model
includes the concepts of subclass, superclass, specialization, generalization, category or union
type. The ER model with all these additional concepts is associated with the mechanism of
attribute and relationship inheritance.
Comment
Step 2 of 2
The type of each entity is defined by the set of attributes and the relationship types. The
members of the subclass entity inherit the attributes and the relationships of the superclass
entity. This mechanism is useful because, the attributes in the subclass possess the
characteristics of the superclass.
Comment
Chapter 4, Problem 4RQ
Problem
Discuss user-defined and predicate-defined subclasses, and identify the differences between the
two.
Step-by-step solution
Step 1 of 1
Predicate-defined subclasses: When we decide entities that will become member of each
class of specialization by placing condition on some attribute of the superclass. Such subclasses
are called predicate-defined subclass.
1. Membership of predicate defined subclasses can be decided automatically but it is not the
same for user defined subclasses.
Comment
Chapter 4, Problem 5RQ
Problem
Discuss user-defined and attribute-defined specializations, and identify the differences between
the two.
Step-by-step solution
Step 1 of 5
Comment
Step 2 of 5
If there is no condition for deciding membership of all subclasses, then the sub class is called
user defined specialization.
Comment
Step 3 of 5
Membership in such a specialization is determined by the database users when any operation is
performed to add entity to the subclass.
Comment
Step 4 of 5
Attribute-defined specialization:
If the user chooses entities, the entity become member of each class of specialization by placing
condition on some attribute of the superclass. Such subclasses are called attribute-defined
subclass.
Comment
Step 5 of 5
The user is responsible for identifying proper The value of the same attribute is used in
subclass. defining predicate for all subclasses.
Comment
Chapter 4, Problem 6RQ
Problem
Step-by-step solution
Step 1 of 1
1. Disjoint Constraint: This specifies that the subclasses of the specialization must be disjoint.
This means that an entity can be a member of at most one of the subclasses of the
specialization. A specialization that is attribute-defined implies the disjoint ness constraint if the
attribute used to define membership predicate is single-valued.
If disjoint ness constraint holds true than specialization is disjoint. There might be a set of entities
that are common to subclasses, this is condition of overlap.
Comment
Problem
Chapter 4, Problem 7RQ
Step 1 of 1
A subclass itself may have further subclasses specified on it, forming a hierarchy or a lattice of
specializations. A specialization hierarchy has that constraint that every subclass participates
as a subclass in only one class/subclass relationship; that is, each subclass has only one parent,
which results in a tree structure.
In contrast, for a specialization lattice, a subclass can be a subclass in more than one
class/subclass relationship.
Comment
Chapter 4, Problem 8RQ
Problem
What is the difference between specialization and generalization? Why do we not display this
difference in schema diagrams?
Step-by-step solution
Step 1 of 2
Foe example: On basis that vehicle is commercial or not we can have other specialization
{COMMERCIAL, PRIVATE}.
c. Establish additional specific relationship types between each subclass and other entity types
or other subclasses.
Comment
Step 2 of 2
For example: GOOD_VEHICLE and CARRIER_VEHICLE are two classes and they have certain
attributes, viz. , number_plate, reg_number, color, etc. ; these attributes from both these classes
can be taken in common and a new superclass can be created VEHICLE. This is called
generalization.
Specialization and generalization can be viewed as functionally reverse processes of each other.
We do not generally display difference in design of schema because the decision as to which
process is more appropriate in a particular situation is often subjective.
Comment
Chapter 4, Problem 9RQ
Problem
How does a category differ from a regular shared subclass? What is a category used for?
Illustrate your answer with examples.
Step-by-step solution
Step 1 of 3
1. A category has two or more superclasses that may represent distinct entity types, whereas
other regular shared subclasses always have a single superclass.
Category fig:
Comments (1)
Step 2 of 3
2. An entity that is member of shared subclass must exist in all superclasses i.e. it is subset of
intersection of superclasses. In case of category, a member entity can be part of any one of
superclass, i.e., it is subset of union of superclasses.
Comment
Step 3 of 3
USE:It may happen sometime that need arises for modeling a single superclass/subclass
relationship with more than one superclass, where the superclasses represent different entity
types. In this case, the subclass will represent a collection of objects that is a subset of the of
distinct entity types; in such cases union or a category is used.
For example: Consider a piece of property. This can be owned by a person, a business firm, a
charitable institution, a bank etc. All this entities are of different type but will jointly form total set
of land owners. Above figure illustrate this example.
Comment
Chapter 4, Problem 10RQ
Problem
For each of the following UML terms (see Sections 3.8 and 4.6) discuss the corresponding term
in the EER model, if any: object, class, association, aggregation, generalization, multiplicity,
attributes, discriminator, link, link attribute, reflexive association, and qualified association.
Step-by-step solution
Step 1 of 1
1 Object Entity
5 Generalization Generalization
7 Attributes Attributes
Comment
Chapter 4, Problem 11RQ
Problem
Discuss the main differences between the notation for EER schema diagrams and UML class
diagrams by comparing how common concepts are represented in each.
Step-by-step solution
Step 1 of 1
Following are some of the differences between the notation for EER schema diagram and UML
class diagram notations are as follows:
Comment
Problem
Chapter 4, Problem 12RQ
List the various data abstraction concepts and the corresponding modeling concepts in the EER
model.
Step-by-step solution
Step 1 of 3
The list of four abstraction concepts in the EER (Enhanced Entity-Relationship model) are as
follows:
• Identification
Comment
Step 2 of 3
• The classification is used to assign the similar entities or object to the entity type or object type.
• The instantiation is a quite opposite of the classification and it is used to a specific examination
of distinct objects of a class.
Identification
• Identify the classes and objects are uniquely identified by the identifier is known as an
identification.
o The identification is used to tell the difference between the classes and objects.
o The identification is also used to identify the database objects and to relate them to their real-
world counterparts.
• The generalization is the quite opposite of the generalization and it is used combined several
classes into a higher-level class.
• The aggregation is used to build the composite objects from their component objects.
Comment
Step 3 of 3
• The modeling concepts in the EER model almost like all the ER model modeling concepts. In
addition, the EER model contains subclass and superclass are related to the concepts of the
Specialization and generalization.
• Another modeling concepts in the EER model is category or union type. Which have no
standard terminology related to the abstract concepts of the EER model.
Comment
Chapter 4, Problem 13RQ
Problem
What aggregation feature is missing from the EER model? How can the EER model be further
enhanced to support it?
Step-by-step solution
Step 1 of 2
Missing feature:
In the EER (Enhanced Entity Relationship) model may not be used explicitly and it includes the
possibility of combining the objects which are related to specific instance into a higher level
aggregate object.
• This may be sometimes helpful because this higher-level aggregate may be related to some
other object.
• This type of relationship between the primitive object and aggregate object is referred as IS-A-
PART-OF and its inverse is called as IS-A-COMPONENT-OF.
Comment
Step 2 of 2
Enhancement:
This missing feature must be further enhanced by representing the aggregation feature correctly
in EER model by creating the additional entity types.
Comment
Chapter 4, Problem 14RQ
Problem
What are the main similarities and differences between conceptual database modeling
techniques and knowledge representation techniques?
Step-by-step solution
Step 1 of 2
Major similarities and differences between conceptual database modeling techniques and
knowledge representation techniques:
1. Both the disciplines use an abstraction process to identify common properties and important
aspects of objects in the miniworld while suppressing insignificant differences and unimportant
details.
2. Both disciplines provide concepts, constraints, operations, and languages for defining data
and representing knowledge.
3. KR is generally broader in scope than semantic data models. Different forms of knowledge,
such as rules, incomplete and default knowledge, temporal and spatial knowledge, are
represented in KR schemes.
Comment
Step 2 of 2
4. KR schemes include reasoning mechanisms that deduce additional facts stored in a database.
Hence, whereas most current database systems are limited to answering the direct queries,
knowledge-based systems using KR schemes can answer queries that involve inferences over
the stored data.
5. Whereas most data models concentrate on the representation of database schemas, or meta-
knowledge, KR schemes often mix up the schemas with the instances themselves in order to
provide flexibility in representing exceptions. This often leads to inefficiencies when KR schemes
are implemented in comparison to database especially when large amount of data needs to be
stored.
Comment
Chapter 4, Problem 15RQ
Problem
Discuss the similarities and differences between an ontology and a database schema.
Step-by-step solution
Step 1 of 1
The difference between ontology and database schema is that, the schema is usually limited to
describing a small subset of a miniworld form reality in order to store and manage data. Ontology
is usually considered to be more general in that. It attempts to describe a part of reality or a
domain of interest (e.g., medical terms, electronic-commerce applications) as completely as
possible
Comment
Chapter 4, Problem 16E
Problem
Design an EER schema for a database application that you are interested in. Specify all
constraints that should hold on the database. Make sure that the schema has at least five entity
types, four relationship types, a weak entity type, a superclass/subclass relationship, a category,
and an n-ary (n > 2) relationship type.
Step-by-step solution
Step 1 of 2
Comment
Step 2 of 2
Here weak entity type INTERVIEW has ternary identifying relationships- JOB_OFFER,
CANDIDATE and EMPLOYER. An interview can be related to candidate who gives interview and
some employer that takes it and some job offer for which interview can be taken.
Employer can be a government organization or a private firm, and is hiring for a department for
which a candidate can apply or wants to work for.
Comment
Chapter 4, Problem 17E
Problem
Consider the BANK ER schema in Figure, and suppose that it is necessary to keep track of
different types of ACCOUNTS (SAVINGS_ACCTS, CHECKING_ACCTS, …) and LOANS
(CAR_LOANS, HOME_LOANS, …). Suppose that it is also desirable to keep track of each
ACCOUNT’S TRANSACTIONS (deposits, withdrawals, checks, …) and each LOAN's
PAYMENTS; both of these include the amount, date, and time. Modify the BANK schema, using
ER and EER concepts of specialization and generalization. State any assumptions you make
about the additional requirements.
Step-by-step solution
Step 1 of 2
• There are only three types of accounts SAVING, CURRENT and CHECKING accounts.
• There are only three types of loans CAR loans, HOME loans and PERSONAL loans.
Comment
Step 2 of 2
Problem
The following narrative describes a simplified version of the organization of Olympic facilities
planned for the summer Olympics. Draw an EER diagram that shows the entity types, attributes,
relationships, and specializations for this application. State any assumptions you make. The
Olympic facilities are divided into sports complexes. Sports complexes are divided into one-sport
and multisport types. Multisport complexes have areas of the complex designated for each sport
with a location indicator (e.g., center, NE corner, and so on). A complex has a location, chief
organizing individual, total occupied area, and so on. Each complex holds a series of events
(e.g., the track stadium may hold many different races). For each event there is a planned date,
duration, number of participants, number of officials, and so on. A roster of ail officials will be
maintained together with the list of events each official will be involved in. Different equipment is
needed for the events (e.g., goal posts, poles, parallel bars) as well as for maintenance. The two
types of facilities (one-sport and multisport) will have different types of information. For each type,
the number of facilities needed is kept, together with an approximate budget.
Step-by-step solution
Step 1 of 3
Comment
Step 2 of 3
The following is the EER diagram for the organization of Olympic facilities planned for the
summer Olympics.
Comment
Step 3 of 3
Explanation:
• The Olympic facilities are divided into sports complexes. The sport complexes are divided into
one sport and multisport types.
• There exist a holds relationship between Complex and Event entities. The complex holds the
number of events.
• Both complex and event have equipment. The complex maintains maintenance equipment and
event has event equipment.
Comment
Chapter 4, Problem 19E
Problem
Identify all the important concepts represented in the library database case study described
below. In particular, identify the abstractions of classification (entity types and relationship types),
aggregation, identification, and specialization/generalization. Specify (min, max) cardinality
constraints whenever possible. List details that will affect the eventual design but that have no
bearing on the conceptual design. List the semantic constraints separately. Draw an EER
diagram of the library database.
Case Study: The Georgia Tech Library (GTL) has approximately 16,000 members, 100,000
titles, and 250,000 volumes (an average of 2.5 copies per book). About 10% of the volumes are
out on loan at any one time. The librarians ensure that the books that members want to borrow
are available when the members want to borrow them. Also, the librarians must know how many
copies of each book are in the library or out on loan at any given time. A catalog of books is
available online that lists books by author, title, and subject area. For each title in the library, a
book description is kept in the catalog; the description ranges from one sentence to several
pages. The reference librarians want to be able to access this description when members
request information about a book. Library staff includes chief librarian, departmental associate
librarians, reference librarians, check-out staff, and library assistants.
Books can be checked out for 21 days. Members are allowed to have only five books out at a
time. Members usually return books within three to four weeks. Most members know that they
have one week of grace before a notice is sent to them, so they try to return books before the
grace period ends. About 5% of the members have to be sent reminders to return books. Most
overdue books are returned within a month of the due date. Approximately 5% of the overdue
books are either kept or never returned. The most active members of the library are defined as
those who borrow books at least ten times during the year. The top 1% of membership does 15%
of the borrowing, and the top 10% of the membership does 40% of the borrowing. About 20% of
the members are totally inactive in that they are members who never borrow.
To become a member of the library, applicants fill out a form including their SSN, campus and
home mailing addresses, and phone numbers. The librarians issue a numbered, machine-
readable card with the members photo on it. This card is good for four years. A month before a
card expires, a notice is sent to a member for renewal. Professors at the institute are considered
automatic members. When a new faculty member joins the institute, his or her information is
pulled from the employee records and a library card is mailed to his or her campus address.
Professors are allowed to check out books for three-month intervals and have a two-week grace
period. Renewal notices to professors are sent to their campus address.
The library does not lend some books, such as reference books, rare books, and maps. The
librarians must differentiate between books that can be lent and those that cannot be lent. In
addition, the librarians have a list of some books they are interested in acquiring but cannot
obtain, such as rare or out- of-print books and books that were lost or destroyed but have not
been replaced. The librarians must have a system that keeps track of books that cannot be lent
as well as books that they are interested in acquiring. Some books may have the same title;
therefore, the title cannot be used as a means of identification. Every book is identified by its
International Standard Book Number (ISBN), a unique international code assigned to all books.
Two books with the same title can have different ISBNs if they are in different languages or have
different bindings (hardcover or softcover). Editions of the same book have different ISBNs.
The proposed database system must be designed to keep track of the members, the books, the
catalog, and the borrowing activity.
Step-by-step solution
Step 1 of 2
Entity Types:
1. LIBRARY_MEMBER
2. BOOK
3. STAFF_MEMBER
Relationship types:
1. ISSUE_CARD
2. ISSUE_NOTICE
3. ISSUE_BOOK
4. GET_DESCRIPTION
Aggregation:
1. All entity types are aggregation of constituent attributes as can be seen from EER diagram.
2. Relationship types that have member attributes (see figure) are also aggregation.
Identification:
a. LIBRARY_MEMBER: Ssn
c. STAFF_MEMBER: Ssn
Specialization/ generalization:
3.
Comment
Step 2 of 2
Comment
Chapter 4, Problem 20E
Problem
Design a database to keep track of information for an art museum. Assume that the following
requirements were collected:
■ The museum has a collection of ART_OBJECTS. Each ART_OBJECT has a unique ld_no, an
Artist (if known), a Year (when it was created, if known), a Title, and a Description. The art
objects are categorized in several ways, as discussed below.
■ ART_OBJECTS are categorized based on their type. There are three main types—PAINTING,
SCULPTURE, and STATUE—plus another type called OTHER to accommodate objects that do
not fall into one of the three main types.
■ A PAINTING has a Paint_type (oil, watercolor, etc.), material on which it is Drawn_on (paper,
canvas, wood, etc.), and Style (modern, abstract, etc.).
■ A SCULPTURE or a statue has a Material from which it was created (wood, stone, etc.),
Height, Weight, and Style.
■ An art object in the OTHER category has a Type (print, photo, etc.) and Style.
■ Information describing the country or culture of Origin (Italian, Egyptian, American, Indian, and
so forth) and Epoch (Renaissance, Modern, Ancient, and so forth) is captured for each
ART_OBJECT.
■ The museum keeps track of ARTIST information, if known: Name, DateBorn (if known),
Date_died (if not living), Country_of_origin, Epoch, Main_style, and Description. The Name is
assumed to be unique.
■ Different EXHIBITIONS occur, each having a Name, Start_date, and End_date. EXHIBITIONS
are related to all the art objects that were on display during the exhibition.
■ Information is kept on other COLLECTIONS with which the museum interacts; this information
includes Name (unique), Type (museum, personal, etc.), Description, Address, Phone, and
current Contact_person.
Draw an EER schema diagram for this application. Discuss any assumptions you make, and then
justify your EER design choices.
Step-by-step solution
Step 1 of 2
Comment
Step 2 of 2
The EER schema diagram for the art museum database is as follows:
Comments (1)
Chapter 4, Problem 21E
Problem
Figure shows an example of an EER diagram for a small-private-airport database; the database
is used to keep track of airplanes, their owners, airport employees, and pilots. From the
requirements for this database, the following information was collected: Each AIRPLANE has a
registration number [Reg#], is of a particular plane type [OF_TYPE], and is stored in a particular
hangar [STORED_IN]. Each PLANE_TYPE has a model number [Model], a capacity [Capacity],
and a weight [Weight]. Each HANGAR has a number [Number], a capacity [Capacity], and a
location [Location]. The database also keeps track of the OWNERs of each plane [OWNS] and
the EMPLOYEES who have maintained the plane [MAINTAIN]. Each relationship instance in
OWNS relates an AIRPLANE to an OWNER and includes the purchase date [Pdate]. Each
relationship instance in MAINTAIN relates an EMPLOYEE to a service record [SERVICE]. Each
plane undergoes service many times; hence, it is related by [PLANE_SERVICE] to a number of
SERVICE records. A SERVICE record includes as attributes the date of maintenance [Date], the
number of hours spent on the work [Hours], and the type of work done [Work_code]. We use a
weak entity type [SERVICE] to represent airplane service, because the airplane registration
number is used to identify a service record. An OWNER is either a person or a corporation.
Hence, we use a union type (category) [OWNER] that is a subset of the union of corporation
[CORPORATION] and person [PERSON] entity types. Both pilots [PILOT] and employees
[EMPLOYEE] are subclasses of PERSON. Each PILOT has specific attributes license number
[Lic_num] and restrictions [Restr]; each EMPLOYEE has specific attributes salary [Salary] and
shift worked [Shift]. All PERSON entities in the database have data kept on their Social Security
number [Ssn], name [Name], address [Address], and telephone number [Phone]. For
CORPORATION entities, the data kept includes name [Name], address [Address], and telephone
number [Phone]. The database also keeps track of the types of planes each pilot is authorized to
fly [FLIES] and the types of planes each employee can do maintenance work on [WORKS_ON].
Show how the SMALL_AIRPORT EER schema in Figure 4.12 may be represented in UML
notation. (Note: We have not discussed how to represent categories (union types) in UML, so
you do not have to map the categories in this and the following question.)
Step-by-step solution
Step 1 of 2
Consider the EER schema for a SMALL_AIRPORT database. The following is the UML diagram
that represents the SMALL_AIRPORT database.
Comment
Step 2 of 2
Each entity and relationships are shown in the UML diagram. In the provided EER diagram, there
is a union type (category) specified for OWNER. The OWNER is a subset of the union of
CORPORATION and PERSON. The categories are not mapped in the UML as specified.
Comments (2)
Chapter 4, Problem 22E
Problem
Show how the UNIVERSITY EER schema in Figure 4.9 may be represented in UML notation.
Step-by-step solution
Step 1 of 2
• The entity relationship diagram refers to the diagram that represents the relationship between
different entities and their attributes.
• The UML refers to the unified modeling language which is a language used to develop or model
the fields in software engineering.
Comment
Step 2 of 2
Problem
Consider the entity sets and attributes shown in the following table. Place a checkmark in one
column in each row to indicate the relationship between the far left and far right columns.
(b) Has
(a) Has a (c) Is a (d) Is a
an Entity
Entity Set Relationship Specialization Generalization
Attribute Attrib
with of of
that is
1. MOTHER PERS
2. DAUGHTER MOT
3. STUDENT PERS
4. STUDENT Stude
5. SCHOOL STUD
6. SCHOOL CLAS
7. ANIMAL HOR
8. HORSE Breed
9. HORSE Age
Step-by-step solution
Step 1 of 2
Generalization: Generalization is a relationship in which the child class is based on the parent
class. Both child and parent class elements in a generalization relationship must be of the same
type.
Inheritance: A child class properties is derived from parent class properties. It is also called an “Is
a” relationship.
Comment
Step 2 of 2
Consider the entity sets and attributes and apply one of the relationship.
Comment
Chapter 4, Problem 24E
Problem
Draw a UML diagram for storing a played game of chess in a database. You may look at
http://www.chessgames.com for an application similar to what you are designing. State clearly
any assumptions you make in your UML diagram. A sample of assumptions you can make about
the scope is as follows:
3. The players are assigned a color of black or white at the start of the game.
4. Each player starts with the following pieces (traditionally called chessmen):
a. king
b. queen
c. 2 rooks
d. 2 bishops
e. 2 knights
f. 8 pawns
6. Every piece has its own set of legal moves based on the state of the game. You do not need to
worry about which moves are or are not legal except for the following issues:
c. If a pawn moves to the last row, it is “promoted” by converting it to another piece (queen, rook,
bishop, or knight).
Step-by-step solution
Step 1 of 1
Assumptions:
Comment
Chapter 4, Problem 25E
Problem
Draw an EER diagram for a game of chess as described in Exercise. Focus on persistent
storage aspects of the system. For example, the system would need to retrieve all the moves of
every game played in sequential order.
Exercise
Draw a UML diagram for storing a played game of chess in a database. You may look at
http://www.chessgames.com for an application similar to what you are designing. State clearly
any assumptions you make in your UML diagram. A sample of assumptions you can make about
the scope is as follows:
3. The players are assigned a color of black or white at the start of the game.
4. Each player starts with the following pieces (traditionally called chessmen):
a. king
b. queen
c. 2 rooks
d. 2 bishops
e. 2 knights
f. 8 pawns
6. Every piece has its own set of legal moves based on the state of the game. You do not need to
worry about which moves are or are not legal except for the following issues:
c. If a pawn moves to the last row, it is “promoted” by converting it to another piece (queen, rook,
bishop, or knight).
Step-by-step solution
Step 1 of 1
Enhanced Entity Relationship diagram is the concept of superclass and subclass entity types in
the ER model.
Here super classes are PLAYER, MOVES, PIECES and subclasses are Name, Color,
Cur_position, Initian_Position, Piece_name, Position_before_move, Changed_position.
Sequence order for game play:
Step 2: PIECES get moved and give the chance for position.
Step 4: PIECES change the position for avoiding the PLAYER move.
Comment
Chapter 4, Problem 26E
Problem
Which of the following EER diagrams is/are incorrect and why? State clearly any assumptions
you make.
a.
b.
c.
Step-by-step solution
Step 1 of 3
a.
Comment
Step 2 of 3
b.
• E1 and E2 are disjoint entities of entity E. It indicates that E may be a member of E1 or E2.
Comment
Step 3 of 3
c.
• E1 and E3 are overlapping entities of entity say E. It indicates that E may be a member of E1 or
E3 or both.
• The overlapping entities E1 and E3 cannot share a relationship R. So there cannot be a many
to many relationship between E1 and E3.
Comments (1)
Chapter 4, Problem 27E
Problem
Consider the following EER diagram that describes the computer systems at a company. Provide
your own attributes and key for each entity type. Supply max cardinality constraints justifying
your choice. Write a complete narrative description of what this EER diagram represents.
Step-by-step solution
Step 1 of 5
4 DESKTOP Color NA
8 KEYBOARD Type NA
9 MEMORY Size NA
12 SOUND_CARD Type NA
13 VIDEO_CARD Type NA
Comment
Step 2 of 5
(min,max)
Relationship Entity type1 (min,max)
S.No constraint, Entity type2 name
name name constraint
REASON
Comment
Step 3 of 5
As all components and accessories are restricted by S_no and all softwares are restricted by
Lic_no so each can go to a single LAPTOP/ DESKTOP/ COMPUTER. On the contrary a
computer can have any no of ACCESSORY/ SOFTWARE/ OPERATING_SYSTEM/
COMPONENT/ MEMORY.
SOFTWARE may need many supporting COMPONENTS and a COMPONENT can SUPPORT
many SOFTWARES.
Comment
Step 4 of 5
Comment
Step 5 of 5
With COMPUTER one can get ACCESSORY. Each ACCESSORY has cost, S_no, and
type(audio/ video./ input/output). ACCESSORY can be categorized into KEYBOARD (type),
MOUSE (type, Is_wired), MONITOR(size, resolution, type).
Associated with DESKTOP and software we have various COMPONENT (Manufacturer, S_no,
Cost, Type). COMPONENT are further divided in MEMORY(size), AUDIO_CARD(type),
VIDEO_CARD(type). LAPTOP can also have MEMORY_OPTOIONS.
Comment
Chapter 4, Problem 29LE
Expert Answer
Below are the Database tables designed in MS Access for Teams, Managers, Umpires, Players and Pitchers:
For Managers:
For Players:
For Pitchers:
For Umpires:
These all the related tables used to manage the Baseball game with a Master Database as follows:
Master DB Part2:
Problem
Chapter 4, Problem 31LE
Consider the EER diagram for the UNIVERSITYdatabase shown in Figure 4.9.Enter this design
using a data modeling tool such as ERwin or Rational Rose. Make a list of the differences in
notation between the diagram in the text and the corresponding equivalent diagrammatic notation
you end up using with the tool.
Step-by-step solution
Step 1 of 1
Refer to the figure 4.9 for the EER diagram of the UNIVERSITY database. Use Rational Rose
tool to create the ER schema for the database as follow:
• In the options available on left, right click on the option Logical view, go to New and select the
option Class Diagram.
• Name the class diagram as UNIVERSITY. Select the option Class available in the toolbar and
then click on empty space of the Class Diagram file. Name the class as FACULTY.
Right click on the class, select the option New Attribute, and name the attribute as Rank.
Similarly, create the other attributes Foffice, Fphone and Salary.
• Similarly create another class GRANT and its attributes Title, No, Agency and St_date.
• Now right click on the attribute No, available on the left under the class GRANT, and select the
option Open Specification. Select the Protected option under Export Control. This will make
the attribute No as primary key.
• Select the option Unidirectional Association from the toolbar, for creating relationships
between the two classes. Now click on the class FACULTY; while holding the click drag the
mouse towards the class GRANT and release the click. This will create the relationship between
the two selected classes.
Name the association as PI. Since the structural constraint in the EER diagram is specified using
cardinality ratio, so specify the structural constraints using the Rational Rose tool as follows:
• Right click on the association close to the class FACULTY and select 1 from the option
Multiplicity.
• Again, right click on the association close to the class GRANT and select n from the option
Multiplicity.
• Similarly, create other classes and their associated attributes. Specify the relationships and
structural constraints between the classes, as mentioned above.
ER schema may be specified using alternate diagrammatic notation that is class diagram,
through the use of Rational Rose tool as follows:
The list of differences in notation between the EER diagram used in the figure 4.9 and its
equivalent diagrammatic notation, drawn through the Rational Rose tool, are as follows:
• In the EER diagram the entities are specified in a rectangle. However, the class diagram in
Rational Rose makes use of top section of the class diagram for specifying the entities.
• The attributes are specified in the EER diagram using the oval. The class diagram in the
Rational Rose makes use of the middle section, for specifying the attributes.
• The primary keys in the EER diagram are specified by underlining the attribute in an oval.
An attribute can be made a primary key in the class diagram in the Rational Rose by selecting
the option Open Specification; followed by selecting the Protected option under Export
Control. A yellow color key against the attribute in the class diagram in the Rational Rose
indicates primary key.
• The relationship between two entities is specified in the diamond shaped box. For example, in
figure 4.9 PI is the relationship between FACULTY and GRANT.
The class diagram in Rational Rose makes use of option Unidirectional Association for
specifying the relation or association between two entities. For example, in the above class
diagram, the association named PI is specified on the line joining the two entities.
• The structural constraint in the EER diagram is specified using cardinality ratio. For example, in
the PI relationship, FACULTY: GRANT is of cardinality ratio 1:N.
In the class diagram made using Rational Rose, the Multiplicity option is used for specifying the
cardinality ratio.
Comment
Chapter 4, Problem 31LE
Problem
Consider the EER diagram for the UNIVERSITYdatabase shown in Figure 4.9.Enter this design
using a data modeling tool such as ERwin or Rational Rose. Make a list of the differences in
notation between the diagram in the text and the corresponding equivalent diagrammatic notation
you end up using with the tool.
Step-by-step solution
Step 1 of 1
Refer to the figure 4.9 for the EER diagram of the UNIVERSITY database. Use Rational Rose
tool to create the ER schema for the database as follow:
• In the options available on left, right click on the option Logical view, go to New and select the
option Class Diagram.
• Name the class diagram as UNIVERSITY. Select the option Class available in the toolbar and
then click on empty space of the Class Diagram file. Name the class as FACULTY.
Right click on the class, select the option New Attribute, and name the attribute as Rank.
Similarly, create the other attributes Foffice, Fphone and Salary.
• Similarly create another class GRANT and its attributes Title, No, Agency and St_date.
• Now right click on the attribute No, available on the left under the class GRANT, and select the
option Open Specification. Select the Protected option under Export Control. This will make
the attribute No as primary key.
• Select the option Unidirectional Association from the toolbar, for creating relationships
between the two classes. Now click on the class FACULTY; while holding the click drag the
mouse towards the class GRANT and release the click. This will create the relationship between
the two selected classes.
Name the association as PI. Since the structural constraint in the EER diagram is specified using
cardinality ratio, so specify the structural constraints using the Rational Rose tool as follows:
• Right click on the association close to the class FACULTY and select 1 from the option
Multiplicity.
• Again, right click on the association close to the class GRANT and select n from the option
Multiplicity.
• Similarly, create other classes and their associated attributes. Specify the relationships and
structural constraints between the classes, as mentioned above.
ER schema may be specified using alternate diagrammatic notation that is class diagram,
through the use of Rational Rose tool as follows:
The list of differences in notation between the EER diagram used in the figure 4.9 and its
equivalent diagrammatic notation, drawn through the Rational Rose tool, are as follows:
• In the EER diagram the entities are specified in a rectangle. However, the class diagram in
Rational Rose makes use of top section of the class diagram for specifying the entities.
• The attributes are specified in the EER diagram using the oval. The class diagram in the
Rational Rose makes use of the middle section, for specifying the attributes.
• The primary keys in the EER diagram are specified by underlining the attribute in an oval.
An attribute can be made a primary key in the class diagram in the Rational Rose by selecting
the option Open Specification; followed by selecting the Protected option under Export
Control. A yellow color key against the attribute in the class diagram in the Rational Rose
indicates primary key.
• The relationship between two entities is specified in the diamond shaped box. For example, in
figure 4.9 PI is the relationship between FACULTY and GRANT.
The class diagram in Rational Rose makes use of option Unidirectional Association for
specifying the relation or association between two entities. For example, in the above class
diagram, the association named PI is specified on the line joining the two entities.
• The structural constraint in the EER diagram is specified using cardinality ratio. For example, in
the PI relationship, FACULTY: GRANT is of cardinality ratio 1:N.
In the class diagram made using Rational Rose, the Multiplicity option is used for specifying the
cardinality ratio.
Comment
Chapter 4, Problem 32LE
Problem
Consider the EER diagram for the small AIRPORTdatabase shown in Figure. Build this design
using a data modeling tool such as ERwin or Rational Rose. Be careful how you model the
category OWNER in this diagram. (Hint: Consider using CORPORATION_IS_OWNER and
PERSON_IS_OWNER as two distinct relationship types.)
Step-by-step solution
Step 1 of 2
Refer to the figure 4.12 for the EER schema of AIRLINE database. Use Rational Rose tool to
create the EER schema for the database as follow:
• In the options available on left, right click on the option Logical view, go to New and select the
option Class Diagram.
• Name the class diagram as SMALL_AIRPORT. Select the option Class available in the toolbar
and then click on empty space of the Class Diagram file. Name the class as PLANE_TYPE.
Right click on the class, select the option New Attribute, and name the attribute as Model.
Similarly, create the other attributes Capacity and Weight.
• Now right click on the attribute Model, available on the left under the class PLANE_TYPE, and
select the option Open Specification. Select the Protected option under Export Control. This
will make Model as the primary key.
• Similarly create another class EMPLOYEE and its attribute Salary and Shift.
• Select the option Unidirectional Association from the toolbar, for creating relationships
between the two classes. Now click on the class PLANE_TYPE; while holding the click drag the
mouse towards the class EMPLOYEE and release the click. This will create the relationship
between the two selected classes.
Name the association as WORKS_ON. Since the structural constraint in the EER diagram is
specified using cardinality ratio, so specify the structural constraints using the Rational Rose tool
as follows:
• Right click on the association close to the class PLANE_TYPE and select n from the option
Multiplicity.
• Again, right click on the association close to the class EMPLOYEE and select n from the option
Multiplicity.
• Similarly, create other classes and their associated attributes. Specify the relationships and
structural constraints between the classes, as mentioned above.
ER schema may be specified using alternate diagrammatic notation that is class diagram,
through the use of Rational Rose tool as follows:
Comment
Step 2 of 2
In the above class diagram, OWNER is the superclass, and PERSON and CORPORATION are
the subclasses. The subclasses can further participate in specific relationship types.
For example, in the above class diagram the PERSON subclass participates in the
OWNER_TYPE relationship. The subclass PERSON is further related to an entity type
PERSON_IS_OWNER via the OWNER_TYPE relationship.
The relationship types can be specified using the Rational Rose as follows:
• Create the subclass PERSON_IS_OWNER of the class PERSON as explained above. Also
create the association between the class PERSON and its subclass PERSON_IS_OWNER and
name it as OWNER_TYPE, as explained above.
Comment
Chapter 4, Problem 33LE
Problem
Consider the EER diagram for the UNIVERSITYdatabase shown in Figure 4.9.Enter this design
using a data modeling tool such as ERwin or Rational Rose. Make a list of the differences in
notation between the diagram in the text and the corresponding equivalent diagrammatic notation
you end up using with the tool.
Which combinations of attributes have to be unique for each individual SECTION entity in the
UNIVERSITY database shown in Figure 3.20 to enforce each of the following miniworld
constraints:
a. During a particular semester and year, only one section can use a particular classroom at a
particular DaysTime value.
b. During a particular semester and year, an instructor can teach only one section at a particular
DaysTime value.
c. During a particular semester and year, the section numbers for sections offered for the same
course must all be different.
Step 1 of 1
Refer to the Exercise 3.16 for the UNIVERSITY database and the ER schema developed for this
database through Rational Rose tool. Using Rational Rose, make the required changes and
create the ER schema as follows:
• Consider the class COURSE developed in Exercise 3.31. Select the option Class available in
the toolbar and then click on empty space of the Class Diagram file. Name the subclass as
UNDERGRAD_COURSES.
Right click on the class, select the option New Attribute, and name the attribute as Title.
Similarly, create the other attribute Department.
Similarly, create another subclass GRAD_COURSES of the class COURSE and its attributes
Title and Department.
The relationship types between the subclass and superclass can be specified using the Rational
Rose as follows:
• Select the option Unidirectional Association from the toolbar, for creating relationships
between the two classes. Now click on the class JUNIOR_PROFESSORS; while holding the
click drag the mouse towards the class UNDERGRAD_COURSES and release the click. This will
create the relationship between the two selected classes.
ER schema with the changes may be specified using alternate diagrammatic notation that is
class diagram, through the use of Rational Rose tool as follows:
Comment
Chapter 5, Problem 1RQ
Problem
Define the following terms as they apply to the relational model of data: domain, attribute, n-
tuple, relation schema, relation state, degree of a relation, relational database schema, and
relational database state.
Step-by-step solution
Step 1 of 7
575-5-1RQ
1. Domain: Domain is a set of atomic (indivisible) values that can appear in a particular column
in a relational schema. A common method of specifying domain is to specify a data type (integer,
character, floating point, etc...) from which the data values forming a domain can be drawn.
For example: Consider a relational schema called Student that may have facts about students
in a particular course. Consider a fact to be name of the student. Name of a student must be a
char string. So we can say domain of name is char string.
Comment
Step 2 of 7
For example: In relational Schema STUDENT, NAME can be one of the attributes of the relation
NOTATIONS:
• Attributes>> A1, A2 ….
• Tuple>> t
Comment
Step 3 of 7
For example: In a relational schema STUDENT, if we have four attributes, viz., Name, Roll No.,
Class, and , Rank then n-tuple for a student can be where
Student Ram has roll number 1 and studies in class to and got rank 5 in class.
4. Relational Schema: Relational schema is but collection of attributes that define facts and
relation between a real world entity and name. In other words a relational schema R, denoted by
R (A1,A2,….,AN), is made up of a name and a list of attributes A1, A2,…,An.
For example: STUDENT can be name of a relational schema and Name, Roll No., Class, and ,
Rank can be its four attributes.
Comment
Step 4 of 7
For example: In relational schema for student collection of data for 2 students, viz., , is a relation
state.
Formal Definition: A relation state, r(R), is a mathematical relation of degree n on the domains
of all attributes, which is a subset of the cartesian product of the domains that define R:
Comment
Step 5 of 7
6. Degree of a Relation: The degree (or arity) of a relation is the number of attributes n of its
relational schema.
Comment
Step 6 of 7
Comment
Step 7 of 7
Comment
Problem
Chapter 5, Problem 2RQ
Step-by-step solution
Step 1 of 2
Comment
Step 2 of 2
Comment
Chapter 5, Problem 3RQ
Problem
Step-by-step solution
Step 1 of 1
Duplicate tuples are not allowed in a relation as it violates the relational integrity constraints.
• A key constraint states that there must be an attribute or combination of attributes in a relation
whose values are unique.
• There should not be any two tuples in a relation whose values are same for their attribute
values.
• If the tuples contains duplicate values, then it violates the key constraint.
Comment
Chapter 5, Problem 4RQ
Problem
Step-by-step solution
Step 1 of 2
A super key SK is a set of attributes that uniquely identifies the tuples of a relation. It satisfies the
uniqueness constraint.
A key K is an attribute or set of attributes that uniquely identifies the tuples of a relation. It is a
minimal super key. In other words, when an attribute is removed from super key, it will no longer
be a super key.
Comment
Step 2 of 2
Comment
Chapter 5, Problem 5RQ
Problem
Why do we designate one of the candidate keys of a relation to be the primary key?
Step-by-step solution
Step 1 of 1
Every relation must contain an attribute or combination of attributes which can used to uniquely
identify each tuple in a relation.
• An attribute or combination of attributes which can used to uniquely identify each tuple in a
relation is known as candidate key.
• Among several candidate key, one candidate key which is usually single and simple is chosen
as a primary key.
Comment
Chapter 5, Problem 6RQ
Problem
Discuss the characteristics of relations that make them different from ordinary tables and files.
Step-by-step solution
Step 1 of 2
The tables, relations, and files are the key concepts of the relational data model. A relation
resembles a table, but it has some added constraints to it to use the link between two tables in
an efficient way.
Comment
Step 2 of 2
Even though both the relation and a table are used to store/represent data, there are differences
between them as shown below:
Comment
Chapter 5, Problem 7RQ
Problem
Discuss the various reasons that lead to the occurrence of NULL values in relations.
Step-by-step solution
Step 1 of 2
NULL value:
For example,
If the student does not have any pen or pencil for the exam,
• For that particular student, the values of those attributes are defined as NULL.
• The NULL value can be either the values do not exist, an unknown value or the value not yet
available.
Comment
Step 2 of 2
• The tuple can be marked as NULL, When the value of an attribute is not applicable.
• The tuple can be marked as NULL, When the existing value of an attribute is unknown.
• If the value of an attribute does not apply to a tuple, it is also marked as NULL.
• If the value of an attribute is not known or not found, the particular tuple is marked as NULL.s
• For instance, suppose the values are known but specifically does not apply to the tuple it is
marked as NULL.
• In relations of NULL values, the values exist but at present, it is not available.
• In relations of NULL values, the different meanings can be conveyed by different codes.
• In relations, the operations of NULL value have been proved when the lack of value (NULL) is
found.
Comment
Chapter 5, Problem 8RQ
Problem
Discuss the entity integrity and referential integrity constraints. Why is each considered
important?
Step-by-step solution
Step 1 of 2
Entity Integrity Constraint: It states that no primary key value can be NULL.
Importance: Primary key values are used to identify a tuple in a relation. Having NULL value for
primary key will mean that we cannot identify some tuples.
Referential Integrity Constraints: It states that a tuple in one relation that refers to another
relation must refer to an existing tuple in that relation
Comment
Step 2 of 2
Definition using Foreign Key: For two relational schemas R1 and R2, a set of attributes FK in
relation schema R1, is foreign key of R1 that references relation R2 f it satisfies following
condition:
• Attributes in FK have same domain(s) as primary key attributes PK of R2; the attributes FK are
said to reference relation R2.
• A value of FK in a tuple t1 of the current state r1 (R1) either occurs in as a value of PK for some
tuple in the current state r2 (R2) or is NULL . In former case (t1 [FK] = t2 [PK]) tuple t1 is said to
refer to the tuple t2.
When these two conditions hold true between R1 the referencing relation and R2 the referenced
relation the referential integrity constraint is said to hold true.
Importance: Referential Integrity constraints are specified among two relations and are used to
maintain consistency among tuples in two relations.
Comment
Chapter 5, Problem 9RQ
Problem
Step-by-step solution
Step 1 of 2
A foreign key is an attribute or composite attribute of one relation which is/are a primary key of
other relation that is used to maintain relationship between two relations.
Comment
Step 2 of 2
The concept of foreign key is used to maintain referential integrity constraint between two
relations and hence in maintaining consistency among tuples in two relations.
• The value of a foreign key should match with value of the primary key in the referenced relation.
• A value to a foreign key cannot be added which does not exist in the primary key of the
referenced relation.
• It is not possible to delete a tuple from the referenced relation if there is any matching record in
the referencing relation.
Comment
Chapter 5, Problem 10RQ
Problem
Step-by-step solution
Step 1 of 2
A transaction is a program in execution that involves various operations that can be done on the
database.
Comment
Step 2 of 2
• In an update operation, only a single attribute value can be changed at one time.
• In a transaction, more than one update operation along with reading data from the database,
insertion and deletion operations can be done.
Comment
Chapter 5, Problem 11E
Problem
Suppose that each of the following Update operations is applied directly to the database state
shown in Figure 5.6. Discuss all integrity constraints violated by each operation, if any, and the
different ways of enforcing these constraints.
a. Insert <‘Robert’, ‘F’ ‘Scott’, ‘943775543’, ‘1972-06-21’, ‘2365 Newcastle Rd, Bellaire, TX’, M,
58000, ‘888665555’, 1> into EMPLOYEE.
i. Modify the Mgr_ssn and Mgr_start_date of the DEPARTMENT tuple with Dnumber = 5 to
‘123456789’ and ‘2007-10-01’, respectively.
j. Modify the Super_ssn attribute of the EMPLOYEE tuple with Ssn = ‘999887777’ to
‘943775543’.
k .Modify the Hours attribute of the WORKS_ON tuple with Essn = ‘999887777’ and Pno = 10 to
‘5.0’.
Step-by-step solution
Step 1 of 11
(a)
Acceptable operation.
Comment
Step 2 of 11
(b)
Not Acceptable. Violates referential integrity constraint as value of Department number that is
foreign key is not present in DEPARTMENT relation.
• Not performing the operation and explain to user cause of the same.
• Prompting user to insert department with Dept number 2 in DEPRTMENT relation and then
performing the operation.
Comment
Step 3 of 11
(c)
Not Acceptable. Violates Key constraint. Department with dept number 4 already exist. Ways of
enforcing as follows:
• Not performing the operation and explain to user cause of the same.
Comment
Step 4 of 11
(d)
Not Acceptable. Violates entity Integrity constraint and referential integrity constraint. Value of
one of the Attributes of primary is NULL. Also value of Essn is not present in referenced relation,
i.e., EMPLOYEE.
• Not performing the operation and explain to user cause of the same.
• Prompting user to specify correct values for the primary key and performing the operation.
Comment
Step 5 of 11
(e)
Acceptable
Comment
Step 6 of 11
(f)
Acceptable
Comment
Step 7 of 11
(g)
Not Acceptable.
Violates referential integrity constraint as value of Ssn has been used as foreign key of
WORKS_ON, EMPLOYEE, DEPENDENT, DEPARTMENT relations and deleting record with Ssn
= ‘987654321’ will leave no corresponding entry for record in WORKS_ON relation.
• Not performing the operation and explain to user cause of the same.
Comment
Step 8 of 11
(h)
Not Acceptable.
Violates referential integrity constraint as value of Pnumber has been used as foreign key of
WORKS_ON relation and deleting record with Pname = ‘ProductX’ will also delete product with
Pnumber = ’1’. Since this value has been used in WORKS_ON table so deleting this record will
violate referential integrity constraint. Ways of enforcing as follows:
• Not performing the operation and explain to user cause of the same.
Comment
Step 9 of 11
(i)
Acceptable.
Comment
Step 10 of 11
(j)
Not Acceptable.
Violates referential integrity constraint as value of Super_Ssn is also foreign key for EMPLOYEE
relation. Since no employee with Ssn = ‘943775543’ exist so Super_Ssn of any employee cannot
be ‘943775543’.
• Not performing the operation and explain to user cause of the same.
• Prompting user to either add a record in EMPLOYEE relation with Ssn = ‘943775543’ or to
change Super_Ssn to some valid value.
Comment
Step 11 of 11
(k)
Acceptable.
Comment
Chapter 5, Problem 12E
Problem
Consider the AIRLINE relational database schema shown in Figure, which describes a database
for airline flight information. Each FLIGHT is identified by a Flight_number, and consists of one or
more FLIGHT_LEGs with Leg_numbers 1, 2, 3, and so on. Each FLIGHT_LEG has scheduled
arrival and departure times, airports, and one or more LEG_INSTANCEs—one for each Date on
which the flight travels. FAREs are kept for each FLIGHT. For each FLIGHT_LEG instance,
SEAT_RESERVATIONs are kept, as are the AIRPLANE used on the leg and the actual arrival
and departure times and airports. An AIRPLANE is identified by an Airplane_id and is of a
particular AIRPLANE_TYPE. CAN_LAND relates AIRPLANE_TYPEs to the AIRPORTs at which
they can land. An AIRPORT is identified by an Airport_code. Consider an update for the AIRLINE
database to enter a reservation on a particular flight or flight leg on a given date.
c. Which of these constraints are key, entity integrity, and referential integrity constraints, and
which are not?
d. Specify all the referential integrity constraints that hold on the schema shown in Figure.
Step-by-step solution
Step 1 of 4
a.
First it is necessary check if the seats are available on the on a particular flight or flight leg on a
given date. This can be done by checking the LEG_INSTANCE relation.
Comment
Step 2 of 4
b.
The constraints that need to be checked into to perform the update are as follows:
• Check if the particular SEAT_NUMBER for particular flight on the particular date is available or
not.
Comments (1)
Step 3 of 4
c.
Checking for SEAT_NUMBER particular flight on the particular date comes under entity integrity
constraint.
Comment
Step 4 of 4
d.
A referential integrity constraint specifies that the value of a foreign key should match with value
of the primary key in the primary table.
• Flight_number of FARE is a foreign key which references the Flight_number of FLIGHT relation.
• Flight_number, Leg_number and Date of SEAT_RESERVATION are are foreign keys which
references Flight_number, Leg_number and Date of LEG_INSTANCE relation.
Comment
Chapter 5, Problem 13E
Problem
Step-by-step solution
Step 1 of 2
The relation CLASS specified about the uniqueness of and classes that are
taught in University.
As per the CLASS relation, the following are the possible candidate keys:
3. – If at
given same time, for a specific semester, same room cannot be used by more than one course.
Comment
Step 2 of 2
Comment
Chapter 5, Problem 14E
Problem
Consider the following six relations for an order-processing database application in a company:
ITEM(Item#, Unit_price)
WAREHOUSE(Warehouse#, City)
Here, Ord_amt refers to total dollar amount of an order; Odate is the date the order was placed;
and Ship_date is the date an order (or part of an order) is shipped from the warehouse. Assume
that an order can be shipped from several warehouses. Specify the foreign keys for this schema,
stating any assumptions you make. What other constraints can you think of for this database?
Step-by-step solution
Step 1 of 2
Foreign Keys:
a. Cust# of ORDER is FK for CUSTOMER: orders are taken from recognized customers only.
c. Item# of ORDER_ITEM is FK of ITEM: Orders are taken only for items in stock.
Comment
Step 2 of 2
Other Constraints:
• Ship_date must be greater (later date) then Odate in ORDER. Order must be taken before it is
shipped.
Comment
Chapter 5, Problem 15E
Problem
Consider the following relations for a database that keeps track of business trips of salespersons
in a sales office:
A trip can be charged to one or more accounts. Specify the foreign keys for this schema, stating
any assumptions you make.
Step-by-step solution
Step 1 of 3
A foreign key is a column or composite of columns which is/are a primary key of other table that
is used to maintain relationship between two tables.
• A foreign key is mainly used for establishing relationship between two tables.
Comment
Step 2 of 3
• Ssn is a foreign key in TRIP relation. It references the Ssn of SALESPERSON relation.
• Trip_id is a foreign key in EXPENSE relation. It references the Trip_id of TRIP relation.
Comment
Step 3 of 3
Assume that there are additional tables that stores the department information and account
details. Then possible foreign keys are as follows:
Comment
Chapter 5, Problem 16E
Problem
Consider the following relations for a database that keeps track of student enrollment in courses
and the books adopted for each course:
Specify the foreign keys for this schema, stating any assumptions you make.
Step-by-step solution
Step 1 of 2
A foreign key is a column or composite of columns which is/are a primary key of other table that
is used to maintain relationship between two tables.
• A foreign key is mainly used for establishing relationship between two tables.
Comment
Step 2 of 2
• Ssn is a foreign key in ENROLL table which references the Ssn of STUDENT table . Ssn is a
primary key in STUDENT table.
• Course# is a foreign key in ENROLL table which references the Course# of COURSE table .
Course#is a primary key in COURSE table.
Comment
Chapter 5, Problem 17E
Problem
Consider the following relations for a database that keeps track of automobile sales in a car
dealership (OPTION refers to some optional equipment installed on an automobile):
First, specify the foreign keys for this schema, stating any assumptions you make. Next, populate
the relations with a few sample tuples, and then give an example of an insertion in the SALE and
SALESPERSON relations that violates the referential integrity constraints and of another
insertion that does not.
Step-by-step solution
Step 1 of 4
a. Serial_no from OPTION is FK for CAR: spare parts can be added to cars with serial number.
b. Serial_no from is FK for CAR:only car with serial number can be put to sale.
Comments (2)
Step 2 of 4
CAR:
1 1987 ford 7
2 1998 Tata 4
3 1988 Ferrari 20
4 1952 Ford 2
2 Abc 200
4 def 400
OPTION:
Comment
Step 3 of 4
SALESPERSON:
Comment
Step 4 of 4
Insertion in SALESPERSON can not violate Referential Integrity constraint. A valid insertion for
SALESPERSON can be:
Comment
Chapter 5, Problem 18E
Problem
Database design often involves decisions about the storage of attributes. For example, a Social
Security number can be stored as one attribute or split into three attributes (one for each of the
three hyphen-delineated groups of numbers in a Social Security number—XXX-XX-XXXX).
However, Social Security numbers are usually represented as just one attribute. The decision is
based on how the database will be used. This exercise asks you to think about specific situations
where dividing the SSN is useful.
Step-by-step solution
Step 1 of 2
Usually during the database design, the social security number (SSN) is stored as single
attribute.
Comment
Step 2 of 2
The situations where it is preferred to store the SSN as parts instead of as a single attribute is as
follows:
• Area number determines the location or state. In some cases, it is necessary to group the data
based on the location to generate some statistical information.
• The area code (or city code) is required and sometimes country code is needed for dialing the
international phone numbers.
Comment
Chapter 5, Problem 19E
Problem
Consider a STUDENT relation in a UNIVERSITY database with the following attributes (Name,
Ssn, Local_phone, Address, Cell_phone, Age, Gpa). Note that the cell phone may be from a
different city and state (or province) from the local phone. A possible tuple of the relation is
shown below:
a. Identify the critical missing information from the Local_phone and Cell_phone attributes. (Hint:
How do you call someone who lives in a different state or province?)
b. Would you store this additional information in the Local_phone and Cell_phone attributes or
add new attributes to the schema for STUDENT?
c. Consider the Name attribute. What are the advantages and disadvantages of splitting this field
from one attribute into three attributes (first name, middle name, and last name)?
d. What general guideline would you recommend for deciding when to store information in a
single attribute and when to split the information?
e. Suppose the student can have between 0 and 5 phones. Suggest two different designs that
allow this type of information.
Step-by-step solution
Step 1 of 5
Comment
Step 2 of 5
b. Since cell phone and local phone can be of different city or state, additional information must
be added in Local_phone and Cell_phone attributes.
Comment
Step 3 of 5
c. If Name is Split in First_name, Middle_name and Last_name attributes there can be following
advantages:
• Sorting can be done on basis of First Name or Last Name or Middle Name.
Disadvantages:
• By splitting single attribute into three attributes NULL values may increase in database. (If few
students don’t have a Middle Name.)
• Extra Memory will be consumed for storing NULL values of attributes that may not exist for a
particular student. (Middle Name).
Comment
Step 4 of 5
• When storing information in different attributes will create NULL values, single attribute must be
preferred.
• When while using single attribute atomicity can not be maintained, we must use different
attributes.
• When information needs to be sorted on the basis of some Sub-field of and attribute or when
any sub-field is needed for decision making, we must split single attribute into many.
e.
Comment
Step 5 of 5
First Design
Second Design:
Although schema can be designed in either of the two ways but design first is better than second
as it leaves lesser number of NULL values.
Comment
Chapter 5, Problem 20E
Problem
Recent changes in privacy laws have disallowed organizations from using Social Security
numbers to identify individuals unless certain restrictions are satisfied. As a result, most U.S.
universities cannot use SSNs as primary keys (except for financial data). In practice, Student_id,
a unique identifier assigned to every student, is likely to be used as the primary key rather than
SSN since Student_id can be used throughout the system.
a. Some database designers are reluctant to use generated keys (also known as surrogate keys)
for primary keys (such as Student_id) because they are artificial. Can you propose any natural
choices of keys that can be used to identify the student record in a UNIVERSITY database?
b. Suppose that you are able to guarantee uniqueness of a natural key that includes last name.
Are you guaranteed that the last name will not change during the lifetime of the database? If last
name can change, what solutions can you propose for creating a primary key that still includes
last name but remains unique?
c. What are the advantages and disadvantages of using generated (surrogate) keys?
Step-by-step solution
Step 1 of 1
(a)
Some Operation on Students Name and Local and cell phone numbers
(originals) can jointly be used for generating id for student.
For Example:
Let it be 57th entry into the system. We can have unique identifier as:
GeorgeGWE_Edwards_555-123430_57.
Assumptions: Each student has different local_number unless they have same
address and two students with same address will not have same names.
Some hash operations can also be used on various fields for generation of
key.
(b)
In case if natural key uses Last name and as last name can change we can
include a column called original last name. That can be used for identification.
(c)
Immutability:
• Surrogate keys do not change while the row exists. This has two advantages:
Database applications won't lose their "handle" on the row because the data changes;
• Many database systems do not support cascading updates of keys across foreign keys of
related tables. This results in difficulty in modifying the primary key data.
Because of changing requirements, the attributes that uniquely identify an entity might change. In
that case, the attribute(s) initially chosen as the natural key will no longer be a suitable natural
key.
Example :
An employee ID is chosen as the natural key of an employee DB. Because of a merger with
another company, new employees from the merged company must be inserted, who have
conflicting IDs (as their IDs were independently generated when the companies were Separate).
In these cases, generally a new attribute must be added to the natural key (e.g. an attribute
"original_company"). With a surrogate key, only the table that defines the surrogate key must be
changed. With natural keys, all tables (and possibly other, related software) that use the natural
key will have to change. More generally, in some problem domains it is simply not clear what
might be a suitable natural key. Surrogate keys avoid problems from choosing a natural key that
later turns out to be incorrect.
Performance
Often surrogate keys are composed of a compact data type, such as a four-byte integer. This
allows the database to query faster than it could multiple columns.
• If the natural key is a compound key, joining is more expensive as there are multiple columns to
compare. Surrogate keys are always contained in a single column.
Compatibility
Disassociation
Because the surrogate key is completely unrelated to the data of the row to
which it is attached, the key is disassociated from that row. Disassociated
keys are unnatural to the application's world, resulting in an additional level of
indirection from which to audit.
Query Optimization
Normalization
Because surrogate keys are unnatural, flaws can appear when modeling the
business requirements. Business requirements, relying on the natural key,
then need to be translated to the surrogate key.
Inadvertent Disclosure
Inadvertent Assumptions
Sequentially generated surrogate keys create the illusion that events with a
higher primary key value occurred after events with a lower primary key value.
This illusion would appear when an event is missed during the normal data
entry process and is, instead, inserted after subsequent events were
previously inserted. The solution to the inadvertent assumption problem is to
generate a random primary key. However, a randomly generated primary key
must be queried before assigned to prevent duplication and cause an insert
rejection.
Comment
Chapter 6, Problem 1RQ
Problem
How do the relations (tables) in SQL differ from the relations defined formally in Chapter 3?
Discuss the other differences in terminology. Why does SQL allow duplicate tuples in a table or in
a query result?
Step-by-step solution
Step 1 of 1
SQL allows a table(relation) to have two or more tuples that are identical in all their attribute
values. Hence, in general, an SQL table is not a set of tuples, because a set does not allow two
identical members; rather, it is a multiset of tuples. Some SQL relations are constrained to be
sets because a key constraint has been declared or because of DISTINCT option has been used
in SELECT statement.
On contrary relation defined formally says that a relation is set of tuples that is, same values are
not allowed for any tuple.
Correspondence between ER and Relational Model can help in understanding other differences
in terminology:
3. When an aggregate function is applied to tuples, in most cases user don’t want to remove
duplicates.
Comment
Chapter 6, Problem 2RQ
Problem
List the data types that are allowed for SQL attributes.
Step-by-step solution
Step 1 of 1
Character string
Bit string
Boolean
Comment
Chapter 6, Problem 3RQ
Problem
How does SQL allow implementation of the entity integrity and referential integrity constraints
described in Chapter 3? What about referential triggered actions?
Step-by-step solution
Step 1 of 6
An entity integrity constraint specifies that every table must have a primary key and the primary
key should contain unique values and cannot contain null values.
SQL allows implementation of the entity integrity constraint using PRIMARY KEY clause.
• The PRIMARY KEY clause must be specified at the time of creating a table.
Comment
Step 2 of 6
Following are the examples to illustrate how the entity integrity constraint is implemented in SQL:
BOOK_TITLE VARCHAR(20),
BOOK_PRICE INT );
AUTHOR_NAME VARCHAR(20));
Comment
Step 3 of 6
A foreign key is an attribute or two or more attributes which is/are a primary key of other table
that is used to maintain relationship between two tables.
A referential integrity constraint specifies that the value of a foreign key should match with value
of the primary key in the primary table.
SQL allows implementation of the referential integrity constraint using FOREIGN KEY clause.
• The FOREIGN KEY clause must be specified at the time of creating a table.
• It ensures that it is not possible to add a value to a foreign key which does not exist in the
primary key of the primary/linked table.
Comment
Step 4 of 6
Following is the example to illustrate how the referential integrity constraint is implemented in
SQL:
BOOK_TYPE VARCHAR(20),
In the table BOOKSTORE, BOOK_CODE, AUTHOR_ID together form the primary key.
The use of the foreign key BOOK_CODE is that it is not possible to add a tuple to BOOKSTORE
table unless there is a valid BOOK_CODE in the BOOKS table.
The use of the foreign key AUTHOR_ID is that it is not possible to add a tuple to BOOKSTORE
table unless there is a valid AUTHOR_ID in the AUTHOR table.
Comment
Step 5 of 6
When a foreign key is violated, the default action performed by the SQL is to reject the operation.
• The options provided along with REFERENTIAL TRIGGERED ACTION are SET NULL, SET
DEFAULT, CASCADE.
Comment
Step 6 of 6
Following is the example to illustrate how the referential triggered action is implemented in SQL:
ENAME VARCHAR(20),
JOB VARCHAR(20),
SALARY INT,
Comment
Chapter 6, Problem 4RQ
Problem
Describe the four clauses in the syntax of a simple SQL retrieval query. Show what type of
constructs can be specified in each of the clauses. Which are required and which are optional?
Step-by-step solution
Step 1 of 1
The following are the four clauses of a simple SQL retrieval query.
Select:
• It is a statement connected with the From clause to extract or get the data from the database in
a human readable format.
From:
• The From clause should be used in combination with the Select statement for retrieving the
data.
• It will prompt the database to use which table to retrieve the data and we can mention multiple
tables in the from clause.
• It is required.
Where:
• It is used to impose conditions on the query and remove the rows or tuples which does not
satisfy the condition.
• We can use more than one condition in the where clause and
• It is optional.
Order By:
• This clause is used to sort the values of the output either in ascending order or descending
order.
Comment
Chapter 6, Problem 5E
Problem
Consider the database shown in Figure 1.2, whose schema is shown in Figure 2.1. What are the
referential integrity constraints that should hold on the schema? Write appropriate SQL DDL
statements to define the database.
Step-by-step solution
Step 1 of 2
From the figure 1.2 in the text book the referential integrity constraints that should hold
This represent a foreign key from the attributes A1, ..., An of referencing relation R
Comment
Step 2 of 2
StudentNumber INTEGER NOT NULL, Class CHAR NOT NULL, Major CHAR(4),
COURSE (CourseNumber) );
COURSE (CourseNumber) );
SECTION (SectionIdentifier) );
Comment
Chapter 6, Problem 6E
Problem
Exercise
Consider the database shown in Figure 1.2, whose schema is shown in Figure 2.1. What are the
referential integrity constraints that should hold on the schema? Write appropriate SQL DDL
statements to define the database.
Step-by-step solution
Step 1 of 10
Below referential integrity constraints for the AIR LINE data base schema is based on the figure
2.1 from the text book.
FLIGHT_LEG.(FLIGHT_NUMBER, LEG_NUMBER)
Comment
Step 2 of 10
CREATE (AIRPORT_CODE CHAR (3) NOT NULL, NAME VARCHAR (30) NOT NULL, CITY
VARCHAR (30) NOT NULL, STATE VARCHAR (30), PRIMARY KEY (AIRPORT_CODE) );
Comment
Step 3 of 10
CREATE TABLE FLIGHT (NUMBER VARCHAR (6) NOT NULL, AIRLINE VARCHAR (20) NOT
NULL, WEEKDAYS VARCHAR (10) NOT NULL, PRIMARY KEY (NUMBER));
Comment
Step 4 of 10
Comment
Step 5 of 10
Comment
Step 6 of 10
FARE_CODE VARCHAR (10) NOT NULL, AMOUNT DECIMAL (8, 2) NOT NULL,
Comment
Step 7 of 10
Comment
Step 8 of 10
Comment
Step 9 of 10
Comment
Step 10 of 10
Comment
Chapter 6, Problem 7E
Problem
Consider the LIBRARY relational database schema shown in Figure. Choose the appropriate
action (reject, cascade, set to NULL, set to default) for each referential integrity constraint, both
for the deletion of a referenced tuple and for the update of a primary key attribute value in a
referenced tuple. Justify your choices.
Step-by-step solution
Step 1 of 7
The appropriate actions of the LIBRARY relational database schema are as follows:
• The REJECT action will not permit the automatic changes in the LIBRARY database.
• If the BOOK is deleted the CASCADE on DELETE action is automatically propagated to the
rows of the referenced relation BOOK_AUTHORS.
• If the BOOK is updated the CASCADE on UPDATE action is automatically propagated to the
rows of the referenced relation BOOK_AUTHORS.
Therefore, the CASCADE on DELETE and CASCADE on UPDATE actions are chosen for the
above referential integrity.
Comment
Step 2 of 7
• It is not possible to delete the rows in the PUBLISHER relation because it is referenced to the
rows in the BOOK table.
Therefore, the ON DELETE REJECT and CASCADE on UPDATE actions are chosen for the
above referential integrity.
Comment
Step 3 of 7
• If the BOOK is deleted the CASCADE on DELETE action is automatically propagated to the
rows of the referenced relation BOOK_LOANS.
• If the BOOK is updated the CASCADE on UPDATE action is automatically propagated to the
rows of the referenced relation BOOK_LOANS.
• It is not possible to delete the rows in the BOOK relation because it is referenced to the rows in
the BOOK_LOANS table.
Comment
Step 4 of 7
• If a BOOK is deleted, then delete all its associated rows in the relation BOOK_COPIES.
• If the BOOK is deleted the CASCADE on DELETE action is automatically propagated to the
rows of the referenced relation BOOK_COPIES.
• If the BOOK is updated the CASCADE on UPDATE action is automatically propagated to the
rows of the referenced relation BOOK_COPIES.
Comment
Step 5 of 7
• If the rows deleted in a BORROWER table, the CASCADE on DELETE action is automatically
propagated to the rows of the referenced relation BOOK_LOANS.
• If the CardNo is updated in the BORROWER table, the CASCADE on UPDATE action is
automatically propagated to the rows of the referenced relation BOOK_LOANS.
• It is not possible to delete the rows in the BORROWER relation because it is referenced to the
rows in the BOOK_LOANS table.
Comment
Step 6 of 7
• If the Branch_id is updated in the LIBRARY_BRANCH table, the CASCADE on UPDATE action
is automatically propagated to the rows of the referenced relation BOOK_COPIES.
• It is not possible to delete the rows in the LIBRARY_BRANCH relation because it is referenced
to the rows in the BOOK_COPIES table.
Comment
Step 7 of 7
• If the Branch_id is updated in the LIBRARY_BRANCH table, the CASCADE on UPDATE action
is automatically propagated to the rows of the referenced relation BOOK_LOANS.
• It is not possible to delete the rows in the LIBRARY_BRANCH relation because it is referenced
to the rows in the BOOK_LOANS table.
Comment
Chapter 6, Problem 8E
Problem
Write appropriate SQL DDL statements for declaring the LIBRARY relational database schema of
Figure. Specify the keys and referential triggered actions.
Step-by-step solution
Step 1 of 7
Set of statements for the LIBRARY relational schema from the figure 6.14 in the text book. The
CREATE TABLE is like this:
CREATE TABLE BOOK ( BookId CHAR(20) NOT NULL, Title VARCHAR(30) NOT NULL,
PublisherName VARCHAR(20), PRIMARY KEY (BookId), FOREIGN KEY (PublisherName)
REFERENCES PUBLISHER (Name) ON UPDATE CASCADE );
Comment
Step 2 of 7
Comment
Step 3 of 7
CREATE TABLE PUBLISHER ( Name VARCHAR(20) NOT NULL, Address VARCHAR(40) NOT
NULL, Phone CHAR(12), PRIMARY KEY (Name) );
Comment
Step 4 of 7
CREATE TABLE BOOK_COPIES ( BookId CHAR(20) NOT NULL, BranchId INTEGER NOT
NULL, No_Of_Copies INTEGER NOT NULL, PRIMARY KEY (BookId, BranchId), FOREIGN KEY
(BookId) REFERENCES BOOK (BookId)
Comment
Step 5 of 7
CREATE TABLE BORROWER ( CardNo INTEGER NOT NULL, Name VARCHAR(30) NOT
NULL, Address VARCHAR(40) NOT NULL, Phone CHAR(12),
Comment
Step 6 of 7
CREATE TABLE BOOK_LOANS ( CardNo INTEGER NOT NULL, BookId CHAR(20) NOT NULL,
BranchId INTEGER NOT NULL, DateOut DATE NOT NULL,
Comment
Step 7 of 7
Comment
Chapter 6, Problem 9E
Problem
How can the key and foreign key constraints be enforced by the DBMS? Is the enforcement
technique you suggest difficult to implement? Can the constraint checks be executed efficiently
when updates are applied to the database?
Step-by-step solution
Step 1 of 3
Key constraint:
The technique that is often used to check efficiently for the key constraint is to create an index on
the combination of attributes that form each key (primary or secondary).
• Before inserting a new record (tuple), each index is searched to check that no value currently
exists in the index that matches the key value in the new record.
The technique to check the foreign key constraint is that using the index on the primary key of
each referenced relation will make the check relatively efficient.
Whenever a new record is inserted in a referencing relation, its foreign key value is used to
search the index for the primary key of the referenced relation, and if the referenced record
exists, then the new record can be successfully inserted in the referencing relation.
For deletion of a referenced record, it is useful to have an index on the foreign key of each
referencing relation so as to be able to determine efficiently whether any records reference the
record being deleted.
Comment
Step 2 of 3
, the enforcement technique of using the index is easy to identify the duplicate data
records.
• If any other alternative structure like hashing is used instead of using the index on key
constraint then it only does the linear searches to check for constraints and it makes the checks
quite inefficient.
Comment
Step 3 of 3
, the constraint checks are executed efficiently while inserting or deleting the record from
the database.
• Using the index to enforce the key constraint avoids the duplication of data records and this
helps the product vendors to achieve the greater data storage and management.
Comment
Chapter 6, Problem 10E
Problem
Specify the following queries in SQL on the COMPANY relational database schema shown in
Figure 5.5. Show the result of each query if it is applied to the COMPANY database in Figure 5.6.
a. Retrieve the names of all employees in department 5 who work more than 10 hours per week
on the ProductX project.
b. List the names of all employees who have a dependent with the same first name as
themselves.
c. Find the names of all employees who are directly supervised by ‘Franklin Wong’.
Step-by-step solution
Step 1 of 9
a)
Query:
where emp.Dno = 5 and emp.ssn = w.Essn and w.Pno = p.pnumber and p.pname = 'ProductX'
and w.hours > 10
Comment
Step 2 of 9
Result:
Fname Lname
John Smith
Joyce English
Comment
Step 3 of 9
Explanation:
The above query will display the names of all employees of department “5” and who works more
than 10 hours per week on the project “Product X”.
Comment
Step 4 of 9
b)
Query:
Comment
Step 5 of 9
Result: (empty)
Fname Lname
Comment
Step 6 of 9
Explanation:
The above query will display the names of the entire employee who have a dependent with the
same first name as themselves.
• Here, the result is empty. Because, it does not have the same first name in dependent and
employee table.
Comment
Step 7 of 9
c)
Query:
Comment
Step 8 of 9
Fname Lname
John Smith
Ramesh Narayan
Joyce English
Comment
Step 9 of 9
Explanation:
The above query uses self-join to display the names of all the employees who are under the
supervision of Franklin Wong.
Comment
Chapter 6, Problem 11E
If the same entity type participate more than once in a relationship type in different roles then such
relationship types are called recursive relationship. It occur within unary relationships. The relationship may
be one to one, one to many or many to many. That is the cardinality of the relationship is unary. The
connectivity may be 1:1, 1:M, or M:N.
For example, in the below gure REPORTS_TO is a recursive relationship as the Employee entity type plays
two roles – 1) Supervisor and 2) Subordinate.
The above relationship can also be de ned as relationship between a manager and a employee. An
employee is a manager as well as employee.
To implement recursive relationship, a foreign key of the employee’s manager number would be held in
each employee record.
Problem
Specify the following queries in SQL on the database schema of Figure 1.2.
a. Retrieve the names of all senior students majoring in ‘cs’ (computer science).
b. Retrieve the names of all courses taught by Professor King in 2007 and 2008.
c. For each section taught by Professor King, retrieve the course number, semester, year, and
number of students who took the section.
d. Retrieve the name and transcript of each senior student (Class = 4) majoring in CS. A
transcript includes course name, course number, credit hours, semester, year, and grade for
each course completed by the student.
Step-by-step solution
Step 1 of 4
a.
Query:
Output:
Explanation:
• There are no rows in the database where Class is Senior, and Major is CS.
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
• WHERE is used to specify a condition based on which the data is to be retrieved. In the
database, Seniors are represented by Class 4. The condition is as follows:
o Major='CS'AND Class = ‘4’
Comment
Step 2 of 4
b.
The query to get the course name that are taught by professor King in year 2007 and 2008 is as
follows:
Query:
SELECT Course_name
Output :
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
• WHERE is used to specify a condition based on which the data is to be retrieved. The
conditions are as follows:
o COURSE.Course_number = SECTION.Course_number
o Instructor = 'King'
o (Year='07' or Year='08')
• The conditions are concatenated with AND operator. All the conditions must be satisfied.
Comment
Step 3 of 4
c.
The query to retrieve the course number, Semester, Year and number of students who took the
section taught by professor King is as follows:
Query:
AND S.Section_identifier=G.Section_identifier;
Output :
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
o S.Instructor= 'King'
o S.Section_identifier=G.Section_identifier
Comment
Step 4 of 4
d.
The query to display the name and transcript of each senior students majoring in CS is as
follows:
Query:
Output :
No rows selected.
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
• WHERE is used to specify a condition based on which the data is to be retrieved. The
conditions are as follows:
o Class = 4
o Major='CS'
o ST.Student_number= G.Student_number
o G.Section_identifier= S.Section_identifier
o S.Course_number= C.Course_number
Comment
Chapter 6, Problem 13E
Problem
Write SQL update statements to do the following on the database schema shown in Figure 1.2.
d. Delete the record for the student whose name is ‘Smith’ and whose student number is 17.
Step-by-step solution
Step 1 of 4
a.
Query:
Explanation:
Output:
Comment
Step 2 of 4
b.
The query to update the class of a student with name Smith to 2 is as follows:
Query:
UPDATE STUDENT
SET CLASS = 2
WHERE Name='Smith';
Explanation:
Output:
Comment
Step 3 of 4
c.
Query:
Explanation:
Output:
Comment
Step 4 of 4
d.
Query:
Explanation:
Output:
Chapter 6, Problem 14E
Problem
b. Specify a number of queries in SQL that are needed by your database application.
c. Based on your expected use of the database, choose some attributes that should have
indexes specified on them.
Step-by-step solution
Step 1 of 6
Consider a student database that stores the information about students, courses and faculty.
a.
DOB date,
Gender char
);
The DDL statement to add a primary key to the relation STUDENT is as follows:
);
);
);
DateQualified varchar(12),
);
The DDL statement to add a column GradePoints to the relation COURSE is as follows:
Comment
Step 2 of 6
b.
A wide number of queries can be written using the five relations based on the requirement of the
user. So, the number of queries is not fixed and will vary.
Some of the possible queries that are needed by the database application are as follows:
SELECT *
FROM STUDENT;
SELECT *
FROM FACULTY;
SELECT *
FROM COURSE;
SELECT *
FROM TEACHES;
The query to retrieve the names of the students who have registered for a course is as follows:
WHERE STUDENT.StudentID=REGISTRATION.StudentID;
Comment
Step 3 of 6
The query to retrieve the courses with grade point 3 and above is as follows:
Comment
Step 4 of 6
c.
Indexes are used for faster retrieval of data. Some of the attributes that can used as indexes are
as follows:
Comment
Step 5 of 6
d.
Step 6 of 6
Comment
Chapter 6, Problem 15E
Problem
Consider that the EMPLOYEE table’s constraint EMPSUPERFK as specified in Figure 6.2 is
changed to read as follows:
a. What happens when the following command is run on the database state shown in Figure 5.6?
Step-by-step solution
Step 1 of 2
a)
From the figure 8.2 in the text book, while EMP table constraint specified as
From the figure 5.5 in the text book the result is like this.
The James E. Borg entry is deleted from the table, and each employee with him as a
supervisor is also (and their supervisees, and so on). In total, 8 rows are deleted and the
table is empty.
Comment
Step 2 of 2
b)
Yes, It is better to SET NULL, since an employee is not fired (DELETED) when their
supervisor is deleted. Instead, their SUPERSSN should be SET NULL so that they can later
Comment
Chapter 6, Problem 16E
Problem
Write SQL statements to create a table EMPLOYEE_BACKUP to back up the EMPLOYEE table
shown in Figure 5.6.
Step-by-step solution
Step 1 of 4
Step1:
);
Step2:
Insert the data into the EMPLOYEE table using INSERT command.
INSERT INTO EMPLOYEE VALUES ('James', 'E', 'Borg', '888665555', DATE '1937-11-10', '450
Stone, Houston, TX', 'M', 55000, NULL, 1);
INSERT INTO EMPLOYEE VALUES ('Jennifer', 'S', 'Wallace', '987654321', DATE '1941-06-20',
'291 Berry, Bellaire, Tx', 'F', 37000, '888665555', 4);
INSERT INTO EMPLOYEE VALUES ('Franklin', 'T', 'Wong', '333445555', DATE '1955-12-08',
'638 Voss, Houston, TX', 'M', 40000, '888665555', 5);
INSERT INTO EMPLOYEE VALUES ('John', 'B', 'Smith', '123456789', DATE '1965-01-09', '731
Fondren, Houston, TX', 'M', 30000, '333445555', 5);
INSERT INTO EMPLOYEE VALUES ('Alicia', 'J', 'Zelaya', '999887777', DATE '1968-01-19', '3321
castle, Spring, TX', 'F', 25000, '987654321', 4);
INSERT INTO EMPLOYEE VALUES ('Ramesh', 'K', 'Narayan', '666884444', DATE '1920-09-15',
'975 Fire Oak, Humble, TX', 'M', 38000, '333445555', 5);
INSERT INTO EMPLOYEE VALUES ('Joyce', 'A', 'English', '453453453', DATE '1972-07-31',
'5631 Rice, Houston, TX', 'F', 25000, '333445555', 5);
INSERT INTO EMPLOYEE VALUES ('Ahmad', 'V', 'Jabbar', '987987987', DATE '1969-03-29',
'980 Dallas, Houston, TX', 'M', 22000, '987654321', 4);
INSERT INTO EMPLOYEE VALUES ('Melissa', 'M', 'Jones', '808080808', DATE '1970-07-10',
'1001 Western, Houston, TX', 'F', 27500, '333445555', 5);
Step3:
Sample Output:
Comment
Step 2 of 4
The SQL statements to create a table EMPLOYEE_BACKUP to store the backup data of
EMPLOYEE table is as follows:
Explanation:
• The SQL statement will create the table EMPLOYEE_BACKUP with the same structure as the
table EMPLOYEE.
• LIKE is the keyword used to copy the structure of the table EMPLOYEE.
Comment
Step 3 of 4
Explanation:
• The SQL statement will insert the data in the table EMPLOYEE_BACKUP into the table
EMPLOYEE_BACKUP.
Comment
Step 4 of 4
SELECT * FROM EMPLOYEE will fetch all the records from the table EMPLOYEE.
Sample Output:
Comment
Chapter 7, Problem 1RQ
Problem
Describe the six clauses in the syntax of an SQL retrieval query. Show what type of constructs
can be specified in each of the six clauses. Which of the six clauses are required and which are
optional?
Step-by-step solution
Step 1 of 3
A query in SQL consists of up to six clauses. The clauses are specified in following order.
Comment
Step 2 of 3
The definition of the types of the values returned by the query is made with the help of the
SELECT clause.
The FROM clause is used to retrieve the desired data from the table for the provided query.
The WHERE clause is a conditional clause. It is used to retrieve the values with restriction.
The GROUP BY clause is used to group the results for the provided query according to the
properties.
The HAVING clause is used to retrieve the results of the GROUP BY clause with some
restriction.
The ORDER BY clause is used to sort the values returned by the query in a specific order.
Comment
Step 3 of 3
The SELECT and FROM clauses are the required clauses and the clauses like WHERE,
GROUP BY, HAVING and ORDER BY are optional clauses.
Comment
Chapter 7, Problem 2RQ
Problem
Describe conceptually how an SQL retrieval query will be executed by specifying the conceptual
order of executing each of the six clauses.
Step-by-step solution
Step 1 of 1
A retrieval query in SQL can consist of up to six clauses, but only the first two-SELECT and
FROM- are mandatory. The clauses are specified in the following order, with the clauses
between square brackets […] being optional:
SELECT
FROM
[WHERE]
[GROUP BY]
[HAVING]
[ORDER BY ]
The SELECT clause lists the attributes or functions to be retrieved. The FROM clause specifies
all relation needed in query, including joined relations, but not those in nested queries. The
WHERE clause specifies the conditions for selection of tuples from these relations, including join
conditions if needed. GROUP BY specifies grouping attributes, HAVING specifies a condition on
groups being selected rather than individual tuples. ORDER BY specifies an order for displaying
the result of a query.
A query is evaluated conceptually by first applying FROM clause, followed by WHERE clause,
and then GROUP BY, and HAVING. ORDER BY s applied at the end to sort the query result. The
values of the attributes specified in SELECT clause are shown in result.
Comment
Chapter 7, Problem 3RQ
Problem
Discuss how NULLs are treated in comparison operators in SQL. How are NULLs treated when
aggregate functions are applied in an SQL query? How are NULLs treated if they exist in
grouping attributes?
Step-by-step solution
Step 1 of 1
In SQL NULL is treated as an UNKNOWN value. SQL has thre logical operators TRUE, FALSE,
UNKNOWN.
For comparison operators in SQL, NULL can be compared using IS or IS NOT operator. SQL
treats each NULL as a distinct value, so =,<,> can not be used for comparison.
In general, NULL values are discarded when aggregate functions are applied to a particular
column.
If NULL exists in the grouping attribute, then separate group is created for all tuples with a NULL
value in the grouping attribute.
Comment
Chapter 7, Problem 4RQ
Problem
Discuss how each of the following constructs is used in SQL, and discuss the various options for
each construct. Specify what each construct is useful for.
a. Nested queries
d. Triggers
Step-by-step solution
Step 1 of 11
a.
Nested Queries:
A nested query is a type of SQL query that is used within another SQL queries with WHERE
clause. It is also known as sub query or Inner query.
Options:
It can be used with the SELECT, INSERT, UPDATE, and DELETE statements. These statements
are used with the operators <, >, <=, >=, =, IN, BETWEEN.
SYNTAX:
Get the employee id of all employee who are enrolled in the same business as the other
employee with salary 35000.
Select * from
where in
Use:
Comment
Step 2 of 11
b.
Joined Tables:
A joined-table is the resultant table that is the generated by an inner join, or an outer join, or a
cross join.
A joined table can be used in any context where the SELECT statement is used.
Outer Join:
Types of outer:
1) Left outer join: when left outer join is applied on tables it return all the rows from the left table
and those right table rows also came which is same in the left table row. It is denoted by the
symbol (?).
Syntax:
2) Right outer join: when the right outer join is applied to tables, it returns all the rows from the
right table and those left table rows also came which is same in the right table row. It is denoted
by the symbol (?).
Syntax:
3) Full outer join: when the full outer join is applied on the table it return all the rows from both
the left and the right table. It is denoted by the symbol (?).
Syntax:
SELECT column
FROM table_AFULLOUTERJOIN table_BON table_A.column1=table_B.column2;
Options:
Use:
Join can be used to get a resultant column or table by adding two different table.
Comment
Step 3 of 11
c.
Aggregate Functions:
It is a function where the multiple input values take from the Column to generate a single value
as an output
Aggregate functions are: Avg, Count, First, Last, Max, Min, Sum etc.
Option:
Use:
Grouping:
In many cases to subgroup the tuples in a relation the aggregation function may apply. These
subgroups are dependent on some attribute values. On applying the group by clause the table is
divided into different group.
Options:
Use:
The GROUP BY clause is applied when there is a need of dividing the table into different group
according the attributes values.
Comment
Step 4 of 11
d.
Triggers:
A database trigger is procedural code, which automatically execute or fire when event (INSERT,
DELETE or UPDATE) occurs.
Options:
Use:
Comment
Step 5 of 11
e.
Assertions:
It is an expression that should be always true. When there is create the expression should
always be true. DBMS checks the assertion after any change in the expression that may violate
the expression.
Option:
Use:
ASSERTIONS TRIGGERS
Assertion only check the conditions it do not Triggers check the condition and if required
modify the data. the change the data also.
Assertion neither linked the particular table nor Trigger linked the both particular table and
particular events in the database. particular in the database.
Oracle database does not implements Assertions. Oracle database implements Triggers.
Comment
Step 6 of 11
f.
This clause was introduced as a convenience in SQL 99 and it was added into the Oracle SQL
syntax in Oracle 9.2, it may not available in all SQL based DBMS. It allows the user to define the
table in a such a way that it is only being used in a particular query. It is sometime similar like
creating a view that will be used in a particular query then drop.
Option:
Used:
It can be used to create a complex statement rather than simple statements. It can be used to
break down complex SQL queries with which it easy for debugging and processing the complex
queries.
Comment
Step 7 of 11
g.
The SQL case constructs used as the if-else-then used in java similarly it is used in SQL. It can
be used when some value or any values is different on a particular condition. SQL case construct
can be used with any SQL query where the conditional values have to be extract.
Case expression
ELSE result
END;
Comment
Step 8 of 11
Option:
Comment
Step 9 of 11
Use:
Comment
Step 10 of 11
h.
The view is a virtual table which is derived from the other table and these other tables are base
table. And these base tables are physically exist and its tuples are stored in the database.
AS SELECT attributes
WHERE conditions;
It creates the view there is the name of the view and in the AS SELECT we define the attributes
which came under virtual table, the FROM clause defines the table from where the attributes will
be extracted for the virtual table and in the where there is particular condition which should be
satisfied by the virtual table.
Option:
Use:
The virtual table is create when the table need to reference frequently.
Comment
Step 11 of 11
i.
DROP command:
The drop commands can be used to drop schema elements, Such as tables, attributes,
constraints. The whole schema can be drop by the command DROP SCHEMA.
ALTER command:
The schema can be change with the help of the Alter command, such as changing the column
name, adding or dropping the attributes.
Use:
Comment
Chapter 7, Problem 5E
Problem
Specify the following queries on the database in Figure 5.5 in SQL. Show the query results if
each query is applied to the database state in Figure 5.6.
a. For each department whose average employee salary is more than $30,000, retrieve the
department name and the number of employees working for that department.
b. Suppose that we want the number of male employees in each department making more than
$30,000, rather than all employees (as in Exercise a). Can we specify this query in SQL? Why or
why not?
Step-by-step solution
Step 1 of 2
a)
The query to retrieve dname and count of employees working in that department whose average
salary is greater than 30000 is as follows:
Query:
GROUP BY Dname
Output:
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
o DEPARTMENT.Dnumber=EMPLOYEE.DNo
• GROUP BY is used to group the result of a SELECT statement done on a table where the tuple
values are similar for more than one column.
• COUNT(*) is used to count the number of tuples that satisfy the conditions.
Comment
Step 2 of 2
(b)
The query to retrieve dname and count of employees working in that department whose salary is
greater than 30000 is as follows:
Query:
WHERE DEPARTMENT.Dnumber=EMPLOYEE.DNo
AND Sex='M'
GROUP BY Dname;
Output:
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
o DEPARTMENT.Dnumber=EMPLOYEE.DNo
o Sex='M'
• GROUP BY is used to group the result of a SELECT statement done on a table where the tuple
values are similar for more than one column.
Comments (1)
Chapter 7, Problem 6E
Problem
Specify the following queries in SQL on the database schema in Figure 1.2.
a. Retrieve the names and major departments of all straight-A students (students who have a
grade of A in all their courses).
b. Retrieve the names and major departments of all students who do not have a grade of A in
any of their courses.
Step-by-step solution
Step 1 of 2
a.
The query to retrieve the names and major departments of the students who got A grade in all
the courses is as follows:
Query:
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
• The inner query retrieves the details of the student who got other than A grade for any courses.
• The outer query retrieves the name and major of the student who got A grade for all courses.
• NOT EXISTS is used to retrieve only those students which are not retrieved by inner query.
Output:
Comment
Step 2 of 2
b.
The query to retrieve the names and major departments of the students who got A grade in all
the courses is as follows:
Query:
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
• The inner query retrieves the details of the student who got A grade for any courses.
• The outer query retrieves the name and major of the student who did not get A grade for any
courses.
• NOT EXISTS is used to retrieve only those students which are not retrieved by inner query.
Output:
Comment
Chapter 7, Problem 7E
Problem
In SQL, specify the following queries on the database in Figure 5.5 using the concept of nested
queries and other concepts described in this chapter.
a. Retrieve the names of all employees who work in the department that has the employee with
the highest salary among all employees.
b. Retrieve the names of all employees whose supervisor’s supervisor has ‘888665555’ for Ssn.
c. Retrieve the names of employees who make at least $10,000 more than the employee who is
paid the least in the company.
Step-by-step solution
Step 1 of 4
SQL:
Structured Query Language (SQL) is a database language for managing and accessing the data
in a relational database.
• SQL consists of queries to insert, update, delete, and retrieve records from a database. It even
creates a new database and database table.
Nested query:
Some of the queries require the need of existing values to be obtained and then it is utilized in a
comparison condition. This is referred as nested query. In this, a completed “select from where”
blocks exist inside WHERE clause of a different query. This query is referred as outer query.
• To retrieve all the attributes of a table, instead of giving all attributes in the table, asterisk (*) can
be used.
o Condition is optional.
Comment
Step 2 of 4
a)
Query:
Explanation:
The first nested (outer) query selects all employee names. While the second query selects
department number with the employee of highest salary among all the employees.
Comment
Step 3 of 4
b)
Query:
Explanation:
The first nested (outer) query selects the employee names where the supervisor’s supervisor
serial number in the second query matches with the number “888665555”.
Comments (1)
Step 4 of 4
c)
Query:
SELECT LNAME FROM EMPLOYEE WHERE SALARY > 10000 + ( SELECT MIN(SALARY)
FROM EMPLOYEE)
Explanation:
The first nested (outer) query selects the employee names where the salary is greater than
10,000 and in the second query, it selects the employee who has the least salary.
Comment
Chapter 7, Problem 8E
Problem
Specify the following views in SQL on the COMPANY database schema shown in Figure 5.5.
a. A view that has the department name, manager name, and manager salary for every
department
b. A view that has the employee name, supervisor name, and employee salary for each
employee who works in the ‘Research’ department
c. A view that has the project name, controlling department name, number of employees, and
total hours worked per week on the project for each project
d. A view that has the project name, controlling department name, number of employees, and
total hours worked per week on the project for each project with more than one employee
working on it
Step-by-step solution
Step 1 of 4
a.
A view that has the department name along with the name and salary of the manager for every
department is as follows:
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
Comment
Step 2 of 4
b.
A view that has the employee name, supervisor name and employee salary for each employee
who works in the Research department is as follows:
e.Minit AS Employee_middle_init,
e.Lname AS Employee_last_name,
s.Fname AS Manager_fname,
s.Minit AS Manager_minit,
DEPARTMENT AS d
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
• WHERE is used to specify a condition based on which the data is to be retrieved. The
conditions specified in the query are
o e.Dno = d.Dnumber
o d.Dname = 'Research'
Comment
Step 3 of 4
c.
A view that has the project name, controlling department name, number of employees, and total
hours worked per week on the project is as follows:
DEPARTMENT AS D
GROUP_BY Pno;
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
• WHERE is used to specify a condition based on which the data is to be retrieved. The
conditions specified in the query are
o P.Dnum = D.Dnumber
o P.Pnumber = WO.Pno
• GROUP BY is used to group the result of a SELECT statement done on a table where the tuple
values are similar for more than one column.
Comment
Step 4 of 4
d.
The following is the view that has the project name, controlling department name, number of
employees, and total hours worked per week on the project for each project with more than one
employee working on it.
DEPARTMENT AS D
WHERE P.Dnum = D.Dnumber
GROUP_BY Pno
Explanation:
• SELECT is used to query the database and get back the specified fields.
• FROM is used to query the database and get back the preferred information by specifying the
table name.
• WHERE is used to specify a condition based on which the data is to be retrieved. The
conditions specified in the query are
o P.Dnum = D.Dnumber
o P.Pnumber = WO.Pno
• GROUP BY is used to group the result of a SELECT statement done on a table where the tuple
values are similar for more than one column.
Comment
Chapter 7, Problem 9E
Problem
Consider the following view, DEPT_SUMMARY, defined on the COMPANY database in Figure
5.6:
State which of the following queries and updates would be allowed on the view. If a query or
update would be allowed, show what the corresponding query or update on the base relations
would look like, and give its result when applied to the database in Figure 5.6.
a.
b.
c.
d.
e.
Step-by-step solution
Step 1 of 5
a) Allowed
D C Total_s Average_s
5 4 133000 33250
4 3 93000 31000
1 1 55000 55000
Comments (1)
Step 2 of 5
b) Allowed
D C
5 4
Comment
Step 3 of 5
c) Allowed
D Average_s
5 33250
Comment
Step 4 of 5
Comment
Step 5 of 5
Comment
Chapter 8, Problem 1RQ
Problem
Step-by-step solution
Step 1 of 6
• SELECT
• PROJECT
• THETA JOIN
• EQUI JOIN
• NATURAL JOIN
• UNION
• INTERSECTION
• MINUS or DIFFERENCE
• CARTESIAN PRODUCT
• DIVISION
Comment
Step 2 of 6
SELECT operation:
PROJECT operation:
Comment
Step 3 of 6
• THETA JOIN operation combines related tuples from two relations and outputs as a single
tuple.
• An EQUIJOIN operation combines all the tuples of relations R and S that satisfy the condition.
The comparison operator must be =.
Comment
Step 4 of 6
• It is similar to EQUIJOIN. The only difference is the join attributes of relation S are not included
in the resultant relation.
• When UNION operation is applied on relations R and S, the resultant relation consists of all the
tuples in relation R or S or both R and S.
• If similar tuples are in both R and S relations, then only one tuple will be in the resultant relation.
• The UNION operation can be applied on relations R and S only if the relations are union
compatible.
Comment
Step 5 of 6
INTERSECTION operation:
• When INTERSECTION operation is applied on relations R and S, the resultant relation consists
of only the tuples that are in both R and S.
• When DIFFERENCE operation is applied on relations R and S, the resultant relation consists of
only the tuples that are R but not in S.
Comment
Step 6 of 6
• When CARTESIAN PRODUCT operation is applied on relations R and S, the resultant relation
consists of all the attributes of relation R and S along with all possible combination of the tuples
of R and S.
DIVISION operation:
• This combines all the tuples of that appears in with every tuple from to
form a new relation where .
Comments (1)
Chapter 8, Problem 2RQ
Problem
Step-by-step solution
Step 1 of 2
Union compatibility: The two relations are said to be union compatible if both the relations have
the same number of attributes and the domain of the similar attributes is same.
Comment
Step 2 of 2
The UNION, INTERSECTION and DFFERENCE operations require that the relations on which
they are applied be union compatible because all these operations are binary set operations. The
tuples of the relations are directly compared under these operations and the tuples should have
same no of attributes and the domain of the similar attributes should be same.
Comment
Chapter 8, Problem 3RQ
Problem
Discuss some types of queries for which renaming of attributes is necessary in order to specify
the query unambiguously.
Step-by-step solution
Step 1 of 1
When a query has an NATURAL JOIN operation than renaming foreign key attribute is
necessary, if the name is not already same in both relations, for operation to get executed. In
EQUIJOIN after the operation is performed there are two attributes that have same values for all
tuples. These are attributes which have been checked in condition. In NATURAL JOIN one of
them has been removed only single attribute is there.
DIVISION operation is another such operation. Division takes place on basis of common attribute
so names must be same.
Comment
Chapter 8, Problem 4RQ
Problem
Discuss the various types of inner join operations. Why is theta join required?
Step-by-step solution
Step 1 of 6
From multiple relations when combining the data, then the related information can be presented
in single table.
Comment
Step 2 of 6
• In this operation, it will use the conditions and the relations with equality comparisons.
• is called an EQUIJOIN operator where the only comparison operator used in a JOIN
operation.
In the end result of equijoin operations, always have one or more pair of attributes.
Example syntax:
Or,
Comment
Step 3 of 6
NATURALJOIN operation:
• It is created to get rid of the second (superfluous) attribute in an EQUI JOIN condition.
Definition:
• The standard definition of the NATURAL JOIN operation requires two join attributes.
Comment
Step 4 of 6
• If the case is not possible, then the remaining operation is firstly applied.
Example syntax:
Comment
Step 5 of 6
• From two relations to combine tuples, where the combination condition for the equality of
shared attributes is not simple.
• Then it is convenient for the JOIN operation to have a more general form.
• operator is used to join the attributes those are NULL in the tuples or instructs the tuple do
not appear the result when the join condition is FALSE.
• So, the two relations will join that results in a subset of the Cartesian product, which is a
subset determined by the join condition.
Example syntax:
The result of
Comment
Step 6 of 6
Comment
Chapter 8, Problem 5RQ
Problem
What role does the concept of foreign key play when specifying the most common types of
meaningful join operations?
Step-by-step solution
Step 1 of 3
A foreign key is a column or composite of columns which is/are a primary key of other table that
is used to maintain relationship between two tables.
• A foreign key is mainly used for establishing relationship between two tables.
Comment
Step 2 of 3
The JOIN operation is used to combine related tuples from two relations into a single tuple.
• In order to perform JOIN operation, there should exist relationship between two tables.
• If there is no foreign key, then JOIN operation may not lead to meaningful results.
Hence, a foreign key concept is needed to establish relationship between two tables.
Comment
Step 3 of 3
Example:
DEPARTMENT(Dno,Dname, Mgr_ssn)
DeptNum is a foreign key in relation EMPLOYEE. The JOIN operation can be performed on two
relations based on the foreign key.
Comment
Chapter 8, Problem 6RQ
Problem
Step-by-step solution
Step 1 of 3
FUNCTION operation:
• The aggregate functions are SUM, AVERAGE, MAXIMUM, MINIMUM and COUNT.
Comment
Step 2 of 3
(R)
where,
attributes.
Comment
Step 3 of 3
FUNCTION operation is used for obtaining the summarized data from the relations.
The above query will find the maximum and minimum salary in the EMPLOYEE relation.
Comment
Chapter 8, Problem 7RQ
Problem
How are the OUTER JOIN operations different from the INNER JOIN operations? How is the
OUTER UNION operation different from UNION?
Step-by-step solution
Step 1 of 2
OUTER JOIN and INNER JOIN: Consider two relational databases R and S. When user wants
to keep all the tuples in R, or all those in S, or all the tuples in R, or all those in S, or all those in
both relations in the result of the JOIN regardless of weather or not they have matching tuples in
other relation, set of operations called outer joins can do so. This satisfies the need of queries in
which tuples from two tables are to be combined by matching corresponding rows, but without
losing any tuples for lack of matching values.
When only matching tuples (based on condition) are contained in resultant relation and not all
tuples then join is INNER JOIN (EQUIJOIN and NATURALJOIN).
In OUTER JOIN if matching values of other relation are not present fields are padded by NULL
value.
Comment
Step 2 of 2
OUTER UNION and UNION: For UNION operation databases have to be UNION compatible, i.e,
they have same number of attributes and each corresponding pair of attributes have same
domain.
OUTER UNION operation was developed to take the union of tuples from two relations if the
relations are not union compatible. This operation will take UNION of tuples in two relations R(X,
Y) and S(X,Z) that are partial compatible, meaning that only some attributes, say X, are union
compatible. Resultant relation is of form RESULT(X, Y, Z).
Two tuples t1 in R and T2 in S are said to match if t1[X] =t2[X] and are considered to contain
same entity instance. These are combined in single tuple.
Comment
Chapter 8, Problem 8RQ
Problem
In what sense does relational calculus differ from relational algebra, and in what sense are they
similar?
Step-by-step solution
Step 1 of 2
New relations are not created by performing operations on New relations can be obtained by
the existing relations. Formulas are directly applied on the performing operations on the
existing relations. existing relations.
Comment
Step 2 of 2
• Relational algebra and relational calculus are formal query languages for relational model.
Comment
Chapter 8, Problem 9RQ
Problem
How does tuple relational calculus differ from domain relational calculus?
Step-by-step solution
Step 1 of 2
• The order of the operations to be followed for getting the result is not specified.
• In other words, the evaluation of the query does not depend on the order of the operations.
Comment
Step 2 of 2
The differences between tuple relational calculus and domain relational calculus are as follows:
Comment
Chapter 8, Problem 10RQ
Problem
Discuss the meanings of the existential quantifier (∃) and the universal quantifier (∀).
Step-by-step solution
Step 1 of 2
Here
The statement is
If the formula F evaluates to TRUE for some tuple assigned to free occurrences of t in F, then the
formula is TRUE. Otherwise, it is FALSE.
Comment
Step 2 of 2
The statement is .
Evaluates to true for every tuple assigned to free occurrences of t in F, then F is TRUE other wire
it is FALSE.
Comment
Chapter 8, Problem 11RQ
Problem
Define the following terms with respect to the tuple calculus: tuple variable, range relation,atom,
formula, and expression.
Step-by-step solution
Step 1 of 3
Tuple relational calculus: The tuple relational calculus is a non-procedural language. It contains
a declarative expression that specifies what is to be retrieved.
Comment
Step 2 of 3
Range Relation: In the tuple relational calculus, every tuple ranges over a relation. The variable
takes any tuple as its value from the relation.
Atom: The atom in the tuple relational calculus identifies the range of the tuple variable. The
condition in the tuple relational calculus is made of atoms.
Comment
Step 3 of 3
Formula: A formula or condition is made of atoms. These atoms in the formula are connected
via the logical operators like AND, OR, NOT. Every atom in the formula is treated as a formula
i.e., the formula may or may not have multiple atoms.
Expression: The tuple relational calculus contains a declarative expression that specifies what is
to be retrieved.
Example:
Comment
Chapter 8, Problem 11RQ
Problem
Define the following terms with respect to the tuple calculus: tuple variable, range relation,atom,
formula, and expression.
Step-by-step solution
Step 1 of 3
Tuple relational calculus: The tuple relational calculus is a non-procedural language. It contains
a declarative expression that specifies what is to be retrieved.
Comment
Step 2 of 3
Range Relation: In the tuple relational calculus, every tuple ranges over a relation. The variable
takes any tuple as its value from the relation.
Atom: The atom in the tuple relational calculus identifies the range of the tuple variable. The
condition in the tuple relational calculus is made of atoms.
Comment
Step 3 of 3
Formula: A formula or condition is made of atoms. These atoms in the formula are connected
via the logical operators like AND, OR, NOT. Every atom in the formula is treated as a formula
i.e., the formula may or may not have multiple atoms.
Expression: The tuple relational calculus contains a declarative expression that specifies what is
to be retrieved.
Example:
Comment
Chapter 8, Problem 12RQ
Problem
Define the following terms with respect to the domain calculus: domain variable, range relation,
atom, formula, and expression.
Step-by-step solution
Step 1 of 3
Domain variable:-
To form a relation of degree ‘n’ for a query result, domain variables are used.
Ex:
The domain of domain variable Crs might be the set of possible values of the Crs code attribute
of the relation teaching.
Comment
Step 2 of 3
Range relation:-
In the domain calculus, the type of variables is used in formulas, other wise variables
having the range over tuples. The variable range over single values from domains of
attributes.
ATOM:-
Here R is the name of the relation of degree j and each , and is a domain variable.
Comment
Step 3 of 3
Formula:-
In a domain relational calculus formula is recursively defined. Starting with simple atomic
formulas and building bigger and better formulas using the logical connectives.
Expression:-
Here
Comment
Chapter 8, Problem 13RQ
Problem
Step-by-step solution
Step 1 of 3
An expression in relational calculus is said to be safe expression if it ensures to output a finite set
of tuples.
Comment
Step 2 of 3
The relational calculus expression that generates all the tuples from the universe that are not
student tuples is as follows:
It generates infinite number of tuples as there will be so many tuples other than student tuples.
Such expressions in relational calculus that does not generate a finite set of tuples are known as
unsafe expression.
Comment
Step 3 of 3
The generated tuples of the safe expression must be from the domain of an expression.
Otherwise it is considered as unsafe.
Comment
Chapter 8, Problem 14RQ
Problem
Step-by-step solution
Step 1 of 2
Comment
Step 2 of 2
• Almost all relational query languages (for example SQL) are relationally complete. They are
more expressive than relational algebra or relational calculus.
Comment
Chapter 8, Problem 15E
Problem
Show the result of each of the sample queries in Section 8.5 as it would apply to the database
state in Figure 5.6.
Step-by-step solution
Step 1 of 6
Query 1:-
Result
Comment
Step 2 of 6
Query 2:-
Step 3 of 6
Query 3:-
Result :-
LNAME F NAME
Query 4:
Result:
Is
P NO
Comment
Step 4 of 6
Query 5:-
Result:
L NAME F NAME
Smith John
Wong
Comment
Step 5 of 6
Query 6:-
Result :-
L NAME F NAME
Zelaga Alicia
Narayan Ramesh
English Joyce
Jobber Ahmad
Borg James
Comment
Step 6 of 6
Query 7:-
Result:
L NAME FNAME
Wallace Jennifer
Wong
Comment
Chapter 8, Problem 16E
Problem
Specify the following queries on the COMPANY relational database schema shown in Figure 5.5
using the relational operators discussed in this chapter. Also show the result of each query as it
would apply to the database state in Figure 5.6.
a. Retrieve the names of all employees in department 5 who work more than 10 hours per week
on the ProductX project.
b. List the names of all employees who have a dependent with the same first name as
themselves.
c. Find the names of all employees who are directly supervised by ‘Franklin Wong’.
d. For each project, list the project name and the total hours per week (by all employees) spent
on that project.
f. Retrieve the names of all employees who do not work on any project.
g. For each department, retrieve the department name and the average salary of all employees
working in that department.
i. Find the names and addresses of all employees who work on at least one project located in
Houston but whose department has no location in Houston.
j. List the last names of all department managers who have no dependents.
Step-by-step solution
Step 1 of 10
Comment
Step 2 of 10
Comment
Step 3 of 10
Comment
Step 4 of 10
Comments (1)
Step 5 of 10
Comment
Step 6 of 10
Comment
Step 7 of 10
Comments (1)
Step 8 of 10
Comment
Step 9 of 10
Comments (2)
Step 10 of 10
Comments (2)
Chapter 8, Problem 17E
Problem
Consider the AIRLINE relational database schema shown in Figure, which was described in
Exercise. Specify the following queries in relational algebra:
a. For each flight, list the flight number, the departure airport for the first leg of the flight, and the
arrival airport for the last leg of the flight.
b. List the flight numbers and weekdays of all flights or flight legs that depart from Houston
Intercontinental Airport (airport code ‘iah’) and arrive in Los Angeles International Airport (airport
code ‘lax’).
c. List the flight number, departure airport code, scheduled departure time, arrival airport code,
scheduled arrival time, and weekdays of all flights or flight legs that depart from some airport in
the city of Houston and arrive at some airport in the city of Los Angeles.
e. Retrieve the number of available seats for flight number ‘col97’ on ‘2009-10-09’.
Exercise
Consider the AIRLINE relational database schema shown in Figure, which describes a database
for airline flight information. Each FLIGHT is identified by a Flight_number, and consists of one or
more FLIGHT_LEGs with Leg_numbers 1, 2, 3, and so on. Each FLIGHT_LEG has scheduled
arrival and departure times, airports, and one or more LEG_INSTANCEs— one for each Date on
which the flight travels. FAREs are kept for each FLIGHT. For each FLIGHT_LEG instance,
SEAT_RESERVATIONs are kept, as are the AIRPLANE used on the leg and the actual arrival
and departure times and airports. An AIRPLANE is identified by an Airplane_id and is of a
particular AIRPLANE_TYPE. CAN_LAND relates AIRPLANE_TYPEs to the AIRPORTs at which
they can land. An AIRPORT is identified by an Airport_code. Consider an update for the AIRLINE
database to enter a reservation on a particular flight or flight leg on a given date.
c. Which of these constraints are key, entity integrity, and referential integrity constraints, and
which are not?
d. Specify all the referential integrity constraints that hold on the schema shown in Figure.
Step-by-step solution
Step 1 of 4
Step 2 of 4
a.
Following is the query to list the flight number, the first leg of flight’s departure airport, and the
last leg of flight’s arrival airport from each flight:
Explanation:
• FLIGH_LEG_IN holds the data about the combinations of FLIGHT and FLIGHT whose
FLIGHT’s Flight_number is equal to FLIGHT_LEG’s Flight_number.
• MIN_FLIGHT_LEG holds the data about Flight_numbers whose Leg_number is minimum in the
FLIGHT_LEG_IN.
• RESULT will display the resultant tuples of the Union of the Set Algebra of RESULT1 and
RESULT2.
Comments (1)
Step 3 of 4
b.
Following is the query to retrieve the flight numbers and weekdays of all flights or flight legs that
flies from Houston Intercontinental Airport whose code is given as ‘iah’ to Los Angeles
International Airport whose code is given as ‘lax’:
Explanation:
• FLIGH_LEG_IN holds the data about the combinations of FLIGHT and FLIGHT whose
FLIGHT’s Flight_number is equal to FLIGHT_LEG’s Flight_number.
c.
Following is the query to retrieve the flight number, airport code and scheduled time of departure,
airport code and scheduled time of arrival, and weekdays of all flights or flight legs that flies from
one of the airport in city of Houston and lands at one of the airport in Los Angeles:
Explanation:
• FLIGH_LEG_IN holds the data about the combinations of FLIGHT and FLIGHT whose
FLIGHT’s Flight_number is equal to FLIGHT_LEG’s Flight_number.
• The DEPART_CODE will hold the data about the Airport_code of AIRPORT whose City =
‘Houston’.
• The ARRIVE_CODE will hold the data about the Airport_code of AIRPORT whose City = ‘Los
Angeles’.
• The HOUST_DEPART holds the resultant of the relation obtained when the JOIN operation is
applied between the relations DEPART_CODE and FLIGHT_LEG_IN which satisfies condition
that Airport_Code = Departure_airport_code.
• The HOUST_TO_LA holds the resultant of the relation obtained when the JOIN operation is
applied between the relations ARRIVE_CODE and HOUST_DEPART which satisfies condition
that Airport_Code = Arrival_airport_code.
d.
Following is the query to retrieve the fare information of the whose flight number is ‘col97’:
Explanation:
RESULT will hold the data about the all the FARE’s whose Flight_number is ‘col97’.
Comment
Step 4 of 4
e.
Following is the query to get the number of available seats whose flight number is ‘col97’ and
dated on ‘2009-10-09’:
Explanation:
• LEG_INST_INFO holds the data about LEG_INSTANCE whose Flight_number is ‘col97’ and
Date is ‘2009-10-09’.
Comment
Chapter 8, Problem 18E
Problem
Consider the LIBRARY relational database schema shown in Figure, which is used to keep track
of books, borrowers, and book loans. Referential integrity constraints are shown as directed arcs
in Figure, as in the notation of Figure 5.7. Write down relational expressions for the following
queries:
a. How many copies of the book titled The Lost Tribe are owned by the library branch whose
name is ‘Sharpstown’?
b. How many copies of the book titled The Lost Tribe are owned by each library branch?
c. Retrieve the names of all borrowers who do not have any books checked out.
d. For each book that is loaned out from the Sharpstown branch and whose Due_date is today,
retrieve the book title, the borrower’s name, and the borrower’s address.
e. For each library branch, retrieve the branch name and the total number of books loaned out
from that branch.
f. Retrieve the names, addresses, and number of books checked out for all borrowers who have
more than five books checked out.
g. For each book authored (or coauthored) by Stephen King, retrieve the title and the number of
copies owned by the library branch whose name is Central.
Step-by-step solution
Step 1 of 7
a.
Following is the relational expression to find the number of copies of the book whose title is ‘The
Lost Tribe’ in the library branch whose name is ‘Sharpstown’:
Comment
Step 2 of 7
b.
Following is the relational expression to find the number of copies of the book whose title is ‘The
Lost Tribe’ is available at each branch of the library:
Comment
Step 3 of 7
c.
Following is the relational expression to retrieve the names of the borrowers who have no books
checked out:
Comment
Step 4 of 7
d.
Following is the relational expression to retrieve the book title, borrower’s name and address of
the book that is loaned out from of the borrowers who have no books checked out from branch
whose name is ‘Sharpstown’ and which has the due date as today:
Comment
Step 5 of 7
e.
Following is the relational expression to retrieve the branch name and the total number of books
loaned out from that branch:
Comments (1)
Step 6 of 7
f.
Following is the relational expression to retrieve the name, address and total number of books for
all borrowers who have more than five books checked out:
Comment
Step 7 of 7
g.
Following is the relational expression to retrieve the title and number of copies of each book
authored or coauthored by Stephen King in library branch whose name is Central:
Comment
Chapter 8, Problem 19E
Problem
Specify the following queries in relational algebra on the database schema given in Exercise:
a. List the Order# and Ship_date for all orders shipped from Warehouse# W2.
b. List the WAREHOUSE information from which the CUSTOMER named Jose Lopez was
supplied his orders. Produce a listing: Order#, Warehouse#.
c. Produce a listing Cname, No_of_orders, Avg_order_amt, where the middle column is the total
number of orders by the customer and the last column is the average order amount for that
customer.
d. List the orders that were not shipped within 30 days of ordering.
e. List the Order# for orders that were shipped from all warehouses that the company has in New
York.
Exercise
Consider the following six relations for an order-processing database application in a company:
ITEM(Item#, Unit_price)
WAREHOUSE(Warehouse#, City)
Here, Ord_amt refers to total dollar amount of an order; Odate is the date the order was placed;
and Ship_date is the date an order (or part of an order) is shipped from the warehouse. Assume
that an order can be shipped from several warehouses. Specify the foreign keys for this schema,
stating any assumptions you make. What other constraints can you think of for this database?
Step-by-step solution
Step 1 of 6
Relational Algebra
• Select: It is used to select the tuples and it is presented by a symbol σ. • Project: it is used to
projects the columns and it is represented by ∏. • Union is identified by ∪. • Set different is
identified by –. • Cartesian product is identified by Χ. • Rename is identified by ρ
Comment
Step 2 of 6
a.
Query to retrieve the order number and shipping date for all the orders that are shipped from
Warehouse "W2":
Explanation:
• First projects the Order# and Ship_date and then select the Warehouse# "W2" for all orders.
• The above query will select the fields Order# and Ship_date from the table SHIPMENT whose
Warehouse number = "W2" for all the orders.
Comment
Step 3 of 6
b.
Query to retrieve the order number and warehouse number for all the orders of customer named
"Jose Lopez":
Explanation:
• First select the Customer named "Jose Lopez" was supplied his orders and then project the
listing of Order#, Warehouse#.
• TEMP will give the details of the ORDER and the CUSTOMER table whose Cname is ‘Jose
Lopez’. The details of Jose Lopez will be the output.
• The above query will display only the Order# and Warehouse# and perform natural join on
SHIPMENT and the TEMP table whose Order# is same as the Order# number of TEMP.
Comment
Step 4 of 6
c.
Query to retrieve the Cname and total number of orders and average order amount of each
customer:
Explanation:
• The relation TEMP specifies the list of attributes between parenthesis in the RENAME
operation.
• To define the aggregate functions in the query by using the following syntax:
• The number of orders and average order amount is group by the cname field.
• The above query will display only the Customer name, number of orders, and average order
amount and perform natural join on CUSTOMER and the TEMP table whose Cust# is same as
the Cust# number of TEMP.
Comment
Step 5 of 6
d.
Query to list the orders that are not shipped within 30 days of ordering:
Explanation:
• First projects the Order#, Odate, Cust#, and Order_amt then select the orders were not shipped
within the thirty days.
• Select the number of days is calculated by subtracting order date from shipping date and
perform natural join on SHIPMENT whose Order# is same as the Order# number of ORDER.
Comment
Step 6 of 6
e.
Query to list the order# of the orders shipped from the warehouses located in New York:
Explanation:
• TEMP will give the details of the WAREHOUSE whose City is ‘NEW YORK’. The details of
‘NEW YORK’ will be the output.
• Project the Warehouse# from the SHIPMENT table and it is divided by the TEMP.
• The division operator includes all the rows in the SHIPMENT table in combination with every
row from relation TEMP and finally the resultant rows appear in the SHIPMENT relation.
Comment
Chapter 8, Problem 20E
Problem
Specify the following queries in relational algebra on the database schema given in Exercise:
a. Give the details (all attributes of trip relation) for trips that exceeded $2,000 in expenses.
c. Print the total trip expenses incurred by the salesperson with SSN = ‘234-56-7890’.
Exercise
Consider the following relations for a database that keeps track of business trips of salespersons
in a sales office:
A trip can be charged to one or more accounts. Specify the foreign keys for this schema, stating
any assumptions you make.
Step-by-step solution
Step 1 of 4
Comment
Step 2 of 4
Comment
Step 3 of 4
Comment
Step 4 of 4
c) Print the total trip expenses incurred by the salesman with SSN= ‘234-56-7890’.
Comment
Chapter 8, Problem 21E
Problem
Specify the following queries in relational algebra on the database schema given in Exercise:
a. List the number of courses taken by all students named John Smith in Winter 2009 (i.e.,
Quarter=W09).
b. Produce a list of textbooks (include Course#, Book_isbn, Book_title) for courses offered by the
‘CS’ department that have used more than two books.
c. List any department that has all its adopted books published by ‘Pearson Publishing’.
Exercise
Consider the following relations for a database that keeps track of student enrollment in courses
and the books adopted for each course:
Specify the foreign keys for this schema, stating any assumptions you make.
Step-by-step solution
Step 1 of 3
a.
Explanation:
• This query will give the courses taken by the student named ‘John
• Here, ‘Π’ is nothing but the projection, ‘σ’ represents selection operation and
Comment
Step 2 of 3
b.
Explanation:
• The above query will retrieve the list of textbooks for CS course with the use of natural join.
• The union operator for this query is used to get the common rows from two queries.
Comment
Step 3 of 3
c.
Explanation:
• The above query will list the departments which have all the adopted books published by
“Pearson publishing”.
• In this query ‘<>’ operator is used for “not equal to” operation.
Comment
Chapter 8, Problem 22E
Problem
Consider the two tables T1 and T2 shown in Figure 8.15. Show the results of the following
operations:
Step-by-step solution
Step 1 of 7
TABLE T1 TABLE T2
P Q R A B C
10 a 5 10 b 6
15 b 8 25 c 3
25 a 6 10 b 5
Comment
Step 2 of 7
P Q R A B C
10 a 5 10 b 6
10 a 5 10 b 5
25 a 6 25 c 3
Comment
Step 3 of 7
P Q R A B C
15 b 8 10 b 6
15 b 8 10 b 5
Comment
Step 4 of 7
c) The operation is “LEFT OUTER JOIN”. It produces the tuples that are in
the first or left relation T1 with the join condition . If no matching tuple is found in
T2, then the attributes are filled with a NULL values. Following table is the result of the “LEFT
OUTER JOIN” operation.
P Q R A B C
10 a 5 10 b 6
10 a 5 10 b 5
25 a 8 25 c 3
Comment
Step 5 of 7
d) The operation is “RIGHT OUTER JOIN”. It produces the tuples that are in
the second or right relation T2 with the join condition . If no matching tuple is found
in T1, then the attributes are filled with a NULL values. Following table is the result of the “RIGHT
OUTER JOIN” operation.
P Q R A B C
15 b 8 10 b 6
15 b 8 10 b 5
Comment
Step 6 of 7
e) The operation is “UNION”. It produces a relation that includes all the tuples that are
in T1 or T2 or both T1 and T2. The operation is possible since T1 and T2 are union compatible.
Following table is the result of the “UNION” operation.
P Q R
10 a 5
15 b 8
25 a 6
10 b 6
25 c 3
10 b 5
Comment
Step 7 of 7
P Q R A B C
10 a 5 10 b 5
Comment
Chapter 8, Problem 23E
Problem
Specify the following queries in relational algebra on the database schema in Exercise:
a. For the salesperson named ‘Jane Doe’, list the following information for all the cars she sold:
Serial#, Manufacturer, Sale_price.
c. Consider the NATURAL JOIN operation between SALESPERSON and SALE. What is the
meaning of a left outer join for these tables (do not change the order of relations)? Explain with
an example.
d. Write a query in relational algebra involving selection and one set operation and say in words
what the query does.
Exercise
Consider the following relations for a database that keeps track of automobile sales in a car
dealership (OPTION refers to some optional equipment installed on an automobile):
First, specify the foreign keys for this schema, stating any assumptions you make. Next, populate
the relations with a few sample tuples, and then give an example of an insertion in the SALE and
SALESPERSON relations that violates the referential integrity constraints and of another
insertion that does not.
Step-by-step solution
Step 1 of 4
(a)
Comment
Step 2 of 4
(b)
Comment
Step 3 of 4
(c)
Meaning of LEFT OUTER JOIN operation between SALESPERSON and SALE is that all the
records for which JOIN condition evaluates to be true and all the records from SALESPERSON
that do not match condition will also be displayed and attribute values for attributes
corresponding to SALE table will be marked as NULL.
a. ID_1,ABC,9999999
b. ID_2,DEF,8888888
a) ID_1,111, 2-08-2008,500000
Comment
Step 4 of 4
(d)
This query gives information about Doe couple, who happen to work at same place.
Comment
Chapter 8, Problem 24E
Problem
Specify queries a, b, c, e, f, i, and j of Exercise 8.16 in both tuple and domain relational calculus.
Specify the following queries on the COMPANY relational database schema shown in Figure 5.5
using the relational operators discussed in this chapter. Also show the result of each query as it
would apply to the database state in Figure 5.6.
a. Retrieve the names of all employees in department 5 who work more than 10 hours per week
on the ProductX project.
b. List the names of all employees who have a dependent with the same first name as
themselves.
c. Find the names of all employees who are directly supervised by ‘Franklin Wong’.
d. For each project, list the project name and the total hours per week (by all employees) spent
on that project.
f. Retrieve the names of all employees who do not work on any project.
g. For each department, retrieve the department name and the average salary of all employees
working in that department.
i. Find the names and addresses of all employees who work on at least one project located in
Houston but whose department has no location in Houston.
j. List the last names of all department managers who have no dependents.
Step-by-step solution
Step 1 of 10
The tuple relational calculus is dependent on the use of tuple variables. A tuple variable is a
named relation of “ranges over”.
The variables in the tuple relational calculus take their values from domains of attributes rather
than tuples of relations.
Comment
Step 2 of 10
a.
• Select the LNAME, FNAME attributes of the EMPLOYEE relation where DNO=5 work for
HOURS>10.
Explanation:
• In the provided Tuple Relational calculus, the EMPLOYEE considers as the a, the PROJECT
considers as the b and the WORKS_ON considers as the c.
• In the above tuple relational calculus, there is a free variable a and these appear to the left of
the bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfy the
conditions provided after the bar.
• The conditions EMPLOYEE (a) and WORKS_ON (c) specify the range relations for a and c.
The condition a.ssn=c.ESSN is a join condition.
• There is a need of the 10 variables for the EMPLOYEE relation, of the variables q, r, s…z. The
only q and s are free because they appear to the left of the bar.
• Firstly, there is a specification of the requested attribute, the name of the barrower, by the free
domain variable q and s for Name fields.
• A condition relating two domain variables from relations t=e is a join condition.
Comment
Step 3 of 10
b.
• Select the LNAME, FNAME attributes of the EMPLOYEE relation who have a dependent with
the same first name as themselves.
Explanation:
• In the provided Tuple Relational calculus, the EMPLOYEE considers as the a and the
DEPENDENT considers as the b.
• In the above tuple relational calculus, there is a free variable a and these appear to the left of
the bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfy the
conditions provided after the bar.
• The conditions EMPLOYEE (a) and DEPENDENT (b) specify the range relations for a and b.
The condition a.ssn=b.ESSN is a join condition.
Explanation:
• There is a need of the 10 variables for the EMPLOYEE relation, of the variables q, r, s…z. The
only q and s are free because they appear to the left of the bar.
• Firstly, there is a specification of the requested attribute, the name of the barrower, by the free
domain variable q and s for Name fields.
• A condition relating two domain variables from relations a=t and b=q is a join condition.
Comment
Step 4 of 10
c.
• Select the LNAME, FNAME attributes of the EMPLOYEE relation to find the names of
employees that are directly supervised by 'Franklin Wong'.
Explanation:
• In the provided Tuple Relational calculus, the EMPLOYEE considers as the a and the
EMPLOYEE considers as the b by using self-join.
• In the above tuple relational calculus, there is a free variable a and these appear to the left of
the bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfy the
conditions provided after the bar.
• The conditions EMPLOYEE (a) and EMPLOYEE (b) specify the range relations for e and s. The
condition a.ssn=b.SSN is a self-join condition.
Explanation:
• There is a need of the 10 variables for the EMPLOYEE relation, of the variables q, r, s…z. The
only q and s are free because they appear to the left of the bar.
• Firstly, there is a specification of the requested attribute, the name of the barrower, by the free
domain variable q and s for Name fields.
• A condition relating two domain variables from relations y=d and S.FNAME='Franklin' AND
S.LNAME='Wong' is a join condition.
Comment
Step 5 of 10
e.
• Select the LNAME, FNAME attributes of the EMPLOYEE relation to retrieve the names of
employees who work on every project.
Explanation:
• In the provided Tuple Relational calculus, the EMPLOYEE considers as the a and the FORALL
PROJECT considers as the b.
• In the above tuple relational calculus, there is a free variable a and these appear to the left of
the bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfy the
conditions provided after the bar.
• The conditions EMPLOYEE (a) and FORALL PROJECT (b) specify the range relations for a
and b. The condition WHERE PNUMBER=PNO AND ESSN=SSN.
Explanation:
• There is a need of the 10 variables for the EMPLOYEE relation, of the variables q, r, s…z. The
only q and s are free because they appear to the left of the bar.
• Firstly, there is a specification of the requested attribute, the name of the barrower, by the free
domain variable q and s for Name fields.
• A condition relating two domain variables from relations e=t and PNUMBER=PNO AND
ESSN=SSN is a join condition.
Comment
Step 6 of 10
f.
• Select the LNAME, FNAME attributes of the EMPLOYEE relation to retrieve the names of
employees who do not work on any project.
Comment
Step 7 of 10
Explanation:
• In the provided Tuple Relational calculus, the EMPLOYEE considers as the a and the
WORKS_ON considers as the b.
• In the above tuple relational calculus, there is a free variable a and these appear to the left of
the bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfy the
conditions provided after the bar.
• The conditions EMPLOYEE (a) and WORKS_ON (b) specify the range relations for e and w.
The condition WHERE ESSN=SSN.
Explanation:
• There is a need of the 10 variables for the EMPLOYEE relation, of the variables q, r, s…z. The
only q and s are free because they appear to the left of the bar.
• Firstly, there is a specification of the requested attribute, the name of the barrower, by the free
domain variable q and s for Name fields.
• A condition relating two domain variables from relations a=t WHERE ESSN=SSN is a join
condition.
Comment
Step 8 of 10
i.
• Select the LNAME, FNAME, and ADDRESS attributes of the EMPLOYEE relation employees
who work on at least one project located in Houston.
Explanation:
• In the provided Tuple Relational calculus, the EMPLOYEE considers as the a, the PROJECT
considers as the b and the WORKS_ON considers as the c.
• In the above tuple relational calculus, there is a free variable a and these appear to the left of
the bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfy the
conditions provided after the bar.
• The conditions EMPLOYEE (a) and WORKS_ON (c) specify the range relations for a and c.
The condition a.ssn=c.ESSN and PNO=PNUMBER AND PLOCATION='Houston' is a join
condition.
Explanation:
• There is a need of the 10 variables for the EMPLOYEE relation, of the variables q, r, s…z. The
only q and s are free because they appear to the left of the bar.
• Firstly, there is a specification of the requested attribute, the name of the barrower, by the free
domain variable q, s, and v for Name and address fields.
• A condition relating two domain variables from relations t=e and e.ssn=w.ESSN and
PNO=PNUMBER AND PLOCATION='Houston' is a join condition.
Comment
Step 9 of 10
j.
• Select the LNAME attribute of the EMPLOYEE relation of department managers who have no
dependents.
Explanation:
• In the provided Tuple Relational calculus, the EMPLOYEE considers as the a, the
DEPARTMENT considers as the b and the DEPENDENT considers as the c.
Comment
Step 10 of 10
In the above tuple relational calculus, there is a free variable a and these appear to the left of the
bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfy the
conditions provided after the bar.
• The conditions EMPLOYEE (a) and DEPARTMENT (b) specify the range relations for e and d.
The condition a.ssn=b.MGRSSN and SSN=ESSN is a join condition.
Explanation:
• There is a need of the 10 variables for the EMPLOYEE relation, of the variables q, r, s…z. The
only s is free because they appear to the left of the bar.
• Firstly, there is a specification of the requested attribute, the name of the barrower, by the free
domain variable s for Name fields.
• A condition relating two domain variables from relations e=t and e.ssn=d.MGRSSN and
SSN=ESSN is a join condition.
Comment
Chapter 8, Problem 25E
Problem
Specify queries a, b, c, and d of Exercise 1 in both tuple and domain relational calculus.
Exercise 1
Consider the AIRLINE relational database schema shown in Figure, which was described in
Exercise 2. Specify the following queries in relational algebra:
a. For each flight, list the flight number, the departure airport for the first leg of the flight, and the
arrival airport for the last leg of the flight.
b. List the flight numbers and weekdays of all flights or flight legs that depart from Houston
Intercontinental Airport (airport code ‘iah’) and arrive in Los Angeles International Airport (airport
code ‘lax’).
c. List the flight number, departure airport code, scheduled departure time, arrival airport code,
scheduled arrival time, and weekdays of all flights or flight legs that depart from some airport in
the city of Houston and arrive at some airport in the city of Los Angeles.
e. Retrieve the number of available seats for flight number ‘col97’ on ‘2009-10-09’.
Exercise 2
Consider the AIRLINE relational database schema shown in Figure, which describes a database
for airline flight information. Each FLIGHT is identified by a Flight_number, and consists of one or
more FLIGHT_LEGs with Leg_numbers 1, 2, 3, and so on. Each FLIGHT_LEG has scheduled
arrival and departure times, airports, and one or more LEG_INSTANCEs— one for each Date on
which the flight travels. FAREs are kept for each FLIGHT. For each FLIGHT_LEG instance,
SEAT_RESERVATIONs are kept, as are the AIRPLANE used on the leg and the actual arrival
and departure times and airports. An AIRPLANE is identified by an Airplane_id and is of a
particular AIRPLANE_TYPE. CAN_LAND relates AIRPLANE_TYPEs to the AIRPORTs at which
they can land. An AIRPORT is identified by an Airport_code. Consider an update for the AIRLINE
database to enter a reservation on a particular flight or flight leg on a given date.
c. Which of these constraints are key, entity integrity, and referential integrity constraints, and
which are not?
d. Specify all the referential integrity constraints that hold on the schema shown in Figure.
Step-by-step solution
Step 1 of 5
a.
• In the above tuple relational calculus there are two free variable f and l and these appear to the
left of the bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfies the
conditions provided after the bar.
• The conditions FLIGHT (f) and FLIGHT_LEG (l) specifies the range relations for f and l. The
condition f.Fnumber = l.flight_number is a join condition, whose purpose is similar to the INNER
JOIN operation
• There are need of the 10 variables for the FLIGHT relation, of the ten variables q, r, s…z. Only
q, and v are free, because they appear to the left of the bar.
• Firstly there is specification of the requested attributes, flight number, departure airport for the
first leg of the flight and the arrival airport for the last leg of the flight.
• A condition relating two domain variable from relations m=z is a join condition.
Comment
Step 2 of 5
b.
In the provided Tuple Relational calculus the FLIGHT consider as the f and the FLIGHT_LEG
consider as the l.
• In the created tuple relational calculus there is a single to free variable f this is appear to the left
of the bar ( | ) .
• The variables are retrieved which come before the bar (|), for all those tuples which satisfies the
conditions provided after the bar.
• The conditions FLIGHT (f) and FLIGHT_LEG (l) specified the range relations for f and l. The
condition f.Fnumber = l.flight_number is a join condition, whose purpose is similar to the INNER
JOIN operation.
• There are need of the 10 variables for the FLIGHT relation, of the ten variables q, r, s…z, only
u, and v are free, because they appear to the left of the bar.
• Firstly there is specification of the requested attributes flight number, Weekdays, departurefrom
the Houstonintercontinental and arrive in los Angeles international Airport and of all the flight and
the arrival airport for the last leg of the flight.
• The values assigned to the variable qrstuvwxyz, they become the tuple of the FLIGHT relation
and these values are for q (Departure_airport_code) and r (Arrival_airport_code) is equal to ‘iah’
and ‘Iax’ respectively.
• Then there is condition for selecting a tuple after the bar (|).
• A condition relating two domain variable from relations m=z is a join condition.
Comment
Step 3 of 5
c.
In the provided Tuple Relational calculus the FLIGHT consider as the f and the FLIGHT_LEG
consider as the l.
• In the created tuple relational calculus there are two free variable f and l and these appear to
the left of the bar (|) .
• The variables are retrieved which come before the bar (|), for all those tuples which satisfies the
conditions provided after the bar.
• The condition l.Departure_airport_code=’iah’ and l.Arrival_airport_code=’Iax’ is a selection
condition, which is similar to the SELECT operation in relational algebra.
Comment
Step 4 of 5
The conditions FLIGHT (f) and FLIGHT_LEG (l) specifies the range relations for f and l. The
condition f.Fnumber = l.flight_number is a join condition, whose purpose is similar to the INNER
JOIN operation
• There are need of the 10 variables for the FLIGHT relation and 5 variable for FLIGHT_LEG, of
the 15 variables k, l…..q, r, s…z, only u, l, m, n, o and v are free, because they appear to the left
of the bar.
• The values assigned to the variable qrstuvwxyz and jklmnop, they become the tuple of the
FLIGHT, FLIGHT_LEG relation and these values are for q (Departure_airport_code) and r
(Arrival_airport_code) is equal to ‘iah’ and ‘Iax’ respectively.
• Then there is condition for selecting a tuple after the bar (|).
• A condition relating two domain variable from relations m=z is a join condition.
Comment
Step 5 of 5
d.
• In the created tuple relational calculus there are two free variable f and r and these appear to
the left of the bar (|) .
• The variables are retrieved which come before the bar (|), for all those tuples which satisfies the
conditions provided after the bar.
• The condition FLIGHT (f) and FARE(r) specifies the range relations for f and r. The condition
r.Fnumber = f.flight_number is a join condition, whose purpose is similar to the INNER JOIN
operation
• There are need of the 10 variables for the FLIGHT relation and 5 variable for FLIGHT_LEG, of
the 15 variables k, l…..q, r, s…z, only s, t, u, v, and m are free, because they appear to the left of
the bar.
• Firstly there is specification of the requested attributes flight number, Fare_code, Amount,
Restrication, and Airline for all fare information for flight number ‘col197’.
• The values assigned to the variable qrstuvwxyz and lmnop , they become the tuple of the
FARE, FLIGHT relation and these values are for q (flight_number) is equal to ‘col197’.
• Then there is condition for selecting a tuple after the bar (|).
• A condition relating two domain variable from relations m=z is a join condition
Comment
Chapter 8, Problem 26E
Problem
Specify queries c, d, and f of Exercise in both tuple and domain relational calculus.
Exercise
Consider the LIBRARY relational database schema shown in Figure, which is used to keep track
of books, borrowers, and book loans. Referential integrity constraints are shown as directed arcs
in Figure, as in the notation of Figure 5.7. Write down relational expressions for the following
queries:
a. How many copies of the book titled The Lost Tribe are owned by the library branch whose
name is ‘Sharpstown’?
b. How many copies of the book titled The Lost Tribe are owned by each library branch?
c. Retrieve the names of all borrowers who do not have any books checked out.
d. For each book that is loaned out from the Sharpstown branch and whose Due_date is today,
retrieve the book title, the borrower’s name, and the borrower’s address.
e. For each library branch, retrieve the branch name and the total number of books loaned out
from that branch.
f. Retrieve the names, addresses, and number of books checked out for all borrowers who have
more than five books checked out.
g. For each book authored (or coauthored) by Stephen King, retrieve the title and the number of
copies owned by the library branch whose name is Central.
Step-by-step solution
Step 1 of 3
c.
Following is the relational expression to retrieve the names of the borrowers who have no books
checked out:
Explanation:
• In the provided Tuple Relational calculus, the Borrower considers as the b and the Book_Loans
consider as the l.
• In the above tuple relational calculus, there are two free variable b and l and these appear to
the left of the bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfy the
conditions provided after the bar.
• The conditions Borrower (b) and Book_Loans (l) specifiy the range relations for b and l. The
condition b.Card_No = l.Card_No is a join condition.
• There is a need of the 10 variables for the Borrower relation, of the ten variables q, r, s…z. The
only q is free because they appear to the left of the bar.
• Firstly, there is a specification of the requested attribute, the name of the barrower, by the free
domain variable q for Name filed.
• A condition relating two domain variables from relations m=z is a join condition.
Comment
Step 2 of 3
d.
Following is the relational expression to retrieve the book title, borrower’s name and address of
the book that is loaned out from of the borrowers who have no books checked out from a branch
whose name is ‘Sharps town’ and which has the due date as today:
Explanation:
• In the provided Tuple Relational calculus, the Borrower considers as the b and the Book_Loans
consider as the c.
• In the above tuple relational calculus, there are two free variable b and c and these appear to
the left of the bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfy the
conditions provided after the bar.
• The conditions Borrower (b) and Book_Loans (c) specify the range relations for and l. The
condition b.branch_name = “sharptown” and c.Card_No = b.Card_No and c.Card_No =
a.Card_No is a join condition.
• There is a need of the 16 variables for the BOOK relation, The only a,e, and f are free because
they appear to the left of the bar.
• Firstly, there is a specification of the requested attributes title from book and name and address
fields form borrower.
• The values assigned to the variable ijklm, they become the tuple of the Book_loans relation and
these values are for i (card_no) is equal to o (card_no) and branch_name=”Sharptown”.
• Then there is a condition for selecting a tuple after the bar (|).
• A condition relating two domain variables from relations i=o and j=f is a join condition.
Comment
Step 3 of 3
f.
Following is the relational expression to retrieve the name, address and the total number of
books for all borrowers who have more than five books checked out:
Explanation:
• In the provided Tuple Relational calculus, the Borrower considers as the b and the Book_Loans
consider as the a.
• In the above tuple relational calculus, there are two free variable b and a and these appear to
the left of the bar (|).
• The variables are retrieved which come before the bar (|), for all those tuples which satisfy the
conditions provided after the bar.
• The conditions Borrower (b) and Book_Loans (a) specify the range relations for b and a. The
condition b.Card_No = l.Card_No is a join condition and retrieve the total number of books for all
borrowers using count() function.
Explanation:
• There is a need of the 10 variables for the Borrower relation, of the ten variables q, r, s…z. The
only q,s, and v are free because they appear to the left of the bar.
• Firstly, there is a specification of the requested attribute, the name of the barrower, by the free
domain variable q for Name filed, s for address, and v for a total number of books.
• There is a condition for selecting a tuple after the bar (|).
• A condition relating two domain variables from relations m=z is a join condition and count is
greater than 5.
Comment
Chapter 8, Problem 27E
Problem
In a tuple relational calculus query with n tuple variables, what would be the typical minimum
number of join conditions? Why? What is the effect of having a smaller number of join
conditions?
Step-by-step solution
Step 1 of 1
In a tuple relational calculus, query with n tuple variables should be at least ( n – 1) join
conditions, and the second side, the Cartesian product with one of the range relations would be
taken. This usually does not make sense.
Comment
Chapter 8, Problem 28E
Problem
Rewrite the domain relational calculus queries that followed Q0 in Section 8.7 in the style of the
abbreviated notation of Q0A, where the objective is to minimize the number of domain variables
by writing constants in place of variables wherever possible.
Step-by-step solution
Step 1 of 5
Comment
Step 2 of 5
This condition relations two domain variables, here range over attribute from two relations are.
m = 2 in Q1 and
Comment
Step 3 of 5
Comment
Step 4 of 5
Comment
Step 5 of 5
Remaining queries Q6, and Q7 will not be different so, they have no constants.
Comment
Chapter 8, Problem 29E
Problem
Consider this query: Retrieve the Ssns of employees who work on at least those projects on
which the employee with Ssn = 123456789 works. This may be stated as (FORALL x) (IF P
THEN Q), where
Step-by-step solution
Step 1 of 1
Comment
Chapter 8, Problem 30E
Problem
Show how you can specify the following relational algebra operations in both tuple and domain
relational calculus.
a. σA=C(R(A, B, C))
c. R(A, B, C) * S(C, D, E)
d. R(A, B, C) ⋃ S(A, B, C)
e. P(A, B, C) ⋂ S(A, B, C)
f. P(A, B, C) = S(A, B, C)
g. R(A, B, C) ×S(D, E, F)
h. P(A, B) ÷ S(A)
Step-by-step solution
Step 1 of 7
(a)
Comment
Step 2 of 7
(b)
Comment
Step 3 of 7
(c)
Comment
Step 4 of 7
(d)
Comment
Step 5 of 7
(e)
(f)
Comments (1)
Step 6 of 7
(g)
Comment
Step 7 of 7
(h)
Comment
Chapter 8, Problem 31E
Problem
Suggest extensions to the relational calculus so that it may express the following types of
operations that were discussed in Section 8.4: (a) aggregate functions and grouping; (b) OUTER
JOIN operations; (c) recursive closure queries.
Step-by-step solution
Step 1 of 3
1. We can define a relation AGGREGATE with attributes Sum, Minimum, Maximum, Average,
Count etc. Using any query we can say
We can get sum of salary of all Employees. We can include similar functions for other aggregate
operations.
Comment
Step 2 of 3
2. For OUTER JOIN a special Operation say with symbol δ can be used.
Comment
Step 3 of 3
So by specifying that it is a recursive closure operation we may instruct system to calculate result
of query.
Comment
Chapter 8, Problem 32E
Problem
A nested query is a query within a query. More specifically, a nested query is a parenthesized
query whose result can be used as a value in a number of places, such as instead of a relation.
Specify the following queries on the database specified in Figure 5.5 using the concept of nested
queries and the relational operators discussed in this chapter. Also show the result of each query
as it would apply to the database state in Figure 5.6.
a. List the names of all employees who work in the department that has the employee with the
highest salary among all employees.
b. List the names of all employees whose supervisor’s supervisor has ‘888665555’ for Ssn.
c. List the names of employees who make at least $10,000 more than the employee who is paid
the least in the company.
Step-by-step solution
Step 1 of 3
Result:
Comment
Step 2 of 3
b. List the names of all employees whose supervisor’s supervisor has '888665555' for SSN.
Result:
Comments (1)
Step 3 of 3
c. List the names of employees who make at least $10,000 more than the employee who is paid
the least in the company.
Result:
Comment
Chapter 8, Problem 33E
Problem
c. (∃ x) (P(x)) → ∀ x: (( P(x))
Step-by-step solution
Step 1 of 3
(a) TRUE
Comments (2)
Step 2 of 3
(b) TRUE
Comment
Step 3 of 3
(c) FALSE
Comment
Chapter 8, Problem 34LE
Problem
Specify and execute the following queries in relational algebra (RA) using the RA interpreter on
the COMPANY database schema in Figure 5.5.
a. List the names of all employees in department 5 who work more than 10 hours per week on
the ProductX project.
b. List the names of all employees who have a dependent with the same first name as
themselves.
c. List the names of employees who are directly supervised by Franklin Wong.
f. List the names and addresses of employees who work on at least one project located in
Houston but whose department has no location in Houston.
Step-by-step solution
Step 1 of 7
a)
EMP_WORK_PRODUCT<--(σPname=’ProductX’(Project)) ?(Pnumber),(Pno)
(Works_on)
EMP_W_10<--
(Employee)?(Ssn,Essn)(σHours>10(EMP_WORK_PRODUCT))
Explanation: The above query will display the names of all the employee of department and also
who works more than 10 hrs per week on the Product X project. For this query we have used
natural join and ‘σ’ is for selecting and ‘π’ is projection which eliminates duplicates.
Comment
Step 2 of 7
b)
Explanation: The above query will display the names of all the employees who have a
dependent with the same first as themselves.
Comment
Step 3 of 7
C)
Wong_S<--πSsn(σFname=’Franklin’and Lname=’Wong’(Employee))
π Lname, Fname,Minit(Emp_wong)
Explanation: The above query we use self join in this query to display the names of all the
employees who are under the supervision of Franklin Wong.
Comment
Step 4 of 7
D)
Explanation: The above query will give the names of employees who work on every project by
using minus operator which will remove all the rows that exists in left side table.
Comment
Step 5 of 7
e)
Explanation: The above query will give the names of employees who does not works on any
project by using minus operator which will remove all the rows that exists in left side table.
Comment
Step 6 of 7
f)
Emp_proj_Hou(Ssn)<--πEssn(Works_on(Pno),(Pnumber)(σPlocation=’Houston’(Project)))
Dept_NOLOC_HOU <--
πDno(Department)–πDno(σDlocation= ‘Houston’(Department’)
Emp_Dept_No_Hou<--
Explanation: the above query will give the names and address of employees who work at least
one project located in ‘Houston’ and no department location in ‘Houston’ by using minus operator
which will remove all the rows that exists in left side table.
Comment
Step 7 of 7
g)
Explanation: the above query will give the names of department managers who have no
dependents by using minus operator which will remove all the rows that exists in left side table.
Comment
Chapter 8, Problem 35LE
Problem
Consider the following MAILORDER relational schema describing the data for a mail order
company.
ZIP_CODES(Zip, City)
Qoh stands for quantity on hand : the other attribute names are self- explanatory. Specify and
execute the following queries using the RA interpreter on the MAILORDER database schema.
b. Retrieve the names and cities of employees who have taken orders for parts costing more
than $50.00.
c. Retrieve the pairs of customer number values of customers who live in the same ZIP Code.
d. Retrieve the names of customers who have ordered parts from employees living in Wichita.
e. Retrieve the names of customers who have ordered parts costing less than $20.00.
g. Retrieve the names of customers who have placed exactly two orders.
Step-by-step solution
Step 1 of 7
a)
The following command is used to retrieve the names of “PARTS” that costs less than $20.00.
Comment
Step 2 of 7
b)
The following command is used to retrieve the names and cities of employees and whose have
taken orders for parts costing more than $50.00.
Comment
Step 3 of 7
c)
The following command is used to retrieve pairs of customer number values of customers and
who live in the same ZIP code:
Comment
Step 4 of 7
d)
The following command is used to retrieve names of customer and who have ordered parts from
employees living in Wichita.
Comment
Step 5 of 7
e)
The following command is used to retrieve names of customer and who have ordered parts
costing less than $20.00.
EXISTS (Select * from ORDERS O, Odetails OT where O.Ono= OT.Ono and O.Ono=C.Cno and
OT.Pno=P.Pno));
Comment
Step 6 of 7
f)
The following command is used to retrieve names of customer and who have not placed an
order.
SELECT C.cname from Customers C Where NOT EXISTS (Select Ono from ORDERS O,
Customers C where O.Ono=C.Cno);
Comment
Step 7 of 7
g)
The following command is used to retrieve names of customer and who have placed an exactly
two orders.
Comment
Chapter 8, Problem 36LE
Problem
Consider the following GRADEBOOK relational schema describing the data for a grade book of a
particular instructor. ( Note : The attributes A, B, C, and D of COURSES store grade cutoffs.)
CATALOG(Cno, Ctitle)
Specify and execute the following queries using the RA interpreter on the GRADEBOOK
database schema.
a. Retrieve the names of students enrolled in the Automata class during the fall 2009 term.
b. Retrieve the Sid values of students who have enrolled in CSc226 and CSc227.
c. Retrieve the Sid values of students who have enrolled in CSc226 or CSc227.
d. Retrieve the names of students who have not enrolled in any class.
e. Retrieve the names of students who have enrolled in all courses in the CATALOG table.
Step-by-step solution
Step 1 of 5
GRADEBOOK Database
a)
The following command is used to retrieve the names of students enrolled in the Automata class
during the fall 2009 term.
• Select Fname, Minit, Lname FROM STUDENTS, ENROLLS, COURSES, CATALOG WHERE
STUDENTS.Sid= ENROLLS.Sid And COURSES.Cno=CATALOG.Cno And
COURSES.Term=ENROLLS.Term And CATALOG.Ctitle = Automata And ENROLLS.Term=2009;
Comment
Step 2 of 5
b)
The following command is used to retrieve the Sid values of students who have enrolled in
CSc226 and CSc227.
• Select Sid From STUDENTS WHERE Sid IN (Select Sid from ENROLLS, COURSES WHERE
COURSES.Term= ENROLLS.Term And COURSES.Cno=’CSc226’ And Sid IN (Select Sid from
ENROLLS, COURSES WHERE COURSES.Term= ENROLLS.Term And
COURSES.Cno=’CSc227’;
Comment
Step 3 of 5
c)
The following command is used to retrieve the Sid values of students who have enrolled in
CSc226 or CSc227.
• Select Sid From STUDENTS WHERE Sid IN (Select Sid from ENROLLS, COURSES WHERE
COURSES.Term= ENROLLS.Term And COURSES.Cno=’CSc226’ OR Sid IN (Select Sid from
ENROLLS, COURSES WHERE COURSES.Term= ENROLLS.Term And COURSES.Cno=
‘CSc227’;
Comment
Step 4 of 5
d)
The following command is used to retrieve the names of students who have not enrolled in any
class.
• Select Fname, Minit, Lname FROM STUDENTS WHERE NOT EXISTS (Select Sid from
ENROLLS);
Comment
Step 5 of 5
e)
The following command is used to retrieve the names of students who have enrolled in all
courses in the CATALOG table.
( Select Cno from CATALOG) MINUS (Select Cno from COURSES , ENROLLS WHERE
COURSE.Term= ENROLLS.Term And STUDENTS.Sid=ENROLLS.Sid));
Comment
Chapter 8, Problem 37LE
Problem
(a) Discuss the correspondences between the ER model constructs and the relational model
constructs. Show how each ER model construct can be mapped to the relational model and
discuss any alternative mappings.
(b) Discuss the options for mapping EER model constructs to relations, and the conditions under
which each option could be used.
Step-by-step solution
Step 1 of 3
A model representing the data in conceptual and abstract way is called ER model. This can be
used in database modeling. Also used to reduce the complexity of the database schema and
also produce a semantic data model of a system.
In relational schema relationship types are represented by two attributes, one as a primary key
and the other one as a foreign key instead of representing them explicitly.
a.
Some of the correspondence between ER model and relational model are as follows:
All binary 1: 1 or 1: N
Relationship between two entities is represented by foreign
relationship type are
key or relationship relation having two foreign keys each
represented by a line
representing corresponding entity.
connecting line.
Follow the following steps to map ER model into relational model efficiently:
Derived attribute are the attributes which can be derived from other attributes like age, full name.
If ER diagram has any derived attribute than remove all derived attributes to make schema
simpler. Full name can be calculated by concatenating the first name, middle name and the last
name of the candidate. So it is not required to store the full name of the candidate separately.
• Map all strong entities into tables. Create a separate relation for each strong entity including all
simple attributes in the ER diagram and choose key attribute of ER diagram as primary key of
relation.
• Assume an entity type T in the ER model E, create a relation R including all simple attributes of
T, also choose unique attribute as a primary key of relation R.
• If multiple keys exist for T in E during the analysis of the design, then keep all of them to
describe specific information about the attributes. Keys can also be used for indexing the
database and also for other analysis.
• Map all weak entities into tables. Create a separate relation for each weak entity including all
simple attributes. Include all primary keys of the relations to which weak entity is related as
foreign key, to establish connection among the relations.
• Weak entity does not have its own candidate key. Here candidate key of the relation R is
composed of the primary key(s) of the participating entity(s) and the partial key of the weak
entity.
• For each binary 1:1 relationship in the relation R constructed by the ER schema, identify
relation between two entities. This relationship might occur in the form of foreign key or merging
two attributes into one as a candidate key.
• Also add the attributes which come under relationship. This can also be done by creating a new
relation R that includes primary keys of both participating relations as foreign key.
5. Binary 1: N Mapping.
• Identify all 1: N relationships in ER diagram. For each binary 1: N relationship in relation R, the
primary key present on the 1-side of the relationship becomes a foreign key on the N-side
relation.
• Another approach is to create a new relation S that includes primary keys of both participating
entities. Both primary keys work as foreign keys in S.
6. Binary M: N Mapping.
7.
Comment
Step 2 of 3
• The relation will comprise of primary key attribute K, such that the attribute K belongs to the
relation representing the relationship type containing A as a multivalued attribute. The primary
key of R would be the combination of A and K.
• For each n-ary relationship, having , represent the relation R through a new relation.
Include primary keys attributes of all participating relations as foreign key attributes and also
include the simple attribute of n-ary relationship.
• Since the participating entities are more than two, so without creating a new relation this cannot
be mapped. Combination of all foreign keys is generally used as a primary key in relation R.
Comment
Step 3 of 3
b.
Mapping of Enhanced Entity Relationship (EER) model to relations includes all the 8 steps
followed in part (a). EER model is an extended model, used to map extended elements of the ER
model. Extended elements in the EER model are specialization or generalization and shared
subclasses.
The following steps can also be used for EER to relation mapping:
First option is to map the whole specialization into a single table. Second option is to map it into
multiple tables. In each option, variations may occur that depends upon the constraints on the
specialization or generalization.
• Each specialization containing m subclasses and generalized super class
C, having the primary key k and the attributes are converted into relational
schemas using one of the following options:
• Create a relation R for superclass that includes all the attributes of C with the primary key k.
Create a separate relation having primary key k and attributes
for each subclass , where . Here k is working
as primary key for each relation .
• Specialization having disjointed constraints can be mapped through this option. In the case of
specialization overlapping, there can be replication of same entity in several relations. This will
cause redundancy in the relational schema.
• This option is applicable for a specialization whose subclasses are disjoint. This option
generates many NULL values if independent attributes exist in the subclasses.
• Each attribute is a Boolean type attribute. This indicates whether a record is contained by
a subclass or not. This option can be used for specialization, having overlapping subclasses.
Comment
Chapter 9, Problem 2E
Problem
Map the UNIVERSITY database schema shown in Figure 3.20 into a relational database
schema.
Step-by-step solution
Step 1 of 3
Refer Fig 3.20 of chapter 3 for the UNIVRESITY database schema from the textbook.
Comment
Step 2 of 3
Basic steps to map ER diagram into Relational Database Schema are as follows:
If ER diagram has any derived attribute than remove all derived attributes to make schema
simpler. Derived attribute are the attributes which can be derived from other attributes like, age,
full name. Age can be calculated through difference of current date and the date of birth.
Map all strong entities into tables. Create a relation R that includes all single attributes in the ER
diagram and choose key attribute of ER diagram as primary key of relation R.
COLLEGE
INSTRUCTOR
DEPT
STUDENT
COURSE
For each weak entity create a separate relation R. Add all the simple attributes of weak entity in
relation R. Include all primary keys of the relations to which weak entity is related as foreign key,
to establish connection among the relations. Since the provided ER diagram has no weak entity,
so there is no need to map weak entities.
4. 1:1 Mapping.
For each binary 1:1 relationship in the relation R constructed by the ER schema, identify relation
between two entities. This relationship might occur in the form of foreign key or by merging two
attributes into one (both must have exact same number of attributes). Also add the attributes
which come under relationship.
COLLEGE
INSTRUCTOR
5. 1: N Mapping.
Identify all 1:N relationships in ER diagram. For each regular binary 1:N relationship in relation R,
add primary key of participating relation of 1-side as foreign keys to the N-side relation.
COLLEGE
DEPT
INSTRUCTOR
COURSE
6. M: N Mapping.
Identify all M:N relationship in ER diagram. For each M:N relationship, create new relation S to
represent relationship. Include all primary key attributes of participating relation as foreign key in
the relation S.
TAKES
For each multivalued attribute in the ER diagram, create a new relation R. R will include all
attributes corresponding to multivalued attribute. Add primary key attribute as a foreign key in R.
Since the provided ER diagram has no multivalued attributes, so there is no need to map
multivalued attributes.
For each n-ary relationship, where , create a new relation R to represent the relationship.
Include primary keys attributes of all participating relations as foreign key attributes and also
include the simple attribute of n-ary relationship.
Since the maximum value of n is 2 in the ER diagram provided, so there is no n-ary relationship.
Comment
Step 3 of 3
Final relational schema, for ER diagram provided in Fig-3.20, can be generated as follows:
Final schema has seven relations, six from the strong entities and one from binary M: N
relationship. Each relational table has primary and foreign keys. TAKES table represents
relationship between STUDENT and SECTION table.
Also, Grade can be calculated with the help of Sid and SecId for corresponding semester, year or
in particular section.
• In COLLEGE table, CName is primary key and DeanId and DCode are foreign keys for
INSTRUCTOR and DEPT tables respectively. DeanId is the projection of Id attribute in
INSTRUCTOR table.
• In INSTRUCTOR table, Id is working as primary key. DCode and SecId are working as foreign
key for DEPT and SECTION tables respectively.
• In DEPT table, DCode is unique for each department and it is working as primary key. To
establish connection with COURSE, INSTRUCTOR and STUDENT, their primary keys can be
used as foreign keys. InstId of DEPT table is primary key (Id) attribute in INSTRUCTOR table
and it is working as foreign key here.
• STUDENT table has primary key only. To get the personal information of student SId will be
used. But to retrieve academic information connection is required with DEPT and TAKES table.
• Each course has its unique CCode in COURSE table. COURSE table is logically connected
with SECTION table and DEPT table to particulate the course in department and section.
• TAKES table is created using binary M: N relationship between STUDENT and SECTION. This
is normalized form of both tables.
Comment
Chapter 9, Problem 3E
Problem
Try to map the relational schema in Figure 6.14 into an ER schema. This is part of a process
known as reverse engineering, where a conceptual schema is created for an existing
implemented database. State any assumptions you make.
Step-by-step solution
Step 1 of 3
Take the relational schema from the text book figure 6.14 it shows the relations of mapping the
EER categories. Based on this we may construct the ER schema.
Comment
Step 2 of 3
Comment
Step 3 of 3
Comment
Chapter 9, Problem 4E
Problem
Figure shows an ER schema for a database that can be used to keep track of transport ships
and their locations for maritime authorities. Map this schema into a relational schema and specify
all primary keys and foreign keys.
Figure
Step-by-step solution
Step 1 of 6
Following are the steps to convert the given ER scheme into a relational schema:
Identify the regular entities in the given ER scheme and create a relation for each regular entity.
Include all the simple attributes of regular entities into relations.
Comments (1)
Step 2 of 6
The weak entities in the given ER scheme are SHIP_MOVEMENT, PORT, and PORT_VISIT.
Create a relation for each weak entity. Include all the simple attributes of weak entities into
relations and include the primary key of the strong entity that corresponds to the owner entity
type as a foreign key.
Comments (1)
Step 3 of 6
There exists one binary 1:1 relationship mapping which is SHIP_AT_PORT in given ER scheme.
1: N relationship types in given ER scheme are HISTORY, TYPE, IN, ON, HOME_PORT.
For HISTORY 1: N relationship type, include the primary key of SHIP in SHIP_MOVEMENT. That
is handled in step 2.
For TYPE 1:N relationship type, include the primary key of SHIP_TYPE in SHIP.
For HOME_PORT 1:N relationship type, include the primary key of PORT_VISIT in SHIP.
Comment
Step 4 of 6
Comments (3)
Step 5 of 6
Comment
Step 6 of 6
Comment
Chapter 9, Problem 5E
Problem
Map the BANKER schema of Exercise 1 (shown in Figure 2) into a relational schema. Specify all
primary keys and foreign keys. Repeat for the AIRLINE schema (Figure 3.20) of Exercise 2 and
for the other schemas for Exercises 1 through 9.
Exercise 1
Consider the ER diagram shown in Figure 1 for part of a BANK database. Each bank can have
multiple branches, and each branch can have multiple accounts and loans.
b. Is there a weak entity type? If so, give its name, partial key, and identifying relationship.
c. What constraints do the partial key and the identifying relationship of the weak entity type
specify in this diagram?
d. List the names of all relationship types, and specify the (min, max) constraint on each
participation of an entity type in a relationship type. Justify your choices.
Figure 1
Exercise 2
Consider the ER diagram in Figure 2, which shows a simplified schema for an airline
reservations system. Extract from the ER diagram the requirements and constraints that
produced this schema. Try to be as precise as possible in your requirements and constraints
specification.
Figure 2
Exercise 3
Which combinations of attributes have to be unique for each individual SECTION entity in the
UNIVERSITY database shown in Figure 3.20 to enforce each of the following miniworld
constraints:
a. During a particular semester and year, only one section can use a particular classroom at a
particular DaysTime value.
b. During a particular semester and year, an instructor can teach only one section at a particular
DaysTime value.
c. During a particular semester and year, the section numbers for sections offered for the same
course must all be different.
Composite and multivalued attributes can be nested to any number of levels. Suppose we want
to design an attribute for a STUDENT entity type to keep track of previous college education.
Such an attribute will have one entry for each college previously attended, and each such entry
will be composed of college name, start and end dates, degree entries (degrees awarded at that
college, if any), and transcript entries (courses completed at that college, if any). Each degree
entry contains the degree name and the month and year the degree was awarded, and each
transcript entry contains a course name, semester, year, and grade. Design an attribute to hold
this information. Use the conventions in Figure 3.5.
Exercise 5
Show an alternative design for the attribute described in Exercise 4 that uses only entity types
(including weak entity types, if needed) and relationship types.
Exercise 6
In Chapters 1 and 2, we discussed the database environment and database users. We can
consider many entity types to describe such an environment, such as DBMS, stored database,
DBA, and catalog/data dictionary. Try to specify all the entity types that can fully describe a
database system and its environment; then specify the relationship types among them, and draw
an ER diagram to describe such a general database environment.
Exercise 7
Design an ER schema for keeping track of information about votes taken in the U.S. House of
Representatives during the current two-year congressional session. The database needs to keep
track of each U.S. STATE?S Name (e.g., ?Texas?, ?New York?, ?California?) and include the
Region of the state (whose domain is {?Northeast?, ?Midwest?, ?Southeast?, ?Southwest?, ?
West?}). Each CONGRESS_PERSON in the House of Representatives is described by his or her
Name, plus the District represented, the Start_date when the congressperson was first elected,
and the political Party to which he or she belongs (whose domain is {?Republican?, ?Democrat?,
?Independent?, ?Other?}). The database keeps track of each BILL (i.e., proposed law), including
the Bill_name, the Date_of_vote on the bill, whether the bill Passed_or_failed (whose domain is
{?Yes?, ?No?}), and the Sponsor (the congressperson(s) who sponsored?that is, proposed?the
bill). The database also keeps track of how each congressperson voted on each bill (domainof
Vote attribute is {?Yes?, ?No?, ?Abstain., ?Absent?}). Draw an ER schema diagram for this
application. State clearly any assumptions you make.
Exercise 8
A database is being constructed to keep track of the teams and games of a sports league. A
team has a number of players, not all of whom participate in each game. It is desired to keep
track of the players participating in each game for each team, the positions they played in that
game, and the result of the game. Design an ER schema diagram for this application, stating any
assumptions you malie. Choose your favorite sport (e.g., soccer, baseball, football).
Exercise 9
Consider the ER diagram in Figure 3. Assume that an employee may work in up to two
departments or may not be assigned to any department. Assume that each department must
have one and may have up to three phone numbers. Supply (min, max) constraints on this
diagram. State clearly any additional assumptions you make. Under what conditions would the
relationship HAS_PHONE be redundant in this example?
Figure 3
Figure 3.20
Step-by-step solution
Ask an expert
Chapter 9, Problem 6E
Problem
Map the EER diagrams in Figures 4.9 and 4.12 into relational schemas. Justify your choice of
mapping options.
Step-by-step solution
Step 1 of 7
The relational schema diagram for the EER diagram in figure 4.9 is as shown below:
Comment
Step 2 of 7
Explanation:
• The regular entity types are PERSON, DEPARTMENT, COLLEGE, COURSE and SECTION.
So, create a relation for each entity with their respective attributes.
• The FACULTY and STUDENT are sub classes of the entity PERSON. So, two relations one for
FACULTY and one for STUDENT are created and the primary key of PERSON is included in
both the relations along with their respective attributes.
• There exists a binary 1:1 relationship CHAIRS between FACULTY and DEPARTMENT. So,
include the primary key of Faculty as a foreign key in relation DEPARTMENT.
• There exists a binary 1:N relationship CD between COLLEGE and DEPARTMENT. So, include
the primary key of COLLEGE as a foreign key in relation DEPARTMENT.
Comment
Step 3 of 7
• There exists a binary 1:N relationship DC between DEPARTMENT and COURSE. So, include
the primary key of DEPARTMENT as a foreign key in relation COURSE.
• There exists a binary 1:N relationship CS between COURSE and SECTION. So, include the
primary key of COURSE as a foreign key in relation SECTION.
• There exists a binary 1:N relationship ADVISOR between FACULTY and GRAD_STUDENT.
So, include the primary key of FACULTY as a foreign key in relation GRAD_STUDENT.
• There exists a binary 1:N relationship PI between FACULTY and GRANT. So, include the
primary key of FACULTY as a foreign key in relation GRANT.
Comment
Step 4 of 7
There exists a binary 1:N relationship MINOR between STUDENT and DEPARTMENT. Create a
relation MINOR and include the primary keys of STUDENT and DEPARTMENT as attributes of
MINOR.
• There exists a binary M:N relationship BELONGS between FACULTY and DEPARTMENT.
Create a relation BELONGS and include the primary keys of FACULTY and DEPARTMENT as
attributes of BELONGS.
• There exists a binary M:N relationship TRANSCRIPT between SECTION and STUDENT.
Create a relation TRANSCRIPT and include the primary keys of SECTION and STUDENT as
attributes of TRANSCRIPT along with additional attributes of relation TRANSCRIPT.
Comment
Step 5 of 7
The relational schema diagram for the EER diagram in figure 4.12 is as shown below:
Comment
Step 6 of 7
Explanation:
• The regular entity types are PLANE_TYPE, AIRPLANE and HANGAR. So, create a relation for
each entity with their respective attributes.
• Create two relations CORPORATION and PERSON and include their respective attributes.
• Owner category is a subset of the union of two entities CORPORATION and PERSON. So, a
relation OWNER is created with Owner_id as an attribute. This attribute is included as a foreign
key in the relations CORPORATION and PERSON.
• The EMPLOYEE and PILOT are sub classes of the entity PERSON. So, two relations one for
EMPLOYEE and one for PILOT are created and the primary key of PERSON is included as
primary key in both the relations along with their respective attributes.
• An entity SERVICE is a weak entity. So, create a relation SERVICE and include as attributes
the primary key of AIRPLANE along with the attributes of SERVICE.
• There exists a binary 1:N relationship OF_TYPE between AIRPLANE and PLANE_TYPE. So,
include the primary key of AIRPLANE as a foreign key in relation PLANE_TYPE.
• There exists a binary 1:N relationship STORED_IN between AIRPLANE and HANGAR. So,
include the primary key of AIRPLANE as a foreign key in relation HANGAR.
• There exists a binary M:N relationship WORKS_ON between PLANE_TYPE and EMPLOYEE.
Create a relation WORKS_ON and include the primary keys of PLANE_TYPE and EMPLOYEE
as attributes of WORKS_ON.
Comment
Step 7 of 7
There exists a binary M:N relationship FLIES between PLANE_TYPE and PILOT. Create a
relation FLIES and include the primary keys of PLANE_TYPE and PILOT as attributes of FLIES.
• There exists a binary M:N relationship OWNS between AIRPLANE and OWNER. Create a
relation OWNS and include the primary keys of AIRPLANE and OWNER as attributes of OWNS
along with the attribute Pdate.
• There exists a binary M:N relationship MAINTAIN between SERVICE and EMPLOYEE. Create
a relation OWNS and include the primary keys of SERVICE and EMPLOYEE as attributes of
MAINTAIN.
Comment
Chapter 9, Problem 7E
Problem
Step-by-step solution
Step 1 of 3
When there exists a many to many relationship between two entities, then the relationship type is
known as binary M: N relationship type.
Comment
Step 2 of 3
• Include the primary keys of the two participating entities as foreign keys in new relation R1.
• The primary keys of the two participating entities also become the composite primary key of
relation R1.
Comment
Step 3 of 3
Hence, it is not possible to map a binary M: N relationship type without requiring a new relation.
Comment
Problem
Chapter 9, Problem 8E
Map the EER schema into a set of relations. For the VEHICLE to CAR/TRUCK/SUV
generalization, consider the four options presented in Section 9.2.1 and show the relational
schema design under each of those options.
Figure
Step-by-step solution
Step 1 of 8
Following are the set of relations for the VEHICLE to CAR/TRUCK/SUV generalization using the
option multiple relations – superclass and subclasses:
Comment
Step 2 of 8
Using the option multiple relations – superclass and subclasses, a separate relation is created for
super class and each sub class in the generalization.
Comment
Step 3 of 8
The relational schema for a car dealer EER diagram (refer figure 9.9) using the option multiple
relations – superclass and subclasses is as shown below:
Comment
Step 4 of 8
Following are the set of relations for the VEHICLE to CAR/TRUCK/SUV generalization using the
option multiple relations –subclass relations only:
Using the option multiple relations –subclass relations only, a separate relation is created for
each sub class in the generalization.
• A relation CAR is created with attribute Vin, Model, Price and Engine_size.
• A relation TRUCK is created with attribute Vin, Model, Price and Tonnage.
• A relation SUV is created with attribute Vin, Model, Price and No_seats.
Comment
Step 5 of 8
Following are the set of relations for the VEHICLE to CAR/TRUCK/SUV generalization using the
option single relation with one type attribute:
Using the option single relation with one type attribute, a single relation is created for super class
as well as the sub class.
• The attributes of the relation will be the union of attributes of super class and sub classes.
• An attribute Vehicle_Type is added to specify the type of the vehicle
• A relation Vehicle is created with attributes Vin, Model, Price, Engine_size, Tonnage, No_seats
and Vehicle_Type.
Comment
Step 6 of 8
The relational schema for a car dealer EER diagram (refer figure 9.9) using the option single
relation with one type attribute is as shown below:
Comment
Step 7 of 8
Following are the set of relations for the VEHICLE to CAR/TRUCK/SUV generalization using the
option single relation with multiple type attributes:
Using the option single relation with multiple type attributes, a single relation is created for super
class as well as the sub class.
• The attributes of the relation will be the union of attributes of super class and sub classes.
• An Boolean attribute Car_Type is added to indicate the type of the vehicle as car.
• An Boolean attribute Truck_Type is added to indicate the type of the vehicle as truck.
• An Boolean attribute SUV_Type is added to indicate the type of the vehicle as SUV.
• A relation Vehicle is created with attributes Vin, Model, Price, Car_Type, Engine_size,
Truck_Type, Tonnage, SUV_Type, No_seats.
Comment
Step 8 of 8
The relational schema for a car dealer EER diagram (refer figure 9.9) using the option single
relation with multiple type attributes is as shown below:
Comment
Chapter 9, Problem 9E
Problem
Using the attributes you provided for the EER diagram in Exercise, map the complete schema
into a set of relations. Choose an appropriate option out of 8A thru 8D from Section 9.2.1 in doing
the mapping of generalizations and defend your choice.
Exercise
Consider the following EER diagram that describes the computer systems at a company. Provide
your own attributes and key for each entity type. Supply max cardinality constraints justifying
your choice. Write a complete narrative description of what this EER diagram represents.
Step-by-step solution
Step 1 of 2
• The relation computer has the attributes that RAM, ROM, Processor, S_no, Manufacturer, and
Cost.
• EER diagram starts with the relation computer that it deals to many relations that Accessory,
Installed and d.
• The Accessory has a one-to-many cardinality and transfers the function to the keyboard,
monitor, and mouse.
• Also, the installed and installed_OS relation deals with the software and operating_system to
perform the operations and signals on the computer system to support with it.
• The relation d performs the cardinality to laptop and desktop with all other components.
• The other components that related are memory, video_card, and sound_card.
Cardinality:
• One-to-one cardinality describes the entity that related to only one occurrence to another
occurrence.
• One-to-many cardinality describes the entity that related to one occurrence to many
occurrences.
• Many-to-many cardinality describes the entity that related to many occurrences to many
occurrences.
Comment
Step 2 of 2
The following table describes the attributes, primary key, and cardinality of each relation:
Comment
Chapter 9, Problem 10LE
Problem
Consider the ER design for the UNIVERSITY database that was modeled using a tool like ERwin
or Rational Rose in Laboratory Exercise 3.31. Using the SQL schema generation feature of the
modeling tool, generate the SQL schema for an Oracle database.
Consider the UNIVERSITY database described in Exercise 16. Build the ER schema for this
database using a data modeling tool such as ERwin or Rational Rose.
Reference Exercise 16
Which combinations of attributes have to be unique for each individual SECTION entity in the
UNIVERSITY database shown in Figure 3.20 to enforce each of the following miniworld
constraints:
a. During a particular semester and year, only one section can use a particular classroom at a
particular DaysTime value.
b. During a particular semester and year, an instructor can teach only one section at a particular
DaysTime value.
c. During a particular semester and year, the section numbers for sections offered for the same
course must all be different.
Step-by-step solution
Step 1 of 1
Refer to the ER schema for UNIVERSITY database, generated using Rational Rose tool in
Laboratory Exercise 3.31. Use Rational Rose tool to create the SQL schema for an Oracle
database as follows:
• Open the ER schema generated using Rational Rose tool in Laboratory Exercise 3.31. In the
options available on left, right click on the option Component view, go to Data Modeler, then go
to New and select the option Database.
• Name the database as Oracle Database.
• Right click on Oracle Database and select the option Open Specification. In the field Target
select Oracle 7.x and click on OK.
• Import the ER schema, generated using Rational Rose tool in Laboratory Exercise- 3.31, to the
Oracle Database as follows:
• Right click on the Oracle Database, then go to New and select the option File.
• Now browse and select the ER schema generated using Rational Rose tool in Laboratory
Exercise 3.31. Selecting the file would import the ER schema for the UNIVERSITY database,
generated using Rational Rose tool in Laboratory Exercise 3.31.
• Click on File option in menu bar, followed by clicking on Save as option. Save the ER schema
by the file name 714374-9-10LE.
• This will generate the SQL schema of the UNIVERSITY database for the Oracle database.
Comment
Chapter 9, Problem 11LE
Problem
Consider the ER design for the MAIL_ORDER database that was modeled using a tool like
ERwin or Rational Rose in Laboratory Exercise. Using the SQL schema generation feature of the
modeling tool, generate the SQL schema for an Oracle database.
Exercise
Consider a MAIL_ORDER database in which employees take orders for parts from customers.
The data requirements are summarized as follows:
■ The mail order company has employees, each identified by a unique employee number, first
and last name, and Zip Code.
■ Each customer of the company is identified by a unique customer number, first and last name,
and Zip Code.
■ Each part sold by the company is identified by a unique part number, a part name, price, and
quantity in stock.
■ Each order placed by a customer is taken by an employee and is given a unique order number.
Each order contains specified quantities of one or more parts. Each order has a date of receipt
as well as an expected ship date. The actual ship date is also recorded.
Design an entity-relationship diagram for the mail order database and build the design using a
data modeling tool such as ERwin or Rational Rose.
Step-by-step solution
Step 1 of 1
Refer to the ER schema for MAIL_ORDER database, generated using Rational Rose tool in
Laboratory Exercise 3.32. Use Rational Rose tool to create the SQL schema for an Oracle
database as follows:
• Open the ER schema generated using Rational Rose tool in Laboratory Exercise 3.32. In the
options available on left, right click on the option Component view, go to Data Modeler, then go
to New and select the option Database.
• Right click on Oracle Database and select the option Open Specification. In the field Target
select Oracle 7.x and click on OK.
• Import the ER schema, generated using Rational Rose tool in Laboratory Exercise- 3.32, to the
Oracle Database as follows:
• Right click on the Oracle Database, then go to New and select the option File.
• Now browse and select the ER schema generated using Rational Rose tool in Laboratory
Exercise 3.32. Selecting the file would import the ER schema for the MAIL_ORDER database.
• Click on File option in menu bar, followed by clicking on Save as option. Save the ER schema
by the file name 714374-9-11LE.
• This will generate the SQL schema of the MAIL_ORDER database for the Oracle database.
Comment
Chapter 10, Problem 1RQ
Problem
Step-by-step solution
Step 1 of 1
ODBCL:-
Open data base connectivity (ODBC) is the standardized application programming interface. It is
for accessing a database.
For accessing the files we use the ODBC soft ware and programming support of ODBC is
Microsoft.
SQL/CLI
SQL/CLI is the part of SQL standard. SQL / CLT means. Call level interface. It was developed as
a follow up to the technique known as ODBC.
Comment
Chapter 10, Problem 2RQ
Problem
Step-by-step solution
Step 1 of 2
JDBCE
JDBC stand for Java database connectivity. It is a registered trademark of sun Microsystems.
JDBC is the call function interface it is for accessing the databases from java.
A JDBC driver is basically an implementation of the function calls. That is specified in the JDBC
application programming interface. It is designed for allow a single java program to connect
several different databases.
Comment
Step 2 of 2
JDBC is not the example of embedded SQL. It is a function call. That is specified in JDBC API.
JDBC function calls can access any RDBMS where the JDBC driver can available. So the
function libraries for this access are known as JDBC.
Comment
Chapter 10, Problem 3RQ
Problem
List the three main approaches to database programming. What are the advantages and
disadvantages of each approach?
Step-by-step solution
Step 1 of 3
Here database statements are embedded into the host programming language. But they are
identified by a special prefix and precompiled or preprocessor scans the source program code to
identify database statements and extract them for processing by the DBMS.
Comment
Step 2 of 3
A library of functions is made available to the host programming language for database calls.
Comment
Step 3 of 3
Database programming language is designed from scratch to be compatible with the database
model and query language. Here loops and conditional statements are added to the data base
language to convert it in to a full fledged programming language.
In many applications, first two steps are most common approaches. But they require some
database access and main disadvantages of these two approaches is impedance mismatch.
In the third approach it is more appropriate for applications and it has intensive data base
interaction. In the third approach impedance mismatch is not occur here.
Comment
Chapter 10, Problem 4RQ
Problem
What is the impedance mismatch problem? Which of the three programming approaches
minimizes this problem?
Step-by-step solution
Step 1 of 2
Impedance mismatch:
Impedance mismatch is a term that is used to refer the problems occur in the differences
between the data base model and the programming language model.
It is less of problem when a special data base programming language is designed. At this time
that uses the same data model and data types as the database model.
Attributes
tuples
tables.
Comment
Step 2 of 2
1 st problem:-
In the data model the data types of the programming language differ from the attribute data type.
So, for this, it is necessary to have a binding for each programming language because different
languages have different data types.
2 nd problem:
The results of most queries are sets or multisite of tuples. And each is formed of a sequence of
attribute value.
So binding is needed to map the query result data structure, which is a table to an appropriate
data structure in the programming language.
The third approach of the data base programming that is designing a brand new language,
approach is minimize this impedance mismatch problem.
Comment
Chapter 10, Problem 5RQ
Problem
Step-by-step solution
Step 1 of 2
A cursor is a pointer that points to a single tuple/ row from the result of a query that retrieves
multiple tuples.
FETCH command
Comments (1)
Step 2 of 2
In the embedded SQL, update / delete commands are used when the condition WHERE
CURRENT OF < Cursor name > specifies that the current tuple. It is represented by the cursor.
When declaring a cursor in the embedded SQL, some operations are performed in that.
Comment
Chapter 10, Problem 6RQ
Problem
What is SQLJ used for? Describe the two types of iterators available in SQLJ.
Step-by-step solution
Step 1 of 2
SQL J
SQL J is standard it is adopted by several vendors for embedded SQL in java. SQL J is used for
accessing SQL database from java using function calls. And it is used in oracle DBMS.
SQL J is used for convert the SQL statements into java through the JDBC interface.
In SQL J an iterates is associated with the tuples and attributes in a query result. Here two types
of iterators is there.
Comment
Step 2 of 2
A named iterator is associated with a query result by listing the attribute names and types. That
may appear in the query result. And
A positional iterator lists only the attribute types at the time of query result appear.
A part from this, is both cases, the list should be in the same order as the attributes that are listed
in the SELECT clause of the query. Looping over a query result is different for these two type of
iterators.
In the name iterator, there are no attribute names and in the positional iterator only attribute types
are present.
Comment
Chapter 10, Problem 7E
Problem
Consider the database shown in Figure 1.2, whose schema is shown in Figure 2.1. Write a
program segment to read a student’s name and print his or her grade point average, assuming
that A = 4, B = 3, C = 2, and D = 1 points. Use embedded SQL with C as the host language.
Step-by-step solution
Step 1 of 1
Assuming all required variables have been declared already and assuming that Name of
STUDENT is unique , code will look like:
EXEC SQL
From STUDENT
Select Grade
from GRADE_REPORT
While(SQLCODE = = 0)
switch (:grade)
case ‘A’:
total_grade_avg+= 4;
case ‘B’:
total_grade_avg+= 3;
case ‘C’:
total_grade_avg+= 2;
case ‘D’:
total_grade_avg+= 1;
total_course_count++;
If (total_course_count!=0)Total_grade_avg/ = total_course_count;
Comment
Chapter 10, Problem 8E
Problem
Repeat Exercise 10.7, but use SQLJ with Java as the host language.
Reference 10.7
Consider the database shown in Figure 1.2, whose schema is shown in Figure 2.1. Write a
program segment to read a student’s name and print his or her grade point average, assuming
that A = 4, B = 3, C = 2, and D = 1 points. Use embedded SQL with C as the host language.
Step-by-step solution
Step 1 of 1
Assuming all required variables have been declared already, headers have been included, and
assuming that Name of STUDENT is unique, code will look like:
try
#sql
From STUDENT
};
Return;
}
STU s = null;
from GRADE_REPORT
while (s.next())
switch (:grade)
case ‘A’:
total_grade_avg+= 4;
case ‘B’:
total_grade_avg+= 3;
case ‘C’:
total_grade_avg+= 2;
case ‘D’:
total_grade_avg+= 1;
total_course_count++;
};
If (total_course_count!= 0 )
};
s.close();
Comment
Chapter 10, Problem 9E
Problem
Consider the library relational database schema in Figure. Write a program segment that
retrieves the list of books that became overdue yesterday and that prints the book title and
borrower name for each. Use embedded SQL with C as the host language.
Figure
Step-by-step solution
Step 1 of 1
While(SQLCODE = = 0)
printf(“BookId”,bookId );
printf(“Book Title”,bookTitle );
printf(“Borrower Name”,borrowerName );
Comment
Chapter 10, Problem 10E
Problem
Repeat Exercise, but use SQLJ with Java as the host language.
Exercise
Consider the library relational database schema in Figure. Write a program segment that
retrieves the list of books that became overdue yesterday and that prints the book title and
borrower name for each. Use embedded SQL with C as the host language.
Figure
Step-by-step solution
Step 1 of 1
Assuming all required variables have been declared already, headers have been included.
DB d = null;
};
while (d.next())
};
d.close();
Comment
Chapter 10, Problem 11E
Problem
Repeat Exercise 10.7 and 10.9, but use SQL/CLI with C as the host language.
Reference 10.7
Consider the database shown in Figure 1.2, whose schema is shown in Figure 2.1. Write a
program segment to read a student’s name and print his or her grade point average, assuming
that A = 4, B = 3, C = 2, and D = 1 points. Use embedded SQL with C as the host language.
Reference 10.9
Consider the library relational database schema in Figure 6.6. Write a program segment that
retrieves the list of books that became overdue yesterday and that prints the book title and
borrower name for each. Use embedded SQL with C as the host language.
Step-by-step solution
Step 1 of 4
#include sqlcli.h;
Void printGPA() {
SQLHSTMT stmt1 ;
SQLHDBC conv1 ;
SQLHENV env1 ;
if (!ret2) ret3 = SQLConnect (con1, “dbs”, SQL_NTS, “js”, SQL_NTS,”xyz”, SQL_NTS) else exit;
From STUDENT
ret1 = SQLExecute(stmt1);
if (!ret1)
ret2 = SQLFetch(stmt1);
while (!ret2)
from GRADE_REPORT
ret1 = SQLExecute(stmt1);
if (!ret1)
ret2 = SQLFetch(stmt1);
while (!ret2)
switch (:grade)
case ‘A’:
total_grade_avg+= 4;
case ‘B’:
total_grade_avg+= 3;
case ‘C’:
total_grade_avg+= 2;
case ‘D’:
total_grade_avg+= 1;
total_course_count++;
ret2 = SQLFetch(stmt1);
Comment
Step 2 of 4
If (total_course_count!=0)Total_grade_avg/ = total_course_count;
Comment
Step 3 of 4
}
else System.out.printline(“Sname does not match”);
#include sqlcli.h;
Void printDueBookRecord() {
SQLHSTMT stmt1 ;
SQLHDBC conv1 ;
SQLHENV env1 ;
if (!ret2) ret3 = SQLConnect (con1, “dbs”, SQL_NTS, “js”, SQL_NTS,”xyz”, SQL_NTS) else exit;
ret1 = SQLExecute(stmt1);
if (!ret1)
Comment
Step 4 of 4
ret2 = SQLFetch(stmt1);
while (!ret2)
ret2 = SQLFetch(stmt1);
Comment
Chapter 10, Problem 12E
Problem
Repeat Exercise 10.7 and 10.9, but use JDBC with Java as the host language.
Reference 10.7
Consider the database shown in Figure 1.2, whose schema is shown in Figure 2.1. Write a
program segment to read a student’s name and print his or her grade point average, assuming
that A = 4, B = 3, C = 2, and D = 1 points. Use embedded SQL with C as the host language.
Reference 10.9
Consider the library relational database schema in Figure 6.6. Write a program segment that
retrieves the list of books that became overdue yesterday and that prints the book title and
borrower name for each. Use embedded SQL with C as the host language.
Step-by-step solution
Step 1 of 2
Import java.io.*;
import java.sql.*;
…..
class PrintGPAAverage
Try{ Class.forName(“oracle.jdbc.driver.Oracle.Driver”)
} catch (ClassNotFoundException x) {
Integer number;
Statement s = conn.createStatement();
ResultSet r = s.ExecuteQuery(q);
while(r.next())
number = r.getInteger(1);
name = r.getString(2);
Statement g = conn.createStatement();
ResultSet rs = g.executeQuery(t);
while (rs.next()){
switch (:grade)
case ‘A’:
total_grade_avg+= 4;
case ‘B’:
total_grade_avg+= 3;
case ‘C’:
total_grade_avg+= 2;
case ‘D’:
total_grade_avg+= 1;
total_course_count++;
If (total_course_count!=0)Total_grade_avg/ = total_course_count;
Comment
Step 2 of 2
Import java.io.*;
import java.sql.*;
…..
class PrintGPAAverage
Try{ Class.forName(“oracle.jdbc.driver.Oracle.Driver”)
} catch (ClassNotFoundException x) {
}
String dbacct, password, lname;
Statement s = conn.createStatement();
ResultSet r = s.ExecuteQuery(q);
while(r.next())
Book_Id = r.getString(1);
Book_title= r.getString(2);
Borrower_name = r.getstring(3);
}}
Comment
Chapter 10, Problem 13E
Problem
Reference 10.7
Consider the database shown in Figure 1.2, whose schema is shown in Figure 2.1. Write a
program segment to read a student’s name and print his or her grade point average, assuming
that A = 4, B = 3, C = 2, and D = 1 points. Use embedded SQL with C as the host language.
Step-by-step solution
Step 1 of 2
Consider the following SQL/PSM function to determine the average grade point of student.
//Function PSM2:
Name=in_name;
Student_number= std_no;
8. OPEN grd;
9. LOOP
19. final_avg:=total_avg/count;
Comment
Step 2 of 2
• Now, from the line number 2 to line number 5, variables are declared to store intermediate
values.
• Now, Query in line number 6 is used to find the student number of user entered student name.
• In the line number 7, cursor is declared to process the multiple row returned by the query.
• Now, from the line number 8 to line number 18, for loop is used to count the number of rows.
Also, else-if statement is used inside for loop to find the total point’s sum of student.
Comment
Chapter 10, Problem 14E
Problem
Create a function in PSM that computes the median salary for the EMPLOYEE table shown in
Figure 5.5.
Step-by-step solution
Step 1 of 2
Following is the function in Persistent Stored Module (PSM ) to calculate the median salary for
the EMPLOYEE table:
//Function PSM1:
1) RETURNS INTEGER
4) FROM EMPLOYEE;
5) RETURN median_salary;
Comment
Step 2 of 2
Explanation:
Line 0: CREATE FUNCTION is used to create a function. The name of the function
as input.
Line 1: RETURNS is used to return the median salary among the inputs.
Line 3: MEDIAN(Salary) will give the median value among the salaries. INTO
variable median_salary.
Line 4: FROM is used to specify from which table the data is to be considered.
Comment
Chapter 14, Problem 1RQ
Problem
Step-by-step solution
Step 1 of 2
Semantics of a relation refers to way of explaining the meaning of an attribute value in a tuple.
Comment
Step 2 of 2
• The semantics of an attribute should be considered in such a way that they can be interpreted
easily.
• Once the semantics of an attribute are clear, it will be easy to interpret a relation.
• The relation that is easy to interpret will indeed result in a good schema design.
Thus, the semantics of an attribute plays an informal measure to design a relation schema.
Comment
Problem
Chapter 14, Problem 2RQ
Discuss insertion, deletion, and modification anomalies. Why are they considered bad? Illustrate
with examples.
Step-by-step solution
Step 1 of 6
Insertion anomaly refers to the situation where it is not possible to enter data of certain attributes
into the database without entering data of other attributes.
Deletion anomaly refers to the situation where data of certain attributes are lost as the result of
deletion of some of the attributes.
Modification anomaly refers to the situation where partial update of redundant data leads to
inconsistency of data.
Comment
Step 2 of 6
Insertion, deletion and modification anomalies are considered bad due to the following reasons:
Comment
Step 3 of 6
Insertion Anomalies:
• Assume that there is an employee E11 who is not yet working in a project. Then it is not
possible to enter details of employee E11 into the relation Emp_Proj.
• Similarly assume there is a project P7 with no employees assigned to it. Then it is not possible
to enter details of project P7 into the relation Emp_Proj.
• Similarly, it is possible to enter details of a project into relation Emp_Proj only if an employee is
assigned to a project.
Comment
Step 4 of 6
Deletion Anomalies:
• Assume that an employee E07 has left the company. So, it is necessary to delete employee
E07 details from the relation Emp_Pro.
• If employee E07 details are deleted from the relation Emp_Pro, then the details of project P5
will also be lost.
Update anomalies:
• Assume that the location of project P1 is changed from Atlanta to New Jersey. Then the update
should be done at three places.
• If the update is reflected for two tuples and is not done for the third tuple, then inconsistency of
data occurs.
Comment
Step 5 of 6
In order to remove insertion, deletion and modification anomalies, decompose the relation
Emp_Proj into three relations as shown below:
Comment
Step 6 of 6
Insertion Anomalies:
• It is possible to enter the details of employee E11 into relation Employee even though he is not
yet working in a project.
• It is possible to enter the details of project P7 into relation Project even though there are no
employees assigned to it.
Deletion Anomalies:
• If employee E07 details are deleted from the relation Employee, still the details of project P5 will
not be lost.
Update anomalies:
• If the location of project P1 is changed from Atlanta to New Jersey, then the update should be
done in relation Project at only one place.
Comment
Chapter 14, Problem 3RQ
Problem
Why should NULLs in a relation be avoided as much as possible? Discuss the problem of
spurious tuples and how we may prevent it.
Step-by-step solution
Step 1 of 4
Nulls values should be avoided in a relation as much as possible for the following reasons:
Comment
Step 2 of 4
• When aggerate operations such as SUM, AVG etc. are performed on the attribute which has
null values, the result will be incorrect.
• When JOIN operation involves an attribute with null values, the result may be unpredictable.
• The NULL value has different meanings. It may be unknown, not applicable or absent.
Comment
Step 3 of 4
Spurious tuples are generated as the result of bad design or improper decomposition of the base
table.
• Spurious tuples are the tuples generated when a JOIN operation is performed on badly
designed relations. The resultant will have more tuples than the original set of tuples.
• The main problem with spurious tuples is that they are considered invalid as they do not appear
in the base tables.
Comment
Step 4 of 4
Spurious tuples can be avoided by taking care while designing relational schemas.
• The relations should be designed in such a way that when a JOIN operation is performed, the
attributes involved in the JOIN operation must be a primary key in one table and foreign key in
another table.
• While decomposing a base table into two tables, the tables must have a common attribute. The
common attribute must be primary key in one table and foreign key in another table.
Comment
Chapter 14, Problem 4RQ
Problem
State the informal guidelines for relation schema design that we discussed. Illustrate how
violation of these guidelines may be harmful.
Step-by-step solution
Step 1 of 1
For designing a relation a relational database schema there are four types of informal measures
of guidelines that are
(1) Anomalies that cause redundant work to be done during insertion into and modification of a
relation. And that may cause accidental loss of information during a deletion from a relation.
(2) Waste of storage space due to NULL and the difficulty of perfuming selections. Aggregation
operation and joins due to NULL values.
(3) Generation of invalid and spurious data during joins on improperly related base relations.
There problems may pointed out which can be detected with out additional tool of analysis’s.
Comment
Chapter 14, Problem 5RQ
Problem
What is a functional dependency? What are the possible sources of the information that defines
the functional dependencies that hold among the attributes of a relation schema?
Step-by-step solution
Step 1 of 3
Functional dependency: The functional dependency describes the relationship between the
attributes in a table. The functional dependency between the two attributes X, Y in a
relation R is said to be exist if one attribute determines the other attribute uniquely.
Comment
Step 2 of 3
The functional dependency is a property of the semantics i.e., the functional dependency
represents the semantic association between the attributes of the relation schema R. The main
use of the functional dependency is that it describes the relation schema R. It is done by
specifying the constraints on a relation R. These constraints are called legal extensions.
Comment
Step 3 of 3
Full functional dependency indicates that if A and B are attributes of the relation R then B is fully
functionally dependent on A, but not any proper subset of A.
Partial functional dependency indicates that if A and B are attributes of the relation R then B is
partially dependent on A if there is some attribute that can be removed from A and yet the
dependency still holds among the attributes of a relational schema.
Comment
Chapter 14, Problem 6RQ
Problem
Why can we not infer a functional dependency automatically from a particular relation state?
Step-by-step solution
Step 1 of 1
Certain FDs can be specified without refereeing to a specific relation, but as a property of those
attributes given there generally understood meaning. It is also possible that certain functional
dependencies may cease to exist in the real world if the relationship changes. Some tuples may
have values that agree to a supposed FD but a new tuple may not agree with the same. Since a
functional dependency is a property of the relation schema R, and not of a particular legal
relation state R, it is not possible to define FDs from a particular relation state.
Comment
Chapter 14, Problem 7RQ
Problem
What does the term unnormalized relation refer to? How did the normal forms develop historically
from first normal form up to Boyce-Codd normal form?
Step-by-step solution
Step 1 of 1
A unnormalized relation refer to a relation which does not meet any normal form condition.
The normalization process was first proposed by Codd(1972), takes a relation schema through
series of tests to certify weather it satisfies a certain normal form. The process, which proceeds
in a top-down fashion by evaluating each relation against criteria for normal forms and
decomposing relations as necessary, thus can be considered as relation design by analysis.
Initially Codd proposed three normal forms 1NF, 2NF and 3NF. A stronger definition of 3NF called
Boyce-Codd normal form(BCNF) was proposed later by Boyce and Codd. All these nornal forms
are based on a single analytical tool: the functional dependencies among attributes of relation.
1NF splits relation schema into schemas that have atomic values as domain for all attribues and
values of none of attribute is set of values. 2NF removes all partial dependencies of nonprime
attributes A in R on key and ensure that all nonprime attributes are fully functionally dependent
on the key of R. 3NF removes all transitive dependencies on key of R. and ensure that no non
prime attribute is transitively dependent on key.
Comment
Chapter 14, Problem 8RQ
Problem
Define first, second, and third normal forms when only primary keys are considered. How do the
general definitions of 2NF and 3NF, which consider all keys of a relation, differ from those that
consider only primary keys?
Step-by-step solution
Step 1 of 2
First Normal Form: It states that the domain of an attribute must include only atomic values and
that the values of any attribute in a tuple must be a single value from the domain of that attribute.
In other words first normal form does not allow relations with in relation as attribute values within
tuples.
Second Normal Form: It is based on concept of full functional dependency. A dependency X-> Y
is full functional dependency if after removing any attribute A from X dependency does no hold
any more. Else it is called partial dependency.
A relation schema is said to be in third normal form if it satisfies second normal form and no
nonprime attribute of R is transitively dependent on the primary key.
Comment
Step 2 of 2
The general definitions of 2NF and 3NF are different from general definition because general
definition takes into account candidate keys as well. As a general definition of prime atribute, an
attribute that is part of any candidate key will be considered as prime. Partial and full functional
dependencies and transitive dependencies will be considered with respect to all candidate keys
of a relation.
General definition of 2NF: A relation schema R is in second normal form if every non-prime
attribute A in R is not partially dependent on any key of R
General definition of 3NF: A relation schema is said to be in 3NF if, whenever a nontrivial
functional dependency X->A holds in R, either (a) X is a super key of R, or (b) A is a prime
attribute of R
Comment
Chapter 14, Problem 9RQ
Problem
Step-by-step solution
Step 1 of 1
2NF removes all partial dependencies of nonprime attributes A in R on key and ensure that all
nonprime attributes are fully functionally dependent on the key of R.
Comment
Chapter 14, Problem 10RQ
Problem
Step-by-step solution
Step 1 of 1
3NF removes all transitive dependencies on key of R. and ensure that no non prime attribute is
transitively dependent on key.
Comment
Chapter 14, Problem 11RQ
Problem
In what way do the generalized definitions of 2NF and 3NF extend the definitions beyond primary
keys?
Step-by-step solution
Step 1 of 1
The generalized definitions of second normal form and third normal form extend beyond primary
key by taking into consideration all the candidate keys of a relation.
• These definitions do not depend/revolve around only the primary key of a relation.
• These definitions take into consideration all the attributes that can be a possible key for a
relation
• These definitions also consider the partial and transitive dependencies on the candidate keys.
Comment
Chapter 14, Problem 12RQ
Problem
Define Boyce-Codd normal form. How does it differ from 3NF? Why is it considered a stronger
form of 3NF?
Step-by-step solution
Step 1 of 3
• In the functional dependency XY, if the attribute Y is fully functionally dependent on X, then X is
said to be a determinant.
Comment
Step 2 of 3
BCNF 3NF
BCNF is a stronger normal form than 3NF. 3NF is a weaker normal form than BCNF.
In the functional dependency XY, Y need not be In the functional dependency XY, Y must be
a prime attribute. a prime attribute.
Comment
Step 3 of 3
• BCNF does not allow some dependencies which are allowed in 3NF.
Comment
Chapter 14, Problem 13RQ
Problem
Step-by-step solution
Step 1 of 3
Multivalued Dependency:
Comment
Step 2 of 3
• The relation which will have constraints that cannot be specified as the functional dependency
then the multivalued dependency arises.
• It will also occur when there is occurrence of one or more tuples in the same table in a
database.
Comment
Step 3 of 3
Here Ename indicates employee name, Pname indicates project name, and Dname indicates
dependent’s name.
This is a multivalued dependency because; an employee can work in more than one project and
can have more than one dependent.
Comment
Chapter 14, Problem 14RQ
Problem
Does a relation with two or more columns always have an MVD? Show with an example.
Step-by-step solution
Step 1 of 2
In a relation, when one attribute has multiple values referring to another attribute, then it indicates
that there is a multivalued dependency (MVD) in a relation.
In order to remove the MVDs, decompose the relation into two relations as shown below:
Comment
Step 2 of 2
A relation with two or more columns will not always have a multivalued dependency (MVD).
An example of a relation with two attributes that does not have an MVD is as follows:
An example of a relation with three attributes that does not have an MVD is as follows:
Comment
Chapter 14, Problem 15RQ
Problem
Step-by-step solution
Step 1 of 2
• The fourth normal form is violated if the relation is having the multivalued dependencies which
are used to identify and decompose the relations in the relational schema R.
Comment
Step 2 of 2
• A relation can be in Fourth normal form, if the relation is in third normal form.
Comment
Chapter 14, Problem 16RQ
Problem
Step-by-step solution
Step 1 of 2
Join dependency:
• It is a constraint which is specified on the relation schema which is denoted by JD (R1, R2, R3,
... ,Rn).
• A join dependency is said to be trivial join dependency if join dependency specified on the
relation schema is equal to R.
Comment
Step 2 of 2
• The table should be the standard for the fourth normal form.
• It is also called project join normal form because if there is any decomposition of the Relational
Schema R there will be lossless decomposition in join dependency.
Comment
Chapter 14, Problem 17RQ
Problem
Step-by-step solution
Step 1 of 2
• A relation schema is said to be in fifth normal if it is in the fourth normal form and with the set of
the functional and join dependencies.
Comment
Step 2 of 2
Consider when supplier(S) supplies the parts (p) to the projects (j).
Therefore it shows the join dependency in the relation which are decomposed into three relations
that are shown above and each relation is in 5NF.
Comment
Chapter 14, Problem 18RQ
Problem
Why do practical database designs typically aim for BCNF and not aim for higher normal forms?
Step-by-step solution
Step 1 of 1
Boyce Codd normal form (BCNF): The relation schema is said to be in BCNF whenever the
nontrivial functional dependency X->A in R and then X is the super key of the relational
schema(R).
The practical database design users prefer to use BCNF rather than going for the higher normal
forms because of the following reasons:
• It reduces the redundancy (or duplicate) of the information in the thousands of tuples.
• The data model can be easily understood by using the BCNF normalization technique.
• It also improves the performance of the database when compared to the other normal forms.
• It is stronger than the 3NF because a relation in BCNF is also a relation in 3NF but not the vice-
versa.
• In most of the cases, the functional dependencies in R that violate the normal form up to BCNF
are not present.
The above points clearly say that database design users practically use BCNF when compared
to other higher normal forms which improve the consistency, performance and quality of the
database.
Comment
Chapter 14, Problem 19E
Problem
Suppose that we have the following requirements for a university database that is used to keep
track of students’ transcripts:
a. The university keeps track of each student’s name (Sname), student number (Snum), Social
Security number (Ssn), current address (Sc_addr) and phone (Sc_phone), permanent address
(Sp_addr) and phone (Sp_phone), birth date (Bdate), sex (Sex), class (Class) (‘freshman’,
‘sophomore’, …, ‘graduate’), major department (Major_code), minor department (Minor_code) (if
any), and degree program (Prog) (‘b.a.’, ‘b.s.’, ..., ‘ph.d.’). Both Ssn and student number have
unique values for each student.
b. Each department is described by a name (Dname), department code (Dcode), office number
(Doffice), office phone (Dphone), and college (Dcollege). Both name and code have unique
values for each department.
c. Each course has a course name (Cname), description (Cdesc), course number (Cnum),
number of semester hours (Credit), level (Level), and offering department (Cdept). The course
number is unique for each course.
d. Each section has an instructor (Iname), semester (Semester), year (Year), course
(Sec_course), and section number (Sec_num). The section number distinguishes different
sections of the same course that are taught during the same semester/year; its values are 1,2, 3,
..., up to the total number of sections taught during each semester.
e. A grade record refers to a student (Ssn), a particular section, and a grade (Grade).
Design a relational database schema for this database application. First show all the functional
dependencies that should hold among the attributes. Then design relation schemas for the
database that are each in 3NF or BCNF. Specify the key attributes of each relation. Note any
unspecified requirements, and make appropriate assumptions to render the specification
complete.
Step-by-step solution
Step 1 of 4
Functional Dependency:
Functional dependency exists when one attribute in a relation uniquely determines another
attribute. Functional dependency is represented as XY. X and Y can be composite.
Comment
Step 2 of 4
From the functional dependencies FD 1 and FD 2, the relation STUDENT can be defined. Either
Ssn or Snum can be primary key.
From the functional dependencies FD 3 and FD 4, the relation DEPARTMENT can be defined.
Either Dname or Dcode can be primary key.
From the functional dependencies FD 5, the relation COURSE can be defined. Cnum is the
primary key.
From the functional dependencies FD 6, the relation SECTION can be defined. Sec_num,
Sec_course, Semester, Year will be the composite primary key.
From the functional dependencies FD 7 and FD 8, the relation GRADE can be defined. {Ssn,
Sec_course, Semester, Year} will be the composite primary key.
Comment
Step 3 of 4
Explanation:
• In STUDENT relation, either Ssn or Snum can be primary key. Either keys can be used to
retrieve the data from the STUDENT table.
• In DEPARTMENT relation, either Dname or Dcode can be primary key. Either keys can be used
to retrieve the data from the DEPARTMENT table.
• The primary key for the SECTION table is {Sec_num, Sec_course, Semester, Year} which is a
composite primary key.
• The primary key for the GRADE table is {Ssn, Sec_course, Semester, Year} which is a
composite primary key.
Comment
Step 4 of 4
Comment
Chapter 14, Problem 20E
Problem
What update anomalies occur in the EMP_PROJ and EMP_DEPT relations of Figures 14.3 and
14.4?
Step-by-step solution
Step 1 of 2
when the last EMPLOYEE working on the information (PNAME, PNUMBER, PLOCATION) will
not be represented in the database and is removed. Then new PROJECT cannot be added
unless at least one EMPLOYEE is assigned to work on it.
Let the example, if a different value is entered for PLOCATION than those values in other tuples
with the same value for PNUMBER, we get an update anomaly. Same like this comments apply
to EMPLOYEE information. The reason is that EMP_PROJ represents the relationship between
EMPLOYEEs and PROJECTs, and at the same time represents information concerning
EMPLOYEE and PROJECT entities.
Comment
Step 2 of 2
{SSN}->{DNUMBER}->{DNAME, DMGRSSN}
Let the Example for , if a DEPARTMENT temporarily has no EMPLOYEEs working for it, its
information (DNAME, DNUMBER, DMGRSSN) will not be represented in the database when the
last EMPLOYEE working on it is removed. A new DEPARTMENT cannot be added unless at
least one EMPLOYEE is assigned to work on it.
Inserting a new tuple relating a new EMPLOYEE to an existing DEPARTMENT requires checking
the transitive dependencies; for example, if a different value is entered for DMGRSSN than those
values in other tuples with the same value for DNUMBER, we get an update anomaly. The
reason is that EMP_DEPT represents the relationship between EMPLOYEEs and
DEPARTMENTs, and at the same time represents information concerning EMPLOYEE and
DEPARTMENT entities.
Comment
Chapter 14, Problem 21E
Problem
In what normal form is the LOTS relation schema in Figure 14.12(a) with respect to the restrictive
interpretations of normal form that take only the primary key into account? Would it be in the
same normal form if the general definitions of normal form were used?
Step-by-step solution
Step 1 of 1
575-10-23E
With respect to restrictive interpretation of normal form, the LOTS relational schema is in 2NF
since no partial dependencies are on the primary key. Other wise, it is not in 3NF, since following
two transitive dependencies are on the primary key:
Now, if we take all keys into account and use the general definition of 2NF and 3NF, then the
LOTS relation schema will only be in 1NF because there is a partial dependency
COUNTY_NAME ->TAX_RATE on the secondary key {COUNTY_NAME, #}, which violates 2NF.
Comment
Chapter 14, Problem 22E
Problem
Step-by-step solution
Step 1 of 2
BCNF:
Comment
Step 2 of 2
Take he relation schema R= {a, b} with two attributes. Then the non-trivial FDs are
Case 1: No FD holds in R.
In this case, the key is {a, b} and the relation satisfies BCNF.
In this case, the key is {a} and the relation satisfies BCNF.
In this case, the key is {B} and the relation satisfies BCNF.
Case 4: Both {a} -> {a} and {b} -> {a} hold.
In this case, there are two keys {a} and {a} and the relation satisfies BCNF.
Comment
Chapter 14, Problem 23E
Problem
Why do spurious tuples occur in the result of joining the EMP_PROJ1 and EMP_ LOCS relations
in Figure 14.5 (result shown in Figure 14.6)?
Step-by-step solution
Step 1 of 1
The spurious tuples are those tuples that are not valid. The spurious tuples occur in the result of
joining the EMP_PROJ1 and EMP_LOCS relations because the natural joining is based on the
common attribute Plocation.
• The attribute Plocation is not a primary key or a foreign key in the relations EMP_PROJ1 and
EMP_LOCS.
• As Plocation is not a primary key or a foreign key in the relations EMP_PROJ1 and
EMP_LOCS, it resulted in spurious tuples.
Comment
Chapter 14, Problem 24E
Problem
Step-by-step solution
Step 1 of 1
575-10-26E
A minimal set of attributes whose closure includes all the attributes in R is a key. Since the
closure of {A, B}, {A, B} + = R,
For this normalize R intuitively into 2NF then 3NF, we may follow below steps
Step 1:
Identify partial dependencies and that may violate 2NF. These are attributes that are
Now we can calculate the closures {A}+ and {B}+ to determine partially dependent attributes:
{A}+ = {A, D, E, I, J}. Hence {A} -> {D, E, I, J} ({A} -> {A} is a trivial dependency
{B}+ = {B, F, G, H}, hence {A} -> {F, G, H} ({B} -> {B} is a trivial dependency
For normalizing into 2NF, we may remove the attributes that are functionally dependent on part of
the key (A or B) from R and place them in separate relations R1 and R2, along with the part of
the key they depend on (A or B), which are copied into each of these relations but also remains
in the original relation, which we call R3 below:
The new keys for R1, R2, R3 are underlined. Next, we look for transitive dependencies
The relation R1 has the transitive dependency {A} -> {D} -> {I, J}, so we remove the transitively
dependent attributes {I, J} from R1 into a relation R11 and copy the attribute D they are
dependent on into R11. The remaining attributes are kept in a relation R12. Hence, R1 is
decomposed into R11 and R12 as follows:
The relation R2 is similarly decomposed into R21 and R22 based on the transitive dependency
{B} -> {F} -> {G, H}:
The final set of relations in 3NF are {R11, R12, R21, R22, R3}
Comments (1)
Chapter 14, Problem 25E
Problem
Repeat Exercise for the following different set of functional dependencies G = {{A, B}, → {C}, {B,
D}→ {E, F}, {A, D}→{G, H}, {A}→{I},{H} → {J}}.
Exercise
Step-by-step solution
Step 1 of 6
{A, B}{C}
{B, D}{E, F}
{A, D}{G, H}
{A}{I}
{H}{J}
{A}+{A, I}
{B}+{B}
{C}+{C}
{D}+{D}
{E}+{E}
{F}+{F}
{G}+{G}
{H}+{H, J}
{I}+{ I}
{J}+{J}
From the above closures of single attributes, it is clear that the closure of any single attribute
does not represent relation R. So, no single attribute forms the key for the relation R.
Comment
Step 2 of 6
Step 2: Find the closure of pairs of attributes that are in the set of functional
dependencies.
{A, B}+{A, B, C, I}
{B, D}+{B, D, E, F}
From the functional dependency {A, D}{G, H}, {A}{I} and {H}{J},
{A, D}+{A, D, G, H, I, J}
From the above closures of pairs of attributes, it is clear that the closure of any pairs of attributes
does not represent relation R. So, no single attribute forms the key for the relation
Comment
Step 3 of 6
Step 3: Find the closure of union of the three pairs of attributes that are in the set of
functional dependencies.
From the functional dependency {A, B}{C}, {B, D}{E, F} and {A, D}{G, H}
{A, B, D}+{A, B, C, D, E, F, G, H}
Comment
Step 4 of 6
According to the second normal form, each non-key attribute must depend only on primary key.
R1{A, I}
R2{A, B, C}
R3{B, D, E, F}
R4{A, B, D)
R5{A, D, G, H, J}
The relations R1, R2, R3, R4, R5 are in second normal form.
Comment
Step 5 of 6
According to the third normal form, the relation must be in second normal form and any non-key
attribute should not describe any non-key attribute.
R6{A, D, G, H,}
R7{H, J}
Comment
Step 6 of 6
The final set of relations that re in third normal form are as follows:
R1{A, I}
R2{A, B, C}
R3{B, D, E, F}
R4{A, B, D)
R6{A, D, G, H,}
R7{H, J}
Comment
Chapter 14, Problem 26E
Problem
A B C TUPLE#
10 bl cl 1
10 b2 c2 2
11 b4 cl 3
12 b3 c4 4
13 bl cl 5
14 b3 c4 6
a. Given the previous extension (state), which of the following dependencies may hold in the
above relation? If the dependency cannot hold, explain why by specifying the tuples that cause
the violation.
i. A → B,
ii. B → C,
iii. C → B,
iv. B → A,
v. C → A
b. Does the above relation have a potential candidate key? If it does, what is it? If it does not,
why not?
Step-by-step solution
Step 1 of 2
a)
1.) A->B does not hold good in current state of relation as attribute B has two values
corresponding to value 10 of attribute A.
2.) B->C: this relation can hold good in current relation state.
3.) C->B does not hold good in current state of relation as attribute B has two values
corresponding to value c1 of attribute C.
4.) B->A does not hold good in current state of relation as attribute A has two values
corresponding to value b1 and b3 of attribute B.
5.) C->A does not hold good in current state of relation as attribute A has two values
corresponding to value c1, c4 of attribute C.
Comment
Step 2 of 2
b) If value of attribute - TUPLE# remains different for all tuples in relation it can act as candidate
key.
Comment
Chapter 14, Problem 27E
Problem
AB → C, CD → E, DE→ B
Step-by-step solution
Step 1 of 3
The candidate key is the minimal field or the combination of fields in a relation that can be used
to uniquely identify all the other fields of the given relation.
The candidate key is checked using the closure property of the set and the functional
dependencies of the given relation.
Comment
Step 2 of 3
Consider the given relation R (A, B, C, D, E) and the following function dependencies:
AB C, CD E, DE B
To check whether the key AB is the candidate key of the given relation R, find the closure of AB
as shown below:
Since, all the attributes of the relation R cannot be identified using the key AB, the AB is not the
candidate key for the given relation R.
Comment
Step 3 of 3
To check whether the key ABD is the candidate key of the given relation R, find the closure of
ABD as shown below:
Since, all the attributes of the relation R can be identified using the key ABD, the ABD is a
candidate key for the given relation R.
Hence, proved.
Comment
Chapter 14, Problem 28E
Problem
Consider the relation R, which has attributes that hold schedules of courses and sections at a
university; R = {Course_no, Sec_no, Offering_dept, Credit_hours, Course_level, lnstructor_ssn,
Semester, Year, Days_hours, Room_no, No_of_students}. Suppose that the following functional
dependencies hold on R:
Try to determine which sets of attributes form keys of R. How would you normalize this relation?
Step-by-step solution
Step 1 of 5
Relation
Functional dependencies:
Comment
Step 2 of 5
The attributes Offering_dept, Credit_hours, Course_level are added to the closure of Course_no
as Course_no functionally determines Offering_dept, Credit_hours, Course_level.
Comment
Step 3 of 5
Step 4 of 5
Comment
Step 5 of 5
Comment
Problem
Chapter 14, Problem 29E
Consider the following relations for an order-processing application database at ABC, Inc.
Assume that each item has a different discount. The Total_price refers to one item, Odate is the
date on which the order was placed, and the Total_amount is the amount of the order. If we apply
a natural join on the relations ORDER_ITEM and ORDER in this database, what does the
resulting relation schema RES look like? What will be its key? Show the FDs in this resulting
relation. Is RES in 2NF? Is it in 3NF? Why or why not? (State assumptions, if you make any.)
Step-by-step solution
Step 1 of 4
The natural join of two relations can be performed only when the relations have a common
attribute with the same name.
The relations ORDER and ORDER_ITEM have O# as a common attribute. So, based on the
attribute O#, the natural join of two relations ORDER and ORDER_ITEM can be performed.
The resulting relation RES when natural join is applied on relations ORDER and ORDER_ITEM
is as follows:
Comment
Step 2 of 4
Comment
Step 3 of 4
The relation RES is not in second normal form as partial dependencies exist in the relation.
• O# is a partial primary key and it functionally determines Odate, Cust# and Total_amt%.
Comment
Step 4 of 4
According to the third normal form, the relation must be in second normal form and any non-key
attribute should not describe any non-key attribute.
The relation RES is not in third normal form as it is not in second normal form.
Comment
Chapter 14, Problem 30E
Problem
Assume that a car may be sold by multiple salespeople, and hence {Car#, Salesperson#} is the
primary key. Additional dependencies are
Salesperson# → Commission%
Based on the given primary key, is this relation in INF, 2NF, or 3NF? Why or why not? How would
you successively normalize it completely?
Step-by-step solution
Step 1 of 4
The relation CAR_SALE is in first normal form (1NF) but not in second normal form.
• According to the first normal form, the relation should contain only atomic values.
• As the relation CAR_SALE contains only atomic values, the relation CAR_SALE is in the first
normal form.
Comment
Step 2 of 4
The relation CAR_SALE is not in second normal form as partial dependencies exist in the
relation.
• According to the second normal form, each non-key attribute must depend only on primary key.
• As partial dependency exists in the relation, the relation CAR_SALE is not in second normal
form.
• In order to satisfy second normal form, remove the partial dependencies by decomposing the
relation as shown below:
Comment
Step 3 of 4
The relation CAR_SALE2 is in third normal form but the relation CAR_SALE1 is not in third
normal form as transitive dependencies exist in the relation.
• According to the third normal form, the relation must be in second normal form and any non-key
attribute should not describe any non-key attribute.
• As transitive dependency exists in the relation, the relation CAR_SALE1 is not in third normal
form.
• In order to satisfy third normal form, remove the transitive dependencies by decomposing the
relation CAR_SALE1as shown below:
• The relations CAR_SALE3 and CAR_SALE4 are now in third normal form.
Comment
Step 4 of 4
The final set of relations that are in third normal are as follows:
Comment
Chapter 14, Problem 31E
Problem
Author_affil refers to the affiliation of author. Suppose the following dependencies exist:
Book_type → List_price
Author_name → Author_affil
b. Apply normalization until you cannot decompose the relations further. State the reasons
behind each decomposition.
Step-by-step solution
Step 1 of 4
a.
The relation Book is in first normal form (1NF) but not in second normal form.
Explanation:
• According to the first normal form, the relation should contain only atomic values.
• As the relation Book contains only atomic values, the relation Book is in the first normal form.
• According to the second normal form, each non-key attribute must depend only on primary key.
• Book_title is a partial primary key and it functionally determines Publisher and Book_type.
• As partial dependency exists in the relation, the relation Book is not in second normal form.
Comment
Step 2 of 4
b.
The relation Book is in first normal form. It is not in second normal form as partial dependencies
exist in the relation.
In order to satisfy second normal form, remove the partial dependencies by decomposing the
relation as shown below:
List_price)
Author(Author_name, Author_affil)
The relations Book_author, Book_publisher and Author are in second normal form.
Comment
Step 3 of 4
According to the third normal form, the relation must be in second normal form and any non-key
attribute should not describe any non-key attribute.
• The relations Book_publisher is not in third normal form as transitive dependency exists in the
relation.
• In order to satisfy third normal form, remove the transitive dependencies by decomposing the
relation Book_publisher as shown below:
Book_details(Book_title, Publisher,Book_type)
The relations Book_author, Book_details, Book_price and Author are in third normal form.
Comment
Step 4 of 4
The final set of relations that are in third normal are as follows:
Book_author (Book_title, Author_name)
Author(Author_name, Author_affil)
Comment
Chapter 14, Problem 32E
Problem
This exercise asks you to convert business statements into dependencies. Consider the relation
DISK_DRIVE (Serial_number, Manufacturer, Model, Batch, Capacity, Retailer). Each tuple in the
relation DISK_DRIVE contains information about a disk drive with a unique Serial_number, made
by a manufacturer, with a particular model number, released in a certain batch, which has a
certain storage capacity and is sold by a certain retailer. For example, the tuple Disk_drive
(‘1978619’, ‘WesternDigital’, ‘A2235X’, ‘765234’, 500, ‘CompUSA’) specifies that WesternDigital
made a disk drive with serial number 1978619 and model number A2235X, released in batch
765234; it is 500GB and sold by CompUSA.
d. All disk drives of a certain model of a particular manufacturer have exactly the same capacity.
Step-by-step solution
Step 1 of 1
a)
b)
model → manufacturer
c)
d)
model → capacity
Comments (1)
Chapter 14, Problem 33E
Problem
In the above relation, a tuple describes a visit of a patient to a doctor along with a treatment code
and daily charge. Assume that diagnosis is determined (uniquely) for each patient by a doctor.
Assume that each treatment code has a fixed charge (regardless of patient). Is this relation in
2NF? Justify your answer and decompose if necessary. Then argue whether further
normalization to 3NF is necessary, and if so, perform it.
Step-by-step solution
Step 1 of 1
{Treat_code}→{Charge}
Here there is no partial dependencies, So, the given relation is in 2NF. And it is not 3NF because
the Charge is a nonkey attribute that is determined by another nonkey attribute, Treat_code.
R1 (Treat_code, Charge)
We could further infer that the treatment for a given diagnosis is functionally dependant, but we
should be sure to allow the doctor to have some flexibility when prescribing cures.
Comment
Chapter 14, Problem 34E
Problem
This relation refers to options installed in cars (e.g., cruise control) that were sold at a dealership,
and the list and discounted prices of the options.
Step-by-step solution
Step 1 of 3
Sale_date, Option_discountedprice)
Car_id Sale_date
Option_type Option_listprice
Comment
Step 2 of 3
In order for a relation to be in third normal form, all nontrivial functional dependencies must be
fully dependent on the primary key and any non-key attribute should not describe any non-key
attribute. In other words, there should not be any partial dependency and transitive dependency.
• In functional dependency Car_id Sale_date, Car_id is a partial key that determines Sale_date.
Hence, there exists partial dependency in the relation.
Comment
Step 3 of 3
According to the second normal form, the relation must be in first normal form and each non-key
attribute must depend only on primary key. In other words, there should not be any partial
dependency.
• In functional dependency Car_id Sale_date, Car_id is a partial key that determines Sale_date.
Hence, there exists partial dependency in the relation.
Comment
Chapter 14, Problem 35E
Problem
a. Based on a common-sense understanding of the above data, what are the possible candidate
keys of this relation?
b. Justify that this relation has the MVD {Book} ↠ {Author} | {Edition, Year}.
c. What would be the decomposition of this relation based on the above MVD? Evaluate each
resulting relation for the highest normal form it possesses.
Step-by-step solution
Step 1 of 3
Candidate Key
A candidate key may be a single attribute or a set of attribute that uniquely identify tuples or
record in a database. Subset of candidate key are called prime attributes and rest of the
attributes in the table are called non-prime attributes.
Book_Name is same in all rows so this can’t be consider as a part of candidate key.
a.
All above sets are candidate keys. Any one candidate key can be implemented. (Author, Edition),
(Author, Copyright_Year) will be a better choice to implement.
Comment
Step 2 of 3
b.
MVD occurs when the presence of one or more tuples in the table implies the presence of one or
more other rows in the same table. If at least two rows of table agree on all implying attributes,
then there components might be swapped, and the resulting tuples must be in the table. MVD
plays very important role in 4NF.
By the definition of MVD, Book_Name is implying more than one Author and (Edition,
Copyright_Year). If the components of Author, Edition and Copyright are swapped than the
resulting rows would be present in the table. Therefore, the relation has MVD
.
Comment
Step 3 of 3
c.
If a relation has MVD then redundant values will be there in the tuples and hence functional
dependency would not exist in that relation. Therefore, the relation will be in BCNF. So relation
can be decomposed into the following relations:
Again BOOK1 is following MVD. Decompose it further and the final schema will be holding
highest normal form.
Comment
Chapter 14, Problem 36E
Problem
This relation refers to business trips made by company salespeople. Suppose the TRIP has a
single Start_date but involves many Cities and salespeople may use multiple credit cards on the
trip. Make up a mock-up population of the table.
Step-by-step solution
Step 1 of 2
Relation TRIP has unique attribute Trip_id and particular Trip_id has single Start_date of the trip.
So Start_date is fully functionally dependent on Trip_id.
a.
FD1: ( )
Cities_visited and Cards_used may repeat for particular Start_date or Trip_id. Cities_visited and
Cards_used are independent of each other and they also have multiple values. Also, both
Cities_visited and Cards_used are dependent on Trip_id and Start_date, so the MVDs present in
the relation are as follows:
MVD1: ( )
MVD2: ( )
Comment
Step 2 of 2
b.
Normalizing relation
Relation is having one FD and two MVDs, so first split the relation to remove functional
dependency FD1.
Now split relation to remove multi valued functional dependency. Cities_visited and Cards_used
are independent of each other, if their components are swapped then relation will remain
unchanged. On the basis of Start_date, the relation can be decomposed as follows:
Comment
Chapter 15, Problem 1RQ
Problem
What is the role of Armstrong’s inference rules (inference rules IR1 through IR3) in the
development of the theory of relational design?
Step-by-step solution
Step 1 of 1
There are six inference rules (IR) for functional dependencies (FD) of which first 3 rules:
reflexive, augmentations, and transitive, are referred as Armstrong axioms.
If , then .
The reflexive rule is defined as any set of attributes functionally determines itself.
The augmented rule is defined as, when extending the left-hand side attributes of a FD results in
another valid FD.
Database designers specify the set of functional dependencies F that can be determined by
defining the attributes of relation R, and then IR1, IR2 and IR3 are used to define additional
functional dependencies that hold on R. These 3 inference rules are inferring new functional
dependencies (additional rules can also be determined from them). Hence they define new facts
and preferred by database designers in relational database design.
Comment
Chapter 15, Problem 2RQ
Problem
Step-by-step solution
Step 1 of 1
The inference rules (IR) for functional dependencies (FD) reflexive, augmentation, and transitive
rules are referred as Armstrong inference rules.
If , then .
The reflexive rule is defined as any set of attributes functionally determines itself.
The augmented rule is defined as, when extending the left-hand side attributes of a FD results in
another valid FD.
As given by Armstrong, the inference rules IR1, IR2, and IR3 are sound and complete.
Sound
It means that for any given set of functional dependencies F specified on a relation schema R,
any dependency that is defined from F by using IR1 through IR3 that contained in every relation
states of relation R, satisfies the dependencies in F.
Complete
It means that using IR1 through IR3 continuously again and again to define dependencies until
there are no more dependencies can be defined from it, results in the complete set of all possible
dependencies that can be defined from F.
Comment
Chapter 15, Problem 3RQ
Problem
What is meant by the closure of a set of functional dependencies? Illustrate with an example.
Step-by-step solution
Step 1 of 2
The closure of a set of functional dependencies is nothing but a set of dependencies that consist
of functional dependencies of a relation denoted by F as well as the functional dependencies that
can be inferred from or implied by F.
Comment
Step 2 of 2
Example:
Consider a relation Student with attributes StudentNo, Sname, address, DOB, CourseNo ,
CourseName, Credits, Duration.
So,
Hence,
Comment
Chapter 15, Problem 4RQ
Problem
When are two sets of functional dependencies equivalent? How can we determine their
equivalence?
Step-by-step solution
Step 1 of 1
• Whether A covers B, the statement is determined by calculating with respect to A for each
FD in B, then checking whether this includes the attributes in F, if this holds true for
every FD in B, then A covers B. Similarly determined for B covers A and hence both A and B are
said to be equivalent.
Comment
Chapter 15, Problem 5RQ
Problem
What is a minimal set of functional dependencies? Does every set of dependencies have a
minimal equivalent set? Is it always unique?
Step-by-step solution
Step 1 of 1
1. There are set of dependencies in F, and then every dependency in F contains one single
attribute for its right-hand side.
3. Any dependency cannot be removed from F and contains a set of dependencies that is
equivalent to F.
Condition 1 states that every dependency is accepted with a single attribute on the right-hand
side.
Conditions 2 and 3 ensure that there are no dependencies that occur repeatedly either by having
redundant attributes on the left-hand side of a dependency or by having a dependency that can
be defined from the remaining FDs in a set of functional dependency F respectively.
Comment
Chapter 15, Problem 6RQ
Problem
Step-by-step solution
Step 1 of 1
Decomposition:-
Attribute preservation
Every Attribute is in some relation. All attributes must be preserved through the process of
normalization.
Using the functional dependencies, the algorithms decompose the universal relation schema R
into a set of relation schemas that will become the relational database
schema. D is called decomposition of .
Each attribute in ‘R’ will appear in at least one relation schema in the decomposition so that
no attributes are lost.
Comment
Chapter 15, Problem 7RQ
Problem
Why are normal forms alone insufficient as a condition for a good schema design?
Step-by-step solution
Step 1 of 1
forms along in sufficient as a condition for good schema design from the describe properties of
decompositions,
Using these both, used by the design algorithms to achieve desirable decomposition
It is insufficient to test the relation schemas independently of one another for compliance with
higher normal from like 2nF, 3NF and 13 CNF. The resulting relations must collectively satisfy
these two additional propertied dependency preservation and loss less join property to quality as
a good.
Comment
Chapter 15, Problem 8RQ
Problem
Step-by-step solution
Step 1 of 2
Where is subset of .is the set of all functional dependencies such that attributes in
are contained in . dlence the projection of on each relation schema in the
decomposition is the set of functional dependencies in . Such that all their LHS and RHS
attributes are in .
Comment
Step 2 of 2
Important:-
1) With this property we would like to check easily that updates to the database do not result in
illegal relations being created.
2) It would be nice if our design allowed us to check updates without having to compute natural
joins. To know whether joins must be computed.
Comment
Chapter 15, Problem 9RQ
Problem
Why can we not guarantee that BCNF relation schemas will be produced by dependency-
preserving decompositions of non-BCNF relation schemas? Give a counterexample to illustrate
this point.
Step-by-step solution
Step 1 of 3
-fd1:
Comment
Step 2 of 3
Comment
Step 3 of 3
A relation is NOT in BCNF. That should be decomposed, so as to meet this property. While
possible forgoing the preservation of all functions dependencies in the decomposed relations
Comment
Chapter 15, Problem 10RQ
Problem
What is the lossless (or nonadditive) join property of a decomposition? Why is it important?
Step-by-step solution
Step 1 of 1
This is the one property of decomposition. The word loss in lossess means, lost of information.
But not to loss of tuples.
Equation
Emp-PROJ
SSN ENAME
Important:-
Important feature of decomposition is that it gives lossless joins. It shows the problem of spurious
tuples.
If the relations chosen do not have total information afoot the entity /relationship, when we join
the relations, then obtain the tuples. Actually that is not belonging in there.
Comment
Chapter 15, Problem 11RQ
Problem
Between the properties of dependency preservation and losslessness, which one must definitely
be satisfied? Why?
Step-by-step solution
Step 1 of 1
Dependency preservation and loss lenses both are describe by the properties of decompositions.
With this both are used by the algorithms to achieve desirable decompositions.
It ensures us to in force a constraint on the original relation from corresponding instances in the
smaller relations.
It ensures that to find out any instance of the original relation from corresponding instance in the
smaller relations.
Here no spurious rows are generated. When relations are reunited through natural join operation.
To test the relation schemas independently of one another for compliance with higher normal
forms like , and , dependency preservation is not sufficient.
Comments (1)
Chapter 15, Problem 12RQ
Problem
Step-by-step solution
Step 1 of 2
When designing a relational database schema, we must consider the problems with NULLS.
3) The value is known but absent, that is, it has not been recorded yet.
Comment
Step 2 of 2
Dangling tuples:-
Let a pair of relations and and the natural join . And tuple in ‘ ’ that does
not join with any tuple in .
Example:
For suppose there is a tuple in the account relation with the value of “ ”,
but no matching tuple in the branch relation for the Town 1
branch.
This is undesirable. As should refer to a branch that exists. and now there is a another tuple
. In the branch relation with “ ”, but no matching tuple in The account
relation for the “ ”branch.
Means that, a branch exists for which no accounts exist. When a branch is being opened.
Comment
Chapter 15, Problem 13RQ
Problem
Illustrate how the process of creating first normal form relations may lead to multivalued
dependencies. How should the first normalization be done properly so that MVDs are avoided?
Step-by-step solution
Step 1 of 2
Multivalued dependencies are a consequence of first normal form which disallows an attribute in
a tuple to have a set of values. If we have two or more multivalued independent attributes in the
same relation schema, we get into a problem of having to repeat every value of one of the
attributes with every value of other attribute to keep the relation state consistent and to maintain
the independence among attributes involved. this constraint is specified by a multivalued
dependency.
For example: consider a EMP relation with attributes Ename, Project_name, Dependent_name
1.) ('a','x','n')
2.) ('a','x','m')
3.) ('a','y','n')
4.) ('a','y','m')
Comment
Step 2 of 2
Here employee name 'a' has two depenedents and work for two projects. Since each attribute
value must be atomic, the problem of multivalued dependency has risen in the relation.
Informally, whenever two independent 1:N relationships A:B and A:C are mixed in the same
relation, R(A, B, C) an MVD may arise.
The property NJB': The relation schema R1 and R2 form a nonadditive join decomposition of R
with respect to a set of functional and multivalued dependencies if and only if
...deals with problem of MVD and thus using this property we can get a relation which is in 1NF
and does not has MVD.
Comment
Chapter 15, Problem 14RQ
Problem
Step-by-step solution
Step 1 of 1
It relates attributes across relations. So, the foreign key or referential integrity constraint cannot
be specified as a functional or multivalued dependency.
Class/subclass relationship:-
It represents a relations between two the class/subclass relationship. Also has no formal
definition in terms of the functional, multivalued and join dependencies.
Comment
Chapter 15, Problem 15RQ
Problem
How do template dependencies differ from the other types of dependencies we discussed?
Step-by-step solution
Step 1 of 2
Template dependencies:-
And a template consists of number of hypothesis tuples that appear in one or more relations.
Comment
Step 2 of 2
And other part of template is template conclusion. The conclusion is a set of tuples that must also
exist in the relations. If the hypothesis tuples are there.
Take relation
We may apply the template dependencies to this relation, , it shous the template for functional
dependencies .
Hypothesis
Here we take
Conclusion
Comment
Chapter 15, Problem 16RQ
Problem
Why is the domain-key normal form (DKNF) known as the ultimate normal form?
Step-by-step solution
Step 1 of 1
Behind the idea of domain-key normal form is. It specify the ultimate normal form that taken in to
account all possible types of dependencies that should hold on the valid relation states can be
enforced simply by domain constraints and key constraints.
- DkNF is the ultimate normal form means, here no higher normal form related to modification
anomalies.
- In domain – key normal form the relation is on every constraint. That is logical consequence of
the definition of keys and domains.
Comment
Chapter 15, Problem 17E
Problem
Show that the relation schemas produced by Algorithm 15.4 are in 3NF.
Step-by-step solution
Step 1 of 1
Assume that one of the relation schemas , formed by algorithm 15.4 is not in 3NF.
Thus, if a functional dependency holds in the relation schema , where A is not prime
and M is not a super key of , then M must be a subset of X or else M would comprise of X and
therefore would be a super key.
If both and holds and M is a subset of X, then this contradicts the condition
that is a functional dependency in a minimal cover of functional dependencies, as
removing an attribute from the key X of functional dependency leaves a valid functional
dependency.
This infringes one of the minimality conditions and hence the relational schema must be in
3NF.
Comment
Chapter 15, Problem 18E
Problem
Show that, if the matrix S resulting from Algorithm 15.3 does not have a row that is all a symbols,
projecting Son the decomposition and joining it back will always produce at least one spurious
tuple.
Step-by-step solution
Step 1 of 2
Take the matrix S, it is considered to be some relation state of . (From step1 in algorithm)
Row in represents a tuple ,it is corresponding to and that has a symbols in columns
and that correspond to the attributes of and symbols in the remaining columns.
During the loop, the algorithm then transforms the rows of this matrix, that they represent the
tuples.
So, the tuples satisfy all the functional dependencies in . Any two rows in which
represents two tuples in that agree in their values for the left-hand-side attributes of a
functional dependency in and it will also agree in their values for the right-hand-side
attributes .
If any row in ends up with all a symbols, then the decomposition has the non additive join
property with respect to .
In other hand, if no row ends up being all a symbols, decomposition ‘D’ does not satisfy the
lossless-join Property.
Comment
Step 2 of 2
At this time the relation state represented by . And relation state of that satisfies the
dependencies in . But does not satisfy the non additive join condition.
So the symbols.
So, the Ruslting matrix ‘S’ does not have a row with all ‘a’ symbols and the decomposition does
not have the loss-join property.
Comment
Chapter 15, Problem 19E
Problem
Show that the relation schemas produced by Algorithm 15.5 are in BCNF.
Step-by-step solution
Step 1 of 2
In this algorithm the loop will continue until all relation schemas are in BCNF 11.3 Algorithm
Step 1 : Set D :
Find the functional dependency in a that violates BCNF; replace Q in D by two relation
schemas and
Comment
Step 2 of 2
According to this algorithm, we decompose one relation schema Q. That is not in BCNF into two
relation schemas. According to the property of lossless join decomposition property 1, for binary
composition and claim 2 (Preservation of Nonadditivity in successive Decompositions) [which is
menctioved in text book], the decomposition D has the no additive join property.
Example:-
Company-name Tax-Rate
Area Price
Company-name Tax-Rate
Area Company-name
Company-name Tax-Rate
Comment
Chapter 15, Problem 20E
Problem
Step-by-step solution
Step 1 of 6
The following program converts a relational schema into 3NF. SynthesisAlgorithm is a public
class having main method to start execution. First, program takes the input from the keyboard,
stores them into several list. Input values are the attribute names and functional dependencies
for the relation.
In this program, first step calculates minimal cover of the functional dependencies. Second step
calculates the attributes to be considered for the relation. Third step checks whether or not
primary key is contained in any of the relation. Forth step finds if there is any redundant relation
and removes that relation from the schema.
Following is the java code to implement Synthesis algorithm to convert a relation into 3NF.
import java.util.*;
import java.io.*;
everywhere.");
String relationName=br.readLine();
the Relation?");
int n=Integer.parseInt(br.readLine());
line:");
LinkedList<String> attributeList=new
LinkedList<String>();
for(int i=0;i<n;i++)
attributeList.add(br.readLine());
int numOfFuncDep=Integer.parseInt(br.readLine());
Functional Dependencies.
LinkedList<String>[] fucDepLHSattr=new
LinkedList[numOfFuncDep];
Functional Dependencies.
LinkedList<String>[] fucDepRHSattr=new
LinkedList[numOfFuncDep];
for(int i=0;i<numOfFuncDep;i++)
{
// Left Hand side of functional dependency might // have more than one determinants.
fucDepLHSattr[i]=new LinkedList<String>();
functional dependency["+i+"]");
int temp1=Integer.parseInt(br.readLine());
LHS["+i+"]");
for(int j=0;j<temp1;j++)
fucDepLHSattr[i].add(br.readLine());
// Right Hand side of functional dependency might // have more than one determinants.
functional dependency["+i+"]");
int temp2=Integer.parseInt(br.readLine());
RHS["+i+"]");
for(int j=0;j<temp2;j++)
fucDepRHSattr[i].add(br.readLine());
cover...");
HashMap<String,String> canonicalFDs=new
HashMap<String,String>();
// calling the minimal cover to calculate minimum FDs // required for the relation.
canonicalFDs=minimalCover(fucDepLHSattr,fucDepRHSattr)
Comment
Step 2 of 6
for(int i=0;i<numOfFuncDep;i++)
// Since, HashMap has unique key, value pair, it // will remove redundant FDs.
canonicalFDs.get(i).containsKey(canonicalFDs.get
(j));
canonicalFDs=minusFD(canonicalFDs.get(i),
canonicalFDs.get(i));
for(int i=0;i<canonicalFDs.size();i++)
System.out.print("Relation"+i+": ");
System.out.print(relationName+"("+canonicalFDs.get(
i)+","+canonicalFDs.get(i)+")");
System.out.print("\n");
ttr))
exist:");
for(int i=0;i<canonicalFDs.size();i++)
i)+","+canonicalFDs.get(i)+")");
map.remove(pair);
minimalCover(LinkedList[] LHSlist,LinkedList[]
RHSlist)
//if the set of FDs are null this will throw //exception.
if(LHSlist==null || RHSlist==null)
else
HashMap<String,String> canonicalFDs=new
HashMap<String,String>();
canonicalFDs.put(convertIntoCanonical(LHSlist[i],RHSl
ist[i]));
return canonicalFDs;
convertIntoCanonical(LinkedList<String>
list1,LinkedList<String> list2)
HashMap<String,String> map=new
HashMap<String,String>();
// both loop will insert FDs into map, that hold only // unique pair.
for(int j=0;j<list1.size();j++)
for(int i=0;i<list2.size();i++)
map.put(list1.get(i),list2.get(i));
return map;
Comment
Step 3 of 6
The following program convert a relation into BCNF using relational decomposition algorithm. In
the first step, it considered all attributes in the single relation.
In second step enters into a loop of functional dependency and check whether or not any
functional dependency violates BCNF. If any FD violates BCNF, a new relation will be created
having all those attributes participates in that functional dependency.
At the same time the dependents are removed from the parent relation.
Following is the java code to implement Decomposition algorithm to convert a relation into BCNF.
import java.util.*;
import java.io.*;
public class DecompositionIntoBCNF
InputStreamReader(System.in));
everywhere.");
String relationName=br.readLine();
the Relation?");
int n=Integer.parseInt(br.readLine());
line:");
LinkedList<String> attributeList=new
LinkedList<String>();
for(int i=0;i<n;i++)
attributeList.add(br.readLine());
int numOfFuncDep=Integer.parseInt(br.readLine());
LinkedList<String>[] fucDepLHSattr=new
LinkedList[numOfFuncDep];
LinkedList<String>[] fucDepRHSattr=new
LinkedList[numOfFuncDep];
for(int i=0;i<numOfFuncDep;i++)
fucDepLHSattr[i]=new LinkedList<String>();
functional dependency["+i+"]");
// functional dependency.
// functional dependency.
int temp1=Integer.parseInt(br.readLine());
LHS["+i+"]");
for(int j=0;j<temp1;j++)
fucDepLHSattr[i].add(br.readLine());
functional dependency["+i+"]");
// functional dependency.
// functional dependency.
int temp2=Integer.parseInt(br.readLine());
RHS["+i+"]");
for(int j=0;j<temp2;j++)
fucDepRHSattr[i].add(br.readLine());
LinkedList[] decomposition=new
LinkedList[numOfFuncDep];
output=attributeList;
int d=0;
// BCNF. while(!inBCNF(output,fucDepLHSattr[d],fucDepRHSattr[d]
,d))
decomposition[d]=new LinkedList<String>();
for(int j=0;j<fucDepLHSattr[d].size();j++)
Comment
Step 4 of 6
decomposition[d].add(fucDepLHSattr[d].get(j));
for(int j=0;j<fucDepRHSattr[d].size();j++)
decomposition[d].add(fucDepRHSattr[d].get(j));
// relation.
for(int j=0;j<fucDepRHSattr[d].size();j++)
output.remove(fucDepRHSattr[d].get(j));
d++;
// dependencies.
if(d>=numOfFuncDep)
break;
relations:");
for(int k=0;k<d;k++)
System.out.print(relationName+""+(k+1)+"(");
// relation.
for(int q=0;q<decomposition[k].size();q++)
hs.add(decomposition[k].get(q));
Iterator it=hs.iterator();
while(it.hasNext())
System.out.print(it.next());
System.out.print(")\n");
// in BCNF.
relation,LinkedList<String> list1,LinkedList<String>
list2,int index)
// and RHS.
for(int i=0;i<list2.size();i++)
list1.add(list2.get(i));
if(list1.size()< relation.size())
return false;
else
Collections.sort(list1);
Collections.sort(relation);
j++)
if(list1.get(j)==relation.get(j))
continue;
return true;
Comment
Step 5 of 6
Note: Everything is case sensitive, please enter values in the same case everywhere.
MyRelation
Comment
Step 6 of 6
2
Enter the attribute names of LHS[2]
MyRelation1(ABC)
MyRelation2(CDE)
MyRelation3(BDE)
Comment
Chapter 15, Problem 21E
Problem
Consider the relation REFRIG(Model#, Year, Price, Manuf_plant, Color), which is abbreviated as
REFRIG(M, Y, P, MP, C), and the following set F of functional dependencies: F = {M → MP, {M,
Y}→ P, MP → C}
a. Evaluate each of the following as a candidate key for REFRIG, giving reasons why it can or
cannot be a key: {M}, {M, Y}, {M, C}.
b. Based on the above key determination, state whether the relation REFRIG is in 3NF and in
BCNF, and provide proper reasons.
c. Consider the decomposition of REFRIG into D = {R1 (M, Y, P), R2(M, MP, C)}. Is this
decomposition lossless? Show why. (You may consult the test under Property NJB in Section
14.5.1.)
Step-by-step solution
Step 1 of 3
Consider the relation schema REFRIG and the functional dependencies F provided in the
question.
a.
It is provided that .
Comment
Step 2 of 3
b.
Comment
Step 3 of 3
c.
Hence, .
Comment
Chapter 15, Problem 22E
Problem
Specify all the inclusion dependencies for the relational schema in Figure 5.5.
Step-by-step solution
Step 1 of 1
Inclusion dependencies
- Class/subclass relationships.
From the figure 5.5 in the text book, we can specify the following inclusion dependencies on the
relational schema.
WORKS-ON.P number
DEPT-LOCATIONS.D number
Comment
Chapter 15, Problem 23E
Problem
Prove that a functional dependency satisfies the formal definition of multivalued dependency.
Step-by-step solution
Step 1 of 2
Functional dependencies
While come to multi valued dependencies, it may follow the below rule
If (BB intersects CC) where AA, BB, and CC are sets of attributes, and intersect
performs set intersection.
Comments (2)
Step 2 of 2
As with function dependencies (FDs), inference rules for multi valued dependencies (MVPs)
have been developed. A functional dependency is a multi valued dependencies it follows the
replication Rule. Ice. If then
Holds
Now assume that all attributes are included in universal relation schema
If then where is
all attributes in except
Augmentation rule:
(b)
(c) Then
So, by the above rules “every functional dependency is also an multi valued dependencies,
because. It satisfies the formal definition of an multi valued dependencies.
Comment
Chapter 15, Problem 24E
Problem
Consider the example of normalizing the LOTS relation in Sections 14.4 and 14.5. Determine
whether the decomposition of LOTS into {LOTS1AX, LOTS1AY, LOTS1B, LOTS2} has the
lossless join property by applying Algorithm 15.3 and also by using the test under property NJB
from Section 14.5.1.
Step-by-step solution
Step 1 of 8
Comment
Step 2 of 8
Comment
Step 3 of 8
There are a problem with this decomposition but we wish to focus on are aspect at the moment.
Let an instance of the relation LOTS be
Comment
Step 4 of 8
Comment
Step 5 of 8
Comment
Step 6 of 8
And
Comment
Step 7 of 8
All the information that was in tehr elation LOTS appears to be still available in LOTSIAX and
LOTSIAY. But this is not so.
Suppose, we construct LOTSIAX by removing the attribute Tax-rate that violates 2NF form LUTS
and placing it wilt country-name. Into another relation LOTSIY
Let
Comment
Step 8 of 8
Now we need to retrieve #. Then we would need to join LUTSIAX and LOTSIAY. Then the join
would have some tuples.
Let decomposition of R has the non additive join property. That represents the set
of functional dependencies on . If and only if either the functional dependencies
it is also in .
Let
And .
a
so the functional dependencies
it is in and it is also in
Comment
Chapter 15, Problem 25E
Problem
Show how the MVDs Ename ↠ and Ename ↠ Dname in Figure 14.15(a) may arise during
normalization into 1NF of a relation, where the attributesPname and Dname are multivalued.
Step-by-step solution
Step 1 of 2
EMP
Now, we need to show that attributes P name and D name are multi valued.
EMP
Smith X John
Smith Y Anna
Smith X Anna
Smith Y John
Comment
Step 2 of 2
By above relation EMP shows. An employee where name is E name works on the project where
P name and has a dependent whose name I D name.
An employee may work on several projects and may have several dependents.
To maintain this relation state consistent. We must have a separate tople to represent every
combination of based on this Decomposing the EMP relation into two 4 NF relations EMP-
PROJECTS and EMP-DEPENDENTS.
Is
EMP-PROJECTS EMP-DEPENDENTS
E name P name
Smith X
Smith Y
E name P name
Smith john
Smith john
Comment
Problem
Chapter 15, Problem 26E
Apply Algorithm 15.2(a) to the relation in Exercise to determine a key for R. Create a minimal set
of dependencies G that is equivalent to F, and apply the synthesis algorithm (Algorithm 15.4) to
decompose R into 3NF relations.
Exercise
Step-by-step solution
Step 1 of 4
Refer to the Exercise 14.24 for the set of functional dependencies F and relation R. The
functional dependencies in F are as follows:
• The combination of all the attributes is always a candidate key for that relation. So
ABCDEFGHIJ will be a candidate key for the relation R.
Comment
Step 2 of 4
If functional dependencies of a relation are not in canonical form then first convert them into
canonical form using decomposition rule of inference.
Refer to the Exercise 14.24 for the set of functional dependencies F and convert them into
canonical form as follows:
• Test for minimal set of LHS (only test functional dependencies with ≥2 attributes)
1. Testing for
2. Testing for
Since so is necessary.
1. Testing for
Since so is necessary.
2. Testing for
Since so is necessary.
3.
Comment
Step 3 of 4
Testing for
Since so is necessary.
4. Testing for
Since so is necessary.
5. Testing for
Since so is necessary.
6. Testing for
Since so is necessary.
7. Testing for
Since so is necessary.
8. Testing for
Since so is necessary.
Comment
Step 4 of 4
Following steps must be used to decompose R into 3NF relations, using synthesis algorithm:
There are five functional dependencies in the relation R. Create five relations
, all having the corresponding attributes as follows:
• AB is the candidate key in relation R. Since attributes A and B already exist in relation so
there is no need to create another relation for key attributes.
• If another relation is created containing the candidate key AB, then it will result in
redundancy, and step 4 can be used for removing the redundant relation.
Therefore, the final 3NF relations obtained after decomposing R are as follows:
and
Comment
Chapter 15, Problem 27E
Problem
Exercise 1
Apply Algorithm 15.2(a) to the relation in Exercise to determine a key for R. Create a minimal set
of dependencies G that is equivalent to F, and apply the synthesis algorithm (Algorithm 15.4) to
decompose R into 3NF relations.
Exercise
Exercise 2
Repeat Exercise for the following different set of functional dependencies G = {{A, B}, {B, D}→
{E, F}, {A, D}→{G, H}, {A}→{I},{H}{J}}.
Exercise
Step-by-step solution
Step 1 of 5
Refer to the Exercise 14.25 for the set of functional dependencies F and relation R. The
functional dependencies in F are as follows:
• The combination of all attributes is always a candidate key for that relation. So ABCDEFGHIJ
will be candidate key for the relation R. Reduce unnecessary attributes from the key. Since C can
be determined by so remove it from the key.
• Since attributes B and D are determining attributes E and F so both should be removed from
the candidate key.
• Since attributes A and D are determining attributes G and H so both should be removed from
the candidate key.
• Since attribute A is determining attributes I so it should be removed from the candidate key.
• Since attribute H is determining attributes J so it should be removed from the candidate key.
Therefore, the attribute set ABD is a candidate key for the relation R.
Comment
Step 2 of 5
If functional dependencies of a relation are not in canonical form then first convert them into
canonical form using decomposition rule of inference.
Refer to the Exercise 14.25 for the set of functional dependencies F and convert them into
canonical form as follows:
If there exist any extraneous functional dependency, remove it.
• Test for minimal set of LHS (only test functional dependencies with ≥2 attributes)
1. Testing for
Since so is necessary.
2. Testing for
Since so is necessary.
3. Testing for
Since so is necessary.
4. Testing for
Since so is necessary.
5. Testing for
Since so is necessary.
6. Testing for
Since so is necessary.
Comment
Step 3 of 5
1. Testing for
Since so is necessary.
2. Testing for
Since so is necessary.
3. Testing for
Since so is necessary.
4. Testing for
Since so is necessary.
5. Testing for
Since so is necessary.
6. Testing for
Since so is necessary.
7. Testing for
Since so is necessary.
After applying composition rule of inference to above canonical functional dependencies, the
minimal functional dependencies G (where ) obtained are as follows:
Comment
Step 4 of 5
Following steps must be followed to decompose the relation R into 3NF relation using synthesis
algorithm. Refer Exercise 14.25 for the functional dependencies.
There are five functional dependencies in the relation R, create five relations
, all having corresponding attributes.
Comment
Step 5 of 5
ABD is the candidate keys in relation R. Create a new relation containing attributes A, B and
D. Therefore, all six relations with their corresponding attributes are as follow:
Remove all relations which are redundant. A relation R is redundant if R is a projection of another
relation S in the same schema . Since there is no redundant relation in the schema, so
there is no need to remove any relation.
Therefore, the final 3NF relations obtained after decomposing R are as follows:
and
Comment
Chapter 15, Problem 29E
Problem
Apply Algorithm 15.2(a) to the relations in Exercises 1 and 2 to determine a key for R. Apply the
synthesis algorithm (Algorithm 15.4) to decompose R into 3NF relations and the decomposition
algorithm (Algorithm 15.5) to decompose R into BCNF relations.
Exercise 1
AB → C, CD → E, DE→ B
Exercise 2
Consider the relation R, which has attributes that hold schedules of courses and sections at a
university; R = {Course_no, Sec_no, Offering_dept, Credit_hours, Course_level, lnstructor_ssn,
Semester, Year, Days_hours, Room_no, No_of_students}. Suppose that the following functional
dependencies hold on R:
Try to determine which sets of attributes form keys of R. How would you normalize this relation?
Step-by-step solution
Step 1 of 6
Refer to the Exercise 14.27 for the set of functional dependencies and relation R. The functional
dependencies are as follows:
Functional dependency having only one attribute on their right hand side.
• The combination of all attributes is always a candidate key for that relation. So ABCDE will be
candidate key for the relation R. Since all functional dependencies are in canonical form, there
is no need to convert them into canonical form.
Comment
Step 2 of 6
Refer to the Exercise 14.27 for the set of functional dependencies and relation R. Following steps
must be used to decompose R into 3NF relations, using synthesis algorithm:
There are three functional dependencies, and their corresponding attributes are as follows:
Step 3: Creating relation for key attributes
• ABD is the candidate key in relation R. Since attributes A, B and D already exist in the above
relations, so there is no need to create another relation for key attributes.
• If another relation is created containing the candidate key ABD, then it will result in
redundancy, and step 4 can be used for removing the redundant relation.
Therefore, the final 3NF relations obtained after decomposing R are as follows:
and .
Comment
Step 3 of 6
Refer to the Exercise 14.27 for the set of functional dependencies and relation R. Following steps
must be used to decompose R into BCNF relations, using decomposition algorithm:
S= (A, B, C, D, E)
Step 2: Check whether or not any functional dependency violates BCNF. If yes, then decompose
the relation.
Therefore, the final BCNF relations obtained after decomposing R are as follows:
and .
Comment
Step 4 of 6
Refer to the Exercise 14.28 for the set of functional dependencies and relation R. Since
functional dependencies are not in canonical form, convert them into canonical functional
dependencies as follows:
The entire attribute set of the relation R is a candidate key. Since Days_hours, Room_no,
No_of_students and Instructor_ssn can be determined by functional dependencies FD4, FD5,
FD6 and FD7 respectively, so remove them from the candidate key. Remaining attributes in the
candidate key are as follows:
Since Offering_dept, Credit_hours and Course_level can be determined by FD1, FD2 and FD3
respectively, so remove them from the candidate key. Remaining attributes in candidate are as
follows:
Comment
Step 5 of 6
Refer to the Exercise 14.28 for the set of functional dependencies and relation R. Following steps
must be used to decompose R into 3NF relations, using synthesis algorithm:
Since functional dependencies are not in canonical form, convert them into canonical functional
dependencies as follows:
Since Instruct_ssn, Course_no and Sec_no have been determined already, so these are
extraneous attributes. Minimal cover for the relation R is as follows:
There are two functional dependencies, and their corresponding relations are as follows:
• The relation R has the candidate key (Course_no, Sec_no, Semester, Year). Since attributes
(Course_no, Sec_no, Semester, Year) already exist in the above relations, so there is no need to
create another relation for key attributes.
• If another relation is created containing the candidate key (Course_no, Sec_no, Semester,
Year), then it will result in redundancy, and step 4 can be used for removing the redundant
relation.
Therefore, the final 3NF relations obtained after decomposing R are as follows:
and
Comment
Step 6 of 6
Refer to the Exercise 14.28 for the set of functional dependencies and relation R. Following steps
must be used to decompose R into BCNF relations, using decomposition algorithm:
Step 2: Check whether or not any functional dependency violates BCNF. If yes, then decompose
the relation.
Therefore, the final BCNF relations obtained after decomposing R are as follows:
and
Comment
Chapter 15, Problem 31E
Problem
Consider the following decompositions for the relation schema R of Exercise. Determine whether
each decomposition has (1) the dependency preservation property, and (2) the lossless join
property, with respect to F. Also determine which normal form each relation in the decomposition
is in.
a. D1 = {R1, R2, R3, R4, R5}; R1 = {A, B, C}, R2 = {A, D, E}, R3 = {B, F}, R4 = {F, G, H}, R5 =
{D, I, J}
c. D3 = {R1, R2, R3, R4, R5}; R1= {A, B, C, D}, R2= {D, E}, R3 = {B, F}, R4 = {F, G, H}, R5= {D, I,
J}
Exercise
Step-by-step solution
Step 1 of 10
Comment
Step 2 of 10
a.
Comment
Step 3 of 10
In order to know if satisfies the nonadditive join property, apply the algorithm 15.3. Please
refer the algorithm 15.3 (testing for nonadditive join property) given in the textbook.
The first row consists of “a” symbols in all the cells. Hence, the decomposition satisfies the
nonadditive join property.
Comment
Step 4 of 10
In relation R1, is the primary key and also a super key. It satisfies Boyce Codd normal
form.
In relation R2, is the primary key and also a super key. It satisfies Boyce Codd normal form.
In relation R3, is the primary key and also a super key. It satisfies Boyce Codd normal form.
In relation R4, is the primary key and also a super key. It satisfies Boyce Codd normal form.
In relation R5, is the primary key and also a super key. It satisfies Boyce Codd normal form.
Comment
Step 5 of 10
b.
Comment
Step 6 of 10
In order to know if satisfies the nonadditive join property, apply the algorithm 15.3. Please
refer the algorithm 15.3 (testing for nonadditive join property) given in the textbook.
The first row consists of “a” symbols in all the cells. Hence, the decomposition satisfies the
nonadditive join property.
Comment
Step 7 of 10
In relation R1, is the primary key. The relation R1 is in first normal form as there is partial
dependency. The attribute A is a partial key and it determines the attributes D and E.
In relation R2, is the primary key. The relation R2 is in second normal form as there is
transitive dependency. The attribute F is a non-key attribute that functional determines the
attributes G and H.
In relation R3, is the primary key and also a super key. It satisfies Boyce Codd normal form.
Comment
Step 8 of 10
c.
Hence, the decomposition does not satisfy the dependency preserving property.
Comment
Step 9 of 10
In order to know if satisfies the nonadditive join property, apply the algorithm 15.3. Please
refer the algorithm 15.3 (testing for nonadditive join property) given in the textbook.
There is no row in the matrix that consists of “a” symbols in all the cells. Hence, the
decomposition does not satisfy the nonadditive join property.
Comment
Step 10 of 10
The normal form of relation R1 cannot be determined as it satisfies only functional dependency
. Nothing can be said about the attribute D of relation R1.
The normal form of relation R2 cannot be determined as it does not satisfy any functional
dependency.
In relation R3, is the primary key and also a super key. It satisfies Boyce Codd normal form.
In relation R4, is the primary key and also a super key. It satisfies Boyce Codd normal form.
In relation R5, is the primary key and also a super key. It satisfies Boyce Codd normal form.
Comment
Chapter 16, Problem 1RQ
Problem
Step-by-step solution
Step 1 of 1
The CPU can directly access the The CPU cannot directly access the secondary storage
primary storage devices. devices.
Fast access to data is provided by Slower access to data is provided by the secondary
the primary storage devices. storage devices.
Comment
Chapter 16, Problem 2RQ
Problem
Why are disks, not tapes, used to store online database files?
Step-by-step solution
Step 1 of 1
To store online database files, we use disks. Disks are secondary storage device; a disk is a
random access addressable device.
Data us stored and retrieved in units called disk blocks while come to the tapes.
Comment
Chapter 16, Problem 3RQ
Problem
Define the following terms: disk, disk pack, track, block, cylinder, sector, interblock gap, and
read/write head.
Step-by-step solution
Step 1 of 3
Disk: The disk is the secondary storage device that is used to store the huge amount of data.
The disk stores the data in the digital form i.e., 0’s and 1’s. The most basic unit of data that can
be stored in the disk is bit.
Disk pack: The disk pack contains the layers of hard disks to increase the storage capacity i.e.,
it includes many disks.
Comment
Step 2 of 3
Track: In the disk, the information is stored on the surface in the form of circles with various
diameters. Each circle of the surface is called a track.
Block: Each track of the disk is divided into equal sized slices. One or more such slices are
grouped together to form a disk block. The block may contain single slice (sector). The size of
the block is fixed at the time of disk formatting.
Comment
Step 3 of 3
Cylinder: In the disk pack, the tracks with the same diameter forms a cylinder.
Sector: Each track of the disk is divided into small slices. Each slice is called as sector.
Interblock gap: The interblock gap separates the disk blocks. The data cannot be stored in the
interblock gap.
Read/write head: The read/write head is used to read or write the block.
Comment
Chapter 16, Problem 4RQ
Problem
Step-by-step solution
Step 1 of 1
In the disk formatting / initialization process, tracks are divided into equal size. It is set by the
operating system.
Initialization means,
The process of defining the tracks and sectors, so that data and programs can be stored and
retrieved.
While initialization of the disk, block size is fixed, and it can not be changed dynamically.
Comment
Chapter 16, Problem 5RQ
Problem
Discuss the mechanism used to read data from or write data to the disk.
Step-by-step solution
Step 1 of 1
When the disk drive begins to rotate the disk when ever a particular read or write request is
initiated and once the read/unit head is positioned on the right track and the block specified in the
block address moves under the read/write head. The electronic component of the read/write
head is activated to transfer the data.
Below procedure is follows when the data is read for or write from disk.
(4) The data is read from the hard disk and transferred to a buffer RAM
Comment
Chapter 16, Problem 6RQ
Problem
Step-by-step solution
Step 1 of 1
Address of a block:-
Consists of a combination of cylinder number, track number (Surface number with in the cylinder
on which the track is located. Block number (with in the track) is supplied to the disk
Comment
Chapter 16, Problem 7RQ
Problem
Why is accessing a disk block expensive? Discuss the time components involved in accessing a
disk block.
Step-by-step solution
Step 1 of 4
The data is arranged in an order and then stored in a block of the disk is said to be known as
blocking. The data can be transferred from the disk to the main memory in units.
Accessing the data in the main memory is less expensive than accessing the data in the disk.
This is due to the following components:
• Seek time.
• Rotational latency.
Comment
Step 2 of 4
The access of the data in the disk is more expensive because of the time components. The time
components are explained as follows:
• Seek time:
o The disk contains a set of tracks. Each track has one head. This is said to be known as disk
head. The track is formed by sectors of fixed size.
o Each sector can store up to 512 bytes of data in which the user can access the data.
o For reading the data present in the disk, there is an arm on the disk. This is used to read a
record from the disk.
o Seek time is said to be known as the total time taken to position the arms to the disk head
present on the track.
o Accessing a disk block takes more seek time. Therefore, this is one of the major reasons for
the expensiveness of accessing a disk block.
Comment
Step 3 of 4
• Rotational latency:
o The total amount of time taken between the request for an information and how long it takes
the disk to position the sector where that data is available. This is said to be known as rotational
latency.
o This is also said to be a waiting time in which if this time increases, then the expensiveness of
accessing a disk block will also increase.
Comment
Step 4 of 4
o If there is need to transfer the data in the disk from one block to another block then it will take
some time.
o This time is said to be known as block transfer time. At the time of accessing a block of data
from the disk, the transfer time may increase. This will result in the expensiveness of accessing a
disk block.
Comment
Chapter 16, Problem 8RQ
Problem
Step-by-step solution
Step 1 of 1
Double buffering is used to read a continuous stream of blocks from disk to memory.
Disk blocks, which eliminates the seek time and rotational delay for all but the first block transfer.
Moreover, in the programs data is kept ready for processing and it reducing the waiting time.
Where n blocks
p processing time/block
Comment
Chapter 16, Problem 9RQ
Problem
What are the reasons for having variable-length records? What types of separator characters are
needed for each?
Step-by-step solution
Step 1 of 1
Variable-length records (Reasons) A file is a sequence of records. All records in a file are of the
same type and in same size. If different records in the file have different size, the file is said to be
made up of variable – length records.
* File records are of the same record type, but one or more of the fields are of different size.
* File records are same record type, but one or more of the fields are optimal.
Means, they may have some values but not for all.
* File contains records of different record types and varying size. If related records of different
types were placed together on the disk block it would be occur.
In the variable length fields, each record has a value for each field. But we do not know the exact
length of some field values. To determine the bytes with in that record it represent each field.
Then we can use separator characters like
? or % or $
Separating the field name from the field value and separating one field from the next field.
Example
,
and
Comment
Chapter 16, Problem 10RQ
Problem
Step-by-step solution
Step 1 of 1
Many techniques are there fore allocating the blocks of a file on Disk. In that
Contiguous allocation
Linked allocation
Clusters
Indexed allocation.
Contiguous allocation:-
File blocks are allocated to consecutive disk blocks. This makes reading the whole file very fast
using double buffering.
Linked allocation:-
Each file block contains a pointer to the next file block. It is easy to expand the file but makes it
slow to read the whole file.
Clusters:-
Combination of two allocates of consecutive disk blocks. Clusters are sometime called as file
segments/extends.
Indexed allocation:-
One or more index blocks contain pointers to the actual file blocks.
Comment
Chapter 16, Problem 11RQ
Problem
Step-by-step solution
Step 1 of 1
File organization:-
- It shows “how the physical records in a file are arranged on the disk.
A file organization refers to the organization of the data of a file in to records, blocks, and access
structures
- In this, records and blocks are placed on the storage medium and interlinked.
Access methods:-
Some access methods apply to a file organization and can be applied only to file organization in
certain way.
Comment
Chapter 16, Problem 12RQ
Problem
Step-by-step solution
Step 1 of 1
Static file:- in the file organization update operations are rarely performed.
Comment
Chapter 16, Problem 13RQ
Problem
What are the typical record-at-a-time operations for accessing a file? Which of these depend on
the current file record?
Step-by-step solution
Step 1 of 1
2.) Find (locate): Searches for the first record that satisfies a search condition. Transfer the block
containing that record into memory buffer. The file pointer points to the record in buffer and it
becomes the current record.
3.) Read (Get): Copies current record from the buffer to the program variable in the user
program. This command may also advance the current record pointer to the next record in the
file, which may necessitate reading the next file block from disk.
4.) FindNext: Searches for next record in file that satisfies the search condition. Transfer the
block containing that record into main memory buffer. The record is located in the buffer and
becomes current record.
5.) Delete: Delete current record and updates file on disk to reflect the deletion.
6.) Modify: Modifies some field values for current record and eventually update file on disk to
reflect the modification.
7.) Insert: Insert new record in the file by locating the block where record is to be inserted,
transferring the block into main memory buffer, writing the record into the buffer, and eventually
writing buffer to disk to reflect insertion.
1.) Read
2.) FindNext
3.) Delete
4.) Modify
Comment
Chapter 16, Problem 14RQ
Problem
Step-by-step solution
Step 1 of 1
By using this, it leaves the unused space in the disk block. When we use this technique for
deleting the large number of records result is wasted in storage space.
(2) Another technique for record deletion is deletion marker (deletion is to have an extra byte or
bit). In this
During this reorganization. The file blocks are accessed consecutively and records are packed by
removing deleted records.
For un ordered file, we use either spanned or un spanned organization and it is used with either
fixed-length or variable-length records.
Comment
Chapter 16, Problem 15RQ
Problem
Discuss the advantages and disadvantages of using (a) an unordered file, (b) an ordered file,
and (c) a static hash file with buckets and chaining. Which operations can be performed
efficiently on each of these organizations, and which operations are expensive?
Step-by-step solution
Step 1 of 4
It can be defined as the collection of records, those are placed in file in the same order as they
are inserted.
Advantages:
• It is fast, and Insertion of simple records are added at the end of the last page of the file.
Disadvantages:
Comment
Step 2 of 4
b) an ordered file:
An ordered file, it is stores records in order and it will changes the file when records are inserted.
Advantages:
• Recording a sequential based file is more capable as all the files are being stored as the order.
Disadvantage:
• Rearranging of file would be needed for storing or modifying or deleting any records.
Comment
Step 3 of 4
Advantages:
• The speed is the biggest advantage and it is efficient when huge volume of data is present.
Disadvantages:
Comment
Step 4 of 4
The hashing technique is the most efferent to be executed and is expensive process because of
is sophisticated structures.
• The extendable hashing is a type of dynamic hashing, which splits and associate the bucket of
the database size for change, because when a hash function is to be adjusted on a dynamic
basis.
Comment
Chapter 16, Problem 16RQ
Problem
Discuss the techniques for allowing a hash file to expand and shrink dynamically. What are the
advantages and disadvantages of each?
Step-by-step solution
Step 1 of 5
Hashing techniques are allow the techniques for dynamic growth and shrinking of the number of
the file records.
Techniques that are include the dynamic hashing, extendible hashing and linear hashing.
In the static hashing the primary pages are fixed and allocated sequentially, pages are never de-
allocated and if needed pages of overflowed.
In the dynamic hashing, the directory is a binary tree, the directories can be stored on disk, and
they expand or shrink dynamically. Directory entries point to the disk blocks and that contain the
stored records.
Dynamic hashing is good for the database and that grows and shrinks in size, and hash function
that allows dynamically.
Comment
Step 2 of 5
In dynamic hashing, extendable hashing is the one form. It generates the values over a large
range typically b-bit integers with
Hash function tht allows only prefix to index into a table of bucket addresses.
Example
Comment
Step 3 of 5
The number of buckets also changes dynamically because of coalescing and spilitting of
buckets.
Static hashing uses a fixed address space, and perform computation on the internal binary
representation of the search key.
Using bucket overflow, static hashing is redused, and it can not be eliminated.
Disadvantages:-
Here data base grow is with in time. And if initial number of buckets is too small, then
performance will degraded.
Comment
Step 4 of 5
Extendible hashing:-
Advantages:-
It is a type of directory
Disadvantages:-
Comment
Step 5 of 5
Linear hashing:-
Advantages:-
It allows a hash file to expand to shrink its number of buckets dynamically with out a
directory file.
Disadvantages:-
Linear hashing handles the problem of long overflow chains without using a directory and
handles duplicates.
Comment
Chapter 16, Problem 17RQ
Problem
What is the difference between the directories of extendible and dynamic hashing?
Step-by-step solution
Step 1 of 1
The differences between directories of extendible and dynamic hashing are as follows:
Comment
Chapter 16, Problem 18RQ
Problem
What are mixed files used for? What are other types of primary file organizations?
Step-by-step solution
Step 1 of 3
A mixed file refers to a file in which contains records of different file types.
• An additional field known as record type field is added as the first field along with the fields of
the records to distinguish the file to which it belongs.
Comment
Step 2 of 3
Comment
Step 3 of 3
• Hashed/file organization
Comment
Chapter 16, Problem 19RQ
Problem
Step-by-step solution
Step 1 of 2
In computer systems, the collection of data can be stored physically in the storage
medium.
• From the DBMS (DataBase Management System), the data can be processed, retrieved, and
updated whenever it is needed.
• The storage medium structure in a computer will have some storage hierarchy to make the
process of collections of data.
There are two main divisions in the storage hierarchy of a computer system.
• Primary storage
Primary storage:
• This storage medium in the computer can be directly accessed by the CPU (Central
Processing Unit), it can be stored only as temporarily.
• In main memory, the data can be accessed faster with faster cache memories but less storage
capacity and cost-effective.
• Please note that in case of any power failures or browser crash, the contents of the main
memory will be erased automatically.
• This storage medium in the computer can be stored permanently in the way of disks, tapes,
CD-ROMs, or DVDs.
• The secondary storage is also called as secondary memory or Hard Disk Drives (ROM).
• In today’s world, the data can be stored in offline considered as removable media, it is called as
tertiary storage.
• The data cannot be accessed directly in this type of storage, at first it will be copied to primary
storage and then the CPU processes the data.
Comment
Step 2 of 2
In computer systems, the processing can be done by RAM which is having a series of
chips.
• Also, the processor has the support of cache memory to retrieve the information faster which
will be an added advantage.
In computer systems, the disk technologies need the space to accumulate the data.
• The data cannot be accessed directly in disk type technologies, at first it will be copied to
primary storage and then the CPU processes the data.
• When it is compared to the processor, the time consumption will be more, and the processor is
better to run the processes.
Hence, the processor will provide efficient performance better than the disk technologies.
Comment
Chapter 16, Problem 20RQ
Problem
What are the main goals of the RAID technology? How does it achieve them?
Step-by-step solution
Step 1 of 2
To increase reliability of database when using the redundant array of independent disks by
introducing redundancy
Disk mirroring:-
In case of mirroed data, the data items can be read from any disk, hot for writing the data item
must be written on both. Means, When data is read, it can be retrieved from the disk with shorter
queuing, seek, and rotational delays.
If one disk fails, the other disk is still there to continuously provide the data. It improves the
reliability.
Comment
Step 2 of 2
The mean time to failure of a mired disk depends on the man time to failure of the individual
disks, as well as on the mean time to repair, which is the time it takes (an average) to replace a
failed disk and to restore the data an it. Suppose that, the failures of the two disks are
independent; means there is no connection between the failure of one disk and the failure of the
other.
It the system has 100 disks in an array. The mean to repair is 24 hours, and the MTTF if 200,000
hours on each disk.
Comment
Chapter 16, Problem 21RQ
Problem
How does disk mirroring help improve reliability? Give a quantitative example.
Step-by-step solution
Step 1 of 1
The technique of data striping to achieve higher transfer rates and improves the performance of
disk in RAID, which has two levels (i) bit-level data striping (ii) block level data striping.
Comment
Chapter 16, Problem 22RQ
Problem
Step-by-step solution
Step 1 of 2
Raid Levels:-
In the RAID organization, one solution that presents it self because of the increased size and
reduced cost of hard drives is to built in redundancy. RAID can be implemented in hardware and
software and it is a set of physical disk drives viewed by the operating system as a single logical
drive.
Levels:-
Depends on the data redundancy introduced and correctness checking technique used in the
schema.
Level 0:-
Level 1:-
Level 2:-
In this level; mirroring and no mirroring combined with memory like correctness checking.
For example:
Comment
Step 2 of 2
Level 3:-
Level 3 is seems like as level 2, but uses the single disk for parity. Level 3 is some time called as
bit-interleaved. Disk controller can detect whether a sector has been read correctly. A single
parity bit can be used for error correction as well as detection.
Level 4:-
Block level data striping and parity like level 3 and in this level stores blocks.
Level 5:-
Block level data striping but data and parity are distributed across all disks.
Level 6:-
Uses the P+Q redundancy scheme, and P+Q redundancy using Reed-suloman codes to recover
from multiple disk failures.
Comment
Chapter 16, Problem 23RQ
Problem
Step-by-step solution
Step 1 of 1
Different RAID (Redundant Array of Inexpensive Disks) organizations were defined based on
different combinations of the two factors,
There are various levels of RAID from 0 to 6. The popularly used RAID organization is level 0
with striping, level 1 with mirroring, and level 5 with an extra drive for parity.
RAID level 0
• It has no redundant data and hence it provides best write performance as updates are not
required to be duplicated.
RAID level 1
• Performance improvement is possible by scheduling a read request to the disk with shortest
expected seek and rotational delay.
RAID level 5
• Data and parity information are distributed across all the disks. If any one disk fails, the data
lost is due to any changes is determined by using the information of the parity available from the
remaining disks.
Comment
Chapter 16, Problem 24RQ
Problem
What are storage area networks? What flexibility and advantages do they offer?
Step-by-step solution
Step 1 of 1
There is a demand for storage and management of cost all data as data are integrated across
organization and it is necessary to move from a static fixed data which are used from centered
architecture operation to a more flexible and dynamic infrastructure for the processing of
information requirements, most of the organizations moved to the better criterion of storage area
networks (SANs).
• In SAN, online storage peripherals are configured as nodes on a high-speed network and can
be attached and removed from servers in a very flexible manner.
• They allow storage systems to be placed at longer distances from the servers and provide good
performance and different connectivity options
• It provides point-to-point (every devices are connected to every other device) connections
between servers and storage systems through fiber channel; it allows connecting multiple RAID
systems, tape libraries to servers.
Advantages
1. It is more flexible as it provides flexible connection with many devices that is many-to-many
connectivity among servers and storage devices using fiber channel hubs and switches.
2. Between a server and storage system there is a distance separation of up to 10km provided
by using fiber optic cables.
Comment
Chapter 16, Problem 25RQ
Problem
Step-by-step solution
Step 1 of 1
In enterprise applications it is necessary to maintain solutions at a very low cost to provide high
performance. Network-attached storage (NAS) devices are used for this purpose. It does not
provide any of the services common to the server, but it allows the addition of storage for file
sharing.
Features
• It provides very large amount of hard-disk storage space and it is attached to a network and
multiple or more number of servers can make use of those space without shutting them down so
that it ensure better maintenance and improve the performance.
• It can be located at anywhere in the local area network (LAN) and used with different
configuration.
• A hardware device called as NAS box or NAS head acts as a gateway between the NAS
systems and clients who are connected in the network.
• It does not use any of the devices such as monitor, keyboard, or mouse, disk drives that are
connected to many NAS systems to increase total capacity.
• It can store any data that appears in the form of files, such as e-mails, web content includes
text, image or videos, and remote system backups.
• It includes built-in features such as security (authenticate the access) or automatic sending of
alerts through mail in case of error occurred on the device that are connected.
Comment
Chapter 16, Problem 26RQ
Problem
How have new iSCSI systems improved the applicability of storage area networks?
Step-by-step solution
Step 1 of 1
Internet SCSI (iSCSI) is a protocol proposed to issue commands that allows clients (initiators) to
send SCSI commands to SCSI storage devices through remote channels.
• The main feature is that, it does not require any special cabling connections as needed by Fiber
Channel and it can run for longer distances using existing network infrastructure.
• iSCSI allows data transfers over intranets and manages storage over long distances.
• It can transfer data over variety of networks includes local area networks (LANs), wide area
networks (WANs) or the Internet.
• It is bidirectional; when the request is given, it is processed and the resultant data is sent in
response to the original request.
• It combines different features such as simplicity, low cost, and the functionality of iSCSI devices
provides good upgrades and hence applied in small and medium sized business applications.
Comment
Chapter 16, Problem 27RQ
Problem
Step-by-step solution
Step 1 of 3
SATA Protocol:
SATA stands for serial ATA, wherein ATA represents attachment; therefore SATA becomes serial
AT attachment.
SATA is a modern storage protocol that has fully replaced the most commonly used SCSI (small
computer system interface) and parallel ATA in laptops and small personal computers. SATA
overcomes design limitations of previous storage protocol.
Comment
Step 2 of 3
SAS Protocol:
SAS stands for serial attached SCSI. SAS overcomes design limitations of previous storage
protocol and also considered superior to SATA.
• SAS was designed to replace SCSI interfaces in Storage area network (SAN).
• SAS drives are faster than SATA drives and has dual portability.
Comment
Step 3 of 3
FC Protocol:
FC stands for serial Fiber channel protocol. Fiber channel is used to connect multiple RAID
systems, taps, which have different configurations.
• Fiber channel supports point to point connection between server and storage system. It also
Provide flexibility to connect too many connections between servers and storage devices.
• Fiber channel has almost the same performance like SAS. It uses fiber optic cables, so high
speed data transfer supported.
Comment
Chapter 16, Problem 28RQ
Problem
What are solid-state drives (SSDs) and what advantage do they offer over HDDs?
Step-by-step solution
Step 1 of 2
SSD is abbreviation for solid-state drives, which uses integrated circuit assemblies as storage to
store data permanently. It is a nonvolatile memory, means it will not forget the data on system
memory when the system is turned off.
SSD is based on flash memory technology, that’s why sometimes it is known as flash memory,
and they don’t require continuous power supply to store data on secondary storage, so they are
known as solid state disk or solid state drives.
SSD does not have read and write head like traditional electromagnetic disk, instead it has
controller (embedded processor) for various operations. It makes speed of data retrieval faster
than magnetic disks. Commonly in SSDs, interconnected NAND flash memory cards are used.
SSD uses wear leveling technique to store data that extend the life of SSD by storing data to
separate NAND cell, instead of overwriting it.
Comment
Step 2 of 2
In SSD data can be accessed directly from different locations on flash memory, so access time in
SSD is 100 times faster than HDD and latency time is low, consequently data transfer rate is high
and system boot up time is low.
• More reliable:
SSD does not have a moving mechanical arm for read and write operations. Data is stored on
integrated circuit chips. SSD has controller to manage all the operations on flash cells, and data
can be written and erased on flash cell, only limited number of time before it fails. The controller
manages these activities, so that SSD can work for many years under normal use.
As SSD does not have moving component, so data on SSD is safer, even when equipment is
being handled roughly.
As in SSD, there is no head rotation to read and write data, so power consumption is lower than
HDD and saves battery life. SDD uses only 2-3 watts whereas HDD uses 6-7 watts of power.
As no moving head rotation is there, so SSD generate less heat and doesn’t make noise that
helps to increase life and reliability of the drive.
• Light weight:
As SSDs are mounted on circuit board and they don’t have moving head and spindle, so they are
light weight and small in size.
Comment
Chapter 16, Problem 29RQ
Problem
What is the function of a buffer manager? What does it do to serve a request for data?
Step-by-step solution
Step 1 of 2
The buffer manager is a software module of DBMS whose responsibility is to serve to all the data
requests and take decision about choosing a buffer and to manage page replacement.
• To increase the possibility that the requested page is found in main memory.
• To find an appropriate replacement for a page while reading a new disk block from disk, such
that the replacement page will not be required soon.
• The buffer manager must ensure that the number of buffers fits in the main memory.
• Buffer manager functions according to the buffer replacement policy and selects the buffers that
must be emptied, when the requested amount of data surpasses the available space in buffer.
Comment
Step 2 of 2
The buffer manager handles two types of operations in buffer pool to fulfill its functionality:
1. Pin count: This is the counter to track the number of page requests or corresponding number
of users who requested that page. Initially counter value is set to zero. If the counter value is
always zero, the page is unpinned. Only unpinned blocks are allowed to be written on the disk.
As the value of counter is incremented the pages are called pinned.
2. Dirty bit: Initially its values is set to zero for all pages. When the page is updated, its value is
updated to 1.
• Buffer manager checks the availability of the page in buffer. If the page is available, it
increments the pin count and sends the page.
• If page is not in buffer, than buffer manager takes the following steps:
• Buffer manager decides a page according to the replacement policy and increments page’s pin
count.
• If the dirty bit of replacement page is on, buffer manager writes that page onto disk and
replaces the old copy.
• If the dirty bit is not on, buffer manager does not write the page back to disk.
• Buffer manager reads the new page and conveys the memory location of the page to the
demanding application.
Comment
Chapter 16, Problem 30RQ
Problem
Step-by-step solution
Step 1 of 2
In large DBMSs, files contain so many pages and it is not possible to keep all the data in memory
at the same time. To overcome this storage problem and improve efficiency of DBMS
transactions, buffer manager (software) uses buffer replacement strategies that decide what
buffer to use and which pages are to be replaced in the buffer to give a space to newly requested
pages.
Comment
Step 2 of 2
The LRU strategy keeps track of page usages for specific period of time and it removes the
oldest used page.
LRU works on the principle that the pages which are frequently used are most likely to be used in
further processing too. To maintain the strategy the buffer manager has to maintain a table where
the frequency of the page usage is recorded for every page. This is very common and simple
policy.
It has problem of sequential flooding, which means that there are frequent scanning and
repeated use of I/O for each page.
• Clock policy:
This is an approximate LRU technique. It is like Round robin strategy. In clock replacement policy
buffers are arranged in a circle like a clock with a single clock hand. The buffer manager sets
“use bit” on each reference. If “use bit” is not set (flag 0) for any buffer that means it is not used in
a long time and is vulnerable for replacement. It replaces the old page not the oldest.
This is the simplest buffer replacement technique. When buffer is required to store new pages,
the oldest arrived page is swapped out. The pages are arranged into the buffer in a queue in a
fashion that most recent page is the tail and oldest arrival is the head.
During replacement the page at the head of the queue is replaced first. This strategy is simple
and easy to implement but not desirable, because it replaces the oldest page which may be most
frequently used page and in future it can be needed, so again it will be swapped in. It creates
processing overhead.
It removes most recently used pages first. This is also called fetch and discard. This is useful in
sequential scanning when most recently used page, won’t be used in future for a period of time.
In situation of sequential scanning LRU and CLOCK strategies don’t perform well. To enhance
performance of FIFO, it can be modified by using some pinned block like root index block, and
make sure that they can’t be replaced and always remain in buffer.
Comment
Chapter 16, Problem 31RQ
Problem
What are optical and tape jukeboxes? What are the different types of optical media served by
optical drives?
Step-by-step solution
Step 1 of 2
Optical jukeboxes:
Optical jukebox is an intelligent data storage device that uses an array of optical disk platters,
and automatically load and unload these disks like according to the storage need. Jukeboxes has
high capacity storage and it supports up to terabytes and even petabytes of tertiary storage.
• Optical jukeboxes have up to 2000 different disk slots. As optical jukeboxes keep traversing
different disk storage according data requirement, so it create time overhead and affect
processing.
• The process of dynamically loading and uploading of disk drives is called migration.
Magnetic jukeboxes:
Magnetic tape jukeboxes uses a number of tapes as a storage and automatically load and
unload taps on tape drives. This is a popular tertiary storage medium that can handle data up to
terabytes.
Comment
Step 2 of 2
Optical media stores data in digital form. Optical media can store all type of data like audio,
video, software, images and text.
To read and write data on optical media, optical drive is used. Optical drive read and write data
using laser waves. Laser waves are electromagnetic waves with specific wavelength to read
different type of media.
• CD(Compact disk): According to use and recording type there are three type of CDs
Read-only: CD-ROM
Writable: CD-R
Re-writable: CD-RW
Comment
Chapter 16, Problem 32RQ
Problem
Step-by-step solution
Step 1 of 1
AST is the one of the storage types that filters and transfers the data among different types of
storage like SATA, SAS, SSDs based on the storage requirement, dynamically.
Automated tiering mechanism is managed by the storage administrator. According to the tiering
policy, less used data is transferred to the SATA drives, as it is slower and is not much
expensive, and frequently used data is transferred to high speed SAS or solid state drives.
EMC implements FAST (fully automated storage tiering). It automatically monitors data
activeness, and moves active data to high performance storage like SSD and inactive data to
inexpensive and slower storage like SATA. Therefore, AST is useful as it results in high
performance and low cost.
Comment
Chapter 16, Problem 33RQ
Problem
Step-by-step solution
Step 1 of 2
In object based storage system data is organized in units called object instead of blocks in file. In
this storage system, data is not stored in hierarchy rather than all the data is stored in the form of
objects, and required object can be searched directly using unique global identifier, without
overhead.
• Variable Meta data: This field has the information about main data like location of the data,
usability, confidentiality and other information required to manage the data.
• Unique global identifier: This identifier stores the address information of the data so that data
can be located easily.
Comment
Step 2 of 2
Object storage system is better than conventional storage system in following ways:
• As the organizations are expanding, their data is also increasing day by day. If the file system is
used as a data storage system and data is stored in the blocks, it would become very difficult to
manage huge amount of data. In conventional file systems, data is stored in hierarchical fashion
and all these data are stored into blocks with their own unique address.
To solve this management overhead, data is stored in the form of objects with additional
metadata information.
• Object based storage provides security of data. In object based systems, the objects can be
accessed directly by the applications through unique global identifier. While in the file storage
system data need to be searched in linear or binary fashion that generates processing overhead
and is time consuming.
• Object based storage system supports features like replication, encapsulation and distribution
of objects, that makes data secure, manageable and easily accessible. However, conventional
file based storage system does not supports replication and distribution of objects.
Comment
Chapter 16, Problem 34E
Problem
Consider a disk with the following characteristics (these are not parameters of any particular disk
unit): block size B = 512 bytes; interblock gap size G = 128 bytes; number of blocks per track =
20; number of tracks per surface = 400. A disk pack consists of 15 double-sided disks.
a. What is the total capacity of a track, and what is its useful capacity (excluding interblock
gaps)?
c. What are the total capacity and the useful capacity of a cylinder?
d. What are the total capacity and the useful capacity of a disk pack?
e. Suppose that the disk drive rotates the disk pack at a speed of 2,400 rpm (revolutions per
minute); what are the transfer rate (tr) in bytes/msec and the block transfer time (btt) in msec?
What is the average rotational delay (rd) in msec? What is the bulk transfer rate? (See Appendix
B.)
f. Suppose that the average seek time is 30 msec. How much time does it take (on the average)
in msec to locate and transfer a single block, given its block address?
g. Calculate the average time it would take to transfer 20 random blocks, and compare this with
the time it would take to transfer 20 consecutive blocks using double buffering to save seek time
and rotational delay.
Step-by-step solution
Step 1 of 8
Given data
Block size
Comment
Step 2 of 8
Bytes
k bytes
Bytes
Comment
Step 3 of 8
Numbers of tracks
400
Comment
Step 4 of 8
Step 5 of 8
Bytes
m bytes
Comment
Step 6 of 8
Comment
Step 7 of 8
Comment
Step 8 of 8
Buffering
Comment
Chapter 16, Problem 35E
Problem
A file has r = 20,000 STUDENT records of fixed length. Each record has the following fields:
Name (30 bytes), Ssn (9 bytes), Address (40 bytes), PHONE (10 bytes), Birth_date (8 bytes),
Sex (1 byte), Major_dept_code (4 bytes), Minor_dept_code (4 bytes), Class_code (4 bytes,
integer), and Degree_program (3 bytes). An additional byte is used as a deletion marker. The file
is stored on the disk whose parameters are given in Exercise.
b. Calculate the blocking factor bfr and the number of file blocks b, assuming an unspanned
organization.
c. Calculate the average time it takes to find a record by doing a linear search on the file if (i) the
file blocks are stored contiguously, and double buffering is used; (ii) the file blocks are not stored
contiguously.
d. Assume that the file is ordered by Ssn; by doing a binary search, calculate the time it takes to
search for a record given its Ssn value.
Exercise
Step-by-step solution
Step 1 of 6
Comment
Step 2 of 6
Comment
Step 3 of 6
Comments (1)
Step 4 of 6
Comment
Step 5 of 6
Comment
Step 6 of 6
Comment
Chapter 16, Problem 36E
Problem
Suppose that only 80% of the STUDENT records from Exercise have a value for Phone, 85% for
Major_dept_code, 15% for Minor_dept_code, and 90% for Degree_program; and suppose that
we use a variable-length record file. Each record has a 1-byte field type for each field in the
record, plus the 1-byte deletion marker and a 1-byte end-of-record marker. Suppose that we use
a spanned record organization, where each block has a 5-byte pointer to the next block (this
space is not used for record storage).
Exercise
What are solid-state drives (SSDs) and what advantage do they offer over HDDs?
Step-by-step solution
Step 1 of 3
It is provided that each record has 1 byte field type, along with 1 byte deletion marker and 1 byte
end of record marker.
So the fixed record size would be calculated for fields not mentioned in the question, that is
Name, Ssn, Address, Birth_date, Sex, Class_code.
Therefore,
And for the remaining variable length fields, that is Phone, Major_dept_code, Minor_dept_code,
Degree_program), the number of bytes per record can be calculated as,
Comment
Step 2 of 3
a.
Comment
Step 3 of 3
b.
Since a spanned record-file organization is being used, where each block has unused space of
5-bytes pointer, so the usable bytes in each block are .
The number of blocks required for the file can be calculated as,
Comment
Chapter 16, Problem 37E
Problem
Suppose that a disk unit has the following parameters; seek time s = 20 msec; rotational delay rd
= 10 msec; block transfer time btt= 1 msec; block size B = 2400 bytes; interblock gap size G =
600 bytes. An EMPLOYEE file has the following fields: Ssn, 9 bytes; Last_name, 20 bytes;
First_name, 20 bytes; Middle_init, 1 byte; Birth_date, 10 bytes; Address, 35 bytes; Phone, 12
bytes; Supervisor_ssn, 9 bytes; Department, 4 bytes; Job_code, 4 bytes; deletion marker, 1 byte.
The EMPLOYEE file has r = 30,000 records, fixed-length format, and unspanned blocking. Write
appropriate formulas and calculate the following values for the above EMPLOYEE file:
a. Calculate the record size R (including the deletion marker), the blocking factor bfr, and the
number of disk blocks b.
b. Calculate the wasted space in each disk block because of the unspanned organization.
c. Calculate the transfer rate tr and the bulk transfer rate btr for this disk unit (see Appendix B for
definitions of tr and btr).
d. Calculate the average number of block accesses needed to search for an arbitrary record in
the file, using linear search.
e. Calculate in msec the average time needed to search for an arbitrary record in the file, using
linear search, if the file blocks are stored on consecutive disk blocks and double buffering is
used.
f. Calculate in msec the average time needed to search for an arbitrary record in the file, using
linear search, if the file blocks are not stored on consecutive disk blocks.
g. Assume that the records are ordered via some key field. Calculate the average number of
block accesses and the average time needed to search for an arbitrary record in the file, using
binary search.
Step-by-step solution
Step 1 of 7
Ssn 9
First_name 20
Last_name 20
Middle_init 1
Address 35
Phone 12
Birth_date 10
Supervisor_ssn 9
Department 4
Job_code 4
deletion marker 1
Comment
Step 2 of 7
Since the file is unspanned so the blocking factor bfr can be calculated as,
In an unspanned organization of records, the number of file blocks can be calculated as,
Comment
Step 3 of 7
As the file has unspanned organization, so wasted space in each block can be calculated as,
Comments (1)
Step 4 of 7
Comment
Step 5 of 7
While searching for an arbitrary record in a file using the liner search the average number of
block accesses can be found as follows:
If one record satisfies the search condition, on average half of the blocks are to be searched, that
is .
If the record does not satisfies the search condition, all blocks are to be searched, that is
.
To calculate the average time to find a record using linear search on the file, the search is
performed on average half of the file blocks.
If the blocks are stored on consecutive disk block and double buffering is used, the average time
taken to read 789.5 blocks is,
If the file blocks are stored consecutively and double buffering is used, then the average time
taken to find a record by doing linear search on the file is .
Comment
Step 6 of 7
If the file blocks are not stored in consecutive disk blocks, the time taken to read 789.5 blocks is,
If the file blocks are not stored consecutively, then the average time taken to find a record by
doing linear search on the file is .
Comment
Step 7 of 7
While the records are ordered via some key field and binary search is going on, then the average
number of block accesses can be found as follows
• If record is found then on an average half of the blocks are to be accessed, that is
.
• If the record is not found then all blocks are to be accessed, that is .
If it is assumed that records are ordered through some key field, the time taken to search a
record, using binary search, is calculated as,
The average time taken to search a record via some key field is .
Comment
Chapter 16, Problem 39E
Problem
Load the records of Exercise into expandable hash files based on extendible hashing. Show the
structure of the directory at each step, and the global and local depths. Use the hash function
h(K) = K mod 128.
Exercise
What are optical and tape jukeboxes? What are the different types of optical media served by
optical drives?
Step-by-step solution
Step 1 of 10
2369, 3760, 4692, 4871, 5659, 1821, 1074, 7115, 1620, 2428, 3943, 4750, 6975, 4981 and
9208.
Comment
Step 2 of 10
Calculate the hash value (bucket number) and binary value to each record as follows:
Comment
Step 3 of 10
Now, perform the extendible hashing with local depth 0 and global depth 0. Here, each bucket
can hold two records.
The record 3 i.e., 4692 cannot be inserted because, already two records are inserted. Increase
the global depth to one to insert more elements. Now, the global depth is 1 and local depth is 1.
Check the binary value of each record. Map the record to 0 if the binary value of the record starts
with 0. Map the record to 1 if the binary value of the record starts with 1. For example, the binary
value of bucket number for 2369 is 1000001 (First bit is highlighted). The first bit is 1 thus, it
should be mapped to 1. The binary value of bucket number for 3760 is 0110000. The first bit is 0
thus, it should be mapped to 0.
Comment
Step 4 of 10
The next record cannot be inserted because all the blocks are filled.
Comment
Step 5 of 10
Now, increase the global depth to 2. Thus, check for the first two bits of the binary value of the
bucket number.
Step 6 of 10
The record 1821 cannot be inserted. Thus, increase the global depth to 3.
Comment
Step 7 of 10
Now, insert other records. The record 1074 can be inserted easily because there is a space in
the bucket.
Now, insert 7115.
Comment
Step 8 of 10
The record 7115 cannot be inserted. Now, increase the local depth to 3 for the last bucket and
insert the elements.
The records left are 6975, 4981 and 9208. The record 6975 cannot be inserted. Increase the
global depth to 4 and insert the elements.
Comment
Step 9 of 10
The last record cannot be inserted. Insert 9208 by increasing the local depth to 4 in the
corresponding block. The final table is as follows:
Comment
Step 10 of 10
Comment
Chapter 16, Problem 40E
Problem
Load the records of Exercise into an expandable hash file, using linear hashing. Start with a
single disk block, using the hash function h0 = K mod 20, and show how the file grows and how
the hash functions change as the records are inserted. Assume that blocks are split whenever an
overflow occurs, and show the value of n at each stage.
Exercise
What are optical and tape jukeboxes? What are the different types of optical media served by
optical drives?
Step-by-step solution
Step 1 of 1
We split this bucket into two buckets with new function K Mod 2^1
B1a:2369, 1821,4981
B1b:1074,4750
B1c:4871,5659,7115,3943,6975,
B1d:3760, 4692,1620,2428,9208
Since some bucket more than 2 elements they can be split using function K Mod 2^3
B1:2369,
B5:1821,4981
B7:4871,3943,6975
B3:5659, 7115
B8:3760,9208
B4:4692,1620,2428
B2: 1074
B6:4750
Since some buckets are still greater in size so we apply another function on them K Mod 2^4
B1: 2369
B5:4981
B7:4871,3943,
B8:9208
B15:6975
B4:4692,1620
B11:5659,7115
B12:2428
B13:1821
B16:3760
B14:4750
B2:1074
Comment
Chapter 16, Problem 41E
Problem
Compare the file commands listed in Section 16.5 to those available on a file access method you
are familiar with.
Step-by-step solution
Step 1 of 1
Records are placed in the file in the order in which they are inserted. Records are inserted at the
end of the file. This record organization is called heap/ pile file. File commends in the files of
unordered records:-
Delete a record
External sating
Inserting a record:-
New record insertion is very efficient. It is done by when new record is inserted. Then the last
block of the file is copied in to a butter than the new record is added then block is rewriters back
to the disk.
Delete a record:-
Program must find it’s block first, and copy the block into a buffer, then delete the record from the
buffer and finally rewrite the block back to it disk. In this record deletion. We use the technique of
deletion marker.
External sorting:-
When we want to read all records in order of the value of some fields. Then we create a sorted
copy of the file. For a large disk file it is an expensive. So, for this we use external sorting.
Comment
Chapter 16, Problem 42E
Problem
Suppose that we have an unordered file of fixed-length records that uses an unspanned record
organization. Outline algorithms for insertion, deletion, and modification of a file record. State any
assumptions you make.
Step-by-step solution
Step 1 of 1
Compare the heap file (unordered files) and file access methods.
Heap file:-
- Records are placed in the file in the order in which are inserted.
- Searching is done by only search procedure. Mainly involves a linear search, and it is an
expensive procedure.
- In the file organization, organization of the data of a file into records, blocks, and access
structures.
- Records and blocks are placed on the storage medium and they are interlinked. Example:
sorted file.
Access methods:-
- Some access methods can be applied only to file organized in certain ways. That are
Method access refers to the way that is, in which records are accessed. A file with an
organization of indexed or relative may still have its records accessed sequentially. But records in
a file with an organization of sequential. Cannot be accessed directly.
Comment
Chapter 16, Problem 43E
Problem
Suppose that we have an ordered file of fixed-length records and an unordered overflow file to
handle insertion. Both files use unspanned records. Outline algorithms for insertion, deletion, and
modification of a file record and for reorganizing the file. State any assumptions you make.
Step-by-step solution
Step 1 of 2
Algorithms: Consider that file name is abc and file is ordered on Key field that is a numeric fiels
and in increasing order.
For insertion: Let for record that is to be inserted value of Key field be n
5. Close file
For deletion: let record to be deleted has value for key field = n
4. Save result
5. Close file
For modification: let record to be modified has value of key field = n and value of Name is
to be modified to xyz.
4. Save result
5. Close file.
Comment
Step 2 of 2
For insertion: Let for record that is to be inserted value of Key field be n
5. Close file
For deletion: let record to be deleted has value for key field = n
4. Save result
5. Close file
For modification: let record to be modified has value of key field = n and value of Name is
to be modified to xyz.
4. Save result
5. Close file.
Comment
Chapter 16, Problem 44E
Problem
Can you think of techniques other than an unordered overflow file that can be used to make
insertions in an ordered file more efficient?
Step-by-step solution
Step 1 of 1
575-13-33E
Yes, we may think that it is possible to use an overflow file in which the records are chained
together in a manner similar to the overflow for static hash files. The overflow records that should
be inserted in each block of the ordered file are linked together in the overflow file, and a pointer
to the first record in the linked list, that is kept in the block of the main file.
Comment
Chapter 16, Problem 45E
Problem
Suppose that we have a hash file of fixed-length records, and suppose that overflow is handled
by chaining. Outline algorithms for insertion, deletion, and modification of a file record. State any
assumptions you make.
Step-by-step solution
Step 1 of 2
Over flow is handled by chaining. Means, in a bucket. Multiple blocks are chained together and
attached by a number of over flow buckets together.
Step 1:
Each bucket stores a value all the entries that point to the same bucket have the same
values on the first ; bits
Step 2:
Compute
Use the first high order nits of as a displacement in to the bucket address table and
follow the pointer to the appropriate bucket.
Comment
Step 2 of 2
Sept 1.
Step 2.
Step 3.
Coalescing of buckets is possible-can only coalesce with a “buddy” bucket having the same
value of and same prefix, if one such bucket exists
Assumptions:-
Comment
Chapter 16, Problem 45E
Problem
Suppose that we have a hash file of fixed-length records, and suppose that overflow is handled
by chaining. Outline algorithms for insertion, deletion, and modification of a file record. State any
assumptions you make.
Step-by-step solution
Step 1 of 2
Over flow is handled by chaining. Means, in a bucket. Multiple blocks are chained together and
attached by a number of over flow buckets together.
Step 1:
Each bucket stores a value all the entries that point to the same bucket have the same
values on the first ; bits
Step 2:
Compute
Use the first high order nits of as a displacement in to the bucket address table and
follow the pointer to the appropriate bucket.
Comment
Step 2 of 2
Sept 1.
Step 2.
Step 3.
Coalescing of buckets is possible-can only coalesce with a “buddy” bucket having the same
value of and same prefix, if one such bucket exists
Assumptions:-
Comment
Chapter 16, Problem 46E
Problem
Can you think of techniques other than chaining to handle bucket overflow in external hashing?
Step-by-step solution
Step 1 of 5
To handle a bucket overflow in external hashing, there is a techniques like chaining and Trie-
Based hashing.
- Distributes records among buckets based on the values of the leading bits in their hash values.
We can show this technique by the following.
Comment
Step 2 of 5
Comment
Step 3 of 5
the bucket (block) based on the first binary digit of the hash address.
Comment
Step 4 of 5
Comment
Step 5 of 5
Here bulk flow is done and now again it is split on 2nd bit in the hash address
Ti show this,
Suppose we have:
If we want to inset in the previous structure thour the structure is comes like this
Comment
Chapter 16, Problem 47E
Problem
Write pseudocode for the insertion algorithms for linear hashing and for extendible hashing.
Step-by-step solution
Step 1 of 2
We assume that the elements in the hash table T are keys with no information.
The key K is identical to the element containing key K. Every slot contains either a key or Nil.
Report
If
Then
Return j
Else
Unitl
Comment
Step 2 of 2
Insertion
Else
End if.
Comment
Chapter 16, Problem 48E
Problem
Write program code to access individual fields of records under each of the following
circumstances. For each case, state the assumptions you make concerning pointers, separator
characters, and so on. Determine the type of information needed in the file header in order for
your code to be general in each case.
Step-by-step solution
Step 1 of 6
a.
Consider the following program code for fixed length records with unspanned blocking.
*starting_location=200;
// record_to_access
int x
x = 5;
y = 2;
//record_size
R=25;
while (B
x = starting_locaton+(R*x)+y;
• In the above code, assume that the starting location of memory address is 200.
• When the records size is less than the block size, each block store more than one record.
Comment
Step 2 of 6
b.
Consider the following program code for fixed length records with spanned blocking.
*starting_location=200;
// record_to_access
int x
x = 5;
y = 2;
//record_size
R=25;
int B;
int a=1
while ($)
//if while loop contain the separating symbol, update the value of
//current_location
while (B
i= i + 2*(a+1)
• while loop contain the separating symbol, update the value of current_location
Comment
Step 3 of 6
c.
Consider the following code for variable length records with variable length fields and spanned
blocking.
*starting_location=200;
// record_to_access
int x
x = 5;
y = 2;
//record_size
R=25;
int a=1
// ReadFirstByte is used to reads first byte of current line and returns true if it indicates an //empty
record
empty = ReadFirstByte(a);
if (! empty)
if (crnt_Rcrd_Length!= R)
empty = false;
if (crnt_Rcrd_Length > R)
records.push_back(*this);
• In the above code assume that each record has an end of record byte.
Step 4 of 6
d.
Consider the following code for variable length records with repeating group and spanned
blocking.
if (! empty)
• Consider the highlighted code. It will be removed from part (c) to determine variable length
records with repeating group and spanned blocking.
• Since the spanned blocking involves records spanning more than one block, so the record
length is not required.
Comment
Step 5 of 6
e.
Consider the following code for variable length records with optional field and spanned blocking.
if (crnt_Rcrd_Length!= R)
empty = false;
• Consider the highlighted code. It will be removed from part (c) to determine variable length
records with optional field and spanned blocking.
• As some of the fields in the file records are optional, so the record length of the records, present
in the files, can be skipped.
Comment
Step 6 of 6
f.
Consider the following code for variable length records that allow all three cases in parts c, d and
e.
if (crnt_Rcrd_Length > R)
records.push_back(*this);
• Consider the highlighted code. It will be removed from part (c) to determine variable length
records that allow all three cases in parts c, d and e.
• One or more of the fields of the records, present in the files, are of varying size so their size
need not be greater than R. Hence the above part of the code can be skipped.
Comment
Chapter 16, Problem 49E
Problem
Suppose that a file initially contains r = 120,000 records of R = 200 bytes each in an unsorted
(heap) file. The block size B = 2,400 bytes, the average seek time s = 16 ms, the average
rotational latency rd = 8.3 ms, and the block transfer time btt = 0.8 ms. Assume that 1 record is
deleted for every 2 records added until the total number of active records is 240,000.
Step-by-step solution
Step 1 of 4
= 120,000 - X + 2X.
X = 120,000
Comment
Step 2 of 4
(a)
= 50K blocks.
Comment
Step 3 of 4
(b)
= 12000 ms.
= 12 sec.
Comment
Step 4 of 4
(c)
= 10000 * 0.8
= 8 sec.
Comment
Chapter 16, Problem 50E
Problem
Suppose we have a sequential (ordered) file of 100,000 records where each record is 240 bytes.
Assume that B = 2,400 bytes, s = 16 ms, rd = 8.3 ms, and btt = 0.8 ms. Suppose we want to
make X independent random record reads from the file. We could make X random block reads or
we could perform one exhaustive read of the entire file looking for those X records. The question
is to decide when it would be more efficient to perform one exhaustive read of the entire file than
to perform X individual random reads. That is, what is the value for X when an exhaustive read of
the file is more efficient than random X reads? Develop this as a function of X.
Step-by-step solution
Step 1 of 3
Calculate the total number of blocks (TB) in file using the formula .
Comment
Step 2 of 3
Calculate the time required for exhaustive reads (er) using the formula .
Hence, the time required for exhaustive read (er) = 8024.3 ms.
Comment
Step 3 of 3
The equation to decide the performance of one exhaustive read of the entire file is more efficient
than performing X individual random reads follows:
Time required to perform X individual random reads > time required for exhaustive read
Therefore, when 320 or more individual random reads are required, then it is better to read
the file exhaustively.
The function in X that relates the individual random reads and exhaustive reads is given by the
following equation:
Comment
Chapter 16, Problem 51E
Problem
Suppose that a static hash file initially has 600 buckets in the primary area and that records are
inserted that create an overflow area of 600 buckets. If we reorganize the hash file, we can
assume that most of the overflow is eliminated. If the cost of reorganizing the file is the cost of
the bucket transfers (reading and writing all of the buckets) and the only periodic file operation is
the fetch operation, then how many times would we have to perform a fetch (successfully) to
make the reorganization cost effective? That is, the reorganization cost and subsequent search
cost are less than the search cost before reorganization. Support your answer. Assume s = 16
msec, rd = 8.3 msec, and btt = 1 msec.
Step-by-step solution
Step 1 of 1
Total reorganization cost = Buckets Read & Buckets Written for (600 & 600) + 1200
= 2400 buckets
= 2400 (1 ms)
= 2400 ms
Average Search time per fetch = time to access (1 + 1/2) buckets where 50% of time we need to
access the overflow bucket.
= 16 + 8.3 + 0-8
= 25.1 ms
= 2400 + X (25.1) ms
Comment
Chapter 16, Problem 52E
Problem
Suppose we want to create a linear hash file with a file load factor of 0.7 and a blocking factor of
20 records per bucket, which is to contain 112,000 records initially.
Step-by-step solution
Step 1 of 2
575-13-41E
(a)
= 8000.
Comment
Step 2 of 2
(b)
Let ‘K’ is the number of bits used for bucket addresses. So, 2K < = 8000 < = 2 k+1
2 12 = 4096
2 13 = 8192
K = 12
= 8000 - 4096
= 3904 -
Comment
Chapter 17, Problem 1RQ
Problem
Define the following terms: indexing field, primary key field, clustering field, secondary key field,
block anchor, dense index, and nondense (sparse) index.
Step-by-step solution
Step 1 of 1
Indexing field:-
Record structure is consisting of several fields. The record fields are used to construct an index.
An index access structure is usually defined on a single field of a file. Any field in a file can be
used to create an index and multiple indexes on different fields can be constructed on a field.
A primary key is the ordering key field of the file. A field that is uniquely identifies a record.
Clustering field:-
A secondary index is also an ordered field with two fields. ( like a primary index). The first field is
of the same data type as some non-ordering field of the data file that is an indexing field. If the
secondary access structure uses a key field, which has a distinct value for every record.
Therefore, it is called as secondary key field.
Block anchor:-
The total number of entries in the index is the same as the number of disk block in the ordered
data file.
The first record in each block of the data file is called as block anchor.
Dense index:
An index has an index entry for every search key value (and hence every record) in the data file.
Index record contains the pointer and search key value to the records on the disk
Non-dense:-
Comment
Chapter 17, Problem 2RQ
Problem
What are the differences among primary, secondary, and clustering indexes? How do these
differences affect the ways in which these indexes are implemented? Which of the indexes are
dense, and which are not?
Step-by-step solution
Step 1 of 1
Comment
Chapter 17, Problem 3RQ
Problem
Why can we have at most one primary or clustering index on a file, but several secondary
indexes?
Step-by-step solution
Step 1 of 2
A file which is in an order has some fixed size of the records with some key fields is said to be
the primary index. But the clustering index in which it has a block pointer and the data with a field
of the same type as the clustering field.
Adding or removing records in the file cannot be done easily. It has some problems in which the
data records are physically ordered.
To overcome this problem, a whole block can be reserved for each of the clustering fields.
Comment
Step 2 of 2
A file which is not in an order is said to be secondary index. It can be defined on a single key field
with a unique value and on a non-key field with repeated values.
The following is the reason behind why there are at most one primary or clustering indexes
whereas several indexes for secondary index:
• Primary and clustering index can use a single key field such that both of them cannot be there
in a file but for secondary index, a unique value can be taken as a key field in every records or a
non-key field with the repeated values in which the pointers will point to another block that have
pointers to the repeated values.
Comment
Chapter 17, Problem 4RQ
Problem
How does multilevel indexing improve the efficiency of searching an index file?
Step-by-step solution
Step 1 of 4
Solution:
• In multilevel indexing, the main idea is to reduce the blocks of the index that are searched.
A Multi-level defines the index file that will be referred first with an ordered file with a
distinct k value.
• By using single level index, create the primary index and then create the second-level, third-
level and so on.
• So that the multi-level index can be created with the single index blocks.
Comment
Step 2 of 4
For improving the efficiency of searching the index file, multilevel index in is follows the
following steps:
Step1:
• Multilevel index considers the index file. The distinct value with an ordered file for each key k (i)
Step 2:
• Hence, the second level blocking factor before, is some as the first level of the index.
• Here, before the blocking factor the first level 1 has entries, then the first level needs
blocks.
Step 3:
• In next level, the primary index has an entry in the second level for the second-level blocks.
• Now repeat the process until all the entries fit in the single block of some index level fit.
• Now, it is in the block at the th level. Also, it is the top index level.
Comment
Step 3 of 4
Where
From the above steps and processer, we may improve the efficiency of the search an index file.
Comment
Step 4 of 4
The following ways that the multilevel indexing improved the efficiency of searching an
index file is:
• While searching the record, it reduces the access of number of blocks in the given indexing field
value.
• The benefits of multi-level indexing include the reduction of insertion and deletion problems in
indexing.
• While inserting new entries, it leaves some space that deals to tshe advantage to developers to
adopt the multi-level indexing.
Comment
Chapter 17, Problem 5RQ
Problem
Step-by-step solution
Step 1 of 2
Order P of a B – tree:-
A tree, it consists that, each node contains at must p – 1 search values and P pointers in the
order
Where :
Comment
Step 2 of 2
Step 1:
Here
is a tree pointer
Step 2:
Step 3:
For all search key field values X in the Sub tree pointed at by :
Sep 5: Each node, except the root and leaf nodes, has at least two tree pointers unless it is the
only node in the tree
Step 6: A node with a tree pointers, , has search key field values.
Step 7: All nodes are at the same level. Leaf nodes have to same structure as internal nodes
except that all of their tree pointers are Null
Comment
Chapter 17, Problem 6RQ
Problem
What is the order p of a B+-tree? Describe the structure of both internal and leaf nodes of a B+-
tree.
Step-by-step solution
Step 1 of 4
Order P of a B + -tree:-
Implementation of a dynamic multilevel index use a variation of the B-tree data structure is called
as -tree.
Comment
Step 2 of 4
Comment
Step 3 of 4
Step 1
Step 2
Step 3
For all search field values X in the sub tree pointed at by we have for
; where for and
Step 5
Each internal node, except the root, has at least tree pointers.
The root node has at least two tree pointers, if it is an internal node.
Step 6
Comment
Step 4 of 4
Where , each is a data pointer and points to the next leaf node of the .
Step 2:
Step 3:
Each is a data pointer that points to the record whose search field value is
Step 4:
Step 5:-
Comment
Chapter 17, Problem 7RQ
Problem
How does a B-tree differ from a B+-tree? Why is a B+-tree usually preferred as an access
structure to a data file?
Step-by-step solution
Step 1 of 1
A B-tree has data pointers in the both internal and leaf nodes, where as
In B+-tree, it has only tree pointers in internal nodes and all data pointers are in leaf
nodes.
B+-tree preferred as an access structure to a data file because, entries in the internal nodes of a
B+-tree leading to fewer levels improving the search time.
In addition that, the entire tree can be traversed in order using the pent pointers.
Comment
Chapter 17, Problem 8RQ
Problem
Explain what alternative choices exist for accessing a file based on multiple search keys.
Step-by-step solution
Step 1 of 3
1. Ordered Index on Multiple Attributes: In this index is created on search key field that is a
combination of attributes . If an index is created on attributes , the search key values are tuples
with n values:
A lexicographic ordering of these tupl values establishes an order on this composite search key.
Lexicographic ordering works similar to ordering of character strings. An index on a composite
key of n attributes works similarly to primary or secondary indexing.
Comment
Step 2 of 3
For example, consider the composite aearch key id Dno is hashed to 3 bits and Age to 5 bits; we
get 8 bits of address. Suppose that Dno = has hash address ‘100’ and for Age = 59 has address
‘10101’ to search combination, search bucket address = 10010101.
Comment
Step 3 of 3
3. Grid Files: We can construct grid array with one linear scale for each of search attributes. This
method is particularly useful for range queries that would map into a set of cells corresponding to
a group of values along the linear scales. Conceptually, the grid file concept may be applied to
any number of search keys. For n search keys, the grid array would have n dimensions of the
search keys attributes and provides an access by combinations of value along those dimensions.
Grid files perform well in terms of reduction in time for multiple key accesses. However, they
represent a space overhead in terms of grid array structure. Moreover, with dynamic files, a
frequent recognition of the files adds to maintenance cost.
Comment
Chapter 17, Problem 9RQ
Problem
What is partitioned hashing? How does it work? What are its limitations?
Step-by-step solution
Step 1 of 1
Partitioned flashing:-
It is an extension of static external hashing. That allows access on multiple keys. Means, hash
values that are split into segments. That depend on each attribute of the search key.
Let , for customer and search-key being (customer – street, customer – city)
(Main, ) 101111
(Main, ) 101001
(Park, ) 010010
(Spring, ) 001001
(, ) 110010
Now, it is ready to search for the required composite search key by looking up eh
appropriate buckets that mach the parts of the address in which we are interested.
Comment
Chapter 17, Problem 11RQ
Problem
Step-by-step solution
Step 1 of 2
Take grid array for the EMPLOYEE file with one linear scale D no and another for the age
attribute.
D no
0 1,2
1 3,4
25
3 6,7
48
5. 9,10
Comment
Step 2 of 2
0 1 2 3 4 5
Through this data we want to show that the linear scale of D no has D no combined as one
value 0 on the scale while D no corresponds to the value 2 on the scale and age is divided
into its scale of 0 to 5 by grouping ages and distribute the employees uniformly by age.
For this the grid array shows cells. And each cell points to some bucket address where the
records corresponding to that cell are stored.
Now our request for D no and age maps into cell . It is corresponding to grid array,
and it will be found in the corresponding bucket. For ‘n’ search keys, the grid array would have ‘n’
dimensions.
Employee File.
Comment
Chapter 17, Problem 12RQ
Problem
Step-by-step solution
Step 1 of 2
Indexes that are all secondary and new records are inserted at the end of the file. Then the data
file it self is an unordered file. So, a file that have secondary index on every one of its field is
called as fully invented file. Usually, the indexes that are implemented as B+- tree and up load
dynamically to reflect insertion or deletion of records.
Comment
Step 2 of 2
Indexed sequential files are important for applications where data needs to be accessed
through
Let an example.
An organization may store the details about it’s employees as an indexed sequential file,
and sometimes the file is accessed
Sequential:-
For example, when the whole of the file is processed to produce pay slips at the end of the
month.
Randomly:
An example changes address, or a female employee gets married can changes her surname so,
indexed sequential file can only be stored an a random access device.
Comment
Chapter 17, Problem 13RQ
Problem
Step-by-step solution
Step 1 of 1
Hashing technique is used for searching wherein fast retrieval of records is necessary. The
reference file used for this is known as hash file. The search condition is validated using the hash
key which is nothing but the reference name that has to be found.
Functions of hashing:
• A hash function ‘f’ or randomizing function is entered in the hash field value of a record and
determines the address of it.
• It is also used as an internal search function within a program, whenever a group of records is
accessed by using the value of only one field.
• Access structures similar to indexes that are based on hashing can be created; the hash index
is a secondary structure to access the file by using hashing function on a search key.
• The index entries contains the key (K) and the pointer (P) used to point to the record containing
the key or block containing the record for that key.
• The index files that contain these index entries can be organized as a dynamically expandable
hash file, using dynamic or linear or extendible hashing techniques, searching for an entry is
performed by using hash search algorithm on K.
• Once an entry is identified the pointer (P) is used to locate the corresponding record in the data
file.
Comment
Chapter 17, Problem 14RQ
Problem
What is bitmap indexing? Create a relation with two columns and sixteen tuples and show an
example of a bitmap index on one or both.
Step-by-step solution
Step 1 of 1
The bitmap index is a data structure that allows querying on more number of keys
• It is used for relations that contain a large number of rows so that it can be used identify the
relation for the specific key value.
• It creates an index for one or more columns and each value or value range in those columns
selected is/are indexed.
• A bitmap index is created for those columns that should contain only a small number of unique
values.
Construction
• To create a bitmap index for a set of records in a relation or a table, the records must be
numbered from 0 to n with an id that is used to be mapped to a physical address that contains a
block number and a record offset within the block.
• It is created for one particular value of a particular field (or column) as an array of bits.
• For example a bitmap index is constructed for the column F and a value V of that column. A
relation with n rows of n tuples and it contains n bits. The jth bit is set to 1 if the row j has the
value V for column F, otherwise it is set to 0.
Example
1 Alan M
2 Clara F
3 John M
4 Benjamin M
5 Marcus M
6 Alice F
7 Joule F
8 Louis M
9 Samuel M
10 Lara F
11 Andy F
12 Martin M
13 Catherine F
14 Fuji F
15 Zarain F
16 Ford M
For M: 1011100110010001, the row that contains the tuple M wherever it appears are set to 1,
other are set to 0.
For F: 0100011001101110, the row that contains the tuple F wherever it appears are set to 1,
other are set to 0.
Comment
Chapter 17, Problem 15RQ
Problem
What is the concept of function-based indexing? What additional purpose does it serve?
Step-by-step solution
Step 1 of 1
Function-based indexing is a new type of indexing that has been developed and used by the
Oracle database systems as well as in some other organizational products that provides financial
profit.
By applying any function to the value that belongs to the field or to the collection of fields, a result
is obtained which is used as the key to the index that is used to create an index.
It ensures that Oracle Database System will use this index to search instead of performing the
scan over full table, even when a function is used in the search value of a query.
Example,
It returns the customer name in lower case letter; LOWER ("MARTIN") results in “martin”, the
query given below uses the index:
SELECT CustomerName
FROM Customer
If the functional-based indexing is not used, an Oracle database system perform scanning
process for the entire table, as -tree index is a searching process by using directly only the
column value, any function that is used on a column avoids using such an index.
Comment
Chapter 17, Problem 16RQ
Problem
Step-by-step solution
Step 1 of 1
Physical index
• The index entries with the key (K) and the physical pointer (P), used to point to the physical
address of the record stored on the disk as a block number and offset. This is referred as
physical index.
• For example, a primary file organization is based on extendible or linear hashing, and then at
each time when a bucket is split, some of the records are allocated to a newer bucket and hence
they are provided with new physical addresses.
• If there is a secondary indexing used on the file, the pointers that point to that record must be
determined and updated (pointer must be changed if the record moved to another location) but it
is considered to be a difficult task.
Logical index
• The index entries of logical index are a pair of keys K and Ks.
• Every entry of the records contains one value of K used for primary organization of files and
another key Ks for the secondary indexing field matched with the value K of the field.
• While searching the secondary index on the value of Ks, a program can identify the location of
the corresponding value of K and use this matching key terms to access the record through the
primary organization of the file, thus it introduces an extra search level of indirection between the
data and access structure.
Comment
Chapter 17, Problem 17RQ
Problem
Step-by-step solution
Step 1 of 1
Column-based storage of relations is a traditional way of storing the relations by row (one by
one). It provides advantages especially for read-only queries, which are from read-only
databases. It stores each column of data in relational databases individually and provides
performance advantages.
Advantages
• Partitioning the table vertically column by column, so those tables with a two-column are
constructed for each and every attribute of the table and thus only the columns that are needed
can be accessed.
• Column-wise indexes and join indexes are used on multiple tables to provide answer to the
queries without accessing the data tables.
Column-wise storage of data provides an extra feature in the index creation. The same column
present in each table on number of projections creates indexes on each projection. For storing
the values in the same column, various strategies, data compression, null value suppression,
and various encoding techniques are used.
Comment
Chapter 17, Problem 18E
Problem
Consider a disk with block size B = 512 bytes. A block pointer is P = 6 bytes long, and a record
pointer is PR = 7 bytes long. A file has r = 30,000 EMPLOYEE records of fixed length. Each
record has the following fields: Name (30 bytes), Ssn (9 bytes), Department_code (9 bytes),
Address (40 bytes), Phone (10 bytes), Birth_date (8 bytes), Sex (1 byte), Job_code (4 bytes),
and Salary (4 bytes, real number). An additional byte is used as a deletion marker.
b. Calculate the blocking factor bfr and the number of file blocks b, assuming an unspanned
organization.
c. Suppose that the file is ordered by the key field Ssn and we want to construct a primary index
on Ssn. Calculate (i) the index blocking factor bfri (which is also the index fan-out fo); (ii) the
number of first-level index entries and the number of first-level index blocks; (iii) the number of
levels needed if we make it into a multilevel index; (iv) the total number of blocks required by the
multilevel index; and (v) the number of block accesses needed to search for and retrieve a record
from the file—given its Ssn value—using the primary index.
d. Suppose that the file is not ordered by the key field Ssn and we want to construct a secondary
index on Ssn. Repeat the previous exercise (part c) for the secondary index and compare with
the primary index.
e. Suppose that the file is not ordered by the nonkey field Department_code and we want to
construct a secondary index on Department_code, using option 3 of Section 17.1.3, with an extra
level of indirection that stores record pointers. Assume there are 1,000 distinct values of
Department_code and that the EMPLOYEE records are evenly distributed among these values.
Calculate (i) the index blocking factor bfri (which is also the index fan-out fo); (ii) the number of
blocks needed by the level of indirection that stores record pointers; (iii) the number of first-level
index entries and the number of first-level index blocks; (iv) the number of levels needed if we
make it into a multilevel index; (v) the total number of blocks required by the multilevel index and
the blocks used in the extra level of indirection; and (vi) the approximate number of block
accesses needed to search for and retrieve all records in the file that have a specific
Department_code value, using the index.
f. Suppose that the file is ordered by the nonkey field Department_code and we want to construct
a clustering index on Department_code that uses block anchors (every new value of
Department_code starts at the beginning of a new block). Assume there are 1,000 distinct values
of Department_code and that the EMPLOYEE records are evenly distributed among these
values. Calculate (i) the index blocking factor bfri (which is also the index fan-out fo); (ii) the
number of first-level index entries and the number of first-level index blocks; (iii) the number of
levels needed if we make it into a multilevel index; (iv) the total number of blocks required by the
multilevel index; and (v) the number of block accesses needed to search for and retrieve all
records in the file that have a specific Department_code value, using the clustering index
(assume that multiple blocks in a cluster are contiguous).
g. Suppose that the file is not ordered by the key field Ssn and we want to construct a B+-tree
access structure (index) on Ssn. Calculate (i) the orders p and pleaf of the B+-tree; (ii) the
number of leaf-level blocks needed if blocks are approximately 69% full (rounded up for
convenience); (iii) the number of levels needed if internal nodes are also 69% full (rounded up for
convenience); (iv) the total number of blocks required by the B+-tree; and (v) the number of block
accesses needed to search for and retrieve a record from the file?given its Ssn value?using the
B+-tree.
h. Repeat part g, but for a B-tree rather than for a B+-tree. Compare your results for the B-tree
and for the B+-tree.
Step-by-step solution
Step 1 of 31
Disk operations on file using primary, secondary, clustering, B+ tree and B-tree methods
Record size
Comment
Step 2 of 31
Comment
Step 3 of 31
Blocking factor,
Comment
Step 4 of 31
(ii) Calculation of number of first –level index and number of first level index blocks
Comment
Step 5 of 31
= 7 entries
It is the top index level because the third level has only one index.
Hence, the index has x = 3 levels
Comment
Step 6 of 31
(v) Calculation of number of block access to search and retrieve a record using primary
index on a file.
For primary index type of index, the number of block access is equal to
the access one block at each level plus one block from the data file.
Since the file is ordered with a single key field, Ssn. So it is a type of primary index.
Comment
Step 7 of 31
In the ‘c’ part, the assumes that leaf-level index blocks contain block pointers. And it is possible
to assume that they contain record pointers. And Record size is
Index records/block for internal nodes, block pointers are always used, so the fan-out for
internal nodes to is 34.
Comment
Step 8 of 31
Step 9 of 31
Ceiling
So, the third level has one block and it is the top of the level
Comment
Step 10 of 31
Comment
Step 11 of 31
Department_code is
Comment
Step 12 of 31
(iii) Calculation of number of first-level index entries and number of first level blocks
Comment
Step 13 of 31
We can calculate the number of levels by number of second level index entries
Entries
Ceiling
Comments (1)
Step 14 of 31
Comment
Step 15 of 31
(vi ) Calculation of number of block access to search and retrieve all records in the file for
a Department_code value
Number of block accesses to search for and retrieve the block containing the record pointers at
the level of indirection
So, total block accesses needed on average to retrieve all the records with in a given value for
Department_code
Comment
Step 16 of 31
(f) Operations on the file which is constructed using clustering index on Department_code
Comment
Step 17 of 31
(ii) Calculation of number of first-level index entries and number of first level blocks
entries.
Comment
Step 18 of 31
Ceiling
Second level has one block and it is in the top index level
Comments (1)
Step 19 of 31
Comment
Step 20 of 31
(v) Calculation of number of block access to search and retrieve all records in the file for a
Department_code value
Number of block accesses to search for the first block in the cluster of blocks
So, total block accesses needed on average to retrieve all the records with a given
DEPARTMENT CODE
Comment
Step 21 of 31
So,
For leaf nodes, the record pointers are included in the leaf nodes, and it satisfied the
(Or)
Comments (2)
Step 22 of 31
Nodes are full on the average, so the average number of key values in a leaf node is
If we round up this for convenience, we get 22 key values and 22 record pointers per leaf node.
So, the file has records and hence values of , the number of leaf-level nodes
needed is
Comment
Step 23 of 31
Calculate the number of levels as average fan-out for the internal nodes is
So, the fourth level has one block and the tree has levels
So,
Comment
Step 24 of 31
Comment
Step 25 of 31
(v) Calculation of number of block access to search and retrieve a record of Ssn using B+
tree
Comment
Step 26 of 31
So,
For leaf nodes, the record pointers are included in the leaf nodes, and it satisfied the
Comment
Step 27 of 31
(ii) Each node of B-Tree is 69% full .So the average number of key values in a leaf node is
If we get ceiling of 21.39 for convenience, we get 22 key values and 22 record pointers per leaf
node.
So, the file has records and hence values of , the number of leaf-level nodes
needed is
Comment
Step 28 of 31
(iii) Calculate the number of levels as average fan-out for the internal nodes
is
Number of second level tree blocks
So, the fourth level has one block and the tree has levels
So,
Comment
Step 29 of 31
Comments (1)
Step 30 of 31
Comment
Step 31 of 31
At root level, each node on average will have 34 pointers and 33 (p-1) search fields
At root level, each node on average will have 23 pointers and 22 (p-1) search fields
For given block size, pointer size and search key field size, a three level B+ tree holds 1336335
entries on average .Similarly, for given block size, pointer size and search key field size, a leaf
level B tree holds 256565 entries on average .Therefore, average entries stored on B+ tree are
more than the average entries stored in B tree.
Comment
Chapter 17, Problem 19E
Problem
A PARTS file with Part# as the key field includes records with the following Part# values: 23, 65,
37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20, 24, 28, 39,43,47, 50,69, 75, 8,49,
33, 38. Suppose that the search field values are inserted in the given order in a B+-tree of order
p = 4 and pleaf = 3; show how the tree will expand and what the final tree will look like.
Step-by-step solution
Step 1 of 34
B+ Tree Insertion:
• The Order implies that each node in the tree should have at most 4 pointers.
• Means the leaf nodes must have at least 2 keys and at most 3 keys.
• The insertion first start from the root, when root or any node overflows its capacity, it must split.
• When a leaf node is full the first elements will keep in that node and rest elements
should form the right node.
• The element at that rightmost position of the left partition will propagate up to the parent node.
• If the propagation is from the leaf node, a copy of the element should maintain at leaf. Else, just
move that element to its parent node.
• All the elements in the key list should be there in the leaf nodes.
Comment
Step 2 of 34
23, 65, 37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20,24, 28, 39, 43, 47, 50, 69, 75,
8, 49, 33, 38.
First, insert the first three keys into the root; it will not result in overflow. Since, the capacity of the
node is also 3.
Comment
Step 3 of 34
Insert 60:
After the insertion of 60 into this node, it will results in an overflow, So the node to be split into
two and a new level will created as below:
Comment
Step 4 of 34
Insert 46:
Insertion of 46 will not affect the capacity constraint of the second node in level 2.
Comment
Step 5 of 34
Insert 92:
Insertion of next key, 92 will results in the overflow of the second node in the level2, it will be
46,60,65,92.
• Therefore, we need to split that node from 60 and create one new node in level2 and duplicate
60 in the parent node as below:
Comment
Step 6 of 34
Comment
Step 7 of 34
Insert 48:
Insertion of 48 will not prompt any overflow it will insert to the second node in the level2 as
below:
Comment
Step 8 of 34
Insert 71:
• It can insert into third node of level 2 without violating order constraints.
Level1: 37, 60
Step 9 of 34
Insert 56:
• Clearly 56 is belongs to the second node of level 2 but it will results in an overflow as shown
below:
• The first two (46, 48) will form the first node of split and (56, 60) will form the second, the last
element of the first set (48) will propagate to up.
Comment
Step 10 of 34
• The level is counts from root to leaves, that is; root will have level value 1 and increment 1
downwards.
Insert 59:
Level 2: 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Comment
Step 11 of 34
Insert 18:
Level 2: 18, 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Comment
Step 12 of 34
Insert 21:
Level 2: 18, 21, 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Overflow. Split (18, 21, 23, 37) and propagate 21 to above level.
Level 2: 18, 21,: 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Again, overflow in level 1. Split and propagate 37, since it is not a leaf node so no need to take a
copy of 37. This will results a new level in the tree.
Level 1: 37
Level 3: 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Comment
Step 13 of 34
Insert 10:
Level 1: 37
Level 3: 10, 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Comment
Step 14 of 34
Insert 74:
Level 1: 37
Level 3: 10, 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71, 74, 92
Level 1: 37
Level 3: 10, 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71: 74, 92
Comment
Step 15 of 34
Insert 78:
Level 1: 37
Level 3: 10, 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Comment
Step 16 of 34
Insert 15:
Level 1: 37
Level 3: 10, 15, 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Level 1: 37
Comment
Step 17 of 34
Comment
Step 18 of 34
Insert 16:
Level 1: 37
Comment
Step 19 of 34
Insert 20:
Level 1: 37
Level 2: 15, 21: 48, 60, 71
Level 3: 10, 15 16, 18, 20, 21: 23, 37: 46, 48:
Level 1: 37
Level 3: 10, 15 16, 18: 20, 21: 23, 37: 46, 48:
Comment
Step 20 of 34
Insert 24:
Level 1: 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24, 37: 46, 48:
Comment
Step 21 of 34
Insert 28:
Level 1: 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24, 28, 37: 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Level 1: 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Level 1: 18, 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Comment
Step 22 of 34
Insert 39:
Level 1: 18, 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Comment
Step 23 of 34
Insert 43:
Level 1: 18, 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43, 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Over flow at the inserted node, so split that node at second element 43 as below.
Level 1: 18, 37
Comment
Step 24 of 34
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: , 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: , 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Comment
Step 25 of 34
Insert 47:
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 56, 59, 60: 65, 71: 74, 78, 92
Comment
Step 26 of 34
Insert 50:
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56, 59, 60: 65, 71:
74, 78, 92
Overflow at the inserted node. Split the node at 56, the second element and propagate it up as
below.
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 71: 74, 78,
92
Comment
Step 27 of 34
Insert 69:
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 69, 71: 74,
78, 92
Comment
Step 28 of 34
Insert 75:
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 69, 71: 74,
75 78, 92
Overflow at the inserted node, split and propagate up the node at the second element.
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 69, 71: 74,
75: 78, 92
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 69, 71: 74,
75: 78, 92
Again overflow at the inserted node of 60. Split it at 37 and propagate 37 into a new level.
Level 1: 37
Level 4: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 69, 71: 74,
75: 78, 92
Comment
Step 29 of 34
Insert 8:
Level 1: 37
Comment
Step 30 of 34
Insert 49:
Level 1: 37
28, 37: 39, 43: 46, 47, 48: 49, 50, 56: 59, 60: 65, 69, 71: 74, 75: 78, 92
Comment
Step 31 of 34
Insert 33:
Level 1: 37
28, 33, 37: 39, 43: 46, 47, 48: 49, 50, 56: 59, 60: 65, 69, 71: 74, 75: 78, 92
Comment
Step 32 of 34
Insert 38:
Level 1: 37
28, 33, 37: 38, 39, 43: 46, 47, 48: 49, 50, 56: 59, 60: 65, 69, 71: 74, 75: 78, 92
Comment
Step 33 of 34
The tree after the insertion of the last key 38 will give us the final B+ tree.
• From each node except the leaf nodes, a left pointer is there to the child nodes in which left
pointer points to node having keys less than that parent node and right pointer points to the node
having key values larger than that parent node.
• Each set in the above tree levels will form a node and set elements are the keys present in that
node.
Comment
Step 34 of 34
Graphically the final tree after the insertion of keys will look as below:
Comment
Chapter 17, Problem 19E
Problem
A PARTS file with Part# as the key field includes records with the following Part# values: 23, 65,
37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20, 24, 28, 39,43,47, 50,69, 75, 8,49,
33, 38. Suppose that the search field values are inserted in the given order in a B+-tree of order
p = 4 and pleaf = 3; show how the tree will expand and what the final tree will look like.
Step-by-step solution
Step 1 of 34
B+ Tree Insertion:
• The Order implies that each node in the tree should have at most 4 pointers.
• Means the leaf nodes must have at least 2 keys and at most 3 keys.
• The insertion first start from the root, when root or any node overflows its capacity, it must split.
• When a leaf node is full the first elements will keep in that node and rest elements
should form the right node.
• The element at that rightmost position of the left partition will propagate up to the parent node.
• If the propagation is from the leaf node, a copy of the element should maintain at leaf. Else, just
move that element to its parent node.
• All the elements in the key list should be there in the leaf nodes.
Comment
Step 2 of 34
23, 65, 37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20,24, 28, 39, 43, 47, 50, 69, 75,
8, 49, 33, 38.
First, insert the first three keys into the root; it will not result in overflow. Since, the capacity of the
node is also 3.
Comment
Step 3 of 34
Insert 60:
After the insertion of 60 into this node, it will results in an overflow, So the node to be split into
two and a new level will created as below:
Comment
Step 4 of 34
Insert 46:
Insertion of 46 will not affect the capacity constraint of the second node in level 2.
Comment
Step 5 of 34
Insert 92:
Insertion of next key, 92 will results in the overflow of the second node in the level2, it will be
46,60,65,92.
• Therefore, we need to split that node from 60 and create one new node in level2 and duplicate
60 in the parent node as below:
Comment
Step 6 of 34
Comment
Step 7 of 34
Insert 48:
Insertion of 48 will not prompt any overflow it will insert to the second node in the level2 as
below:
Comment
Step 8 of 34
Insert 71:
• It can insert into third node of level 2 without violating order constraints.
Level1: 37, 60
Step 9 of 34
Insert 56:
• Clearly 56 is belongs to the second node of level 2 but it will results in an overflow as shown
below:
• The first two (46, 48) will form the first node of split and (56, 60) will form the second, the last
element of the first set (48) will propagate to up.
Comment
Step 10 of 34
• The level is counts from root to leaves, that is; root will have level value 1 and increment 1
downwards.
Insert 59:
Level 2: 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Comment
Step 11 of 34
Insert 18:
Level 2: 18, 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Comment
Step 12 of 34
Insert 21:
Level 2: 18, 21, 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Overflow. Split (18, 21, 23, 37) and propagate 21 to above level.
Level 2: 18, 21,: 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Again, overflow in level 1. Split and propagate 37, since it is not a leaf node so no need to take a
copy of 37. This will results a new level in the tree.
Level 1: 37
Level 3: 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Comment
Step 13 of 34
Insert 10:
Level 1: 37
Level 3: 10, 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71, 92
Comment
Step 14 of 34
Insert 74:
Level 1: 37
Level 3: 10, 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71, 74, 92
Level 1: 37
Level 3: 10, 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71: 74, 92
Comment
Step 15 of 34
Insert 78:
Level 1: 37
Level 3: 10, 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Comment
Step 16 of 34
Insert 15:
Level 1: 37
Level 3: 10, 15, 18, 21: 23, 37: 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Level 1: 37
Comment
Step 17 of 34
Comment
Step 18 of 34
Insert 16:
Level 1: 37
Comment
Step 19 of 34
Insert 20:
Level 1: 37
Level 2: 15, 21: 48, 60, 71
Level 3: 10, 15 16, 18, 20, 21: 23, 37: 46, 48:
Level 1: 37
Level 3: 10, 15 16, 18: 20, 21: 23, 37: 46, 48:
Comment
Step 20 of 34
Insert 24:
Level 1: 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24, 37: 46, 48:
Comment
Step 21 of 34
Insert 28:
Level 1: 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24, 28, 37: 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Level 1: 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Level 1: 18, 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Comment
Step 22 of 34
Insert 39:
Level 1: 18, 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Comment
Step 23 of 34
Insert 43:
Level 1: 18, 37
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43, 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Over flow at the inserted node, so split that node at second element 43 as below.
Level 1: 18, 37
Comment
Step 24 of 34
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: , 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: , 46, 48: 56, 59, 60: 65, 71: 74, 78, 92
Comment
Step 25 of 34
Insert 47:
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 56, 59, 60: 65, 71: 74, 78, 92
Comment
Step 26 of 34
Insert 50:
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56, 59, 60: 65, 71:
74, 78, 92
Overflow at the inserted node. Split the node at 56, the second element and propagate it up as
below.
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 71: 74, 78,
92
Comment
Step 27 of 34
Insert 69:
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 69, 71: 74,
78, 92
Comment
Step 28 of 34
Insert 75:
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 69, 71: 74,
75 78, 92
Overflow at the inserted node, split and propagate up the node at the second element.
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 69, 71: 74,
75: 78, 92
Level 3: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 69, 71: 74,
75: 78, 92
Again overflow at the inserted node of 60. Split it at 37 and propagate 37 into a new level.
Level 1: 37
Level 4: 10, 15 16, 18: 20, 21: 23, 24: 28, 37: 39, 43: 46, 47, 48: 50, 56: 59, 60: 65, 69, 71: 74,
75: 78, 92
Comment
Step 29 of 34
Insert 8:
Level 1: 37
Comment
Step 30 of 34
Insert 49:
Level 1: 37
28, 37: 39, 43: 46, 47, 48: 49, 50, 56: 59, 60: 65, 69, 71: 74, 75: 78, 92
Comment
Step 31 of 34
Insert 33:
Level 1: 37
28, 33, 37: 39, 43: 46, 47, 48: 49, 50, 56: 59, 60: 65, 69, 71: 74, 75: 78, 92
Comment
Step 32 of 34
Insert 38:
Level 1: 37
28, 33, 37: 38, 39, 43: 46, 47, 48: 49, 50, 56: 59, 60: 65, 69, 71: 74, 75: 78, 92
Comment
Step 33 of 34
The tree after the insertion of the last key 38 will give us the final B+ tree.
• From each node except the leaf nodes, a left pointer is there to the child nodes in which left
pointer points to node having keys less than that parent node and right pointer points to the node
having key values larger than that parent node.
• Each set in the above tree levels will form a node and set elements are the keys present in that
node.
Comment
Step 34 of 34
Graphically the final tree after the insertion of keys will look as below:
Comment
Chapter 17, Problem 20E
Problem
Exercise
A PARTS file with Part# as the key field includes records with the following Part# values: 23, 65,
37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20, 24, 28, 39,43,47, 50,69, 75, 8,49,
33, 38. Suppose that the search field values are inserted in the given order in a B+-tree of order
p = 4 and pleaf = 3; show how the tree will expand and what the final tree will look like.
Step-by-step solution
Step 1 of 1
Comment
Chapter 17, Problem 21E
Problem
Suppose that the following search field values are deleted, in the given order, from the B+-tree of
Exercise; show how the tree will shrink and show the final tree. The deleted values are 65, 75,
43, 18, 20, 92, 59, 37.
Exercise
A PARTS file with Part# as the key field includes records with the following Part# values: 23, 65,
37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20, 24, 28, 39,43,47, 50,69, 75, 8,49,
33, 38. Suppose that the search field values are inserted in the given order in a B+-tree of order
p = 4 and pleaf = 3; show how the tree will expand and what the final tree will look like.
Step-by-step solution
Step 1 of 10
In the - tree deletion algorithm, the deletion of a key value from a leaf node is
Comment
Step 2 of 10
(2) If the key value is deleted from right most value. Then its value will appear in an internal
node.
In this case, the key value to the left of the deleted key and left node will replaces the deleted key
value in the internal node.
From the data, deleting 65 will only affect the leaf node.
Deleting 75 will cause a leaf node to be less than half. So, it is combined with the next node and
also 75 is removed than the internal node.
Comment
Step 3 of 10
Comment
Step 4 of 10
Deleting 43 causes a leaf node to be less than half full, and is combined with the next node.
So the next node has 3 entries. It’s right must entry 48 can replace 43 in both the leaf and
interval nodes.
Comment
Step 5 of 10
Comment
Step 6 of 10
In the next step we may delete 18, it is in the right most entry in a leaf node and appears in an
internal node of the . Now the leaf node is less than half full and combined with the
next node.
The value 18 must be removed from the internal node. Causing underflow in the internal
One approach for dealing with under flow internal nodes is to reorganize the values of the under
flow node with its child nodes, so 21 is moved up into the under flow node leading to the
following free.
Comment
Step 7 of 10
Comment
Step 8 of 10
Deleting 59 causes under flow and the remaining value go is combined with the next leaf node.
Hence 60 is no larger a right most entry in a leaf node. This is normally don by moving 56 up to
replace 60 in the internal node, but since this leads to under flow in the node that used to
contains 56 the nodes can be reorganized as follows.
Comment
Step 9 of 10
Comment
Step 10 of 10
Finally removing 37 causes serious underflow, leading to a reorganization of the whole tree. One
approach to deleting the value on the root node is to use the right mast value in the root node is
to use the right mast value in the next leaf node to replace the root an move this leaf node to the
left sub tree. In this case the resulting tree may book as follows.
Comment
Chapter 17, Problem 22E
Problem
Exercise 1
Suppose that the following search field values are deleted, in the given order, from the B+-tree of
Exercise 2; show how the tree will shrink and show the final tree. The deleted values are 65, 75,
43, 18, 20, 92, 59, 37.
Exercise 2
A PARTS file with Part# as the key field includes records with the following Part# values: 23, 65,
37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20, 24, 28, 39,43,47, 50,69, 75, 8,49,
33, 38. Suppose that the search field values are inserted in the given order in a B+-tree of order
p = 4 and pleaf = 3; show how the tree will expand and what the final tree will look like.
Exercise 3
Step-by-step solution
Step 1 of 1
Comment
Chapter 17, Problem 23E
Problem
Algorithm 17.1 outlines the procedure for searching a nondense multilevel primary index to
retrieve a file record. Adapt the algorithm for each of the following cases:
a. A multilevel secondary index on a nonkey nonordering field of a file. Assume that option 3 of
Section 17.1.3 is used, where an extra level of indirection stores pointers to the individual records
with the corresponding index field value.
Step-by-step solution
Step 1 of 3
Comment
Step 2 of 3
Comment
Step 3 of 3
Comment
Chapter 17, Problem 24E
Problem
Suppose that several secondary indexes exist on nonkey fields of a file, implemented using
option 3 of Section 17.1.3; for example, we could have secondary indexes on the fields
Department_code, Job_code, and Salary of the EMPLOYEE file of Exercise. Describe an
efficient way to search for and retrieve records satisfying a complex selection condition on these
fields, such as (Department_code = 5 AND Job_code =12 AND Salary = 50,000), using the
record pointers in the indirection level.
Exercise
Consider a disk with block size B = 512 bytes. A block pointer is P = 6 bytes long, and a record
pointer is PR = 7 bytes long. A file has r = 30,000 EMPLOYEE records of fixed length. Each
record has the following fields: Name (30 bytes), Ssn (9 bytes), Department_code (9 bytes),
Address (40 bytes), Phone (10 bytes), Birth_date (8 bytes), Sex (1 byte), Job_code (4 bytes),
and Salary (4 bytes, real number). An additional byte is used as a deletion marker.
b. Calculate the blocking factor bfr and the number of file blocks b, assuming an unspanned
organization.
c. Suppose that the file is ordered by the key field Ssn and we want to construct a primary index
on Ssn. Calculate (i) the index blocking factor bfri (which is also the index fan-out fo); (ii) the
number of first-level index entries and the number of first-level index blocks; (iii) the number of
levels needed if we make it into a multilevel index; (iv) the total number of blocks required by the
multilevel index; and (v) the number of block accesses needed to search for and retrieve a record
from the file—given its Ssn value—using the primary index.
d. Suppose that the file is not ordered by the key field Ssn and we want to construct a secondary
index on Ssn. Repeat the previous exercise (part c) for the secondary index and compare with
the primary index.
e. Suppose that the file is not ordered by the nonkey field Department_code and we want to
construct a secondary index on Department_code, using option 3 of Section 17.1.3, with an extra
level of indirection that stores record pointers. Assume there are 1,000 distinct values of
Department_code and that the EMPLOYEE records are evenly distributed among these values.
Calculate (i) the index blocking factor bfri (which is also the index fan-out fo); (ii) the number of
blocks needed by the level of indirection that stores record pointers; (iii) the number of first-level
index entries and the number of first-level index blocks; (iv) the number of levels needed if we
make it into a multilevel index; (v) the total number of blocks required by the multilevel index and
the blocks used in the extra level of indirection; and (vi) the approximate number of block
accesses needed to search for and retrieve all records in the file that have a specific
Department_code value, using the index.
f. Suppose that the file is ordered by the nonkey field Department_code and we want to construct
a clustering index on Department_code that uses block anchors (every new value of
Department_code starts at the beginning of a new block). Assume there are 1,000 distinct values
of Department_code and that the EMPLOYEE records are evenly distributed among these
values. Calculate (i) the index blocking factor bfri (which is also the index fan-out fo); (ii) the
number of first-level index entries and the number of first-level index blocks; (iii) the number of
levels needed if we make it into a multilevel index; (iv) the total number of blocks required by the
multilevel index; and (v) the number of block accesses needed to search for and retrieve all
records in the file that have a specific Department_code value, using the clustering index
(assume that multiple blocks in a cluster are contiguous).
g. Suppose that the file is not ordered by the key field Ssn and we want to construct a B+-tree
access structure (index) on Ssn. Calculate (i) the orders p and pleaf of the B+-tree; (ii) the
number of leaf-level blocks needed if blocks are approximately 69% full (rounded up for
convenience); (iii) the number of levels needed if internal nodes are also 69% full (rounded up for
convenience); (iv) the total number of blocks required by the B+-tree; and (v) the number of block
accesses needed to search for and retrieve a record from the file?given its Ssn value?using the
B+-tree.
h. Repeat part g, but for a B-tree rather than for a B+-tree. Compare your results for the B-tree
and for the B+-tree.
Step-by-step solution
Step 1 of 2
The EMPLOYEE file contains the fields Name, Ssn, Department_code, Address, Phone,
Birth_date, Sex, Job_code, Salary.
Consider that the secondary indexes are maintained on the fields Department_code, Job_code
and Salary. The fields Department_code, Job_code and Salary are non-key fields.
Comment
Step 2 of 2
The steps to retrieve records based on the complex condition (Department_code = 5 AND
Job_code = 12 AND Salary = 50,000) using record pointers in indirection level is as follows:
1. First retrieve the record pointers of the records that satisfy the condition Department_code = 5
using secondary index on Deparment_code.
2. Then among the records pointers retrieved in step 1, retrieve the record pointers of the records
that satisfy the condition Job_code = 12 using secondary index on Job_code.
3. Then among the records pointers retrieved in step 2, retrieve the record pointers of the records
that satisfy the condition Salary = 50000 using secondary index on Salary.
Comment
Chapter 17, Problem 25E
Problem
Adapt Algorithms 17.2 and 17.3, which outline search and insertion procedures for a B+-tree, to a
B-tree.
Step-by-step solution
Step 1 of 2
read block n;
begin
if K<= n.K1( * n.Ki referes to the ith search field value in node n*)
then n<- n.P1(* n.Pi refers to the ith tree pointer in node n *)
else
begin
if for (n.Ki == K)
exit;
n<-n.Pi
end
read block n;
end;
begin;
if for (n.Ki == K)
else
return value does not exist;(*if we rech at this level value does not exist*);
Comment
Step 2 of 2
begin
push addres of n on stack S;
if K<= n.K1
else begin
n<- n.Pi
end;
read block n
end;
if found
else
begin
else begin
copy n to temp;
j<- [(pleaf+1)/2];
finished<- false;
repeat
if stack S is empty
then
begin
end
else begin
then
begin
finished<- true
end
else begin
copy n to temp
j<-[(p+1)/2];
K<- Kj
end
end;
entill finished;
end;
end;
Comment
Chapter 17, Problem 26E
Problem
It is possible to modify the B+-tree insertion algorithm to delay the case where a new level is
produced by checking for a possible redistribution of values among the leaf nodes. Figure 17.17
illustrates how this could be done for our example in Figure 17.12; rather than splitting the
leftmost leaf node when 12 is inserted, we do a left redistribution by moving 7 to the leaf node to
its left (if there is space in this node). Figure 17.17 shows how the tree would look when
redistribution is considered. It is also possible to consider right redistribution. Try to modify the
B+-tree insertion algorithm to take redistribution into account.
Step-by-step solution
Step 1 of 1
Refer to figure 17.17 for the redistribution of the values among the leaf nodes at a new level. The
figure shows inserting the values 12, 9 and 6. In the figure, value 12 is inserted into the leaf node
by moving 7 to its left leaf node through left redistribution.
The values 12, 9 and 6 can be distributed among the leaf nodes, at a new level, using right
redistribution as follows:
• When a new value is inserted in a leaf node, the tree is divided into leaf nodes and internal
nodes. Every value that appears in the internal node also appears as the rightmost value at the
leaf level, such that the tree pointer to the left of this value points to this value.
• If a new values needs to be inserted in the leaf node and the leaf node is full, then it is split. The
first values, where denotes the order of leaf nodes, present in the
original node are retained and rests of the values are moved to a new leaf node. The duplicate
value of the jth search value is retained at the parent node and a pointer pointing to the new
node is created.
• This new node is inserted in the parent node. If the parent node is full then it is split. The jth
search value is moved to the parent and values present in the internal nodes up to are kept,
where is the jth tree pointer and .
• The values from till the last value present in the node are kept in the new internal node.
The splitting of parent node and leaf nodes continues in this way and results in new level for the
tree.
The modified tree insertion algorithm based on the right redistribution is as follows:
Comment
Chapter 17, Problem 27E
Problem
Step-by-step solution
Step 1 of 1
read block n;
begin
if K<= n.K1( * n.Ki referes to the ith search field value in node n*)
then n<- n.P1(* n.Pi refers to the ith tree pointer in node n *)
else
begin
if (n.Ki == K)
exit;
n<-n.Pi
end
read block n;
end;
If not found
else
Delte P.next.K1;
exit;
Delete it;
exit;
Comment
Chapter 17, Problem 28E
Problem
Exercise
Step-by-step solution
Step 1 of 2
// is the root of the sub tree and k is the key which is to be deleted.
// if K deleted successfully, then B-tree-Delete return true. Other wise it returns false.
Note: - This function is designed so that when ever it is called recursively has at least keys.
If is a leaf then
if is in then
If k is in them
The predecessor of
Copy over k
// replace with
The successor of k
If contain k
With t or more keys then let 1<1 be the key in that follows C.
Comment
Step 2 of 2
Merge ‘C’ with immediate siblings and make the appropriate key of ,c
B-tree-delete (c, k)
Comment
Chapter 20, Problem 1RQ
Problem
Step-by-step solution
Step 1 of 1
Multi-user system
Users, that can use the many system and access data at the same time. That is called multi user
system.
Two or more transactions read a record at the same time, but when the records are saved, only
the last record saved will reflect any changes while all other changes will be lost.
By this we cannot save the data update because someone else may have accessed the record
and locked it due to a concurrency safety feature. However, another transaction reads the
temporary update. the data from the temporary update is now incorrect or "dirty data".
uses an airline seat reservation issue. a person wants to buy a ticket for a seat so the system
takes a summary of how many open seats on the plane. between the time the
summary action starts and finishes, some other seats become reserved by other tellers and the
initial summary comes back to our customer and is now inaccurate because it will not reflect the
true number of seats available.
Comment
Chapter 20, Problem 2RQ
Problem
Step-by-step solution
Step 1 of 1
Types of Failures
Failures in database management system are categorized as transaction, system, and media
failures.
There are many possible reasons for transaction to fail during execution:
Computer failure:
During the transaction execution, the computer hardware, media, software or network may crash.
This type of crashes will cause database management system failures.
The operations such as such as divide by zero or integer overflow will cause the transaction to
fail. Occurrences of logical programming error or erroneous parameter values will cause failures.
User may interrupt the system during the transaction execution.
Local errors:
Errors or exception conditions that are detected by the transaction will cause failures. Then the
transaction halts and cancels all inputted data; because something along the way prevents it
from proceeding.
Disk failure:
The data stored in the disk blocks may be lost because of a read-write error or a read/write head
crash. This could occur during a read or a write operation of the transaction.
Power failure, robbery, fire accident, destruction and many more refer to physical problems.
Catastrophic failure:
Catastrophic failure will occur very rarely. Catastrophic failure includes many forms of physical
misfortune to our database and there is an endless list of such problems.
• Fire accident that may cause the loss of physical devices and data loss.
Comment
Chapter 20, Problem 3RQ
Problem
Discuss the actions taken by the read_item and write_item operations on a database.
Step-by-step solution
Step 1 of 1
In a database, the operations like read item and write item that may
Actions taken by the read item operation on a database (assume the read operation is performed
on data item X):
Copy the disk block into a buffer in main memory if that disk is not already in some main memory
buffer.
Actions taken by the write item operation on a database (assume the write operation is
performed on data item X):
Copy the disk block into a buffer in main memory if that disk is not already in some main memory
buffer.
Copy item X from the program variable named X into its correct location in the buffer.
Store the updated block from the buffer back to disk (either immediately or at some later point in
time).
Comment
Chapter 20, Problem 4RQ
Problem
Draw a state diagram and discuss the typical states that a transaction goes through during
execution.
Step-by-step solution
Step 1 of 2
Comment
Step 2 of 2
begin_transaction - start
end_transaction - finish
Transactions commit points in the log where the transaction has completed successfully and all
of the reads and write that go along with it.
Comment
Chapter 20, Problem 5RQ
Problem
What is the system log used for? What are the typical kinds of records in a system log? What are
transaction commit points, and why are they important?
Step-by-step solution
Step 1 of 2
System log:
The system log used for “to recover from failures that affect transactions”.
The system maintains a log to keep track of all transaction operations that affect the values of
database items." basically it is used to keep track of all the meaningful stuff from a database.
Comment
Step 2 of 2
start_transaction - start
commit - finish
read - read
write - write
Points in the log where the transaction has completed successfully and all of the reads and
writes that go along with it.
Comment
Chapter 20, Problem 6RQ
Problem
Discuss the atomicity, durability, isolation, and consistency preservation properties of a database
transaction.
Step-by-step solution
Step 1 of 4
Atomicity:
• This property states that a transaction must be treated as an atomic unit, that is, either all its
operations are executed or none.
• States should be defined either before the execution of the transaction or after the
execution/abortion/failure of the transaction.
• This property requires that execute a transaction to completion. If the transaction is fail.
• If there is a failure at midway or user explicitly cancels the operation or due to any internal error
occurred, database ensures whether any partial state from leftover operation or not.
• Database can UNDO or ROLLBACK all the changes as the database was present in its first
place.
• To complicate for some reason, such as a system crash during transaction execution, the
recovery technique must undo any effects of the transaction on the database.
Comment
Step 2 of 4
Durability or permanency:
• The changes applied to the database by a committed transaction must persist in the database,
and must not be lost if failure occurs.
• If a transaction updates a chunk of data in a database and commits, then the database holds
the modified data.
• Even if a transaction commits but the system fails before the data could be written on to the
disk, then the data will be updated once the system springs back into action.
Comment
Step 3 of 4
Isolation:
• A transaction should appear as though it is being executed in isolation from other transactions
simultaneously or in parallel.
• That is, the execution of a transaction should not be interfered with by any other transactions
executing concurrently.
• It is enforced by the concurrency control sub system of the DBMS. If every transaction does not
make its updates visible to other transactions until it is committed.
• In simple terms, one transaction cannot read data from another transaction until it is not
completed.
• If two transactions are executing sequentially, and one wants to see the changes done by the
another, it must wait until the other is finished.
Comment
Step 4 of 4
Consistency preservation:
• The consistency property ensures that the database remains in a consistent state before the
start of the transaction and after the transaction is over (whether it is successful or not).
• It states that when transaction is finished the data will remain in a consistent state.
• A transaction either creates a new and valid state of data, or, if any failure occurs, returns all
data to its state before the transaction was started.
• Execution of transaction should take the database from one consistent state to another.
Comment
Chapter 20, Problem 7RQ
Problem
What is a schedule (history)? Define the concepts of recoverable, cascade-less, and strict
schedules, and compare them in terms of their recoverability.
Step-by-step solution
Step 1 of 4
A schedule (or history) S of n transactions T1, T2 , ...,Tn is an ordering of the operations of the
transactions subject to the constraint that, for each transaction Ti that participates in S, the
operations of Ti in S must appear in the same order in which they occur in Ti.
If we can ensure that a transaction T, when committed, never has to roll back, then we have a
demarcation between recoverable and non-recoverable schedules.
Comment
Step 2 of 4
Recoverable:
A schedule S is recoverable if no transaction T in S commits until all transactions T’, that have
written an item that T reads, have committed.
A transaction T reads from transaction T’ in a schedule S if some item X is first written by T’ and
read later by T.
In addition, T’ should not be aborted before before T reads item X, and there should be no
transactions that write X after T’ writes it and before T reads it (unless those transactions, if any,
have aborted before T reads X).
Comment
Step 3 of 4
Cascadeless schedule:
A schedule is said to avoid cascading rollback if every transaction in the schedule reads only
items that were written by committed transactions. This guarantees that read items will not be
discarded.
Uncommitted transaction has to be rolled back because it read an item from a transaction and
that is that failed.
This form of rollback is undesirable, since it can lead to undoing a significant amount of work. It is
desirable to restrict the schedules to those where cascading rollbacks cannot occur.
Comment
Step 4 of 4
Strict schedule:
Transactions can neither read nor write an item X until the last transaction that wrote X has
committed or aborted.
The process of undoing a write (X) operation of an aborted transaction is simply to restore the
before image, the old-value for X.
Though this always works correctly for strict schedules, it may not work for recoverable or
cascadeless schedules.
Comment
Chapter 20, Problem 8RQ
Problem
Discuss the different measures of transaction equivalence. What is the difference between
conflict equivalence and view equivalence?
Step-by-step solution
Step 1 of 3
1.) Conflict equivalence: Two schedules are said to be conflict equivalent if the order of any two
conflicting operations is the same in both schedules. Two operations in a schedule are said to be
conflict if they belong to different transactions, access the same database item, and at least one
of the two operations is a write item operation. If two conflicting operations are applied in different
orders in two schedules, the effect can be different on the database or on other transactions in
the schedules, and hence the schedules are not conflict equivalent.
Comment
Step 2 of 3
2.) View equivalence: Another less restrictive definition of schedules is called view equivalence.
Two schedules S and S' are said to be view equivalent if the following 3 conditions hold:
1.) The same set of transactions participates in S and S', and S and S' include the same
operation of those transactions.
2.) For any operation ri(X) of Ti in S, if the value of X read but the operation has been written by
an operation wj(X) of Tj, the same condition must hold the value of X read by operation ri(X) of Ti
in S'.
3.) If the operation wk(Y) of Tk is the ast operation to write item Y in S, then wk(Y) of Tk must
also be the last operation to write item in S'.
The idea behind view equivalence is that as long as each rea operation of the transaction reads
the result of the same write operation in both the schedules, the write operation of each
transaction must produce same result. The read operations are thus said to have same view in
both schedules. Condition # ensures that the final write operation on each data item is the same
in both schedules, so the database stat should e the same at the end of both schedules.
Comment
Step 3 of 3
The difference between view equivalence and conflict equivalence arise under unconstrained
write assumption. View serializability is less restrictive under unconstrained write assumption,
where the value written by a operation wi(X) in it can be independent of its old value from the
database. This is called a blind write, and it is illustrated by the following schedule Sg of three
transactions T1: r1(X); w1(X); T2: w2(x); and T3: w3(X):
in Sg the operation w2(X) and w3(X) are blind writes, since T2 and T3 do not read the value of X.
The Schedule Sg is view serializable but not conflict serializable. Conflict serializable schedules
are view serializable but not vice versa. Testing of view serializability has been shown to be NP-
hard, meaning that finding an efficient polynomial time algorithm for this problem is highly
unlikely.
Comment
Chapter 20, Problem 9RQ
Problem
What is a serial schedule? What is a serializable schedule? Why is a serial schedule considered
correct? Why is a serializable schedule considered correct?
Step-by-step solution
Step 1 of 4
Serial schedule:
A schedule “S” is referred as serial, for each transaction “T” participating in schedule, the
operations of T must be executed consecutively in schedule.
• So from this perspective, it is clear that only one transaction at a time is active and whenever if
that transaction is committed, then it initiates the execution of next transaction.
Comment
Step 2 of 4
Serializable schedule:
Consider that possibly there are “n” serial schedule of “n” transactions and moreover there are
possibly non-serial schedules. If two disjoined groups of the nonserial schedules are formed then
it is equivalent o one or more of the serial schedules. Hence, the schedule is referred as
serializable.
Comment
Step 3 of 4
A serial schedule is said to be correct on the assumption of that each transactions is independent
of each other. So according to the “consistency preservation” property, when the transaction runs
in isolation, it is executed from the beginning to end from the other transaction .Thus, the output
is correct on the database.
Comment
Step 4 of 4
The simple method to prove the correctness of serializable schedule is that to prove the
satisfactory definition.
In this definition, it compares the results of the schedules on the database, if both produce same
final state of database. Then, two schedules are equivalent and it is proved to be serializable.
Therefore, the serializable schedule is correct when the two schedules are in the same order.
Comment
Chapter 20, Problem 10RQ
Problem
What is the difference between the constrained write and the unconstrained write assumptions?
Which is more realistic?
Step-by-step solution
Step 1 of 1
Constrained write assumption state that any write operation wi(X) in Ti is preceded by a ri(X) in Ti
and the value written by wi(X) in Ti depends only on value of X read by ri(X). This assume that
computation of the new value of X is a function f(X) based on the old value of X read from the
database.
Unconstrained write assumption state that the value written by an operation wi(X) in it can be
independent of its old value from the database. This is called a blind write, and it is illustrated by
the following schedule Sg of three transactions T1: r1(X); w1(X); T2: w2(x); and T3: w3(X):
in Sg the operation w2(X) and w3(X) are blind writes, since T2 and T3 do not read the value of X.
Constrained write assumption is more realistic as often we need to take in account the value of a
variable before editing the value in the application or query.
Comment
Chapter 20, Problem 11RQ
Problem
Discuss how serializability is used to enforce concurrency control in a database system. Why is
serializability sometimes considered too restrictive as a measure of correctness for schedules?
Step-by-step solution
Step 1 of 4
The concept of serializability of schedules is used to identify which schedules are correct when
transaction executions have interleaving of their operations in the schedules. A schedule S of n
transactions is serializable if it is equivalent to some serial schedule of the same n transactions.
Saying that a non serial schedule S is serializable is equivalent of saying that it is correct,
because it is equivalent to a serial schedule, which is considered correct.
Two schedules are result equivalent if they produce the same final state of database. However
two schedules may accidentally produce same final state, so result equivalence cannot be used
to define equivalence of schedules.
Comment
Step 2 of 4
Conflict equivalence: Two schedules are said to be conflict equivalent if the order of any two
conflicting operations is the same in both schedules. Two operations in a schedule are said to be
conflict if they belong to different transactions, access the same database item, and at least one
of the two operations is a write item operation. If two conflicting operations are applied in different
orders in two schedules, the effect can be different on the database or on other transactions in
the schedules, and hence the schedules are not conflict equivalent.
Comment
Step 3 of 4
View equivalence: Another less restrictive definition of schedules is called view equivalence.
Two schedules S and S' are said to be view equivalent if the following 3 conditions hold:
1.) The same set of transactions participates in S and S', and S and S' include the same
operation of those transactions.
2.) For any operation ri(X) of Ti in S, if the value of X read bt the operation has been written by an
operation wj(X) of Tj, the same condition must hold the value of X read by operation ri(X) of Ti in
S'.
3.) If the operation wk(Y) of Tk is the ast operation to write item Y in S, then wk(Y) of Tk must
also be the last operation to write item in S'.
Comment
Step 4 of 4
An example of the type of transactions known as debit card transactions- for example, those that
apply deposits and withdrawals to data item whose value is the current balance of a bank
account. The semantics of debit- card operations is that they update value of a data item X by
either adding or subtracting to current value and both these operations are commutative- and it is
possible to produce correct schedules that are not serializable.
With additional knowledge, or semantics, that the operation between each ri(I) and wi(I) are
commutative, we know that the order of executing the sequence consisting of (read, write,
update) is not important as long as each (read, write, update)sequence by a particular
transaction Ti on a particular item is not interrupted by conflicting operations. Hence a non
serializable can also be considered correct. Researchers have been working on extending
concurrency control theory to deal with case where serializability is considered to be too
restrictive as a condition for correctness of schedules.
Comment
Chapter 20, Problem 12RQ
Problem
Describe the four levels of isolation in SQL. Also discuss the concept of snapshot isolation and
its effect on the phantom record problem.
Step-by-step solution
Step 1 of 2
The statement ISOLATION LEVEL is used to specify isolation value, where these values can be
SERIALIZABLE, REPEATABLE END, READ COMMITTED OR READ UNCOMMITTED.
SERIALIZABLE is the default isolation level, but some system uses READ COMMITTED as the
default level.
1. Level 0: If the dirty reads of higher level transactions cannot be overwritten by a transaction,
then such transaction have level 0 isolation.
Such isolation level has the value READ UNCOMMITTED. It lets the transaction display the data
of previous statement on current page, whether or not the transaction is committed. This is called
dirty read too.
Example:
Statement 1:
Begin tran
COMMIT;
Statement 2:
The statement 2 will execute after update of stu table by statement 1 and display records before
the transaction is committed.
2. Level 1: The transaction having this isolation level has no lost updates. Such isolation level
has the value READ COMMITTED.
In this isolation level, the SQL query statement takes only committed values. If any transaction is
locked or incomplete, then the select statement will wait until all the transactions complete.
3. Level 2: The transaction having this isolation level has no dirty reads as well as no lost
updates. Such isolation level has the value REPEATABLE READ.
Repeatable read is the extension to the committed read. It ensures that if the same query is
executed again in the transaction, it will not read the change in the data value that another query
has made. No other user can modify the data values until the transaction is committed or rolled
back by the previous user.
4. Level 3: In addition to the properties from level 2, isolation level 3 has repeatable reads. Such
isolation level has the value SERIALIZABLE. Serializable isolation level works like repeatable
read except that it prevents Phantoms, when same query is executed twice. This option works on
range lock. It locks whole the table if there is none of the condition is specified on index.
Comment
Step 2 of 2
Snapshot isolation:
Snapshot isolation is used in concurrency control protocols and some commercial DBMSs. Its
definition comprises of the data items that is read by a transaction based on the committed
values of the items present in the database snapshot.
Snapshot isolation ensures that Phantom record problem does not happen. It ensures this,
through the records that are executed in the database at the beginning of a transaction.
Comment
Chapter 20, Problem 13RQ
Problem
Define the violations caused by each of the following: dirty read, nonrepeatable read, and
phantoms.
Step-by-step solution
Step 1 of 1
Violations caused by :
Dirty read –
A transaction that reads information from another transaction, The initial transaction commits
while the other transaction aborts. This causes the source used in the initial transaction to
become incorrect.
Nonrepeatable read –
The transaction reads a value from a record. Another transaction changes the values of the
record that was read. When the initial transaction reads the record again, the values are different.
Phantoms –
A transaction may read a set of rows from a table based on some condition specified in the SQL
WHERE –class.Seeing a new row that was inserted during the process of the initial transaction.
The new row only shows up if the initial transaction is repeated.
Comment
Chapter 20, Problem 14E
Problem
read_item(X);
X := X + M;
else write_item(X);
Discuss the final result of the different schedules in Figures 20.3(a) and (b), where M = 2 and N =
2, with respect to the following questions: Does adding the above condition change the final
outcome? Does the outcome obey the implied consistency rule (that the capacity of X is 90)?
Step-by-step solution
Step 1 of 1
read_item(X);
X:= X+M;
else write_item(X);
So, this condition is does not change the final output unless the initial value of X > 88.
The outcome, however, does obey the implied consistency rule that X < 90, since the
Comment
Chapter 20, Problem 15E
Problem
Repeat Exercise 20.14, adding a check in T1 so that does not exceed 90.
read_item(X);
X := X + M;
else write_item(X);
Discuss the final result of the different schedules in Figures 20.3(a) and (b), where M = 2 and N =
2, with respect to the following questions: Does adding the above condition change the final
outcome? Does the outcome obey the implied consistency rule (that the capacity of X is 90)?
Step-by-step solution
Step 1 of 1
read_item(X);
X:= X+M;
else write_item(X);
T1 T2
read_item(X);
X := X-N;
read_item(X);
X := X+M;
write_item(X);
read_item(Y);
else write_item(X);
Y := Y+N;
if Y> 90 then
exit
else write_item(Y);
This condition does not change the final output unless the initial value of X > 88 or
Y > 88.
This output obeys the implied consistency rule that X < 90 and Y < 90.
Chapter 20, Problem 16E
Problem
Add the operation commit at the end of each of the transactions T1 and T2 in Figure 20.2, and
then list all possible schedules for the modified transactions. Determine which of the schedules
are recoverable, which are cascade-less, and which are strict.
Step-by-step solution
Step 1 of 6
T1T2
read_item(X); read_item(X);
X := X - N ; X := X + M;
write_item(X); write_item(X);
read_item(Y); commit T 2
Y := Y + N;
write_item(Y);
commit T 1
From these transactions we can be written as using the shorthand notation. That is
T 2 : r 2 (X); w 2 (X); C 2 ;
Comment
Step 2 of 6
m =2
n1 = 5
n2 = 3,
Comment
Step 3 of 6
So, that 56 possible schedules, and the type of each schedule are
cascadeless)
Comment
Step 4 of 6
cascadeless)
cascadeless)
cascadeless)
Comment
Step 5 of 6
Comment
Step 6 of 6
cascadeless)
Comment
Chapter 20, Problem 17E
Problem
List all possible schedules for transactions T1 and T2 in Figure 20.2, and determine which are
conflict serializable (correct) and which are not.
Step-by-step solution
Step 1 of 3
Comment
Step 2 of 3
Comment
Step 3 of 3
Below are the 15 possible schedules and their type of each schedule:
Comment
Chapter 20, Problem 18E
Problem
How many serial schedules exist for the three transactions in Figure 20.8(a)? What are they?
What is the total number of possible schedules?
Step-by-step solution
Step 1 of 2
T 1 T 2 T3
write_itme(X)
Comment
Step 2 of 2
T1 T2 T3
T3 T2 T1
T2 T3 T1
T2 T1 T3
T3 T1 T2
T1 T3 T2
And
ie..(n!).
Comment
Chapter 20, Problem 19E
Problem
Write a program to create all possible schedules for the three transactions in Figure 20.8(a), and
to determine which of those schedules are conflict serializable and which are not. For each
conflict-serializable schedule, your program should print the schedule and list all equivalent serial
schedules.
Step-by-step solution
Step 1 of 1
Int t1Counter = 0;
Int t2Counter = 0;
Int t3Counter = 0;
Int maxCounter = 0;
Int Transaction;
Array TansactionT2Commands[5] ;
Array TansactionT3Commands[4] ;
Array FinalSchedules[12];
Int ti = Rand(3);
FinalSchedule[i++] = TansactionT1Commands[t1Counter++];
FinalSchedule[i++] = TansactionT2Commands[t2Counter++];
FinalSchedule[i++] = TansactionT3Commands[t3Counter++];
If (FinalSchedules[12] in Schedules[12][5000]);
////Do nothing
}
Else
maxCounter++;
For each case in Schedule S where Tj executes a read_item(X) after Ti executes a write_item(X),
create an edge (Ti-> Tj) in the precedence graph.
For each case in Schedule S where Tj executes a write_item(X) after Ti executes a read_item(X),
create an edge (Ti-> Tj) in the precedence graph.
Return;
Comment
Chapter 20, Problem 20E
Problem
Why is an explicit transaction end statement needed in SQL but not an explicit begin statement?
Step-by-step solution
Step 1 of 1
A transaction is an atomic operation. It has only one way to begin, that syntax is like this
BEGIN_ TRANSACTION
------;
- - - - - ; // READ OR WRITE //
-----;
END TRANSATIONS;
COMMIT_TRANSACTION
or
Removes -- its partial updates (which may be incorrect) from the database (abort).
So, it is important for the database systems to identify the right way of ending a transaction. It is
for this reason an "End" command is needed in SQL2 query.
Comment
Chapter 20, Problem 21E
Problem
Describe situations where each of the different isolation levels would be useful for transaction
processing.
Step-by-step solution
Step 1 of 2
and
In this level preserves consistency in all situations, thus it is the safest execution mode. It is
recommended for execution environment where every update is crucial for a correct result. For
example, airline reservation, debit credit, salary increase, and so on.
In this level is similar to Serializable except Phantom problem may occur here. Thus, in record
locking (finer granularity), this isolation level must be avoided. It can be used in all types of
environments, except in the environment where accurate summary information (e.g., computing
total sum of a all different types of
Comment
Step 2 of 2
In this level a transaction may see two different values of the same data items during its
execution life. A transaction in this level applies write lock and keeps it until it commits. It also
applies a read (shared) lock but the lock is released as soon as the data item is read by the
transaction. This isolation level may be used for making balance, weather, departure or arrival
times, and so on.
In this level a transaction does not either apply a shared lock or a write lock. The transaction is
not allowed to write any data item, thus it may give rise to dirty read, unrepeatable read, and
phantom. It may be used in the environment where statistical average of a large number of data
is required.
Comment
Chapter 20, Problem 22E
Problem
Which of the following schedules is (conflict) serializable? For each serializable schedule,
determine the equivalent serial schedules.
Step-by-step solution
Step 1 of 5
Serializable schedule:
1) Create a node labeled Ti in graph for each of the transaction Ti which participates in schedule
S.
3) Create an edge in graph from Ti to Tj, where a read_item(X) is executed by Ti and then a
write_item(X) is executed by Tj.
4) Create an edge in graph from Ti to Tj, where a write_item(X) is executed by Ti and then a
write_item(X) is executed by Tj.
Comment
Step 2 of 5
(a)
Given schedule:
Conflict graph:
Comment
Step 3 of 5
(b)
Given schedule:
Conflict graph:
The conflict graph has cycle, in T1-T3. Hence, the schedule S is .
Comment
Step 4 of 5
(c)
Given schedule:
Conflict graph:
Comment
Step 5 of 5
(d)
Given schedule:
Conflict graph:
Problem
Consider the three transactions T1 T2, and T3, and the schedules S1 and S2 given below. Draw
the serializability (precedence) graphs for S1 and S2 and state whether each schedule is
serializable or not. If a schedule is serializable, write down the equivalent serial schedule(s).
S1: r1 (X); r2 (Z); r1 (Z); r3 (X); r3 (Y); w1 (X); w3 (Y); r2 (Y); w2 (Z); w2 (Y);
S2: r1 (X); r2 (Z); r3 (X); r1 (Z); r2 (Y); r3 (Y); w1 (X); w2 (Z); w3 (Y); w2 (Y);
Step-by-step solution
Step 1 of 2
S1: r1(X); r2(Z); r1(Z); r3(X); r3(Y); w1(X); w3(Y); r2(Y); w2(Z); w2(Y)
Comment
Step 2 of 2
S2: r1(X); r2(Z); r3(X); r1(Z); r2(Y); r3(Y); w1(X); w2(Z); w3(Y); w2(Y)
Comment
Chapter 20, Problem 24E
Problem
Consider schedules S3, S4, and S5 below. Determine whether each schedule is strict,
cascadeless, recoverable, or nonrecoverable. (Determine the strictest recoverability condition
that each schedule satisfies.)
S3: r1 (X); r2 (Z); r1 (Z); r3 (X); r3 (Y); w1 (X); c1; w3 (Y); c3; r2(Y); w2(Z); w2(Y); c2;
S4: r1 (X); r2 (Z); r1 (Z); r3 (X); r3 (Y); w1 (X); w3 (Y); r2(Y); w2(Z); w2(Y); c1; c2; c3;
S5: r1 (X); r2 (Z); r3 (X); r1 (Z); r2 (Y); r3 (Y); w1 (X); c1; w2(Z); w3(Y); w2(Y); c3; c2;
Step-by-step solution
Step 1 of 5
Strict schedule: A schedule is said to be a strict schedule if a transaction neither reads or writes
an item x until another transaction that wrote x is committed.
• It means that T3 reads the value of x before T1 has written the value of x.
• It means that T3 reads the value of x before T1 has written the value of x.
• It means that T3 reads the value of x before T1 has written the value of x.
Comment
Step 2 of 5
Comment
Step 3 of 5
Schedule S3:
• If the T1 aborts first and then T3 and T2 are committed, then the schedule S3 is recoverable as
rolling back of T1 does not affect T2 and T3.
• If the T1 commits first and then T3 aborts and then T2 commits, then the schedule S3 is not
recoverable as rolling back of T3 will affect T2 as it has read the value of y written by T3.
• If the T1 and T3 commits and then T2 aborts, then the schedule S3 is recoverable as rolling
back of T2 does not affect T1 and T3.
Comment
Step 4 of 5
Schedule S4:
• If the T1 aborts first and then T2 and T3 are committed, then the schedule S4 is recoverable as
rolling back of T1 does not affect T2 and T3.
• If the T1 commits first and then T2 aborts and then T3 commits, then the schedule S4 is
recoverable as rolling back of T1 does not affect T2 and T3. The value of y which is read and
written by T3 will be restored by the rollback of T2.
• If the T1 and T2 commits and T3 aborts, then the schedule S4 is not recoverable as rolling back
of T3 will affect T2 as it has read the value of y written by T3.
Comment
Step 5 of 5
Schedule S5:
• If the T1 aborts first and then T3 and T2 are committed, then the schedule S5 is recoverable as
rolling back of T1 does not affect T2 and T3. T1 writes the value of x which is not read by T2 nor
T3.
• If the T1 commits first and then T3 aborts and then T2 commits, then the schedule S5 is not
recoverable as rolling back of T3 will affect T2 as it has read the value of y written by T3.
• If the T1 and T3 commits and then T2 aborts, then the schedule S5 is recoverable as rolling
back of T2 does not affect T1 and T3.
Comment
Chapter 21, Problem 1RQ
Problem
Step-by-step solution
Step 1 of 2
Two-phase locking:
Two-phase locking schema is a one of the locking schema is which a transaction cannot request
a new lock until it unlocks the operations in the transaction. It is involved in two phases.
• Locking phase
• Unlocking phase.
Locking phase:
This is the expanding or growing phase in which the new locks are acquired but none is
released.
Unlocking phase:-
This is the second phase referred as shrinking phase in which it releases the existing locks and
does not acquire the new locks.
Comment
Step 2 of 2
Guarantee of serializability:
The attraction of the two-phase algorithm derives from a theorem which provides that the two-
phase locking algorithm always leads to serializable schedules.
It is proved that if every transaction in a schedule follows the two-phase locking protocol, then the
schedule is guaranteed to be serializable.
With the two-phase locking protocol, the schedule is guaranteed to be serializability because the
protocols will prevent interface among different transactions and it avoids the problems of last
update, uncommitted dependency and inconsistent analysis if the two phase locking is enforced.
Comment
Chapter 21, Problem 2RQ
Problem
What are some variations of the two-phase locking protocol? Why is strict or rigorous two-phase
locking often preferred?
Step-by-step solution
Step 1 of 2
According to the two-phase locking protocol, locks are handled by transactions and there are a
number of variations of two-phase locking.
That is
It requires a transaction to lock all the items it access before the transaction beings execution by
predeceasing its read-set and write-set, it is a deadlock-free protocol.
This a one technique of 2PL and transaction locks data items incrementally. This may cause
dead lock which is dealt with.
Comment
Step 2 of 2
In this variation, a transaction T does not release any of it’s exclusive (write) locks until after it
commits or aborts. So, no other transaction can read/write an item that is written by T unless T
have committed.
And most restrictive variation of strict -2PL is rigorous 2PL. it also guarantees the strict
schedules.
In this, a transaction T does not release any of it’s locks until after it commits or aborts and so it is
easier to implement than strict 2PL.
Comment
Chapter 21, Problem 3RQ
Problem
Discuss the problems of deadlock and starvation, and the different approaches to dealing with
these problems.
Step-by-step solution
Step 1 of 4
Deadlock:
• A deadlock refers to a situation in which a transaction Ti waits for an item that is locked by
transaction Tj. The transaction Tj in turn waits for an item that is locked by transaction Tk.
• When each transaction in a set of transactions is waiting for an item that is locked by other
transaction, then it is called deadlock.
Example:
Suppose there are two transaction T1 and T2 and there are two items X and Y.
• Initially transaction T1 hold the item X and transaction T2 hold the item Y.
• In order for the transaction T1 to complete its execution, it needs item Y which is locked by
transaction T2.
• In order for the transaction T2 to complete its execution, it needs item X which is locked by
transaction T1.
Such a situation is known as deadlock situation because neither transaction T1 and T2 can
complete its execution.
Comment
Step 2 of 4
• Deadlock prevention: The transaction acquires the lock on all the items it needs before starting
the execution. If it cannot acquire a lock on an item, then it should not lock any other items and
should wait and try to acquire locks again.
• Timeouts: A transaction is aborted if it waits for a period longer than the system defined time.
Comment
Step 3 of 4
Starvation:
• Starvation refers to a situation in which a low priority transaction waits indefinitely while other
high priority transactions execute normally.
Comment
Step 4 of 4
• Use the first come first serve queue to maintain the transactions that are waiting. The
transactions can acquire lock on an item in the same order they have been placed in the queue.
• Increase the priority of the transactions that are waiting longer so that at some point of time it
becomes the transaction with highest priority and proceeds to execute.
Comment
Chapter 21, Problem 4RQ
Problem
Compare binary locks to exclusive/shared locks. Why is the latter type of locks preferable?
Step-by-step solution
Step 1 of 2
Binary locks:-
Binary locks are type of lock. It has only two states of a lock, it is too simple, and it is too
restrictive. It is not used in the practice.
Exclusive/shared lock:-
Exclusive/shared locks that may provide more general locking capabilities and that are used in
practical database locking schemas.
In this lock.
Share-lock is the read-locked item through this other operations are allow to read the item and
where as a write-locked is a single transaction exclusively holds the lock on the item. Here these
are three locking operations.
That are
Read-lock (X)
Un lock (X)
Comment
Step 2 of 2
If we use the shared locking scheme. The system must following the
(1) A transaction T must issue the operation read-lock (X) or write-lock(X) before any read-item
(X) operation is performed in T
(2) A transaction T must issue the operation write-lock (X) before any write-items (X) operation is
performed in T.
(3) A Transactions T must issue the operation unlock (X) after all read-items (X) and writer-
item(X) operations are completed in T.
(4) A Transaction T will not issue a read lock (X) operation if it already holds a read (Shared) lock
or a write (Exclusive) lock on item X. This rule may be relaxed.
Comment
Chapter 21, Problem 5RQ
Problem
Step-by-step solution
Step 1 of 2
Transactions are start based on the order of the timestamps, hence. If transaction starts
before transaction , then .
So, we notice that, the order transaction has the smaller timestamp value.
Two schemes that prevent dead lock are called wait-die and wound-wait.
For suppose, transaction tries to lock an item X but is not able to because X is locked by some
other Transaction . With a conflicting lock. These rules are followed by below schemas.
Comment
Step 2 of 2
Wait-die:-
Means:
A younger transaction is allowed to wait an older one. Where an older transaction requesting an
item held by a younger transaction precepts the younger transaction by a forting. It
Comment
Chapter 21, Problem 6RQ
Problem
Describe the cautious waiting, no waiting, and timeout protocols for deadlock prevention.
Step-by-step solution
Step 1 of 3
Cautious waiting:-
Suppose, a transaction tries to lock an item but it is not able to do. Because is locked
by some other transaction with a conflicting lock.
And
If is not blocked, than is blocked and allowed to wait other wise abort
Ie
If X is waiting for , let it wait unless is also waiting for to release some other item.
Comment
Step 2 of 3
No waiting:-
In case of inability to obtain a lock, a Transaction aborts and is resubmitted with a fined delay
Comment
Step 3 of 3
Timeout
If a transactions waits for a period longer than a system-defined time out period, and the system
assumes that the transaction may be dead locked and ;aborts it-regardless of whether a
deadlock actually exists or not.
If we use time out protocol in the dead lock prevention. Some transactions that were not
deadlocked and they may abort and may have to be resubmitted.
Comment
Chapter 21, Problem 7RQ
Problem
Step-by-step solution
Step 1 of 1
Timestamp:-
Time stamp is a unique identifier created by the DBMS to identify a transaction and it’s values
are assigned in the order in which the transactions are submitted to the system.
It is to use a counter and that is incremented each time its value is assigned to a transaction. In
this schema, the transaction time stamps are numbered like 1, 2, 3,…and A computer counter
has a finite maximum value. So the system must periodically reset the counter to zero. When no
transactions are executing for some short period of time
and system may implement the timestamps to use the current date/time values of the system
clock and ensure that no two time stamp value are generated during the same tick of the clock.
Comment
Chapter 21, Problem 8RQ
Problem
Discuss the timestamp ordering protocol for concurrency control. How does strict timestamp
ordering differ from basic timestamp ordering?
Step-by-step solution
Step 1 of 3
The protocol manages concurrent executing such that the time stamps determine the
serializability order.
The protocol maintains for each data Q through two timestamp values.
(1) W-timestamp(Q)
Successfully.
Time stamp ordering protocol ensures that any conflicting read and write operations are executed
in timestamp order.
Comment
Step 2 of 3
Differ from strict time stamp ordering through basic timestamp ordering:-
When transaction ‘T’ issues a write-item (X) operations and read-item (X) operation.
If TS(T)> read-TS(X) then delay T until the transaction ‘T’ that wrote or read X has terminated
and
if TS(T)> write-TS(X) the delay T until the transaction ‘T’ that wrote or read X has terminated
Comment
Step 3 of 3
When transaction ‘T’ issues a write-item (X) operations and read-item (X) operation.
If TS (T)> read-TS(X) then delay T until the transaction ‘T’ that wrote or read X has terminated
and
If TS (T)> write-TS(X) the delay T until the transaction ‘T’ that wrote or read X has terminated
If read-TS(X)>TS(T) or does not exist, then execute write-item (X) of T and set write-TS(X) to
TS(T).
And
If write-TS(X) >TS(T), then an younger transaction has already written to the data item so a fort
and roll-back T and reject the operation.
If write-TS(X) TS (T), then execute read-item (X) of T and set read-TS(X) to the larger of TS(T)
and current read-TS(X).
Comment
Chapter 21, Problem 9RQ
Problem
Discuss two multiversion techniques for concurrency control. What is a certify lock? What are the
advantages and disadvantages of using certify locks?
Step-by-step solution
Step 1 of 4
Multiple concurrency control techniques are the ones that retain the old value of data items, while
dealing with the newer version of the values. The purpose behind holding the older values as
well as is to maintain serializablity and to support some older values as well that are compatible
with the previous data.
Comment
Step 2 of 4
Consider the description of the two multiversion techniques for concurrency control discussed
above:
1.
In this, several versions of each data X are maintained. For each version there must two be more
details.
• Read_TS: It is the time stamp of that particular moment when the data is read. It contains the
highest value of all time stamps.
• Write_TS: It hold the value of that particular moment at which the data is updated.
Whenever a write operation is performed over an item X, the newer version of both the read_TS
and write_TS is made, while previous version is also retained.
Comment
Step 3 of 4
2.
In this, there are three kinds of locking modes for each item. These three kinds of locking modes
are as follows:
• Read
• Write
• Certify
So, if a state is said to be locked then it may be any of these three locks.
• In the previous locking scheme, if a transaction holds a write lock over an item, then no one
item is allowed to access that. But here it is to allow other transactions T to read an item X while
a single transaction T holds a write lock on X.
• For this purpose two version of x is to be held. Then in case of committing a transaction, certify
lock is to be maintained over an item.
Comment
Step 4 of 4
Certify Lock:
It is the kind of lock that is attained only when all the updated values need to be finalized so that
it can get a stable state. It is similar to a commit statement when all the transactions that are
performed successfully are need to be saved.
• When the transaction is completed and is ready to be saved, then a certify lock is maintained
over a transaction or over an item so as to maintain a monopoly over it.
• The updating of the data item can be completed securely and the data get saved from any kind
of hindrance.
When a transaction is completed and there is maintained a certifies lock, then in that case none
of the other data item or other process is not able to have access over that item and cannot have
access even for reading the item.
Comment
Chapter 21, Problem 10RQ
Problem
How do optimistic concurrency control techniques differ from other concurrency control
techniques? Why are they also called validation or certification techniques? Discuss the typical
phases of an optimistic concurrency control method.
Step-by-step solution
Step 1 of 2
In all concurrency control techniques, certain degree of checking is done before a database
operation can be executed. For example, in locking a check is done to determine weather the
item being accessed is locked. In timestamp ordering, the transaction timestamp is checked
against the read and the write timestamps of the item. Such checking represent overhead during
transactions. In optimistic concurrency control techniques, also known as validation or
certification techniques, no checking is done while the transaction is executing. In one of
validation schemes, updates in the transaction are not applied directly to the database items until
the transaction reaches its end. During transaction execution all updates are made to the local
copies of data items that are kept for transaction. At the end of transaction execution, validation
phases checks weather any of the transaction’s updates violate serializability. Certain information
needed by validation phase must be kept in the system. If serializability is not violated the
transaction is committed and database is updated from local copies; otherwise the transaction is
aborted and restarted later.
Comment
Step 2 of 2
1.) Read phase: A transaction can read values of committed data items from the database.
However, updates are applied only to local copies of the data items kept in the transaction
workspace.
2.) Validation phase: Checking is performed to ensure that serializability will not be violated if
the transaction updates are applied to the database.
3.) Write phase: If the validation phase is successful, the transaction updates are applied to the
database; otherwise, the updates are discarded and the transaction restarted.
The idea behind optimistic concurrency control is to do all checks at once; hence, transaction
execution proceeds with a minimum overhead until the validation phase is reached. Since in the
validation phase it is decided that if transaction can be committed or must be aborted it is also
called as validation or certification technique.
Comment
Problem
Chapter 21, Problem 11RQ
What is snapshot isolation? What are the advantages and disadvantages of concurrency control
methods that are based on snapshot isolation?
Next
Step-by-step solution
Step 1 of 1
Snapshot isolation:
Snapshot isolation is used in concurrency control protocols and some commercial DBMSs. Its
definition comprises of the data items that is read by a transaction based on the committed
values of the items present in the database snapshot.
Snapshot isolation ensures that Phantom record problem does not happen. It ensures this,
through the records that are executed in the database at the beginning of a transaction.
• As the database statement or even database transaction only have the records, that were
executed in the database when the transaction had started, so the snapshot isolation ensures
that the phantom record problem does not arises.
• The problems of nonrepeatable read and dirty read might arise during the transaction
execution. Snapshot isolation ensures that these problems of nonrepeatable read and dirty read
does not occur.
• The concurrency control methods based on snapshot isolation has reduced overhead
associated with the two phase locking, as there is no necessity to apply read locks to the items,
in the read operations linked with the concurrency control methods.
• Nonserializable schedules can occur in the case of concurrency control based snapshot
isolation. There are few anomalies such as write-skew anomalies, read-only transaction anomaly
that violates serializability.
Comment
Chapter 21, Problem 12RQ
Problem
How does the granularity of data items affect the performance of concurrency control? What
factors affect selection of granularity size for data items?
Step-by-step solution
Step 1 of 3
The size of data item is often referred to as data item granularity. Smaller the size of data item it
is fine granularity, larger size is course granularity.
Comment
Step 2 of 3
1.) First notice that the larger the data item size is, the lower the degree of concurrency
permitted. For example, if the data item size is a disk block, a transaction T that needs to lock a
record B must lock the whole disk block X that contains B because a lock is associated with the
whole data item (block). Now, if another transaction S wants to lock a different record C that
happens to reside in the same block X in a conflicting lock mode, it is forced to wait. If the data
item size was a single record, transaction S would be able proceed, because it would be locking
a different data item (record).
2.) The smaller the data item size is, the more the number of items in the database. Because
every item is associated with a lock, the system will have a larger number of active locks to be
handled by lock manager. More lock and unlock operations will be performed, causing a higher
overhead. In addition , more storage space will be required for the lock table. For timestamps,
storage is required for the read_TS and write_TS for each item, and there will be similar
overhead for handling a large number of items.
Comment
Step 3 of 3
Best item size is dependent on transactions involved. If a typical transaction accesses a small
number of records, it is advantageous to have the data item granularity be one record. On other
hand, if a transaction typically accesses many records in the same file. It may be better to have
block or file granularity so that the transaction will consider all the records as one data item.
Comment
Chapter 21, Problem 13RQ
Problem
Step-by-step solution
Step 1 of 2
If we want to per form a delete/insert operation a new item in the database, it can not be
accessed until. The item is created and the insert operation is completed. For this we use the
locks that
(1) two-phase locking (2) index locking by using two-phase locking if we use the delete
operation, that may be performed only if the transaction deleting the tuple holds an exclusive lock
on the tuple to be deleted.
And
Comment
Step 2 of 2
A transaction that inserts a new tuple into the database is automatically given an exclusive lock
on the inserted tuple.
A transaction that scans a relation and a transaction that inserts a tuple in the relation. And if only
tuple locks are used non-serializable schedules can result.
the transaction scanning the relation is reading information that indicates which tuples the
relation contains and while a transaction inserting a tuple updates the same information.
Transactions inserting or deleting a tuple acquire and exclusive lock on the data item.
And
Index locking protocols provide higher concurrency while preventing the phantom problem by
requiring the locks on certain index buckets.
Comment
Chapter 21, Problem 14RQ
Problem
Step-by-step solution
Step 1 of 1
Multiple granularity locking is a lock that may contain locks are set of objects. That contain other
object locks are exploiting the hierarchical nature of contains a relationship.
Multiple granularity locks should have to make some decision for all transactions and data
containers are nested.
Multiple granularity locks used in where the granularity level can be different for various maxis of
transactions.
• The Multiple granularity lock may use in concurrency control performance and Ensure that
correctness, efficiency.
• To create multiple granularity locking, there is required, some extra type of locks, those locks
are termed as intention locks.
Comment
Chapter 21, Problem 15RQ
Problem
Step-by-step solution
Step 1 of 2
Intention locks:-
A lock that can be used for, “to macking a lock at multiple granularity levels practical, additional
types of lock is needed. That is intention lock.
Main idea a behind intention locks is, for a transaction to indicate which type of lock it will require
later for a row in that table.
(Not locking the object, but declare intension to lock part of the object) here, there are three types
of intention locks.
Comment
Step 2 of 2
Indicates that, a shared lock (S) will be requested on some decendant node (S)
Indicates that an exclusive lock (S) will be requested on some descendant node(S)
It includes that the current node is locked in shared mod but an exclusive lock (S) will be
requested on some descendent node (S)
And
(1) Before a given transaction can acquire an S lock on a given row. It must first acquire an Is or
stronger lock on the table contain the row.
(2) Before the given transaction can acquire an X lock on a given row. It mot first acquire an IX
lock on the table containing that row.
Comment
Chapter 21, Problem 16RQ
Problem
Step-by-step solution
Step 1 of 1
Latches are used. For, to guarantee the physical integrity of a page when that page is being
written from the buffer disk.
And
Latch would be acquired for the page the page written to disk and then the latch be released.
Typically locks are held for a short duration. This is a called as latches.
Comment
Chapter 21, Problem 17RQ
Problem
What is a phantom record? Discuss the problem that a phantom record can cause for
concurrency control.
Step-by-step solution
Step 1 of 2
Phantom record:-
When a new record is inserted by some transaction T, that satisfies the condition, a set of
records accessed by another Transaction . At this time, transaction followed by transaction
T, it is new one and it is not included for equivalent serial order. And the Transaction logically
conflict in the latter case there is really no record in common between the two transactions, since
may have locked for all records before transaction ‘T’ inserted the new record.
Comment
Step 2 of 2
While transaction is accessing all EMPLOYEE records whose Dno=5. then the equivalent
serial order is T followed by the . Then must read the new EMPLOYEE record and include
its salary in the sum calculation. At this time the new salary should not be included and the latter.
Case there is really no record in common between the two transactions. Since may have
locked all the records with Dno=5 before T inserted the new record. This is because the record
that cause the conflict is a phantom record. It is suddenly appeared in the database on being
inserted.
If the other operation in the two transactions conflict, the conflict due to the phantom record may
not be recognized by the concurrency control protocol.
Comment
Chapter 21, Problem 18RQ
Problem
Step-by-step solution
Step 1 of 2
Index locking:-
Index includes entries that have an attribute values. Plus a set of pointers to all records in the file
with that value. And if the index entry is locked before the record it self can be accessed. Then
the conflict on the phantom record can be detected because transaction would request a read
lock on the index entry and transaction T would request a write lock on the same entry before
that could place the locks on the actual records.
Since the index lock conflict the phantom conflict and that would be detected.
Comment
Step 2 of 2
Example:-
Let the index on Dno of EMPLOYEE would be include an entry for each distinct Dno value, plus
a set of pointers to all EMPLOYEE records with that value.
At this time if the index entry is locked, before the record itself can be accessed, then the conflict
on the phantom record can be detected because transaction would request a read lock on the
index entry for Dno=5 and transaction T would request a write lock on the same entry before they
could place the locks on the actual records.
Since the index locks conflict the phantom conflict would be detected.
Comment
Chapter 21, Problem 19RQ
Problem
Step-by-step solution
Step 1 of 1
Predicate lock:-
Index locking is a special case of predicate locking for which an index supports efficient
implementation of the predicate lock.
Predicate lock means all records that satisfy some logical predicate, and it satisfy an arbitrary
predicate
In general predicate locking has a lot of locking has a lot of locking over head.
It is too expensive.
Comment
Chapter 21, Problem 20E
Problem
Prove that the basic two-phase locking protocol guarantees conflict serializability of schedules.
(Hint: Show that if a serializability graph for a schedule has a cycle, then at least one of the
transactions participating in the schedule does not obey the two-phase lockingprotocol.)
Step-by-step solution
Step 1 of 1
For This proof we tack contradiction, and assume binary locks for simplicity.
Let n transactions T1, T2, ..., Tn such that they all obey the basic two-phase locking rule which is
no transaction has an unlock operation followed by a lock operation. And Suppose that a non-
(conflict)-serializable schedule S for T1, T2, ..., Tn does occur; then, according to the precedence
(serialization) graph for S must have a cycle. Hence, there must be some sequence within the
schedule of the form:
S: ...; [o1(X); ...; o2(X);] ...; [ o2(Y); ...; o3(Y);] ... ; [on(Z); ...; o1(Z);]...
where each pair of operations between square brackets [o,o] are conflicting (either [w,w], or [w,
r], or [r,w]) in order to create an arc in the serialization graph. This implies that in transaction T1,
Than a sequence of the following form occurs:
Furthermore, T1 has to unlock item X (so T2 can lock it before applying o2(X) to follow
the rules of locking) and has to lock item Z (before applying o1(Z), but this must occur
after Tn has unlocked it). Hence, a sequence in T1 of the following form occurs:
T1: ...; o1(X); ...; unlock(X); ... ; lock(Z); ...; o1(Z); ...
This implies that T1 does not obey the two-phase locking protocol (since lock(Z) follows
unlock(X)), contradicting our assumption that all transactions in S follow the two-phase
locking protocol.
Comment
Chapter 21, Problem 21E
Problem
Modify the data structures for multiple-mode locks and the algorithms for read_lock(X),
write_lock(X), and unlock(X) so that upgrading and downgrading of locks are possible. (Hint: The
lock needs to check the transaction id(s) that hold the lock, if any.)
Step-by-step solution
Step 1 of 1
List of transaction ids that have read-locked an item is maintained, as well as the (single)
transaction id that has write-locked an item. Only read_lock and write_lock are shown below.
end
then begin
end
end
else begin
wait (until lock (X) = "unlocked" and the lock manager wakes up the transaction);
goto B;
end;
write_lock (X,Tn);
else
else begin
goto B;
end;
Comment
Chapter 21, Problem 22E
Problem
Step-by-step solution
Step 1 of 1
Strict two-phase locking guarantees strict schedules, Since no other transaction that can read or
write an item and written by a transaction T until , T has committed and the condition for a strict
schedule is satisfied.
Comment
Chapter 21, Problem 23E
Problem
Prove that the wait-die and wound-wait protocols avoid deadlock and starvation.
Step-by-step solution
Step 1 of 2
Two schemas that prevent deadlocks ar called wait-die and wait-wound. Suppose that
transaction Ti tries to lock an item X but is not able to because X is locked by some other
transaction Tj with a conflicting lock. The rules followed by these schemes are as follows:
• Wait – die: If TS(Ti)< TS(Tj), then (Ti older than Tj) Ti is allowed to wait; otherwise (Ti younger
than Tj) abort Ti (Ti dies) and restart it later with the same timestamp.
• Wound – wait: If TS(Ti)< TS(Tj), then (Ti older than Tj) abort Tj (Ti wounds Tj) and restart it
later with the same timestamp; otherwise (Ti younger than Tj) Ti ia allowed to wait.
Comment
Step 2 of 2
Comment
Chapter 21, Problem 24E
Problem
Step-by-step solution
Step 1 of 1
In cautious waiting, a transaction Ti can wait on a transaction Tj (and hence Ti becomes blocked)
only if Tj is not blocked at that time, say time b(Ti), when Ti waits.
Later, at some time b(Tj) > b(Ti), Tj can be blocked and wait on another transaction Tk
already blocked transaction since this is not allowed by the protocol. Hence, the wait-for
graph among the blocked transactions in this system will follow the blocking times and
Comment
Chapter 21, Problem 27E
Problem
Why is two-phase locking not used as a concurrency control method for indexes such as B+-
trees?
Step-by-step solution
Step 1 of 1
Two phase locking can also be applied to indexes such as B+ trees, where the nodes of an index
correspond to disk pages. However, holding locks on index pages until the shrinking phase of
2PL could cause an undue amount of transaction blocking because searching an index always
starts at the root. Therefore, if a transaction wants to insert a record (write operation), the root
would be locked in exclusive mode, so all other conflicting lock requests for the index must wait
until the transaction enters the shrinking phase. This blocks all other transactions from accessing
the index, so in practice other approaches to locking an index must be used.
Comment
Chapter 21, Problem 28E
Problem
The compatibility matrix in Figure 21.8 shows that IS and IX locks are compatible. Explain why
this is valid.
Step-by-step solution
Step 1 of 1
IS and IX are compatible. When transaction T holds IS and IX is requested By T’, T is having
only a shared lock and moreover T’ might be having intensions having an exclusive lock on a
node that might be different from one on which T is working.
Similarly T’ might be holding IX and T might request IS lock, since T’ might be having intensions
of accessing only a node that may be different from one accessed by T both operations are
compatible.
Comment
Chapter 21, Problem 29E
Problem
The MGL protocol states that a transaction T can unlock a node N, only if none of the children of
node N are still locked by transaction T. Show that without this condition, the MGL protocol would
be incorrect.
Step-by-step solution
Step 1 of 2
The rule that parent node can be unlocked only when none of child are not still locked by
transaction T. This rule enforces 2PL rules to produce serializable schedules. If this rule is not
followed, schedule will not be serializable and if schedule will not be serializable the transaction
will not produce correct results and thus the protocol will fail.
Comment
Step 2 of 2
This rule ensures serializability of transactions by governing the order of locking and
manipulation of data item by a transaction T. Let a transaction T wants to insert data in a node.
That is let leaf node. Now before data is inserted and leaf node is unlocked let root node is
unlocked. Now consider a situation when leaf node is full, this will call for splitting, but as root has
been unlocked and might be locked by transaction T’, operation can not proceed. Hence protocol
fails.
Comment
Chapter 22, Problem 1RQ
Problem
Discuss the different types of transaction failures. What is meant by catastrophic failure?
Step-by-step solution
Step 1 of 1
Types of failures :
Computer failure –
Main memory failure. Any thing that was not committed to the disk is gone. Restart the system
and pray it doesn't crash again.
Divide by zero or integer overflow and this transaction failure may also occur because of
erroneous parameter values or because of a logical programming error.
Logical errors:
Errors or exception conditions that are detected by the transaction. A transaction that proceeds
but halts and cancels all inputted data because something along the way prevents it from
proceeding.
Disk failure:
Some disk blocks may lose their data because of a read or write malfunction or because of a
read/ write head crash.
Catastrophic failure:
This would include many forms of physical misfortune to our database server. This refers to an
endless list of problems
Comment
Chapter 22, Problem 2RQ
Problem
Discuss the actions taken by the read_item and write_item operations on a database.
Step-by-step solution
Step 1 of 1
In a database, The operations like read item and write item that may
Actions taken by the read item operation on a database (assume the read operation is performed
on data item X):
Copy the disk block into a buffer in main memory if that disk is not already in some main memory
buffer.
Actions taken by the write item operation on a database (assume the write operation is
performed on data item X):
Copy the disk block into a buffer in main memory if that disk is not already in some main memory
buffer.
Copy item X from the program variable named X into its correct location in the buffer.
Store the updated block from the buffer back to disk (either immediately or at some later point in
time).
Comment
Chapter 22, Problem 3RQ
Problem
What is the system log used for? What are the typical kinds of entries in a system log? What are
checkpoints, and why are they important? What are transaction commit points, and why are they
important?
Step-by-step solution
Step 1 of 4
System log: Recovery from transaction failures usually means that the database is restored to
the most recent consistent state just before the time of failure. To do this, the system must keep
information about changes that were applied to data items by various transactions. This
information is typically kept in the system log. Thus system logs help in data recovery in case of
failures.
Comment
Step 2 of 4
1.) If there is extensive damage to a wide portion of the database due to catastrophic failure,
such as a disk crash, the recovery method restores a past copy of the database that was backed
up to archival storage and reconstructs a more current state by reapplying or redoing the
operations of committed transactions from the backed up log, up to the time of failure.
2.) When the database is not physically damaged, but has become inconsistent due to non-
catastrophic failures the strategy is to reverse any changes that caused inconsistency by undoing
some operations. It may also be necessary to re-do some operations in order to restore a
consistent state of database. In this case, we do not need a complete archival copy of the
database. Rather, the entries kept in the online system log are consulted during recovery.
2.) [T, read command, data item, value] //used for checking accesses to database
2.) [Checkpoint]
3.) [Commit, T]
5.) write_TS
Comment
Step 3 of 4
Checkpoint: This is a type of entry in the system log. A [checkpoint] record is written into the log
periodically at that point when the system writes out to the database on disk all DBMS buffers
that have been modified. As a consequence of this, all transactions have there [commit, T]
entries in the log before a [checkpoint] entry do not need to have their WRITE operations redone
in case of a system crash, since all their updates will be recorded in the database on disk during
check pointing. A checkpoint record may also include additionally information, such as a list of
active transaction ids, and the locations of the first and the most recent records in the log for
active transaction. This can facilitate undoing transaction operations in the event that a
transaction must be rolled back.
Comment
Step 4 of 4
Commit Point: A commit point is point at which execution o transaction gets completed and is
written to database and cannot be rolled back.
1.) A transaction cannot change the database on disk until it reaches commit point.
2.) A transaction does not reach its commit point until all its update operations are recorded in
the log and the log is force- written to disk.
Comment
Chapter 22, Problem 4RQ
Problem
How are buffering and caching techniques used by the recovery subsystem?
Step-by-step solution
Step 1 of 1
In a subsystem. The recovery process is of ten closely inter twined with operating system
functions. In general one or more disk pages that include the data items to be updated are
cached into main memory buffers and then updated in memory before being written back to disk.
At this time, the performance gap between disk and CPU increase, disk I/O has become a major
performance bottleneck for data intensive applications. Disk I/O latency, in particular is much
more difficult to improve than disk band width.
While, buffering and caching in main memory have been used extensively to bridge the
performance gap between CPU and disk.
Comment
Chapter 22, Problem 5RQ
Problem
What are the before image (BFIM) and after image (AFIM) of a data item? What is the difference
between in-place updating and shadowing, with respect to their handling of BFIM and AFIM?
Step-by-step solution
Step 1 of 3
The old value of the data item before updating is called the before image (BFIM).
The new value of the data item after updating is called the after image (AFIM)
Comment
Step 2 of 3
When flushing a modified buffer back to disk is follows two strategies. That are
In – place updating.
shadowing.
Comment
Step 3 of 3
In – place updating – writes the buffer to the same original disk location, and over writing the old
value of any changed data items on disk.
Shadowing:-
BFIM and AFIM, both are kept on disk and it is not strictly necessary to maintain a log for
recovery.
Comment
Chapter 22, Problem 6RQ
Problem
Step-by-step solution
Step 1 of 2
In the database recovery techniques, the recovery is achieved by the performing only UNDO’s
and only REDO’s or by a combination of the two.
The log entry information included for a write command and it is needed for UNDO and REDO.
This entries includes the old value (S) in the data base before a write operation has
been executed
This type entries are use full in “Restore all BFIMs on to the disk, means Remove all
AFIMs.
Comment
Step 2 of 2
These entries, includes the new values in the data base a write operation has been
executed.
This type of entries are use full in “Restore the all AFIMs on to disk.
Comment
Chapter 22, Problem 7RQ
Problem
Step-by-step solution
Step 1 of 1
When in – place up dating, (means immediate or differed ) is used, then log is necessary for
recovery and in this case, it must be available to recovery manager.
For example:
If BFIM of the data item is recoded in the appropriate log entry and that the log entry is flushed to
disk before the BFIM is overwritten with the AFIM in the database on disk.
Before a data item’s AFIM is flushed to the database disk, its BFIM must be written to the log and
the log must me saved on a stable store. (log disk).
All it’s AFIM must be written to the log and the log must be saved on a stable store.
Comment
Chapter 22, Problem 8RQ
Problem
Identify three typical lists of transactions that are maintained by the recovery subsystem.
Step-by-step solution
Step 1 of 1
For the best performance of the recovery process, the DBMS recovery subsystem may need to
maintain number of transactions. In that three main and typical transactions is there.
That are
Comment
Chapter 22, Problem 9RQ
Problem
What is meant by transaction rollback? What is meant by cascading rollback? Why do practical
recovery methods use protocols that do not permit cascading rollback? Which recovery
techniques do not require any rollback?
Step-by-step solution
Step 1 of 4
Transaction rollback means that, if a transaction has failed after a disk write, the writes need to
be undone.
Means that,
Data base recovery is achieved either by performing only Undo s or only Redo s by a
combination of the two.
Comment
Step 2 of 4
Cascading roll back is where the failure and rollback of some transaction requires the rollback of
other.
And
In mean wile, any values that are derived from the values that were rolled back will also be undo.
Comment
Step 3 of 4
Practical recovery methods use protocols that do not permit cascading roll back because, it is
complex and time – consuming.
Comment
Step 4 of 4
UNDO / REDO recovery technique is do not required any rollback in a deferred update.
Comment
Chapter 22, Problem 10RQ
Problem
Discuss the UNDO and REDO operations and the recovery techniques that use each.
Step-by-step solution
Step 1 of 1
If we want to describe a protocol for write – ahead logging, then we must distinguish between two
types of log entry information included for a write.
UNDO
REDO
A UNDO – type log entries includes the old value (BFIM) of the item since this is needed to undo
the effect of the operation from the log.
A REDO – type entry includes the new value (AFIM) of the item written by the operation since
this is needed to read the effect of the operation from the log.
In the UNDO / REDO algorithm, both types of log entries are combined. And cascading roll back
is possible when the read – item entries in the log are considered to be UNDO – type entries.
Comment
Chapter 22, Problem 11RQ
Problem
Discuss the deferred update technique of recovery. What are the advantages and disadvantages
of this technique? Why is it called the NO-UNDO/REDO method?
Step-by-step solution
Step 1 of 5
The main thought of this technique is, to deffer or postpone any actual updates to the database
until the transaction completes its execution successfully end reaches its commit point.
Through this technique, the updates are recorded only in the log and in the cache buffers.
After the transaction reaches its commit point and the log is force written to disk and the updates
are recorded in the data base.
A transaction may not commit until all of the write operations are successfully
This means that we must check to see that the log is actually written to disk
Example:-
Comment
Step 2 of 5
Log file:
Start
write
commit
check point
start
write
write
commit
start
write
start
write
Comment
Step 3 of 5
How ever, did not commit, hence, their changes were not written to disk.
Comment
Step 4 of 5
Advantages:-
Recovery is made easier.
Any transaction that reached the commit point (from the log) has its writes applied to
Cascading rollback does not occur because, no other transactions sees the work of
Disadvantages:-
Concurrency is limited:
Comment
Step 5 of 5
NO – UNDO / REDO recovery method because. From the second step (A transaction does not
reach its commit point until all its update operations are recorded in the log and the log is force –
written to disk ) of this protocol is a restatement of the write – ahead logging (WAL) protocol.
Because the database is never updated on disk until after the transaction commits. There is
never a need to UNDO any operations.
Comment
Chapter 22, Problem 12RQ
Problem
How can recovery handle transaction operations that do not affect the database, such as the
printing of reports by a transaction?
Step-by-step solution
Step 1 of 1
If a transaction that has actions that do not affect the database, such a generating and printing
messages or reports from the information retrieved from the database, fails before completion,
we may not want user to get these reports, since the transaction has failed to complete. If such
erroneous reports are produced, part of the recovery process would have to inform the user
these reports are wrong, since the user may take an action based on these reports that affects
the database. Hence such reports must be generated only after the transaction reaches its
commit point. A common method of dealing with such actions is to issue the command that
generate the reports but keep them as batch jobs, which are executed only after the transaction
reaches its commit point. If the transaction fails, the batch jobs are canceled.
Comment
Chapter 22, Problem 13RQ
Problem
Discuss the immediate update recovery technique in both single-user and multiuser
environments. What are the advantages and disadvantages of immediate update?
Step-by-step solution
Step 1 of 2
Immediate update applies the write operations to the database as the transaction is executing.
When the transaction issues an update commend. Then the database can be updated with out
any need to wait for the transaction to reach it’s commit point and the update operation must still
be recorded in the log before it is applied to the database using the write ahead is maintain two
logs.
(1) REDO log : A record of each new data item in the database.
(2) UNDO log: A record of each update data item old vale
And
(1) Transaction T may not update the database until all undo entries have been written to the
UNDO log.
(2) Transaction T is not allowed to commit until all REDO and UNDO log entries are written.
Comment
Step 2 of 2
Advantages:-
Immediate update allows higher concurrency, because transactions write continuously to the
database rather than waiting until the commit point.
Disadvantages:-
It can lead the cascading roll backs – time consuming and may be problematic.
Comment
Chapter 22, Problem 14RQ
Problem
What is the difference between the UNDO/REDO and the UNDO/NO-REDO algorithms for
recovery with immediate update? Develop the outline for an UNDO/NO-REDO algorithm.
Step-by-step solution
Step 1 of 3
Recovery techniques based on immediate update and it uses in the single user
environment.
This recovery schema category apply to undo and also redo for recovery.
Recovery schemas of this category applies undo and also redo to recover the database
from failure.
Comment
Step 2 of 3
In this algorithm, AFIM’s of a transaction are flushed to the database disk under
For this reason the recovery manager undoes all transactions during recovery.
It is possible that a transaction might have completed execution and ready to commit
Comment
Step 3 of 3
In this algorithm, AFIMs of a transaction are flushed to the database disk under WAL
before it commits.
Reason for the recovery manager undoes all transactions during recovery
Here NO trans.
Comment
Chapter 22, Problem 15RQ
Problem
Describe the shadow paging recovery technique. Under what circumstances does it not require a
log?
Step-by-step solution
Step 1 of 3
Shadow paging is considers that the data base to be made up of a number of fixed size disk
pages (or disk blocks ) – say , n – for recovery purposes.
Shadow paging technique is mused to manage the access of data items by the concurrent
transactions, two directories (current and shadow) are used. The directory arrangement is
illustrated below.
Comment
Step 2 of 3
Comment
Step 3 of 3
Comment
Chapter 22, Problem 16RQ
Problem
Step-by-step solution
Step 1 of 2
Comment
Step 2 of 2
In the analysis phase, step identifies the dirty pages in the buffer and the set of transactions
active at the time of crash. The appropriate point in the log where redo is to start is also
determined,
Where in the redo phase, redo operations are applied and where in undo. The log is scanned
back words and the operations of transactions active at the time of crash are undone in reverse
order.
Comment
Chapter 22, Problem 17RQ
Problem
What are log sequence numbers (LSNs) in ARIES? How are they used? What information do the
Dirty Page Table and Transaction Table contain? Describe how fuzzy checkpointing is used in
ARIES.
Step-by-step solution
Step 1 of 4
In ARIES, every log record is associated log sequence number (LSN) that is monotonically
increasing and indicates the address of the log record on disk.
(4) undo
Comment
Step 2 of 4
Table contains an entry for each active transaction, with information such as
transaction ID. Transaction status and the LSN of the most recent log record for the
transaction.
This table contains an entry for each dirty page, in the buffer which includes the page
Comment
Step 3 of 4
Fuzzy check pointing is used for to reduce the cost of check pointing and allow the system to
continue to execute transactions.
Writes an end – check point record in the log. With this record the contents of
transaction table and dirty table are appended to the end of the log.
Writes the LSN of the begin – check point record to a special file. This special file is
Comment
Step 4 of 4
In practice, Fuzzy check point technique use when the system can resume transaction
processing after the record is written to the log without having to wait for the process of check
point action step 2 ( force – write all memory buffers that have been modified to disk ) to finish
until the above step is completed. Then the previous record should remain valid.
To accomplish this, the system maintains a pointer to the valid check point, which continues to
point to the previous record in the log. Once the above step is concluded, the pointer changes to
point to the new check point in the log.
Comment
Chapter 22, Problem 18RQ
Problem
What do the terms steal/no-steal and force/no-force mean with regard to buffer management for
transaction processing?
Step-by-step solution
Step 1 of 1
A system is said to steal buffers if it allows the buffers that contain dirty data (means it is
uncommitted but updated) data to be swapped to physical storage.
A system is said to force buffers if every committed data is guarantied to be forced on to the disk
at commit time.
Comment
Chapter 22, Problem 19RQ
Problem
Step-by-step solution
Step 1 of 1
Prepare phase –
The global coordinator (initiating node) ask a participants to prepare (to promise to commit or
rollback the transaction, even if there is a failure)
Commit - Phase –
If all participants respond to the coordinator that they are prepared, the coordinator asks
all nodes to commit the transaction, if all participants cannot prepare, the coordinator asks all
nodes to roll back the transaction.
Comment
Chapter 22, Problem 20RQ
Problem
Step-by-step solution
Step 1 of 1
Catastrophic failures from handled by disaster recovery, in this, the entire database along with a
log file are copied to a cheap and large storage device periodically. When a catastrophe strikes,
the most recent back up copy is placed back where the database used to do.
Comment
Chapter 22, Problem 21E
Problem
Suppose that the system crashes before the [read_item, T3, A] entry is written to the log in
Figure 22.1(b). Will that make any difference in the recovery process?
Step-by-step solution
Step 1 of 1
If the system crashes before the [ read_item, T3, A] entry is written to the log, There will be no
difference in the recovery process, because read_item operations are needed only for
determining if cascading rollback of additional transactions is necessary.
Comment
Chapter 22, Problem 22E
Problem
Suppose that the system crashes before the [write_item, T2, D, 25, 26] entry is written to the log
in Figure 22.1(b). Will that make any difference in the recovery process?
Step-by-step solution
Step 1 of 2
When the system cashes before the transaction T2 performs a write operation on item D, there
will a difference in the recovery process.
Comment
Step 2 of 2
During the recovery process, the following transactions must be rolled back.
• The transaction T3 has not reached it commit point. So, the transaction T3 have to be rolled
back.
• Also, the transaction T2 has not reached it commit point. So, the transaction T2 have to be
rolled back.
Hence, the transactions T2 and T3 have to be rolled back in the recovery process.
Comment
Chapter 22, Problem 23E
Problem
Figure shows the log corresponding to a particular schedule at the point of a system crash for
four transactions T1 T2, T3, and T4. Suppose that we use the immediate update protocol with
checkpointing. Describe the recovery process from the system crash. Specify which transactions
are rolled back, which operations in the log are redone and which (if any) are undone, and
whether any cascading rollback takes place.
Step-by-step solution
Step 1 of 5
• Undo all the write operations of the transaction that are not committed.
• Redo all the write operations of the transaction that are committed after the check point.
Comment
Step 2 of 5
• The transaction T3 has not reached it commit point. So, the transaction T3 have to be rolled
back.
• Also, the transaction T2 has not reached it commit point. So, the transaction T2 have to be
rolled back.
Comment
Step 3 of 5
• write_item, T4, D, 25, 15: The transaction T4 must redo the write operation on item D.
• write_item, T4, A, 30, 20: The transaction T4 must redo the write operation on item A.
Comment
Step 4 of 5
Comment
Step 5 of 5
Comment
Chapter 22, Problem 24E
Problem
Suppose that we use the deferred update protocol for the example in Figure 22.6. Show how the
log would be different in the case of deferred update by removing the unnecessary log entries;
then describe the recovery process, using your modified log. Assume that only REDO operations
are applied, and specify which operations in the log are redone and which are ignored.
Step-by-step solution
Step 1 of 2
In the case of deferred update by removing the un necessary log entries , the write operations of
uncommitted transactions are not recorded in the database until the transactions commit. So, the
write operations of T2 and T3 would not have been applied to the database and so T4 would
have read the previous values of items A and B, thus leading to a recoverable schedule.
Comment
Step 2 of 2
The list of committed transactions T since the last checkpoint contains only transaction
T4. The list of active transactions T' contains transactions T2 and T3.
Only the WRITE operations of the committed transactions are to be redone. Hence, REDO is
applied to:
[write_item,T4,B,15]
[write_item,T4,A,20]
The transactions that are active and did not commit i.e., transactions T2 and T3 are
canceled and must be resubmitted. Their operations do not have to be undone since they
Comments (1)
Chapter 22, Problem 25E
Problem
How does checkpointing in ARIES differ from checkpointing as described in Section 22.1.4?
Step-by-step solution
Step 1 of 1
The main difference is that with ARIES, main memory buffers that have been modified are not
flushed to disk. ARIES, however writes additional information to the LOG in the form of a
Transaction Table and a Dirty Page Table when a checkpoint occurs.
Comment
Chapter 22, Problem 26E
Problem
How are log sequence numbers used by ARIES to reduce the amount of REDO work needed for
recovery? Illustrate with an example using the information shown in Figure 22.5. You can make
your own assumptions as to when a page is written to disk.
Step-by-step solution
Step 1 of 1
ARIES can be used to reduce the amount of REDO work through log sequence numbers as
follows:
• ARIES reduces the amount of REDO work by starting redoing after the point, where all prior
changes have been applied to the database. ARIES performs REDO at the position in the log
that corresponds to smallest LSN, M.
• In the Figure 22.5, REDO must start at the log position 1 as the smallest LSN in Dirty Page
Table is 1.
• When , then the page corresponding to LSN is changed and is propagated to the
database.
• In the figure 22.5 the transaction performs the update of page C and page C has a LSN of
7.
• When REDO starts at log position 1, page C is propagated to the database. But the page C is
not changed as its LSN (7) is greater than the LSN of current log position (1).
• Now consider the LSN 2. Page B is associated with this LSN and it would be propagated to the
database. The page B would be updated if its LSN is less than 2. Similarly, the page
corresponding to LSN 6 would be updated.
• However the page corresponding to the LSN 7 need not be updated as the LSN of page C, that
is 7, is not less than the current log position.
Comment
Chapter 22, Problem 27E
Problem
What implications would a no-steal/force buffer management policy have on checkpointing and
recovery?
Step-by-step solution
Step 1 of 1
• No-steal/force buffer management policy means that the cache or buffer page that has been
updated by the transaction cannot be written to disk before the transaction commits
• Force means that pages updated by a transaction are written to disk before transaction commit.
• During checkpoint scheme in no-steal, all modified main memory buffers to disk would not be
able to write pages updated by uncommitted transactions.
• With Force, after a transaction is done, its updates are written to disk. If there is any failure
during this transaction, then REDO is still needed. UNDO is not needed since uncommitted
updates are never written to disk.
Comment
Chapter 22, Problem 28E
Problem
Choose the correct answer for each of the following multiple-choice questions:
Incremental logging with deferred updates implies that the recovery system must
c. store both the old and new value of the updated item in the log
d. store only the Begin Transaction and Commit Transaction records in the log
Step-by-step solution
Step 1 of 1
Incremental loging with deferred updates implies that the recovery system must necessarily,
Option (b)
Comment
Chapter 22, Problem 29E
Problem
Choose the correct answer for each of the following multiple-choice questions:
b. the log record for an operation should be written before the actual data is written
c. all log records should be written before a new transaction begins execution
Step-by-step solution
Step 1 of 1
The write ahead logging (WAL) protocol simply means that the log record for an operation should
be written before the actual data is written.
Option (b)
The log record for an operation should be written before the actual data is written.
Comment
Problem
Chapter 22, Problem 30E
Choose the correct answer for each of the following multiple-choice questions:
In case of transaction failure under a deferred update incremental logging scheme, which of the
following will be needed?
a. an undo operation
b. a redo operation
Step-by-step solution
Step 1 of 1
In case of transaction failure under a deferred update incremental logging scheme which of the
following will needed.
Option (c)
Comments (1)
Chapter 22, Problem 31E
Problem
Choose the correct answer for each of the following multiple-choice questions:
For incremental logging with immediate,updates, a log record for a transaction would contain
a. a transaction name, a data item name, and the old and new value of the item
b. a transaction name, a data item name, and the old value of the item
c. a transaction name, a data item name, and the new value of the item
Step-by-step solution
Step 1 of 1
For incremental logging with immediate updates a log record for a transaction would contain.
Option (a)
A Transaction name, data item name, old value of item, new value of item
Comment
Chapter 22, Problem 32E
Problem
Choose the correct answer for each of the following multiple-choice questions:
For correct behavior during recovery, undo and redo operations must be
a. commutative
b. associative
c. idempotent
d. distributive
Step-by-step solution
Step 1 of 1
For correct behavior during recovery, undo and redo operations must be
Option (c)
Idempotent
Comment
Chapter 22, Problem 33E
Problem
Choose the correct answer for each of the following multiple-choice questions:
When a failure occurs, the log is consulted and each operation is either undone or redone. This
is a problem because
Step-by-step solution
Step 1 of 1
When a failure occurs, the log is consulted and each operation is either undone or redone.
Option (a)
Comment
Chapter 22, Problem 34E
Problem
Choose the correct answer for each of the following multiple-choice questions:
Using a log-based recovery scheme might improve performance as well as provide a recovery
mechanism by
b. writing the appropriate log records to disk during the transaction’s execution
c. waiting to write the log records until multiple transactions commit and writing them as a batch
Step-by-step solution
Step 1 of 1
When using a log based recovery scheme it might improve performance as well as providing a
recovery mechanism by
Option C
Waiting to write the log records until multiple transactions commit and waiting them as a batch.
Comment
Chapter 22, Problem 35E
Problem
Choose the correct answer for each of the following multiple-choice questions:
a. a transaction writes items that have been written only by a committed transaction
Step-by-step solution
Step 1 of 1
Option (d)
A transaction writes & reads an item that is previously written by an uncommitted transaction.
Comment
Chapter 22, Problem 36E
Problem
Choose the correct answer for each of the following multiple-choice questions:
Step-by-step solution
Step 1 of 1
Option (b)
Comment
Chapter 22, Problem 37E
Problem
Choose the correct answer for each of the following multiple-choice questions:
If the shadowing approach is used for flushing a data item back to disk, then
b. the item is written to the same disk location from which it was read
Step-by-step solution
Step 1 of 1
If the shadowing approach is used for flushing a data item back to disk then.
Option (b)
Comment
Chapter 30, Problem 1RQ
Problem
Discuss what is meant by each of the following terms: database authorization, access control,
data encryption, privileged (system) account, database audit, audit trail.
Step-by-step solution
Step 1 of 1
Database authorization
Database authorization ensures the security of the portions of the database against unauthorized
access.
Access control
Most common problem of security is the prevention of accessing the system by an unauthorized
person to obtain information or to inject malicious content that modifies the database. DBMS
must include various security mechanisms which restrict access to the entire database system.
This function is performed by creating user accounts and passwords for the login process to
secure from unauthorized users by the DBMS.
Data encryption
Sensitive data such as card numbers (ATM or credit card) provided by bank must be protected
that is transmitted through communications network; it provides additional protection for
database. The data is encoded so that unauthorized users who access those data will have
difficulty in decoding it.
Privileged account
The DBA account provides important capabilities. The commands are privileged that include
granting and revoking commands of privileges to individual accounts, users, or user groups by
performing following actions
• Account creation
• Privilege granting
• Privilege revocation
Database audit
If there are any modifications or any alterations with the database are identified without their
knowledge, a database audit is performed. It consists of reviewing the log to examine all
accesses and operations applied to the database during certain period of time.
Audit trail
The database log is used for security purposes as it contains all details of the accessing and the
operations are referred as audit trail.
Comment
Chapter 30, Problem 2RQ
Problem
Which account is designated as the owner of a relation? What privileges does the owner of a
relation have?
Step-by-step solution
Step 1 of 1
Owner account is designated as the owner of a relation which is typically the account that was
used when the relation was created in the first place. The owner of a relation is given all
privileges on that relation. The owner account holder can pass privileges on any of the owner
relation to other users by granting privileges to their accounts.
Comment
Chapter 30, Problem 3RQ
Problem
Step-by-step solution
Step 1 of 1
The view mechanism is an important discretionary authorization mechanism in its own right.
For example:-
If the owner A of a relation R wants another account B to be able to retrieve only some fields of
R, then A can create a view V of R that includes only those attributes and then grant SELECT on
V to B. the same applies to limiting B to retrieving only certain tuples of R; a view V can be
created by defining the view by means of a query that selects only those tuples from R that A
wants to allow B to access.
Comment
Chapter 30, Problem 4RQ
Problem
Discuss the types of privileges at the account level and those at the relation level.
Step-by-step solution
Step 1 of 1
There are two levels of privileges to be assigned to use the database system, account level and
relation (or table level).
• At account level, each account of the relation holds particular privileges independently specified
by the database administrator in the database.
• At relation level, each individual relation or view in the database accessing privileges are
controlled by database administrator.
Account level
It includes,
Relation level
• Each type of command can be applied for each user by specifying the individual relation.
Access matrix model, an authorization model is used for granting and revoking of privileges.
Comment
Chapter 30, Problem 5RQ
Problem
Step-by-step solution
Step 1 of 1
Granting and revoking of privileges should be performed so that it ensures secure and authorized
access and hence both of them should be controlled on each relation R in a database.
It is carried out by assigning an owner account, which is the account that was used when the
relation was created. The owner of the relation is the one who uses all privileges on that relation.
Granting of privileges
The owner account holder can transfer the privileges on any of the relations owned to other
users by issuing GRANT command (granting privileges) to their accounts. Types of privileges
granted on each individual relation R by using GRANT command are as follows,
• SELECT privilege on some relation, gives the privilege to retrieve the information (tuples) from
that relation.
• Modification privilege is provided to do insert, delete, and update operations that modify the
database.
Revoking of privileges
When any of the privileges is granted it is given temporarily, it should be necessary to cancel that
privilege after the task has been completed. REVOKE command is used in SQL for canceling the
privileges granted to them.
Comment
Chapter 30, Problem 6RQ
Problem
Discuss the system of propagation of privileges and the restraints imposed by horizontal and
vertical propagation limits.
Step-by-step solution
Step 1 of 2
It is possible for a user to receive a certain privileges from two or more sources. For example, A'
may a certain privilege from both B' and C'. Now let B' revokes privileges from A' but A' will still
have them from virtue of C'. If now C' also revokes the privileges A' will loose them permanently.
The DBMS that allows propagation of privileges must keep a track of how all the privileges were
granted do that revoking of privileges can be done correctly and completely.
Comment
Step 2 of 2
Since propagation of privileges can lead to many accounts having privilege on a relation without
the knowledge of owner. There must be ways to restrict number of people that can have
privileges on an relation. This can be done using limiting by Horizontal propagation and by
limiting by Vertical propagation.
Limiting Horizontal propagation to an integer number i mean that an account B given the
GRANT OPTION can grant privileges to at most i other accounts.
Vertical propagation limits the depth of the granting of privileges. Granting of privileges with
vertical propagation zero is equivalent to granting the privileges with no GRANT OPTION. If
account A grants privileges to account B with vertical propagation set to j>0, this means that the
account B has GRANT OPTION on the privilege, but B can grant the privilege to other accounts
only with a vertical propagation less than j. In effect vertical propagation limits the sequence of
GRANT OPTIONS that can be given from one account to the next based on single original grant
of the privileges.
For example: Suppose that A grant SELECT to B on EMPLOYEE relation with horizontal
propagation = 1 and vertical propagation = 2. B can grant select to almost one account because
horizontal propagation = 1. Additionally, B cannot grant privilege to another account with vertical
propagation set to 0 or 1. Thus we can limit propagation by using these two methods.
Comment
Chapter 30, Problem 7RQ
Problem
Step-by-step solution
Step 1 of 1
1.) Select (retrieval or read) privilege on R: Gives the account retrieval privilege. In SQL this
gives the account the privilege to use SELECT statement to retrieve the tuples from R
2.) Modify privilege on R: This gives the account the capability to modify tuples of R. In SQL
this privilege is further divided into UPDATE, DELETE, and INSERT privileges to apply
corresponding SQL commands to R. Additionally, both the INSERT and UPDATE privileges can
specify that only certain attributes of R can be updated by the account.
3.) Reference privileges on R: This gives the account the capability to reference relation R
when specifying integrity constraints. This privilege can also be restricted to specific attributes of
R.
To create a view an account must have SELECT privilege on all relations involved in view
definition.
Comment
Chapter 30, Problem 8RQ
Problem
Step-by-step solution
Step 1 of 2
a. Discretionary Access Control (DAC) policies are characterized by a high degree of flexibility,
which makes them suitable for a large variety of application domains.
By contrast Mandatory Access Control policies are having a drawback of being too rigid in
that they require a strict classification of subject and objects into security levels, and therefore
they are applied to ery few environments.
Comment
Step 2 of 2
b. The main drawback of DAC models is their vulnerability to malicious attacks, such as Trojan
horses embedded in application programs. The reason is that discretionary authorization models
do not impose any control on how information is propagated and used once it has been
accessed by authorized user to do so.
By contrast Mandatory Access Control policies ensure a high degree of protection- in a way,
they prevent any illegal flow of information.
Comment
Chapter 30, Problem 9RQ
Problem
What are the typical security classifications? Discuss the simple security property and the *-
property, and explain the justification behind these rules for enforcing multilevel security.
Step-by-step solution
Step 1 of 1
Typical security classes are top secret (Ts), secret (S), confidential (C), and unclassified (U),
where TS is the highest level and U the lowest: .
The first rule is that no subjects can red on object whose security classification is higher than the
subject’s security clearance.
The second restriction is less intuitive; it prohibits a subject from writing an object at a lower
security classification than the subject’s security clearance violations of this rule would allow
information to flow from higher to lower classifications which violates a basic tenet of multilevel
security.
Comment
Chapter 30, Problem 10RQ
Problem
Describe the multilevel relational data model. Define the following terms: apparent key,
polyinstantiation, filtering.
Step-by-step solution
Step 1 of 3
Define:
1.) Apparent key: The apparent key of a multilevel relation is the set of attributes that would
have formed the primary key in a regular (single- level) relation.
Comment
Step 2 of 3
2.) Filtering: A multilevel relation will appear to contain different data to subjects with different
clearance levels. In some cases, it is possible to store a single tuple in the relation at a higher
classification level and produce the corresponding tuples at a lower- level classification through a
process known s filtering.
Comment
Step 3 of 3
3.) Polyinstantiation In some cases, it is necessary to store two or more tuples at different
classification levels with the same value for the apparent key. This leads to the concept of
polyinstantiation, where several tuples can have same apparent key value but different attributes
value for users at different classification levels.
Comment
Chapter 30, Problem 11RQ
Problem
Step-by-step solution
Step 1 of 1
flexibility, which makes them suitable for a large variety of application domains.
The main drawback of DAC models is their vulnerability to malicious attacks, such as
MAC have the drawback of being too rigid and they are only applicable in limited
environments.
In many practical situations discretionary policies are preferred because they offer a
Comment
Problem
Chapter 30, Problem 12RQ
What is role-based access control? In what ways is it superior to DAC and MAC?
Step-by-step solution
Step 1 of 1
Role – based access control (RBAC) technology for managing and enforcing security in large –
scale enterprise wide systems. The basic notation is that permissions are associated with soles,
and users are assigned to appropriate roles.
Roles can be created using the CREATE ROLE and DESTROY ROLE commands, ERANT and
REVOKE used to assign and revoke privileges from voles.
RBAC appears to be a viable alternative to traditional DAC and MAC, it ensures that only
authorized users are given access to certain data or resources.
Many DBMS have allowed the concept of voles, where privileges can be assigned to voles.
Role hierarchy in RBAC is natural way of organizing roles to reflect the organization’s lines of
authority and responsibility.
Using an RBAC model highly desirable goal for addressing the key security requirements of
web – based applications.
DAC and MAC models lack capabilities needed to support the security requirements emerging
enterprises and web – based applications.
Comment
Chapter 30, Problem 13RQ
Problem
What are the two types of mutual exclusion in role-based access control?
Step-by-step solution
Step 1 of 1
Two roles are said to be mutually exclusive if the user does not able to use both the roles.
2. Runtime exclusion.
It is a static process in which two roles that are mutually exclusive are not assigned to user’s
authorization at the same time.
Runtime exclusion
It is a dynamic process, where the two roles are mutually exclusive are authorized to one user at
the same time but can activate any one authorization that is both the roles cannot be activated at
the same time.
Comment
Chapter 30, Problem 14RQ
Problem
Step-by-step solution
Step 1 of 1
In row level access control, the name itself determines that access control rules are implemented
on the data row by row.
• It ensures data security by allowing the permissions to be set not only for column or table but
also for each row.
• Database administrator provides the user with the default session label initially.
• Unauthorized users are prevented from viewing or altering certain data by using labels
assigned.
• A user is represented by a low number who have low level authorization, the access is denied
to data having a higher-level number.
• If the label is not given to a row, it is automatically assigned depending upon the user’s session
label.
Comment
Chapter 30, Problem 15RQ
Problem
Step-by-step solution
Step 1 of 1
Label Security policy is a policy defined by the administrator. The policy is invoked
automatically whenever the policy affected data is accessed through an application. When this
policy is implemented, each row is added with a new column.
The new column contains the label for each row that is considered to be the sensitivity of the row
as per the policy. Each user has an identity in label-based security; it is compared to the label
assigned to each row to determine whether the user has rights to access to view the contents of
that row.
The database administrator has the privilege to set an initial label for the row.
Label security administrator defines the security labels for data and authorizations that govern
access to specified projects for users.
Example
If a user has SELECT privilege on the table, Label Security will automatically evaluate each row
returned by the query to determine whether the user is provided with the rights to view the data.
If the user is assigned with sensitivity level 25, the user can view all rows that have a security
level of 25 or lower.
Label security can be used to perform security checks on statements that include insert, delete,
and update.
Comment
Chapter 30, Problem 16RQ
Problem
Step-by-step solution
Step 1 of 1
SQL injection attacks are more common threats to database systems. Types of injection attacks
include,
• SQL Manipulation
• Code injection
Explanation
SQL Manipulation
A modification attack that changes an SQL command in the application, or by extending a query
by adding additional query components using set operations such as union, intersect, or minus in
SQL query.
Example
The hacker can try to change or manipulate the SQL statement as follows:
So the hacker knows “john” as a valid login and without knowing his password able to log into the
database system.
Code Injection
• It allows the addition of extra SQL statements or commands to the existing or original SQL
statement by introducing a computer bug caused by processing invalid data.
• The attacker injects the code into a computer program to change the course of action.
• It is a one of the method used for hacking the system to obtain information without
authorization.
• A database or operating system (OS) function call is injected into the SQL statements to
change the data or to make a system call that is considered to be privileged.
Example
The query given makes the user request a page from a web server.
The attacker can identify the string that is given as an input, the URL of the web page for doing
any other illegal operations.
Comment
Chapter 30, Problem 17RQ
Problem
Step-by-step solution
Step 1 of 1
Database Fingerprinting
The attacks related to database are determined by the attacker by identifying the type of backend
database which are performed if there is weakness in DBMS.
Denial of Service
The attacker can make buffer to overflow with request or consume more number of resources or
they delete some data, thus denying the service to the intended users.
Bypassing Authentication
The attacker can access the database system as an authorized user and perform all the desired
operations.
The attacker obtains the sensitive information such as the type and structure of the back-end
database of a web application. It is possible as the default error page is descriptive that are
returned by application servers.
By this the attacker uses the tool to execute the commands on the database. For example
attacker can execute stored procedures and functions from a remote SQL interface.
This attack makes use of logical flaws within the database to improve the level of access.
Comment
Problem
Chapter 30, Problem 18RQ
Step-by-step solution
Step 1 of 1
Preventing from SQL injection attacks is achieved by using some programming rules to all
procedures and functions that are accessed through web. Some of the techniques include,
Bind Variables
• The bind variables are used to (using parameter) protects against injection attacks and hence
performance is improved.
• User input should be bound to a parameter instead of using it in the statement, in this example
the input ‘1’ is assigned to a bind variable ‘empid’ instead of directly passing string parameters.
Filtering Input
• It is used to remove the escape characters by using Replace function of SQL from input strings.
• For example the delimiter (“) double quote is replaced by (‘’) two single quotes.
Function Security
Database standard and custom functions should be restricted as they take advantage during the
SQL function injection attacks.
Comment
Chapter 30, Problem 19RQ
Problem
Step-by-step solution
Step 1 of 1
Statistical database are used mainly to produce statistics on various populations. The database
may contain data on individuals , which should be protected from user access. Users are
permitted to retrieve statistical information on the populations such as averages, sums, counts,
minimums maximums, and standard deviations.
A population is a set of tuples of a relation (table that satisfy some selection condition
Statistical database security techniques fail to provide security to individual data in some
situations.
For ex: We may want to retrieve the number of individuals in a population or the average income
in the population.
Comment
Chapter 30, Problem 20RQ
Problem
How is privacy related to statistical database security? What measures can be taken to ensure
some degree of privacy in statistical databases?
Step-by-step solution
Step 1 of 2
Statistical database are used mainly to produce statistics about various populations. The
database may contain confidential data about individuals, which should be protected from user
access. However, users are permitted to retrieve statistical information about the populations,
such as averages, sums, counts, maximums, minimums, and standard deviations. Since there
can be ways to retrieve private information using aggregate function when much information is
available about a person, statistical database that store information impose potential threats to
privacy.
Consider a example: PERSON relation with attributes Name, Ssn, Income, Address, City, Zip,
Sex and Last_degree.
A population is set of tuples of a relation that satisfy some selection condition. Hence, each
selection condition on the PERSON relation will specify a particular population of PERSON
tuples. For example Sex = 'F' or Last_degree = 'M.Tech'.
Statistical queries involve applying statistical functions to a population of tuples. For example:
Avg Income. However, access to personal information is not allowed. Statistical database
security techniques must prohibit queries that retrieve attribute values and by allowing only
queries that involve aggregate functions such as ADD, MIN,,MAX, AVG, COUNT and
STANDATRD DEVIATION. Such queries are sometime called statistical queries.
Comment
Step 2 of 2
WHERE ;
WHERE;
Let someone is interested in find in salary of Jane Smith, who is a female with last degree 'M.S,
and stays in Houston adding all these to let we get 1 as result of Q1. Now using same condition
for Q2 will give salary of Jane Smith. even if result is not 1 for Q1, still MAX and MIN functions
can be used to get range of salary.
1.) No statistical queries are permitted whenever number of tuples in the population specified by
selection falls below some threshold.
4.) Partitioning of database into groups and any qury must refer to any complete group, but never
to subsets of records within groups.
Comment
Chapter 30, Problem 21RQ
Problem
What is flow control as a security measure? What types of flow control exist?
Step-by-step solution
Step 1 of 3
Flow control regulates the distribution or flow of information among accessible objects. A flow
between object X and object Y occurs when a program reads values from X and writes values
into Y. Flow control checks that information contained in some object does not flow explicitly or
implicitly into less protected objects. Thus, a user cannot get indirectly in Y what he or she
cannot get directly in X. Most flow controls employ some concepts of security class; the transfer
of information from a sender to a receiver is allowed only if the receiver's security class is at least
as privilege as sender's.
Examples of a flow control program include preventing a service program from leaking a
customer's confidential data, and blocking the transmission of secret military data to an unknown
classified user.
A flow policy specifies the channels long which information is allowed to move. The simplest flow
policy specifies just two classes of information: confidential(C), and non-Confidential (N), and
allows all flows except those from class C to N. This policy can solve the confidentiality problem
that arises when a service program handles data such a s customer information, some of which
may be confidential.
Comment
Step 2 of 3
Access control mechanisms are responsible for checking users' authorizations for resource
access: Only granted operations are executed. Flow controls can be enforced by an extended
access control mechanism, which involve assigning a security class to each running program.
The program is allowed to read a particular memory segment only if its class is as high as that of
the segment. It is allowed to write in a segment only if its class as low as that of the segment.
This automatically ensures that no information transmitted by the person can move from a higher
to a lower class. For example, a military program with secret clearance can only read from
objects that are unclassified and confidential and can only write into objects that are secret or top
secret.
1.) Explicit flows: Occurring as a consequence of assignment instructions, such as Y:= f(X1, Xn)
2.) Implicit flows: Generated by conditional instructions, such as if f(Xm+1,..., Xn) then y:= f(X1,
Xm).
Comment
Step 3 of 3
Flow control mechanisms must verify that only authorized flows, both explicit and implicit, are
executed. A set of rules must be satisfied to ensure secure information flows. Rules may be
expressed using flow relations among classes and assigned to information, stating the
authorized flow within the system. This relation can define, for a class, the set of classes where
information can flow, or can state the specific relations to be verified between two classes to
allow information to flow from one to another. In general, flow control mechanisms implement the
control by assigning a label to each object and by specifying the security class of the object.
Labels are then used to verify the flow relations defined in the model.
Comment
Chapter 30, Problem 22RQ
Problem
Step-by-step solution
Step 1 of 2
A covert channel allows a transfer of information that violates the security or the policy.
Specifically, covert channel allows information to pass from higher classification level to a lower
classification level through improper means. Covert channels can be classified into two broad
categories:
1.) Timing Channels: In a timing channel the information is conveyed by the timing event
processes
2.) Storage channels: In storage channels temporal synchronization is not required, in that
information is conveyed by accessing system information or what is otherwise inaccessible to the
user.
Comment
Step 2 of 2
In a simple example of a convert channel, consider a distributed database system in which two
nodes have user security levels of secret(S) and unclassified (U). In order for a transaction to
commit, both nodes must agree to commit. They mutually can only do operations that are
consistent with *- property, which states that in any transaction, the S site cannot writ or pass
information to the U site. However, if these two sites collude to set up a covert channel between
them, a transaction involving secret data may be committed unconditionally by the U site, but the
S site may do so in some predefined agreed-upon way so that certain information may be
passed from the site S to the U site. Measures such as locking prevent concurrent writing of the
information by users with different security levels into the same objects, preventing the storage-
type convert channels. Operating systems and distributed database provide control over the
multi-programming of operations that allows a sharing of resources without the possibility of
encroachment of one program or process into another's memory or other resources in the
system, thus preventing timing-oriented covert channels. In general, covert channels are not a
major problem in well-implemented robust database implementations. However, certain schemes
may be contrived by clever uses that implicitly transfer information.
Some security experts believe that one way to avoid covert channels is to disallow programmers
to actually gain access to sensitive data that a program will process after the program has been
put into operation.
Comment
Chapter 30, Problem 23RQ
Problem
What is the goal of encryption? What process is involved in encrypting data and then recovering
it at the other end?
Step-by-step solution
Step 1 of 1
Suppose data is communicated via a secure channel but still falls into wrong hands. In this
situation, by using encryption we can disguise the message so that even if the transmission is
diverted, the message will not be revealed. Encryption is a means of maintaining secure data in
an insecure environment.
The resulting data has to be decrypted using a decryption key to recover the original data.
Comment
Chapter 30, Problem 24RQ
Problem
Step-by-step solution
Step 1 of 3
Public key encryption: Public key encryption is based on mathematical functions rather than
operations on bit patterns. They also involve the use of two separate keys, in contrast to
conventional encryption, which uses one key only. The use of two keys can have profound
consequences in the areas of confidentiality, key distribution, and authentication. The two keys
used for public key encryption are referred to as the public key and the private key. Invariably, the
private key is kept secret, but it is referred to as private key rather than secret key to avoid
confusion with conventional encryption.
Comment
Step 2 of 3
2.) Encryption algorithm: Algorithm that will perform transformations on plain text.
3. and 4.) Public key and Private Key: If one of these is used for encryption the other is used for
decryption.
5.) Cipher text: Encrypted data or scrambled text for a given plaintext and set of keys.
6.) Decryption algorithms: This algorithm accepts the cipher text and the matching key and
produces the original plain text.
Comment
Step 3 of 3
Public key is made public for others to use, whereas the private key is known only to its owner. It
relies on one key for encryption and other for decryption.
1.) Each user generates a pair of keys to be used for the encryption and decryption of messages.
2.) Each user places one of the keys in a public register or other accessible file. This is the public
key. The companion key is kept private.
3.) If a sender wishes to send a private message to a receiver, the sender encrypts the message
using the receiver's public key.
4.) When the receiver receives the message, he or she decrypts it using the receiver's private
key. No other recipient can decrypt the message because only the receiver knows his or her
private key.
Comment
Chapter 30, Problem 25RQ
Problem
Question
Step-by-step solution
Step 1 of 1
The RSA encryption algorithm incorporates results form number theory, combined with the
difficulty of determining the prime factors of a target. The RSA algorithm also operates with
modular arithmetic -mod n.
Two keys e and d, are used for encryption and decryption. An important property is that they can
be interchanged. n is chosen as a large integer that is a product of two large distinct prime
numbers, a and b. The encryption key e is a randomly chosen number between 1 and n that is
relatively prime to (a-1) *(b-1). The plaintext block P is encrypted as P^e mod n. Because the
exponentiation is performed mod n, factoring P^e to uncover the encrypted plaintext is difficult.
However, the decrypting key d is carefully chosen so that (P^e)^d mod n = P. The d can be
computed from the condition that d*e = 1 mod((a-1) * (b-1)). Thus, the legitimate receiver who
knows d simply computes (P^e)^dmod n = P and recovers p without having to factor P^e.
Comment
Chapter 30, Problem 26RQ
Problem
Step-by-step solution
Step 1 of 1
Symmetric key uses same key for both encryption and decryption, by using this characteristic
fast encryption and decryption is possible to be used for sensitive data in the database.
• The message is encrypted with a secret key and can be decrypted with the same secret key.
• Algorithm used for symmetric key encryption is called as symmetric key algorithm and as they
are mostly used for encrypting the content of a message, they are also called content
encryption algorithm.
• The secret key is derived from password string used by the user by applying the same function
to the string at both sender and receiver. Thus it is also referred as password based encryption
algorithm.
• Encrypting the content using longer key is difficult to break than using shorter key as the
encryption entirely depends upon the key.
Comment
Chapter 30, Problem 27RQ
Problem
What is the public key infrastructure scheme? How does it provide security?
Step-by-step solution
Step 1 of 2
1. Plain text: This is the data or readable message that is fed into the algorithm as input.
2. Encryption algorithm: This algorithm performs various trans formations on the plaintext.
3. Public and private keys: These are a pair of keys that have been selected so that if one is
used for encryption, the other is used for decryption. The exact transformations performed by the
encryption algorithm detention the public or private key that is provided as in put.
4. Cipher text: This is the scrambled message produced as output. It depends on the plain text
and the key. For a given message two different keys will produce two different cipher texts.
5. Decryption algorithm: This algorithm accepts the cipher text and the matching key and
produces the original plaintext. A general purpose public key cryptographic works with one key
for encryption and different but related key for decryption.
Comment
Step 2 of 2
1. Each user generates a pair of key s to be used for the encryption and decryption of message.
2. User places one of two keys in public register of in an accessible file.(Public key) and
companion key is kept private.
3. If user wishes to send a private message to a receiver, the sender encrypts it using receiver
public key.
4. Receiver receives message, decry its it using the receiver private key, No other user can
decrypt the message thus this provide security to data.
Comment
Chapter 30, Problem 28RQ
Problem
Step-by-step solution
Step 1 of 1
Digital signature is a means of associating a mark unique to an individual with a body of text. The
mark should be unforgettable i.e others able to check whether signature comes from the
originator.
- signature must be different for each use. This can be achieved by making each digital signature
a function of the message that it is signing together with a time stamp.
Comment
Chapter 30, Problem 29RQ
Problem
Step-by-step solution
Step 1 of 1
A digital certificate combines the public key with the identity of the person that consists of the
corresponding private key into a statement that was digitally signed. The certificate are issued
and signed by certification authority (CA).
1. The certificate owner information, which is a unique identifier known as the distinguished name
(DN) of the owner. It includes owner’s name, organization and other related information of the
owner.
4. The validity period is specified by ‘Valid From’ and ‘Valid To’ dates.
6. Digital signature of the certification authority (CA) who issues the certificate.
All the information is encoded through message-digest function, which creates the signature.
Comment
Chapter 30, Problem 30E
Problem
Step-by-step solution
Step 1 of 3
Protecting data from unauthorized access is refereed as data privacy. The data warehouses in
which a large amount of data is stored must be kept private and secure.
There are many challenges associated with data privacy. Some of them are as follows:
• In order to preserve data privacy, performing data mining and analysis should be minimized.
Usually, a large amount of data is collected and stored in centralized location. Violating one
security policy will expose all the data. So, it is better to avoid storing data in central warehouse.
Comment
Step 2 of 3
• The database contains personal data of the individuals. So, personal data of the individuals is
to be kept secure and private.
• A lot of people in the organization and outside the organization access the data. Data must be
protected from illegal access/attacks.
Comment
Step 3 of 3
• A good security mechanism should be imposed to protect the data from unauthorized users. It
includes physical security which includes protecting the location where the data is stored.
• Provide controlled and limited access to the data. Ensure that only authorized users can access
the data by using biometrics, passwords etc. Also impose mechanism so that they access the
data that they need.
• It is better to avoid storing data in central warehouse and distribute the data in different
locations.
Comment
Chapter 30, Problem 31E
Problem
What are some of the current outstanding challenges for database security?
Step-by-step solution
Step 1 of 3
1.) Data Quality: The database community needs techniques and organizational solutions to
access and attest the quality of data. Techniques may be as simple as Quality stamps posted on
Web sites. We also need techniques that provide more efficient integrity semantics verification
and tools for assessment od data quality, based on techniques such as record linkage.
Comment
Step 2 of 3
2.) Intellectual property Rights: With the widespread use of internet and intranets, legal and
informational aspects of data are becoming major concerns of organizations. To address these
concerns, watermarking techniques for relational data have recently been proposed. The main
purpose of digital watermarking is to protect content from unauthorized duplication and
distribution by enabling provable ownership of the content. It has traditionally relied upon the
availability of large noise domain within which the object can be altered while retaining its
essential properties. However, research is needed to assess the robustness of such techniques
and to investigate different approaches aimed at preventing intellectual property right violations.
Comment
Step 3 of 3
3.) Data Survivability: Database systems need to operate and continue their functions, even
with reduced capabilities, despite disruptive events such as information warfare attacks. A
DBMS, in addition to making every effort to prevent an attack and detecting one in the event of
occurrence, should be able to do following:
1.) Confinement: Take immediate action to eliminate the attacker's access to the system and to
isolate or contain the problem to prevent further spread.
2.) Damage assessment: Determine the extent of the problem, including failed functions and
corrupted data.
4.) Repair: Recover corrupted or lost data and repair or reinstall failed system functions to
reestablish a normal level of operation.
5.) Fault treatment: To the extent possible, identify the weaknesses exploited in the attack and
take steps to prevent a recurrence.
The goal of the information warfare attacker is to damage the organization's operation and
fulfillment of its mission through disruption of its information systems. The specific target of an
attack may be the system itself or its data. While attacks that bring the system down outright are
server and dramatic, they must also be well timed to achieve the attackers goal, since attacks will
receive immediate and concentrated attention in order to bring the system back to operational
condition, diagnose how the attack took place, and installs preventive measures.
Comment
Chapter 30, Problem 32E
Problem
Consider the relational database schema in Figure 5.5. Suppose that all the relations were
created by (and hence are owned by) user X, who wants to grant the following privileges to user
accounts A, B, C, D, and E
a. Account A can retrieve or modify any relation except DEPENDENT and can grant any of these
privileges to other users.
b. Account B can retrieve all the attributes of EMPLOYEE and DEPARTMENT except for Salary,
Mgr_ssn, and Mgr_start_date.
c. Account C can retrieve or modify WORKS_ON but can only retrieve the Fname, Minit, Lname,
and Ssn attributes of EMPLOYEE and the Pname and Pnumber attributes of PROJECT.
d. Account D can retrieve any attribute of EMPLOYEE or DEPENDENT and can modify
DEPENDENT.
e. Account E can retrieve any attribute of EMPLOYEE but only for EMPLOYEE tuples that have
Dno = 3.
f. Write SQL statements to grant these privileges. Use views where appropriate.
Step-by-step solution
Step 1 of 6
TO USER_A
Comment
Step 2 of 6
SUPERSSN, DN O
FROM EMPLOYEE ;
TO USER _ B;
TO USER _ B;
Comment
Step 3 of 6
FROM EMPLOYEE ;
TO USER _ C;
CREATE VIEWPROJIAS
SELECT PNAME, PNUMBER,
FROM PROJECT;
TO USER_C;
Comment
Step 4 of 6
Comment
Step 5 of 6
WHERE DNO = 3;
Comment
Step 6 of 6
Comment
Chapter 30, Problem 32E
Problem
Consider the relational database schema in Figure 5.5. Suppose that all the relations were
created by (and hence are owned by) user X, who wants to grant the following privileges to user
accounts A, B, C, D, and E
a. Account A can retrieve or modify any relation except DEPENDENT and can grant any of these
privileges to other users.
b. Account B can retrieve all the attributes of EMPLOYEE and DEPARTMENT except for Salary,
Mgr_ssn, and Mgr_start_date.
c. Account C can retrieve or modify WORKS_ON but can only retrieve the Fname, Minit, Lname,
and Ssn attributes of EMPLOYEE and the Pname and Pnumber attributes of PROJECT.
d. Account D can retrieve any attribute of EMPLOYEE or DEPENDENT and can modify
DEPENDENT.
e. Account E can retrieve any attribute of EMPLOYEE but only for EMPLOYEE tuples that have
Dno = 3.
f. Write SQL statements to grant these privileges. Use views where appropriate.
Step-by-step solution
Step 1 of 6
TO USER_A
Comment
Step 2 of 6
SUPERSSN, DN O
FROM EMPLOYEE ;
TO USER _ B;
TO USER _ B;
Comment
Step 3 of 6
FROM EMPLOYEE ;
TO USER _ C;
CREATE VIEWPROJIAS
SELECT PNAME, PNUMBER,
FROM PROJECT;
TO USER_C;
Comment
Step 4 of 6
Comment
Step 5 of 6
WHERE DNO = 3;
Comment
Step 6 of 6
Comment
Chapter 30, Problem 33E
Problem
Suppose that privilege (a) of Exercise is to be given with GRANT OPTION but only so that
account A can grant it to at most five accounts, and each of these accounts can propagate the
privilege to other accounts but without the GRANT OPTION privilege. What would the horizontal
and vertical propagation limits be in this case?
Consider the relational database schema in Figure 5.5. Suppose that all the relations were
created by (and hence are owned by) user X, who wants to grant the following privileges to user
accounts A, B, C, D, and E
a. Account A can retrieve or modify any relation except DEPENDENT and can grant any of these
privileges to other users.
b. Account B can retrieve all the attributes of EMPLOYEE and DEPARTMENT except for Salary,
Mgr_ssn, and Mgr_start_date.
c. Account C can retrieve or modify WORKS_ON but can only retrieve the Fname, Minit, Lname,
and Ssn attributes of EMPLOYEE and the Pname and Pnumber attributes of PROJECT.
d. Account D can retrieve any attribute of EMPLOYEE or DEPENDENT and can modify
DEPENDENT.
e. Account E can retrieve any attribute of EMPLOYEE but only for EMPLOYEE tuples that have
Dno = 3.
f. Write SQL statements to grant these privileges. Use views where appropriate.
Step-by-step solution
Step 1 of 1
So that uses A can then grant it with level 0 vertical limit (i.e with out the GRANT OPTION) to at
most five users, who then cannot further grant the privilege.
Comment
Chapter 30, Problem 34E
Problem
Consider the relation shown in Figure 30.2(d). How would it appear to a user with classification
U? Suppose that a classification U user tries to update the salary of ‘Smith’ to $50,000; what
would be the result of this action?
Step-by-step solution
Step 1 of 1
If a classification user tried to up date the salary of smith to $ 50,000, a third polyinstantiation
of smith tuple would result as follows.
Comment