Вы находитесь на странице: 1из 9

1

CHAPTER 11
DATABASE CONCEPTS
Database: A database is a collection of interrelated data.
Database system: is a computer based record keeping system.
Purpose/Functions of databases
 Databases can be used to serve the same collection of data to as many applications as possible.
 Database reduces data redundancy to a large extent.
 Database can control data inconsistency.
 Database facilitates sharing of data.
 Database enforces standards.
 Database can ensure data security.
 Integrity can be maintained through databases.
 Database provides a centralized control of the data.
Differences between file processing and database system
 In file processing system, records are permanently stored in various files and different
programs can access the file. But it has lot of limitations such as data redundancy, data
inconsistency, unstandardized data, insecure data etc.
 Database provides centralized control of the data which controls data redundancy and
data inconsistency. It also enforces standards, ensures data security and integrity.
Data redundancy
 Duplication of data is known as data redundancy.
 Student records are maintained in the school as well as in the hostel. Suppose that the Problems
address of a hosteller gets changed. If the student forgets to inform the same to the school associated with
authorities, the same student’s record differs in the hostel and school files. This leads to data redundancy?

inconsistent data.
 Another problem of data redundancy is the wastage of storage space.
 The database systems do not maintain separate copies of the same data. All the data are How do database
kept at one place and all the applications refer to the centrally maintained database. management system
overcome the
 The data updations will happen at one place and the same changed information will be problems of data
available to all the applications referring to it. redundancy?

Application
program 1 User 1

Application
program 2
Database User 2
Application
program 3 User 3

Data inconsistency
 Data is said to be inconsistent if two entries about the same data do not match. That is,
data become inconsistent if redundancy is not controlled.
 Data inconsistency can be reduced by,
 Controlling redundancy
 Propagating updates.
Sharing of data
 It means that each of the database users may have access to the same piece of data and
each of them may use it for different purposes.
Standardizing database
 Standardized database ensures that all the data follow the standards laid by the company
or organization using the database.
 Standardizing makes data interchange or migration between systems possible.
2

Data security
 Data security refers to the protection of data against getting disclosed accidently to
unauthorized persons.
 Data security can be ensured by
 Accessing database through proper channels
 Carrying out authorization checks whenever access to sensitive data is attempted
Data integrity
 Data integrity refers to the overall completeness, accuracy and consistency of data.
 Data integrity can be maintained through integrity checks to ensure that data values
confirm to certain specified rules.
Ex: Range checks & value matching
Database Abstraction
 The process of hiding internal irrelevant details from users to ease the user interaction
with database is called data abstraction.
Different types of database users
1. End user
2. Application system analyst
3. Physical storage system analyst
End user
 End user is a person who is not computer trained but uses the database to retrieve
information.
 Ex: The customer of a bank who checks his/her account balance
Application system analyst
 Application system analyst is the person who is concerned about the database at logical
level.
 That is, about what all are the data constitutes the database, about the relationship
between the data entities etc. without considering the implementation details at the
physical level.
Physical storage system analyst
 Physical storage system analyst is a person who is concerned about the implementation
details of the database at the physical level.
 That is, which storage device can be used for storing the database, what will be the base
address etc.
Various levels of database implementation OR Different levels of database abstraction
1. Internal level
2. Conceptual level
3. External level
Internal level (Physical level)
 It is the lowest level of abstraction.
 It is also known as physical level
 This level describes how the data are actually stored in the storage medium
 This level uses complex low level data structures.
 Physical storage system analyst is the person who is concerned about this level
Conceptual level
 This level describes what data are actually stored inside the database.
 The level also describes the relationship among data
 This level describes database logically in terms of simple data structures
 Application system analyst is the person who is concerned about this level
External level (View level)
 This is the level closest to the users
 This level is concerned with the way data is viewed by the users
 Most of the users are not concerned about the complete data in the database. They are
concerned only about the data they require.
3

 Ex: In a bank database, the customer is concerned only about his/her account details and
not about the other information stored in the database.
 It is also known as view level.

About the About the


View 1 View 2 View 3 personal details mark details
of the student

About the data


Conceptual and type of
level data

Physical level About the device &


address for storing
the data

Data independence
 The ability to modify scheme definition in one level without affecting the scheme
definition in the next higher level is called data independence.
 There are two levels of data independence. They are
1. Physical data independence
2. Logical data independence
Physical data independence
 It is the ability to modify the scheme at physical level without affecting the scheme at the
conceptual level.
 Modifications at physical level improve the performance of the system.
 Application programs remain the same even the scheme at the physical level gets
modified.
Logical data independence
 It is the ability to modify the conceptual scheme without making any changes at the
scheme at the external/view level.
 Modifications at conceptual level are necessary whenever logical structures of the
database get altered due to unavoidable reasons. Ex. Introducing paternity leave in the
employee database, introducing fine for the late returning of books in to the software of a
public library etc
 Application programs remain the same.
Different data models
Data model is a collection of conceptual tools for describing data, data relationships, data
semantics and consistency constraints. It helps in describing the structure of data at the logical
level. It is a link between user’s view of the world and bits stored in computer.
There are 3 data models that are used for database management. They are,
1. Relational data model
2. Network data model
3. Hierarchical data model
Relational data model
 In relational data model, data is organized into tables. (i.e rows & columns)
 These tables are called relations. In database, a relation means a 'table', in which data are
organized in the form of rows and columns.Therefore in database, relations are equivalent to
tables.
4

 Each row in a table represents the relationship among a set of values. Since a table is a
collection of such relationships, it is also referred using the term relation or entity.
 Rows in a relation are called as tuples and columns are called as attributes. The tuple is
also known as a 'record'.
 The users of the relational database system may query these tables, insert new tuples,
delete tuples and modify tuples. There are several languages for doing all these tasks.
One such language is relational query language.

Suppliers
Supp# Supp-name Status City
S1 Britannia 10 Delhi
S2 New bakers 30 Mumbai

Items
Item# Item-name Price
11 Milk 15.00
12 Cake 5.00

Shipments
Supp# Item# Qty-supplied
S1 12 10
S1 13 20
S1 16 20
Note: Primary keys are given in grey colour.

Network data model


 In network data model, data is represented by collection of records and relationships
among data are represented by links.
 A record is a collection of fields/attributes, each of which contains only one data value.
 A link is an association between two records.

Hierarchical model
 In hierarchical model also data is represented as records and relationships among data are
represented using links.
 The only difference is that in hierarchical model, records are organized as trees rather
arbitrary graphs.
 The record type at the top of the tree is known as the root.
5

 The operations on a hierarchical database are performed through a data manipulation


language.

Relational Model
Terminologies
Relation
 A relation means a 'table', in which data are organized in the form of rows and columns.
 A relation has the following properties
 In any given column of a table, all items must be of the same kind.
 In a row, a column cannot have more than one value.
 All rows of a relation are distinct.
 There is no order maintained for rows and columns inside a relation.
 Distinct names are given for the columns of a relation.
Domain
A domain is a pool of values from which the actual values of a given column are drawn.
That is, set of permitted values of an attribute is known as the domain of that attribute.
Tuple
Rows in a relation are called as tuples. The tuple is also known as a 'record'.
Attributes
Columns in a relation are called as attributes.
Degree
The number of attributes in a relation is known as the degree of a relation.
Cardinality
The number of rows in a relation is known as the cardinality of a relation.
View
 A view is a table whose contents are taken from other tables depending upon a condition.
 A view does not really exist in its own right but is derived from one or more base
table(s).
 Base tables are the tables that actually contain data and are used for deriving views.
 There is no stored file created for storing the contents of a view. Only its definition is
stored.
 Every time when a view is referred, the contents are derived from its base tables.
 View is also referred as a virtual table.
6

Keys
Primary key
 A primary key is a set of one or more attributes that can uniquely identify tuples within
the relation.
 If the primary key consists of more than one attribute, it is called composite primary key.
 Primary key is non redundant. That is, it does not have duplicate values.
 The non primary key attributes of a table are known as non key attributes.
Suppliers
Supp# Supp-name Status City Here Supp# is the primary key
S1 Britannia 10 Delhi
S2 New bakers 30 Mumbai

Items
Item# Item-name Price
11 Milk 15.00 Here Item# is the primary key
12 Cake 5.00

Shipments
Supp# Item# Qty-supplied
S1 12 10 Here Supp# & Item# together is the
S1 13 20 primary key.
S1 16 20

Candidate keys
 All attribute combinations in a relation that can serve as primary key are candidate keys.
 Ex: In Items table, Item# and Item-name are the candidate keys.
Alternate key
 A candidate key that is not the primary key is known as alternate key.
 Ex: In Items table, Item-name is the alternate key.
Foreign key
A foreign key is a field (or collection of fields) in one table that uniquely identifies a row of
another table. That is, foreign key is defined in a second table, but it refers to the primary key in
the first table.

Relational Algebra
 Relational algebra is a collection of operations on relations.
Operations in relational algebra
 Select
 Project
 Cartesian product
 Union
 Set difference
 Set intersection
7

Unary operation
 Operations that act with one relation are called unary operations.
 Ex: select, project
Binary operation
 Operations that act with pairs of relations are called binary operations.
 Ex: Cartesian product, Union, Set difference, Set intersection
Select operation
 It selects tuples from a relation based on a given predicate/condition.
 Selection is denoted by lowercase Greek letter sigma ( σ ).
 Format is
σ condition (relation)
 Ex: σ price>10 (Items)
 Ex: σ City=”Delhi” (Suppliers)
 To specify the predicate, the relational operators =, <, >, ≠, ≤, ≥ can be used.
 More than one condition can be combined using the connectives ‘and’ (^) and ‘or’ ( v ).

Project operation
 Projection allows to select specified attributes in a specified order.
 It is denoted by the Greek letter pi ( ∏ ).
 Format is
∏ attributes (relation)
 Ex: ∏ Item#, Item-name (Items)
Ex: ∏ Supp#, Supp-Name (Suppliers)
 Duplicating tuples are automatically removed in the resulting relation.
 Ex:

∏ Hobby (Person)
Result will be,
Hobby
stamps
coins
hiking
8

Cartesian product operation


 It is a binary operation.
 It is denoted by cross (x).
 Cartesian product of two relations A & B is denoted as A x B.
 Cartesian product produces a relation whose degree is equal to the sum of the degrees of the
relations operated upon.
 The number of tuples of the new relation is equal to the product of the number of tuples of the
relations operated upon.

Union operation
 It is a binary operation that requires two relations.
 It produces a third relation that contains tuples from both the operand relations.
 It is denoted by U.
 Union of two relations A & B, is denoted as A U B.
 A U B is valid only when the relations A & B satisfy the following conditions.
 Relations A & B must be of the same degree. That is, they must have the same
number of attributes.
 The domain of the ith attribute in A and the ith attribute of B must be the same.
 All the duplicate tuples will automatically be removed.

Set difference operation


 It is denoted by minus ( - ) sign.
 It gives the tuples that are in one relation but not in another.
 A-B gives the tuples in relation ‘A’ but not in relation ‘B’.
9

Set intersection operation


 It finds the tuples that are common to both the relations operated upon.
 It is denoted by ∩.
 A ∩ B gives the tuples common to both A & B.

 A ∩ B = A – (A – B)

Вам также может понравиться