Вы находитесь на странице: 1из 92

UNIT 2

INTRODUCTION
TO
DBMS
DBMS
Data is generally raw and unprocessed like age ,
name which can be processed and store in
computer.
What is a Database?
A Database is a collection of related data
organized in a way that data can be easily
accessed, managed and updated.
Introduction to DBMS
• A DBMS is a software that allows creation,
definition and manipulation of database,
allowing users to store, process and analyze data
easily.
• Examples:
• MySql
• Oracle
DATABASE MANAGEMENT SYSTEMS

A database management system (DBMS) defines, creates


and maintains a database. The DBMS also allows
controlled access to data in the database. A DBMS is a
combination of five components: hardware,
software, data, users and procedures.

Figure : DBMS components


Hardware
The hardware is the physical computer system that allows
access to data.
Software
The software is the actual program that allows users to
access, maintain and update data. In addition, the software
controls which user can access which parts of the data in the
database.

Data
The data in a database is stored physically on the storage
devices. In a database, data is a separate entity from the
software that accesses it.
Users

In a DBMS, the term users has a broad meaning. We can


divide users into two categories: end users and application
programmers.

Procedures
The last component of a DBMS is a set of procedures or
rules that should be clearly defined and followed by the users
of the database.
File Based Systems Disadvantage
• Data redundancy and inconsistency

• Difficulty in accessing data

• Data isolation (other user can’t see the transitions of one user)

• Security problems
DBMS vs. File System
• DBMS is very expensive but, the traditional file system is cheap.
• DBMS is good for the large system but, the traditional file system
is good for a small system having a small number of items.
• DBMS required lots of effort for designing but, the traditional file
system needs very low design efforts.
• DBMS is highly secured but, the traditional file system is not
secure.
• DBMS is data sharable but, the traditional file system does not
allow data sharing.
• DBMS is flexible but, the traditional file system has a lack of
flexibility and has many limitations.
• DBMS has a complex backup system but, the traditional file
system has a simple backup system.
ADVANTAGES OF DBMS

The various advantages of database management system


are:-
• Data sharing
• Less Redundancy
• Security
• Multiple user access
• Less data inconsistency
• Provides backup and recovery
• Confidentiality
DISADVANTAGES OF DBMS

The various disadvantages of database


management system are:-
• Increased costs
• Size and Complexity
• Security
• Database Failures
• Technical Staff Requirement
DATABASE ARCHITECTURE

Figure: Database Architecture


Internal level

The internal level determines where data is actually stored


on the storage devices. This level deals with low-level
access methods and how bytes are transferred to and from
storage devices.

Conceptual level

The conceptual level defines the logical view of the data. It


describes the structure of the whole database. The
conceptual level is an intermediary between the two levels.
External level

The external level interacts directly with the user. It


describes the part of the database that a particular user
group is interested in and hides the rest of the database.
People who interact with database
People who work with database :
1. End users: They are the people who interact with
the database through applications.
2. Database Designers: They are responsible for
identifying the data to be stored in database and for
choosing appropriate structure to represent and
store the data.
3. Application programmers: They are the people who
writes application programs that uses the database.
4. Database administrator(DBA): DBA is a person who
is responsible for the management of the database.
Responsibilities of a DBA
• Software maintenance and installation
• Data Handling
• Backup and Recovery
• Security
• Troubleshooting Data
DATABASE MODELS
• A database model defines the logical design of
data. The model also describes the relationships
between different parts of the data.
• Describe structure and operations that can be
performed on the data.
• In the history of database design, three models
have been in use: the hierarchical model, the
network model and the relational model.
Representational data model
1. Hierarchical data model
• Data is organized in tree like structure in which
child node is dependent on one parent node.
Example Diagram
2. Network data model
• Data is organised in form of graphs.
• A parent can have many child and a child can
have many parent.
3. THE RELATIONAL DATABASE MODEL

In the relational database model, the data is represented as a


set of relations. A relation appears as a two-dimensional
table.

Figure: An example of a relation


A relation in an RDBMS has the following features:

 Name. Each relation in a relational database should have


a name that is unique among other relations.
 Attributes. Each column in a relation is called an
attribute. The attributes are the column headings in the
table.
 Tuples. Each row in a relation is called a tuple. A tuple
defines a collection of attribute values.
Cardinality of a relation − Number of tuples in a relation.
Degree of a relation − Number of attributes in a relation.
Relation key − Each row has one or more attributes, known as
relation key, which can identify the row in the relation (table) uniquely.
DATABASE DESIGN

• The design of any database is a lengthy and involved


task that can only be done through a step-by-step
process.
• The first step normally involves interviewing potential
users of the database.
• The second step is to build an entity-relationship model
(ERM) that defines the entities, the attributes of those
entities and the relationship between those entities.
Entity-relationship model (ERM)

In this step, the database designer creates an entity-


relationship (E-R) diagram to show the entities for which
information needs to be stored and the relationship between
those entities. E-R diagrams uses several geometric shapes,
but we use only a few of them here:
❑ Rectangles represent entity sets
❑ Ellipses represent attributes
❑ Diamonds represent relationship sets
 Lines link attributes to entity.
Entity
Entities are represented by means of rectangles.
Rectangles are named with the entity set they
represent.
Relationship
• Relationships are represented by diamond-
shaped box. Name of the relationship is written
inside the diamond-box

Teaches
Cardinality
• Cardinality is the number of occurrences of an
entity from a relation that can be associated with
the number of occurrences in another relation.
• It is also known as degree of relationship.
One-to-one
• When only one instance of an entity is
associated with the relationship, it is marked
as '1:1'.
One-to-many
• When more than one instance of an entity is
associated with a relationship, it is marked as '1:N'.
• Example: An employee can also work for multiple
departments at one time.
Many-to-one
• When more than one instance of entity is
associated with the relationship, it is marked
as 'N:1'.
Many-to-many
• The following image reflects that more than one
instance of an entity on the left and more than
one instance of an entity on the right can be
associated with the relationship.
Example of Many to Many
• Each department can have any number of
employees working on a specific task.
Attributes
• Attributes are the properties of entities.
Attributes are represented by means of
ellipses.
Types of attributes
• Identifying Attribute is used to uniquely identify
an instance of an entity.
• Simple and composite attribute
• Stored and derived attribute
• Single valued and multi valued
Simple Attribute
Attribute Roll_No can not be sub-
divided further.
Composite Attribute
• Name is a composite attribute which can be
further subdivided in first name and last name
Multivalued attributes are depicted by
double ellipse.
• PhoneNo is a multivalued attribute because it has
multiple values for the same entity student
Stored and derived Attribute
• Derived attribute : age
• Stored attribute: BirthDate
A very simple E-R diagram with three entity sets, their attributes and
the relationship between the entity sets.

Figure: Entities, attributes and relationships in an E-R diagram


Types of key
1. Super key
2. Candidate key
3. Primary key
4. Foreign key
Primary key
• It is a key that can uniquely identify each record in a table.
PRIMARY KEY

• Primary key:{Stu_Id}
Candidate key
A candidate key is a set of columns, in a table that can uniquely
identify any database record without referring to any other data.

For Example, STUD_NO as well as STUD_PHONE both are


candidate keys for relation STUDENT.
Super key
The set of attributes which can uniquely identify a tuple is known as
Super Key.

Primary key + any other attribute=Super key


Example : {Stu_Id}, {Stu_Id, Stu_Name},{Stu_Id,Stu_Name,Stu_Age}
Foreign key
- Uniquely identifies a row/record in another table.
- Creates relationship between two tables.
- It is a column in child table that references the
primary key of the parent table.
- Foreign keys are the column of the table which is
used to point to the primary key of another table.
Foreign key
• Primary key of department is the foreign key
in employee table.
Integrity Constraints
• Data integrity ensures that the data in the database is correct,
consistent and valid.
• It prevents the entry of invalid data.
• Every relation has some conditions that must hold for it to be a valid
relation. These conditions are called Relational Integrity
Constraints.
• They are of the following types:
Domain integrity
• Domain integrity can be specified for each
attribute by defining its Range or domain.
• Example : Domain value for attribute age is
between 10 and 25.
Entity Integrity (primary key integrity)

• Each tupple in a relation is uniquely identify in


order to retrieve each tupple separately.
• It is done using primary key.(Not NULL and
Unique).
Check constraint
• CHECK constraint is used to restrict the value of a
column between a range.
Not Null constraint
• NOT NULL constraint restricts a column from
having a NULL value. Once NOT
NULL constraint is applied to a column, you
cannot pass a null value to that column. It
enforces a column to contain a proper value.
OPERATIONS ON RELATIONS

In a relational database we can define several operations to create


new relations based on existing ones. We define various operations
in this section: insert, delete, update, select, join, union, intersection
and difference.
Structured Query Language (SQL) is the language for use on
relational databases. It is a declarative rather than procedural
language, which means that users declare what they want without
having to write a step-by-step procedure.
TYPES OF SQL STATEMENTS
• There are four basic types of sql statements:
1. DATA DEFINITION LANGUAGE(DDL)
2. DATA MANIPULATION LANGUAGE(DML)
DATA DEFINITION LANGUAGE (DDL)
• It is used to create and destroy database and database objects.
• Three main commands are :
o CREATE – is used to create the database or its objects (like table,
index, function, views, store procedure and triggers).
o DROP – is used to delete objects from the database.
o ALTER -is used to alter the structure of the database.
o RENAME –is used to rename an object existing in the database.
CREATE
Syntax
• CREATE TABLE table_name (
column1 datatype,
column2 datatype,
column3 datatype,
....
);
EXAMPLE
• CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
ALTER
• The ALTER TABLE statement is used to add,
delete, or modify columns in an existing table.

SYNTAX
• ALTER TABLE table_name
ADD column_name datatype;
EXAMPLE
• ALTER TABLE Customers
ADD Email varchar(255);

• This command will add the column of email to


customer table.
DROP
Syntax
• DROP TABLE table_name;

EXAMPLE
DROP TABLE STUDENT;
This command will delete the student table
RENAME COMMAND

RENAME command is used to set a new name for any


existing table. Following is the syntax:

RENAME TABLE old_table_name to new_table_name;

Example: RENAME TABLE student to students_info;

The above query will rename the table student to


students_info
DML(Data Manipulation Language)
• The SQL commands that deals with the manipulation of data
present in database belong to DML or Data Manipulation
Language and this includes most of the SQL statements.

• SELECT – is used to retrieve data from the database.


• INSERT – is used to insert data into a table.
• UPDATE – is used to update existing data within a table.
• DELETE – is used to delete records from a database table.
SELECT
SYNTAX
SELECT column1, column2, ...
FROM table_name;

EXAMPLE
• SELECT * FROM table_name;
Select Example
INSERT
SYNTAX
• INSERT INTO table_name (column1, column2,
column3, ...)
VALUES (value1, value2, value3, ...);

Example
• INSERT INTO Customers (CustomerName, City,
Country)
• VALUES ('Cardinal', 'Stavanger', 'Norway');
Insert Example
Update
• UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;

• UPDATE Customers
SET ContactName = 'Alfred Schmidt',
City= 'Frankfurt'
WHERE CustomerID = 1;
Update Example
DELETE
Syntax
• DELETE FROM table_name WHERE condition;
Example:
Delete from Student where name=‘RAM’;
SQL

AGGREGATE FUNCTIONS

69
Introduction
• An aggregate function allows you to perform a
calculation on a set of values to return a single
scalar value

70
SQL Aggregate Functions
1. Count
COUNT function is the simplest function and
very useful in counting the number of
records.
Example

• SQL>SELECT COUNT(*) FROM employee_tbl ;


Output

72
SQL Aggregate Functions
2. MAX
SQL MAX function is used to find out the record with
maximum value among a record set.

73
Example

• SELECT MAX(daily_typing_pages) FROM employee_tbl;

Output
SQL Aggregate Functions
3. MIN
SQL MIN function is used to find out the
record with minimum value among a record
set.
Example

• SELECT MIN(daily_typing_pages) FROM employee_tbl;

Output
4. AVG
• SQL AVG function is used to find out the
average of a field in various records.
Example

SELECT AVG(daily_typing_pages) FROM employee_tbl;

Output
5. SUM
• SQL SUM function is used to find out the sum
of a field in various records.
Example

SELECT SUM(daily_typing_pages) FROM employee_tbl;

Output
Normalization

• Normalization is the process by which a given set of


relations are transformed to a new set of relations with
a more solid structure.
• Normalization is needed to remove anomalies in
insertion, deletion and updation of data in a database.
• Several normal forms have been proposed, including
1NF, 2NF, 3NF, BCNF (Boyce-Codd Normal Form),
4NF, 5NF and so on.
Problems Without Normalization
Insertion Anomaly
• If we have to insert data of 100 students of same
branch, then the branch information will be
repeated for all those 100 students.
Updation Anomaly
• What if Mr. X leaves the college? or is no longer
the HOD of computer science department? In
that case all the student records will have to be
updated, and if by mistake we miss any record, it
will lead to data inconsistency. This is Updation
anomaly.
Deletion Anomaly
• Student information and Branch information.
Hence, at the end of the academic year, if
student records are deleted, we will also lose the
branch information.
First normal form (1NF)
•A table is in first normal form if it contains atomic values
i.e. no repeating values should be there.
•In a table, for each and every cell there can be only one
value.

Figure: An example of 1NF


Second Normal Form (2NF)

•Second normal form says, that every non-prime attribute


should be fully functionally dependent on prime key
attribute.
•There should not be any partial dependency.
•Partial Dependency occurs when a non-prime attribute is
functionally dependent on part of a candidate key.
Partial Dependency
Let us see an example:
Example
<StudentProject>

The prime key attributes are Stud_ID and Proj_ID


The Stud_Name can be determined by Stud_ID, which makes the relation
Partial Dependent.
The Proj_Name can be determined by Proj_ID, which makes the relation
Partial Dependent.
Therefore, the <StudentProject> relation violates the 2NF in Normalization
and is considered a bad database design.
To remove Partial Dependency and violation on 2NF, decompose
the tables:

Stud_ID Proj_ID Stud_Name Proj_ID Proj_Name

Student Table Project Table


Third normal form

•For a relation to be in Third Normal Form, it must be in


Second Normal form and no non-prime attribute is
transitively dependent on prime key attribute.
•When an indirect relationship causes functional
dependency it is called Transitive Dependency.
•If P -> Q and Q -> R is true, then P-> R is a transitive
dependency.
•To achieve 3NF, eliminate the Transitive Dependency.
Example

Student_Detail

The above table is not in 3NF because it has a transitive functional dependency:
Stu_ID City
City Zip
Hence Stu_ID Zip
The above states the relation <Student_Detail> violates the 3rd Normal Form
(3NF).
To remove the violation, you need to split the tables and remove the transitive
functional dependency.

Stu_ID Stud_Name Zip City Zip

Student_Details Zipcode
BCNF
Boyce-Codd Normal Form (BCNF) is an extension of
Third Normal Form on strict terms.
BCNF states that –
•For any functional dependency, X → A, X must be
a super-key.
•In the given relations, Stu_ID is the super-key in
the relation Student_Detail and Zip is the super-
key in the relation ZipCodes.
•So, Stu_ID → Stu_Name, Zip and Zip → City Which
confirms that both the relations are in BCNF.

Вам также может понравиться