Вы находитесь на странице: 1из 44

Introduction to DBMS

What is a Database?

A structured collection of related data

An filing cabinet, an address book, a
telephone directory, a timetable, etc.

In Access, your Database is your collection
of related tables
A database is a storage space for content /
information (data)
But what is data? And where is it now?
Data is factual information about objects and concepts,
such as:
Measurements, statistics
You can find it in:
Filing cabinets, spreadsheets, folders, ledgers, lists, colleagues’
memories, piles of papers on your desk
Data vs. Information

 Data – a collection of facts made up of text, numbers


and dates:
Murray 35000 7/18/86

 Information - the meaning given to data in the way


it is interpreted:
Mr. Murray is a sales person whose annual salary is
$35,000 and wrhose hire date is July 18, 1986.

Data -> Field -> Record ->Table


What does “managing information”
mean?
Making information work for us

Making information useful

Avoiding "accidental disorganisation”

Making information easily accessible and integrated


with the rest of our work
Managing as re-organising
We often need to access and re-sort data for various
uses. These may include:
Creating mailing lists
Writing management reports
Generating lists of selected news stories
Identifying various client needs
Managing as re-processing
The processing power of a database allows it to:
 Sort
 Match
 Link
 Aggregate
 Skip fields
 Calculate
 Arrange
(DBMS)
 Collection of interrelated data
 Set of programs to access the data
 DBMS contains information about a particular enterprise
 DBMS provides an environment that is both convenient and
efficient to use.
 Database Applications:
 Banking: all transactions
 Airlines: reservations, schedules
 Universities: registration, grades
 Sales: customers, products, purchases
 Manufacturing: production, inventory, orders, supply chain
 Human resources: employee records, salaries, tax deductions
 Databases touch all aspects of our lives
Purpose of Database System
In the early days, database applications were
built on top of file systems
Drawbacks of using file systems to store data:
Data redundancy and inconsistency
 Multiple file formats, duplication of information in
different files
Difficulty in accessing data
 Need to write a new program to carry out each new task
Data isolation — multiple files and formats
Integrity problems
 Integrity constraints (e.g. account balance > 0) become
part of program code
 Hard to add new constraints or change existing ones
Drawbacks of using file systems (cont.)
Atomicity of updates
 Failures may leave database in an inconsistent state with
partial updates carried out
 E.g. transfer of funds from one account to another should

either complete or not happen at all


Concurrent access by multiple users
 Concurrent accessed needed for performance
 Uncontrolled concurrent accesses can lead to

inconsistencies
 E.g. two people reading a balance and updating it at the

same time
Security problems
Database systems offer solutions to all the above
problems
In Short,
 A Database Management System (DBMS) is a set of computer programs that
controls the creation, maintenance, and the use of a database.
 It allows organizations to place control of database development in the hands of
database administrators (DBAs) and other specialists.
 A DBMS is a system software package that helps the use of integrated collection of
data records and files known as databases.
 It allows different user application programs to easily access the same database.
 DBMSs may use any of a variety of database models, such as the network model
or relational model.
 In large systems, a DBMS allows users and other software to store and retrieve data
in a structured way.
 Instead of having to write computer programs to extract information, user can ask
simple questions in a query language.
 Thus, many DBMS packages provide Fourth-generation programming language
(4GLs) and other application development features.
 It helps to specify the logical organization for a database and access and use the
information within a database. It provides facilities for controlling data access,
enforcing data integrity, managing concurrency, and restoring the database from
backups. A DBMS also provides the ability to logically present database information
to users.
Levels of Data Abstraction
Physical level describes how a record (e.g.,
customer) is stored.
Logical level: describes data stored in database,
and the relationships among the data.
type customer = record
name : string;
street : string;
city : integer;
end;
View level: application programs hide details of
data types. Views can also hide information (e.g.,
salary) for security purposes.
View of Data
An architecture for a database system

What data users and


application programs
see ?

What data is stored ?


describe data properties such as
data semantics, data relationships

How data is actually stored ?


e.g. are we using disks ? Which
file system ?
Database Languages and Interfaces
provide appropriate languages and interfaces for each category of users.

DBMS Languages
Data Definition Language (DDL): Used by the DBA and database
designers to specify the conceptual schema of a database. In many
DBMSs, the DDL is also used to define internal and external schemas
(views). In some DBMSs, separate storage definition language (SDL)
and view definition language (VDL) are used to define internal and
external schemas.
DDL Compiler

Data Manipulation Language (DML): Used to specify database


retrievals and updates (insertion, deletion, modifications)

- DML commands (data sublanguage) can be embedded in a general-


purpose programming language (host language).

- Alternatively, stand-alone DML commands can be applied directly


(query language).
Types of DML

-Procedural DML:
• Also called record-at-a-time (record-oriented) or low-level DML
• Must be embedded in a programming language.
• Searches for and retrieves individual database records and uses looping
and other constructs of the host programming language to retrieve multiple
records.

-Declarative or non-procedural DML:


• Also called set-at-a-time (set-oriented) or high-level DML.
• Can be used as a stand-alone query language or can be embedded in a
programming language.
• Searches for and retrieves information from multiple related database
records in a single command.

- host language: general-purpose language


- data sublanguage: DML
- C++
Data Models
A collection of tools for describing
data
data relationships
data semantics
data constraints
Entity-Relationship model
Relational model
Other models:
object-oriented model
semi-structured data models
Older models: network model and hierarchical
model
Entity-Relationship Model
Example of schema in the entity-relationship model
Entity Relationship Model (Cont.)
E-R model of real world
Entities (objects)
 E.g. customers, accounts, bank branch
Relationships between entities
 E.g. Account A-101 is held by customer Johnson
 Relationship set depositor associates customers with

accounts
Widely used for database design
Database design in E-R model usually converted to
design in the relational model (coming up next)
which is used for storage and processing
Relational Model
Attributes
Example of tabular data in the relational model
customer- customer- customer- account-
Customer-
name street city number
id
192-83-7465 Johnson Alma Palo Alto A-101
019-28-3746 Smith North Rye A-215
192-83-7465 Johnson Alma Palo Alto A-201
321-12-3123 Jones Main Harrison A-217
019-28-3746 Smith North Rye A-201
A Sample Relational Database
Categories of data models
Conceptual (high-level, semantic) data models:
Provide concepts that are close to the way many
users perceive data. (Also called entity-based or
object-based data models.)
Physical (low-level, internal) data models:
Provide concepts that describe details of how data
is stored in the computer.
Implementation (representational) data
models: Provide concepts that fall between the
above two, balancing user views with some
computer storage details.
Slide 2-21
History of Data Models
Relational Model: proposed in 1970 by E.F. Codd (IBM),
first commercial system in 1981-82. Now in several
commercial products (DB2, ORACLE, SQL Server,
SYBASE, INFORMIX).
Network Model: the first one to be implemented by
Honeywell in 1964-65 (IDS System). Adopted heavily due
to the support by CODASYL (CODASYL - DBTG report of
1971). Later implemented in a large variety of systems -
IDMS (Cullinet - now CA), DMS 1100 (Unisys), IMAGE
(H.P.), VAX -DBMS (Digital Equipment Corp.).
Hierarchical Data Model: implemented in a joint effort by
IBM and North American Rockwell around 1965. Resulted
in the IMS family of systems. The most popular model.
Other system based on this model: System 2k (SAS inc.)

Slide 2-22
History of Data Models
Object-oriented Data Model(s): several models have been
proposed for implementing in a database system. One set
comprises models of persistent O-O Programming
Languages such as C++ (e.g., in OBJECTSTORE or
VERSANT), and Smalltalk (e.g., in GEMSTONE).
Additionally, systems like O2, ORION (at MCC - then
ITASCA), IRIS (at H.P.- used in Open OODB).
Object-Relational Models: Most Recent Trend. Started
with Informix Universal Server. Exemplified in the latest
versions of Oracle-10i, DB2, and SQL Server etc. systems.

Slide 2-23
Hierarchical Model
• ADVANTAGES:
• Hierarchical Model is simple to construct and operate on
• Corresponds to a number of natural hierarchically organized
domains - e.g., assemblies in manufacturing, personnel
organization in companies
• Language is simple; uses constructs like GET, GET UNIQUE,
GET NEXT, GET NEXT WITHIN PARENT etc.
• DISADVANTAGES:
• Navigational and procedural nature of processing
• Database is visualized as a linear arrangement of records
• Little scope for "query optimization"

Slide 2-24
Network
• ADVANTAGES:
Model
• Network Model is able to model complex relationships and
represents semantics of add/delete on the relationships.
• Can handle most situations for modeling using record types and
relationship types.
• Language is navigational; uses constructs like FIND, FIND
member, FIND owner, FIND NEXT within set, GET etc.
Programmers can do optimal navigation through the database.
• DISADVANTAGES:
• Navigational and procedural nature of processing
• Database contains a complex array of pointers that thread
through a set of records.
Little scope for automated "query optimization”

Slide 2-25
Data Independence By adding or removing a record type or
data item to
· expand the database
· reduce the database
ogical Data Independence: The capacity to change the conceptual schema without
aving to change the external schemas and their application programs.

hysical Data Independence: The capacity to change the internal schema without
aving to change the conceptual schema.

Reorganize physical files to improve performance


e.g. List all sections offered in Fall 1998
When a schema at a lower level is changed, only the mappings between this
chema and higher-lever schemas need to be changed in a DBMS that fully supports
ata independence. The higher-level schemas themselves are unchanged. Hence, the
pplication programs need not be changed since they refer to the external schemas.

Disadvantages of two levels of mappings:


Overhead during compilation or execution of a query or
program
Components of DBMS
DBMS Engine accepts logical request from the various other DBMS
subsystems, converts them into physical equivalents, and actually accesses
the database and data dictionary as they exist on a storage device.
Data Definition Subsystem helps user to create and maintain the data
dictionary and define the structure of the files in a database.
Data Manipulation Subsystem helps user to add, change, and delete
information in a database and query it for valuable information. Software tools
within the data manipulation subsystem are most often the primary interface
between user and the information contained in a database. It allows user to
specify its logical information requirements.
Application Generation Subsystem contains facilities to help users to
develop transaction-intensive applications. It usually requires that user
perform a detailed series of tasks to process a transaction. It facilitates easy-
to-use data entry screens, programming languages, and interfaces.
Data Administration Subsystem helps users to manage the overall
database environment by providing facilities for backup and recovery, security
management, query optimization, concurrency control, and change
management.
Components of DBMS
DBMS Component Modules
Overall System Structure
naïve users application sophisticated database users
(tellers, agents, etc) programmers users administrator

application Application query database


interface program scheme

Embedded DML DDL query


DML compiler interpreter processor
precompiler
application database-
program management
object code query evaluation system
engine

storage
transaction buffer manager manager
manager

File manager

indices Statistical data disk storage

Data files Data dictionary


Components of a DBMS
Multi-User DBMS Architectures
Teleprocessing

File-server

Client-server
Teleprocessing
Traditional architecture.

Single mainframe with a number of terminals


attached.

Trend is now towards downsizing.


Teleprocessing Topology
File-Server
File-server is connected to several workstations
across a network.

Database resides on file-server.

DBMS and applications run on each


workstation.

Disadvantages include:
 Significant network traffic.
 Copy of DBMS on each workstation.
 Concurrency, recovery and integrity control more complex.
File-Server Architecture
Client-Server
Server holds the database and the DBMS.

Client manages user interface and runs


applications.
Advantages include:
wider access to existing databases;
increased performance;
possible reduction in hardware costs;
reduction in communication costs;
increased consistency.
Client-Server Architecture
Alternative Client-Server Topologies
Transaction Processing Monitors
Program that controls data transfer between
clients and servers in order to provide a consistent
environment, particularly for Online Transaction
Processing (OLTP).
Transaction Processing Monitor as middle
tier of a three-tier client-server
architecture
System Catalog
Repository of information (metadata) describing
the data in the database.
Typically stores:
 names of authorized users;
 names of data items in the database;
 constraints on each data item;
 data items accessible by a user and the type of access.
Used by modules such as Authorization Control
and Integrity Checker.
Information Resource Dictionary System
(IRDS)
Response to an attempt to standardize data
dictionary interfaces.

Objectives:
extensibility of data;
integrity of data;
controlled access to data.
IRDS services interface

Вам также может понравиться