Вы находитесь на странице: 1из 10

7/4/2016

Outline

File Systems

Introduction
What is a distributed DBMS

Distributed DBMS Architecture

Background
Distributed Database Design
Database Integration
Semantic Data Control
Distributed Query Processing
Multidatabase query processing
Distributed Transaction Management
Data Replication
Parallel Database Systems
Distributed Object DBMS
Peer-to-Peer Data Management
Web Data Management
Current Issues

Distributed DBMS

program 1

program 2

Application
program 2
(with data
semantics)

program 3
File 3

data description 3

M. T. zsu & P. Valduriez

Ch.1/1

Distributed DBMS

M. T. zsu & P. Valduriez

Database
Technology
DBMS

description
manipulation
control

Ch.1/2

Motivation
Computer
Networks

integration
database

distribution

Distributed
Database
Systems

Application
program 3
(with data
semantics)
Distributed DBMS

File 2

data description 2

Database Management
Application
program 1
(with data
semantics)

File 1

data description 1

integration
integration centralization
M. T. zsu & P. Valduriez

Ch.1/3

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/4

7/4/2016

What is a Distributed Database


System?

Distributed Computing

A number of autonomous processing elements (not necessarily


homogeneous) that are interconnected by a computer network and that
cooperate in performing their assigned tasks.

A distributed database (DDB) is a collection of multiple, logically


interrelated databases distributed over a computer network.

What is being distributed?

A distributed database management system (DDBMS) is the software


that manages the DDB and provides an access mechanism that makes this
distribution transparent to the users.

Processing logic
Function
Data

Distributed database system (DDBS) = DDB + DDBMS

Control

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/5

What is not a DDBS?

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/6

Centralized DBMS on a Network

A timesharing computer system


Site 1

A loosely or tightly coupled multiprocessor system

Site 2

A database system which resides at one of the nodes of a network of


computers - this is a centralized database on a network node

Site 5
Communication
Network

Site 4

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/7

Distributed DBMS

Site 3

M. T. zsu & P. Valduriez

Ch.1/8

7/4/2016

Distributed DBMS Environment

Implicit Assumptions

Site 1

Site 2

Site 5
Communication
Network

Data stored at a number of sites each site logically consists of a single


processor.
Processors at different sites are interconnected by a computer network
not a multiprocessor system
Parallel database systems

Distributed database is a database, not a collection of files data logically


related as exhibited in the users access patterns

D-DBMS is a full-fledged DBMS

Relational data model

Not remote file system, not a TP system

Site 4

Distributed DBMS

Site 3

M. T. zsu & P. Valduriez

Ch.1/9

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/10

Data Delivery Alternatives

Distributed DBMS Promises

Transparent management of distributed, fragmented, and replicated data

Delivery modes
Pull-only

Improved reliability/availability through distributed transactions

Push-only

Hybrid

Improved performance

Frequency

Easier and more economical system expansion

Periodic
Conditional

Ad-hoc or irregular

Communication Methods
Unicast

One-to-many

Note: not all combinations make sense

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/11

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/12

Ch.
x/1
2

7/4/2016

Transparency

Example

Transparency is the separation of the higher level semantics of a system


from the lower level implementation issues.

Fundamental issue is to provide


data independence
in the distributed environment
Network (distribution) transparency
Replication transparency
Fragmentation transparency

horizontal fragmentation: selection

vertical fragmentation: projection


hybrid

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/13

Ch.
x/1
3

Transparent Access

M. T. zsu & P. Valduriez

Ch.1/14

Distributed Database - User View

SELECT ENAME,SAL
FROM

Distributed DBMS

Tokyo

EMP,ASG,PAY

WHERE DUR > 12

Paris

Boston

AND

EMP.ENO = ASG.ENO

AND

PAY.TITLE = EMP.TITLE

Communication
Network

Paris projects
Paris employees
Paris assignments
Boston employees

Distributed Database

Boston projects
Boston employees
Boston assignments

Montreal
New
York
Boston projects
New York employees
New York projects
New York assignments
Distributed DBMS

M. T. zsu & P. Valduriez

Montreal projects
Paris projects
New York projects
with budget > 200000
Montreal employees
Montreal assignments
Ch.1/15

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/16

7/4/2016

Distributed DBMS - Reality

Types of Transparency

User
Query

DBMS
Software

DBMS
Software

DBMS
Software

User
Application

Network transparency (or distribution transparency)


Location transparency

DBMS
Software

Communication
Subsystem

User
Query

Data independence

Fragmentation transparency

Replication transparency
Fragmentation transparency

User
Application

DBMS
Software

User
Query

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/17

Distributed DBMS

M. T. zsu & P. Valduriez

Reliability Through Transactions

Potentially Improved
Performance

Proximity of data to its points of use

Parallelism in execution

Replicated components and data should make distributed DBMS more


reliable.

Requires some support for fragmentation and replication

Distributed transactions provide


Concurrency transparency
Failure atomicity

Inter-query parallelism

Distributed transaction support requires implementation of

Intra-query parallelism

Distributed concurrency control protocols

Ch.1/18

Commit protocols

Data replication
Great for read-intensive workloads, problematic for updates
Replication protocols

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/19

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/20

7/4/2016

Parallelism Requirements

System Expansion

Issue is database scaling

Emergence of microprocessor and workstation technologies

Have as much of the data required by each application at the site where the
application executes
Full replication

Demise of Grosh's law

How about updates?

Client-server model of computing

Mutual consistency

Freshness of copies

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/21

Data communication cost vs telecommunication cost

Distributed DBMS

M. T. zsu & P. Valduriez

Distributed DBMS Issues

Distributed DBMS Issues

Distributed Database Design

Concurrency Control

How to distribute the database

Synchronization of concurrent accesses

Replicated & non-replicated database distribution

Consistency and isolation of transactions' effects

A related problem in directory management

Deadlock management

Query Processing

Reliability

Convert user transactions to data manipulation instructions

How to make the system resilient to failures

Optimization problem

Atomicity and durability

Ch.1/22

min{cost = data transmission + local processing}

General formulation is NP-hard

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/23

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/24

7/4/2016

Relationship Between Issues

Related Issues

Directory
Management

Operating System Support


Operating system with proper support for database operations
Dichotomy between general purpose processing requirements and database

Query
Processing

Distribution
Design

Reliability

processing requirements

Open Systems and Interoperability


Distributed Multidatabase Systems
More probable scenario
Parallel issues

Concurrency
Control
Deadlock
Management
Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/25

Architecture

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/26

ANSI/SPARC Architecture

Defines the structure of the system


Users

components identified
functions of each component defined

External
Schema

interrelationships and interactions between components defined

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/27

External
view

External
view

Conceptual
Schema

Conceptual
view

Internal
Schema

Internal view

Distributed DBMS

M. T. zsu & P. Valduriez

External
view

Ch.1/28

7/4/2016

DBMS Implementation
Alternatives

Generic DBMS Architecture

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/29

Dimensions of the Problem

Distribution

Heterogeneity

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/30

Client/Server Architecture

Whether the components of the system are located on the same machine or not
Various levels (hardware, communications, operating system)
DBMS important one

data model, query language,transaction management algorithms

Autonomy
Not well understood and most troublesome
Various versions

Design autonomy: Ability of a component DBMS to decide on issues related to its own
design.

Communication autonomy: Ability of a component DBMS to decide whether and how to


communicate with other DBMSs.

Execution autonomy: Ability of a component DBMS to execute local operations in any


manner it wants to.

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/31

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/32

7/4/2016

Advantages of Client-Server
Architectures

Database Server

More efficient division of labor


Horizontal and vertical scaling of resources
Better price/performance on client machines
Ability to use familiar tools on client machines
Client access to remote data (via standards)
Full DBMS functionality provided to client workstations
Overall better system price/performance

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/33

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/34

Datalogical Distributed DBMS


Architecture

Distributed Database Servers

ES1

ES2

...

ESn

GCS

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/35

Distributed DBMS

LCS1

LCS2

...

LCSn

LIS1

LIS2

...

LISn

M. T. zsu & P. Valduriez

Ch.1/36

7/4/2016

Peer-to-Peer Component
Architecture
DATA PROCESSOR

Local Query
Processor

System
Log

Local
Internal
Schema

Database

LES11

Runtime
Support
Processor

Local
Conceptual
Schema

GD/D

Global
Execution
Monitor

Global Query
Optimizer

USER

Global
Conceptual
Schema

Semantic Data
Controller

User
requests

User Interface
Handler

External
Schema

Local Recovery
Manager

USER PROCESSOR

Datalogical Multi-DBMS
Architecture

System
responses

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/37

MDBS Components & Execution

Distributed DBMS

GES1

GES2

LES1n

GCS

...

GESn

LESn1

LCS1

LCS2

LCSn

LIS1

LIS2

LISn

LESnm

M. T. zsu & P. Valduriez

Ch.1/38

Mediator/Wrapper Architecture

Global
User
Request

Local
User
Request
Global
Subrequest

DBMS1

Distributed DBMS

Local
User
Request

Multi-DBMS
Layer
Global
Subrequest

DBMS2

M. T. zsu & P. Valduriez

Global
Subrequest

DBMS3

Ch.1/39

Distributed DBMS

M. T. zsu & P. Valduriez

Ch.1/40

10

Вам также может понравиться