You are on page 1of 103

Chapter 6

Database Design

In this chapter, you will learn:

That successful database design must reflect the information system of which the database is a part That successful information systems are subject to frequent evaluation and revision within a framework known as the Systems Development Life Cycle (SDLC)

In this chapter, you will learn:

Within the information system, the most successful databases are subject to frequent evaluation and revision within a framework known as the Database Life Cycle (DBLC) How to conduct evaluation and revision within the SDLC and DBLC frameworks What database design strategies exist:

top-down vs. bottom-up design centralized vs. decentralized design



Changing Data into Information

Raw facts stored in databases Seldom immediately useful to a decision maker Need additional processing to become useful information

Changing Data into Information

Required by decision maker Data must be put together or transformed to produce information that can create insight Data processed and presented in a meaningful form

Tabulating data (cross-clasification table) Plotting data Reporting data

Changing Data into Information


Data transformation: process that changes data into information

Simple tabulation Detailed report Statistical procedure Graphic presentation

Provide the basis for decision making, which is a driving force


Changing Data into Information

Without enough and reliable data, one can not obtain information Without information, one can not make great decision

The Information System

Carefully designed and constructed repository of facts Fact repository is a part of larger system called information system Database is a part of an information system

The Information System

Information System
Provides data collection, storage, and retrieval Also facilitates data transformation of data into information Manages both data and information


The Information System

Information System

It has several components including:

People Hardware Software Databases Application program procedures


The Information System (Cont.)

Building an information system has two processes

Systems analysis Systems development


The Information System (Cont.)

Systems Analysis

Establishes the need for and the extent of an information system

Systems development
Process of creating information system Within systems development, it has application transformation data into information in terms of formal report, tabulations, and graphic displays



The Information System (Cont.)

Performance of an information system depends on

Database design and implementation (it is called database development) Application design and implementation Administrative procedures Both Systems analysis and systems development require much careful planning


The Information System (Cont.)

Database development
Process of database design and implementation Primary objective of database design: to create complete, normalized, non-redundant and fully integrated conceptual, logical, and physical database models Implementation

Creating storage structure Loading data into database Providing for data management


The Information System (Cont.)

Detailed procedure of how to build an information system Detailed procedure of how to build database system Keep as general as possible


Systems Development Life Cycle (SDLC)

Procedure (history) of an information system Important to system designer Big frame within which the database design and application development can be mapped out and evaluated In other words, database design and application development take place within the confines of an information system

Systems Development Life Cycle (SDLC)

Information system Database design Application development


Systems Development Life Cycle (SDLC)

SDLC is divided into five phases based on time or procedure order

Planning Analysis Detailed systems design Implementation Maintenance

It is an iterative procedure rather than sequential procedure (contains refining)


Systems Development Life Cycle

Figure 6.2

Systems Development Life Cycle (SDLC)

SDLCs planning phase

General overview of organization and its objective Initial assessment must be made Such assessments:

Should the existing system be continue? Should the existing system be modified Should the existing system be replace


Systems Development Life Cycle (SDLC)

SDLCs planning phase

Study existing system Explore alternative solutions Feasibility study:

Technical (Technical aspects of hardware and software requirement) Financial (System cost, labor cost, )


Systems Development Life Cycle (SDLC)

SDLCs analysis phase

Great detail examination based on the previous phase (planning phase) Address both individual (user) needs or organizational needs Existing hardware and software systems are studied to find out current systems performance, functionality, and potential problem, as well as future opportunities


Systems Development Life Cycle (SDLC)

SDLCs analysis phase

Require end-user and system designers to gather together. Study carefully about relationships and database models Create logical design using various tools Define the logical systems Provide functional descriptions Define data transformation and documentation

Systems Development Life Cycle (SDLC)

SDLCs detailed system design phase

Complete the design of the system processes

All technical specifications Menus Reports Devices Conversion from old system Training Methodologies


Systems Development Life Cycle (SDLC)

SDLCs implementation phase

Hardware, DBMS software, and application programs are installed Database design is implemented Actual database is created Tables, views, user authorization


Systems Development Life Cycle (SDLC)

SDLCs implementation phase

Database contents

Customized user programs Database interface programs Conversion programs

Documentation User training Refining and evaluation


Systems Development Life Cycle (SDLC)

SDLCs maintenance phase

After system operation End-user begin to request changes in it, which generate system maintenance activities

Corrective maintenance due to system errors Adaptive maintenance due to changes in business environment Perfective maintenance to enhance the system


Database Lifecycle (SDLC)

Within the large information system, the database is subject to a life cycle Database lifecycle contains six phases

Database initial study Database design Implementation and loading Testing and evaluation Operation Maintenance and evolution

Database Lifecycle (DBLC)

Figure 6.3

Phase 1: Database Initial Study

Determine how and why the current system fails Require to have strong communication skill and interpersonal skills


Phase 1: Database Initial Study

Alone or team (project leader, senior system analyst, ) project, depend on the scale of the system


Phase 1: Database Initial Study


Analyze company situation

Operating environment Organizational structure

Define problems and constraints Define database system specification

Determine objectives Determine scope Determine boundaries


Initial Study Activities

Figure 6.4


Initial Study Activities: Analysis of companys situation

Analysis: to break up any whole into its parts so as to find out their nature, function, and so on Company situation: describes the general conditions in which a company operates, its organizational structure, and its mission


Initial Study Activities: Analysis of companys situation

Designer must discover what the companys operational components are, and how they function, and how they interact


Initial Study Activities: Analysis of companys situation

What is the organizations general operating environment? What is its mission within that environment? What is the organizations structure?


Initial Study Activities: Define problems and constrains

Find existing system (functions, operations, system requirement, people) Collect very broad problem descriptions from people at different levels Distinct the major problems Study the possible constraints

Initial Study Activities: Define objectives

Basic on the major problem, define objectives

What is the proposed systems initial objective? Will the system interface with other existing or future systems in the company? Will the system share with the data with other systems or users?


Initial Study Activities: Define scope and boundary

Scope: define the extend of the design, according to the operational requirement (whole departments, partial departments, individual department) Scope helps define the required data structure, type, and number of entities, physical size of the database

Initial Study Activities: Define scope and boundary

Boundary: external to the system Imposed by hardware and software


Phase 2: Database Design

Focus on the design of database model to support operations and objectives Most Critical DBLC phase Makes sure final product meets requirements


Phase 2: Database Design

Focus on the requirements of data characteristics Two view of data

Business view of data as a source of information Designers view of data structure (access and activities required ro transform the data into information


Two Views of Data

Figure 6.5

Phase 2: Database Design

Create conceptual design DBMS software selection Create logical design Create physical design

Procedure flow in the database design



I. Conceptual Design

Data modeling creates abstract data structure to represent real-world items Must embody a clear understanding of the business and its functional areas High level of abstraction
Hardware and software or database model might not be identified (hardware and software independent) All that is needed is there, and all that is there is needed


I. Conceptual Design

Four steps
Data analysis and requirements Entity relationship modeling and normalization Data model verification Distributed database design


Data analysis and Requirements

First step in conceptual design is to discover the data element characteristics Appropriate data element characteristics are those that can be transformed into appropriate information


Data analysis and Requirements

Designers efforts are focused on:

Information needs

What kind of information needed? what kind of information the current system provides? what kind of new information we need to obtain? Who will use the information? how is the information to be used what is the end-users interface?

Information users


Data analysis and Requirements

Designers efforts are focused on:

Information sources

Where is the information to be found? how to extract the information from? What data elements are needed to product the information? what are the attributes, what relationships exist among the data? what is the data volume? how frequently are the data used,? What data transformations are to be used to generate the required information? 52

Information constitution

Data analysis and Requirements

Data sources
Developing and gathering end-user data views Direct observation of current system Interfacing with systems design group

Business rules

A brief and precise description of a policy, procedures, or principle within a specific organizations environment

Data analysis and Requirements

Business rules help designer to define entity, attributes, relationships, connectivities, cardinalities, and constraints Made by policy makers or managers Documented as companys procedures, standards, and operation manuals


Data analysis and Requirements

Business rules yields several important benefits in the design of new system
Help standardize the companys view of data Constitute a communications tools between users and designers Allow the designer to understand the nature, role, and scope of the data


Data analysis and Requirements

Business rules yields several important benefits in the design of new system
Allow the designer to understand business processes Allow the designer to develop appropriate relationship participation rules and foreign key constraints


Entity Relationship Modeling and Normalization

Must communicate and enforce appropriate standards to be used in the documentation of the design Standards: use of diagram and symbols, documentation writing style, layout, and any conventions


Entity Relationship Modeling and Normalization

Business rules usually define the nature of relationship(s) among the entities The process of defining business rules and developing the conceptual model using ERD is summarized


Entity Relationship Modeling and Normalization

Table 6.2

E-R Modeling is Iterative

Figure 6.8

Concept Design: Tools and Sources


Entity Relationship Modeling and Normalization

During E-R model, designer must

Define entities, attributes, primary keys, and foreign keys Make decision about adding new primary key attributes in order to satisfy end-user and/or processing requirements Make decision about treatment of multivalued attribute Make decision about adding derived attributes to satisfy processing requirement Make decision about placement of foreign keys in 1:1 relationships

Entity Relationship Modeling and Normalization

During E-R model, designer must

Avoid unnecessary ternary relationships Draw the corresponding E-R diagram Normalize the data model Include all the data element definitions in the data dictionary Make decision about standard naming conventions


Data Model Verification

E-R model is verified against proposed system processes Verification requires that the model be run through a series of tests against:
End user views and required transactions Access paths, security, concurrency control Business-imposed data requirements and constraints


Data Model Verification

Revision of original database design with a careful reevaluation of entities Followed by detailed examination of the attributes The process serves several important purposes

Emergence of the attribute details may lead a revision of the entities


Data Model Verification

Focus on attribute details can provide clues about the nature of relationships Satisfy processing and/or end-user requirements, possible create a new primary key to replace existing primary key Normalization helps guard against undesirable redundancies Meet the end-user requirement


Data Model Verification

Module: an information system component that handle a specific function, such inventory, payroll, orders, Create and use modules accomplish several important ends
Easily ad and delete within a team work Simplify design work Prototype quickly Reuse


Data Model Verification

Modules represent E-R model fragments Each module or fragment must be verified against the complete E-T model The verification process is detailed in the following Table


E-R Model Verification Process

Table 6.4

Data Model Verification

Verification process requires the continuous verification of business transaction as well as system and user requirement It is a repeated process for each of the systems modules


Iterative Process of Verification

Figure 6.10


Data Model Verification

Define major components as modules Verification starts with the central (most important) entity Central entity is defined as a entity which has most of the models relationships (involved in the greatest number of relationships) Identify module (sub-system) to which the central entity belongs, and define modules scope and boundary

Data Model Verification

Within the central entity/module framework, we must

Ensure the modules cohesivity: strength of the relationships found among the module entities Analyze relationships between modules, in terms of module coupling: extent to which modules are independent of one another. The lower coupling the better.


Data Model Verification

Processes often classified into

Frequency (daily, weekly, monthly, yearly, and exceptions) Operational type (INSERT, ADD, UPDATE, CHANGE, DELKETE, queries and reports, batches, maintenance and backups)

All identified processes must be verified against the E-R model


Distributed Database Design

Design portions in different physical locations Development of data distribution and allocation strategies


II. DBMS Software Selection

DBMS software selection is critical Advantages and disadvantages of each DBMS should be carefully studied Factors affecting purchasing decision
Cost DBMS features and tools Underlying model Portability DBMS hardware requirements


III. Logical Design

Logical design follows the decision to use a specific database model (hierarchy, network, relation, or object-oriented models) Once the database model is selected, we can map the conceptual design onto a logical design It is software-dependent but hardware independent

III. Logical Design

Translates conceptual design into internal model Maps objects in model to specific DBMS constructs (DB2, SQL Server, Oracle, IMS, Infomax, Access, MySQL, Ingress, ) For relational DBMS, logical design include the design of tables, indexes, views, transactions, access authorities,

III. Logical Design

Design components in relational DBMS

Tables Indexes Views Transactions Access authorities Others


IV. Physical Design

Selection of data storage and access characteristics Storage characteristics are a function of the types of devices supported by the hardware, the type of data access methods supported y the system and DBMS selected. It affects the location of data and performance of the system

IV. Physical Design

Very technical and hardware dependent (experience required) More important in older hierarchical and network models Becomes more complex for distributed systems Designers favor software that hides physical details

Physical Organization

Figure 6.12

Phase 3: Implementation and Loading

Creation of special storage-related constructs to house end-user tables Data loaded into tables Design rights to use the database administrator Create the table spaces within the database Create the table within table space

Phase 3: Implementation and Loading

Assign access rights to the table space .


Phase 3: Implementation and Loading

Other issues need to be addressed in this phase

important fact, monitoring evaluation issue


Phase 3: Implementation and Loading

Other issues need to be addressed in this phase

Physical security Password security Access rights Audit trails Data encryption

Backup and recovery


Phase 3: Implementation and Loading

Other issues need to be addressed in this phase

Integrity: enforced through the proper use of primary and foreign key rules Company standards Concurrency controls


Phase 3: Implementation and Loading

Concurrency controls

Simultaneously access a database while preserving data integrity is concurrency control Example



Phase 4: Testing and Evaluation

Database is tested and fine-tuned for performance, integrity, concurrent access, and security constraints Done in parallel with application programming


Phase 4: Testing and Evaluation

Actions taken if tests fail

For performance, designer should consider finetuning based on reference manuals Modification of physical design Modification of logical design Upgrade or change DBMS software or hardware


Phase 5: Operation

Database considered operational Starts process of system evaluation Unforeseen problems may surface Demand for change is constant


Phase 6: Maintenance and Evaluation

Preventative maintenance (backup) Corrective maintenance (recover) Adaptive maintenance (enhancing performance, adding entities and attributes) Assignment of access permissions Generation of database access statistics to improve efficiency and usefulness of system audits, and monitor performance


Phase 6: Maintenance and Evaluation

Periodic security audits based on systemgenerated statistics Periodic system usage-summaries


DB Design Strategy Notes

Two classical approaches to the database design Top-down design

1) Identify data sets (entities) 2) Define data elements (attributes)

1) Identify data elements (attributes) 2) Group them into data sets (entities)



Top-Down vs. Bottom-Up

Figure 6.14


Centralized vs. Decentralized Design

Depends on the scope and size, the design approaches can also be classified into Centralized design
Typical of simple databases Conducted by single person or small team



Centralized vs. Decentralized Design

Decentralized design
Larger numbers of entities and complex relations Design may divided into several modules Spread across multiple sites Developed by teams


Decentralized Design

Figure 6.16

Decentralized Design

Definition of boundary and interrelation among data subsets must be very precise All the independently-designed modules will integrated together Aggregate sub-conceptual model into a large conceptual model, it must be verified the combination and transactions

Decentralized Design

Aggregation process requires the designer to create a single model in which various aggregation problems must be addressed

Synonyms (same object by different name) and homonyms problems (same name to address different object) Entity and entity subtypes Conflicting objects definition (different data types, different domain definition fro the same attributes)