Вы находитесь на странице: 1из 61

Chapter 5

Organizing Data and Information


ITEC 1010 Information and Organizations

Data
Data A necessity for almost any enterprise to carry out its
business. Consists of raw facts, and when organized may be transformed into information

Database A collection of data organized to meet users needs Database management system (DBMS) A group of programs that manipulate the database and

provide an interface between the database and the user of the database or other application programs
Information and Organizations

ITEC 1010

DBMS Discussion (1)


A collection of programs that enables you to store, modify, and extract information from a database. There are many different types of DBMSs, ranging from small systems that run on personal computers to huge systems that run on mainframes. The following are examples of database applications: computerized library systems automated teller machines flight reservation systems computerized parts inventory systems From a technical standpoint, DBMSs can differ widely. The terms
ITEC 1010 Information and Organizations

DBMS Discussion (2)


relational, network, flat, and hierarchical all refer to the way a DBMS organizes information internally. The internal organization can affect how quickly and flexibly you can extract information. Requests for information from a database are made in the form of a query, which is a stylized question. For example, the query SELECT ALL WHERE NAME = "SMITH" AND AGE > 35 requests all records in which the NAME field is SMITH and the AGE
ITEC 1010 Information and Organizations

DBMS Discussion (3)


field is greater than 35. The set of rules for constructing queries is known as a query language. Different DBMSs support different query languages, although there is a semi-standardized query language called SQL (structured query language). Sophisticated languages for managing database systems are called fourth-generation languages, or 4GLs for short. The information from a database can be presented in a variety of formats. Most DBMSs include a report writer program that enables you to output data in the form of a report. Many DBMSs also include
ITEC 1010 Information and Organizations

DBMS Discussion (4)


a graphics component that enables you to output information in the form of graphs and charts.

ITEC 1010

Information and Organizations

Hierarchy of Data

Schematic
ITEC 1010 Information and Organizations

Hierarchy of data

Example
Personel file

Database Database

Department file Payroll file 005-10-6321 Johns Francine 10-7-65 549-77-1001 Buckley Bill 2-17-79 098-40-1370 Fiske Steven 1-5-85 098-40-1370 Fiske Steven 1-5-85 598

(Project database)

Files Files Records Records Fields Fields Characters Characters (bytes) (bytes)
ITEC 1010

(Personnel file)

Fiske

(Last name field)

(Record containing SSN, last name, first name, date of hire)

1000100

(Letter F in ASCII)

Information and Organizations

Terminology
Database
A collection of integrated and related files

File
A collection of related records

Record
A collection of related fields

Field
A group of characters

Character
Basic building block of information, represented by a byte
ITEC 1010 Information and Organizations

Data Entities, Attributes, and Keys


Entity
A generalized class of people, places, or things (objects) for
which data are collected, stored, and maintained E.g., Customer, Employee

Attribute
A characteristic of an entity; something the entity is identified by E.g., Customer name, Employee name

Keys
A field or set of fields in a record that is used to identify the
ITEC 1010

record E.g, A field or set of fields that uniquely identifies the record
Information and Organizations

Keys and Attributes


Employee # 005-10-6321 549-77-1001 098-40-1370 Last name Johns Buckley Fiske First name Francine Bill Steven Hire date 10-7-65 2-17-79 1-5-85 Dept. # 257 650 598

Key field
ITEC 1010

Attributes (fields)
Information and Organizations

Entities (records)

The Traditional Approach


The traditional approach Separate files are created and stored for each
application program

Schematic
ITEC 1010 Information and Organizations

Data

Files Payroll

Application programs Payroll programs

Users Reports

Invoicing

Invoicing programs

Reports

Inventory control

Inventory control programs Management inquiries programs

Reports

Management inquiries
ITEC 1010

Reports

Information and Organizations

Drawbacks
Data redundancy Duplication of data in separate files Lack of data integrity The degree to which the data in any one file is accurate Program-data dependence A situation in which program and data organized for
one application are incompatible with programs and data organized differently for another application
ITEC 1010 Information and Organizations

Database Approach
The database approach

A pool of related data is shared by multiple


application programs Rather than having separate data files, each application uses a collection of data that is either joined or related in the database
Schematic
ITEC 1010 Information and Organizations

Payroll program Payroll data Inventory data Invoicing Data Other data Inventory program

Reports

Database management system

Reports

Invoicing program

Reports

Other programs Database


ITEC 1010

Reports

Interface

Applications programs

Users

Information and Organizations

Advantages

ITEC 1010

Improved strategic use of corporate data Reduced data redundancy Improved data integrity Easier modification and updating Data and program independence Better access to data and information Standardization of data access A framework for program development Better overall protection of the data Shared data and information resources
Information and Organizations

Disadvantages
Relatively high cost of purchasing and
operating a DBMS in a mainframe operating environment Increased cost of specialized staff Increased vulnerability

ITEC 1010

Information and Organizations

Data Modeling and Database Models (1)


Planned data redundancy

A way of organizing data in which the logical

database design is altered so that certain data entities are combined Summary totals are carried in the data records rather than calculated from elemental data Some data attributes are repeated in more than one data entity to improve database performance
ITEC 1010 Information and Organizations

Data Modeling and Database Models (2)


Data model A map or diagram of entities and their relationships Enterprise data modeling Data modeling done at the level of the entire
organization

Entity-relationship (ER) diagrams A data model that uses basic graphical symbols to show
the organization of and relationships between data
ITEC 1010 Information and Organizations

Example: Entity Relationship (ER) Diagram for a Customer Ordering Database

Schematic
ITEC 1010 Information and Organizations

Last name

Attributes

Colour

Entities

First name

Customer

Order

Product

Name

1:N one-to-many relationship Identification number ITEC 1010 Information and Organizations Identification number

Hierarchical Database Model


Hierarchical database model

A data model in which data are organized in a


top-down, or inverted tree structure

Schematic
ITEC 1010 Information and Organizations

Project 1

Department A

Department B

Department C

Employee 1

Employee 2

Employee 3

Employee 4

Employee 5

Employee 6

ITEC 1010

Information and Organizations

Network Data Model


Network data model
An expansion of the hierarchical database model
with an owner-member relationship in which a member may have many owners
Project 1 Project 2

Department A
ITEC 1010

Department B
Information and Organizations

Department C

Relational Data Model


Relational data model

All data elements are placed in two-dimensional


tables, called relations, that are the logical equivalent of files

Schematic
ITEC 1010 Information and Organizations

Data Table 1: Project Table


Project Number 155 498 226 Description Payroll Widgets Sales manager Dept. Number 257 632 598

Data Table 2: Department Table


Dept. Number 257 632 598 Dept. Name Accounting Manufacturing Marketing Manager SSN 421-55-99993 765-00-3192 098-40-1370

Data Table 3: Manager Table


SSN 005-10-6321 549-77-1001 098-40-1370 Last Name Johns Buckley Fiske First Name Francine Bill Steven Hire Date 10-7-65 2-17-79 1-5-85 Dept. Number 257 650 598

ITEC 1010

Information and Organizations

Relational Database Terminology


Selecting Data manipulation that eliminates rows according to
certain criteria

Projecting Data manipulation that eliminates columns in a table Joining Data manipulation that combines two or more tables Linked Relating tables in a relational database together
ITEC 1010 Information and Organizations

Linking Data Tables to Answer an Inquiry

Schematic
ITEC 1010 Information and Organizations

Project Number 155 498 226

Description Payroll Widgets Sales manager

Dept. Number 257 632 598

Dept. Number 257 632 598

Dept. Name Accounting Manufacturing Marketing

Manager SSN 421-55-99993 765-00-3192 098-40-1370

SSN 005-10-6321 549-77-1001 098-40-1370

Last Name Johns Buckley Fiske

First Name Francine Bill Steven

Hire Date 10-7-65 2-17-79 1-5-85

Dept. Number 257 650 598

ITEC 1010

Information and Organizations

Building and Modifying a Relational Database


Using Microsoft Access

Screen snap
ITEC 1010 Information and Organizations

ITEC 1010

Information and Organizations

Schemas and Subschemas


Schema

A description of the entire database Subschema A file that contains a description of a subset of
the database and identifies which users can perform modifications on the data items in that subset
Schematic
ITEC 1010 Information and Organizations

DBMS

Schema

Subschema A

Subschema B

Subschema C

User 1

User 2

User 3

User 4

User 5

ITEC 1010

Information and Organizations

Schema Discussion
Pronounced skee-ma, the structure of a database system, described in a formal language supported by the database management system (DBMS). In a relational database, the schema defines the tables, the fields in each table, and the relationships between fields and tables. Schemas are generally stored in a data dictionary. Although a schema is defined in text database language, the term is often used to refer to a graphical depiction of the database structure.
ITEC 1010 Information and Organizations

Data Definition Language


Data Definition Language (DDL)

A collection of instructions and commands used


to define and describe data and data relationships in a specific database

Schematic
ITEC 1010 Information and Organizations

SCHEMA DESCRIPTION SCHEMA NAME IS XXXX AUTHOR XXXX DATE XXXX FILE DESCRIPTION FILE NAME IS XXXX ASSIGN XXXX FILE NAME IS XXXX ASSIGN XXXX AREA DESCRIPTION AREA NAME IS XXXX RECORD DESCRIPTION RECORD NAME ISXXXX RECORD ID IS XXXX LOCATION MODE ISXXXX WITHIN XXX AREA FROM XXXX THRU XXXX SET DESCRIPTION SET NAME IS XXXX ORDER IS XXXX MODE IS XXXX MEMBER IS XXXX . . .

ITEC 1010

Information and Organizations

Data Dictionary
Data Dictionary

A detailed description of all data used in the


database

Schematic
ITEC 1010 Information and Organizations

NORTHWESTERN MANUFACTURING PREPARED BY: DATE: APPROVED BY: VERSION: PAGE: DATA ELEMENT NAME: DESCRIPTION: OTHER NAMES: VALUE RANGE: DATA TYPE: POSITIONS: D. BORDWELL 04 AUGUST J. EDWARDS 3.1 1 OF 1

DATE:

13 OCTOBER

PARTNO INVENTORY PART NUMER PTNO 100 TO 5000 NUMERIC 4 POSITIONS OR COLUMNS

ITEC 1010

Information and Organizations

Data Dictionary Features


Provide a standard definition of terms and data
elements Assist programmers in designing and writing programs Simplify database modification Reduce data redundancy Increase data reliability Faster program development Easier modification of data and information
ITEC 1010 Information and Organizations

Logical and Physical Access Paths


Logical access path (LAP)

Application requires information from the


DBMS

Physical access path (PAP)

DBMS accesses a storage device to retrieve


data
Schematic
ITEC 1010 Information and Organizations

Data on storage devices

Physical access path (PAP)

DBMS

Logical access path (LAP)

Management inquiries

Other software

Application programs

ITEC 1010

Information and Organizations

Manipulating Data
Concurrency control A method of dealing with a situation in which two or
more people need to access the same record in a database at the same time

Data manipulation language (DML) The commands that are used to manipulate the data in a
database

Structured query language (SQL) A standardized data manipulation language


ITEC 1010 Information and Organizations

Structured Query Language (SQL)


Invented at IBMs Almaden Research Centre (San Jose, CA) in the 1970s E.g.,
SELECT * FROM EMPLOYEE WHERE JOB_CLASSIFICATION = C2 Select all (*) columns from the EMPLOYEE table in which the JOB_CLASSIFICATION field is equal to C2
ITEC 1010 Information and Organizations

SQL Discussion (1)


Abbreviation of structured query language, and pronounced either see-kwell or as separate letters. SQL is a standardized query language for requesting information from a database. The original version called SEQUEL (structured English query language) was designed by an IBM research center in 1974 and 1975. SQL was first introduced as a commercial database system in 1979 by Oracle Corporation. Historically, SQL has been the favorite query language for database management systems running on minicomputers and mainframes.
ITEC 1010 Information and Organizations

SQL Discussion (2)


Increasingly, however, SQL is being supported by PC database systems because it supports distributed databases (databases that are spread out over several computer systems). This enables several users on a local-area network to access the same database simultaneously. Although there are different dialects of SQL, it is nevertheless the closest thing to a standard query language that currently exists. In 1986, ANSI approved a rudimentary version of SQL as the official standard, but most versions of SQL since then have included many
ITEC 1010 Information and Organizations

SQL Discussion (3)


extensions to the ANSI standard. In 1991, ANSI updated the standard. The new standard is known as SAG SQL.

ITEC 1010

Information and Organizations

Database Output

Screen snap
ITEC 1010 Information and Organizations

ITEC 1010

Information and Organizations

Popular Database Management Systems for End Users


Microsoft Access 98 Lotus Approach 98 Inprise (formerly Borland) dBASE DBMS Selection Criteria

ITEC 1010

Database size Number of concurrent users Performance Integration Features The vendor Cost
Information and Organizations

Distributed Databases
Distributed database

A database in which the actual data may be


spread across several smaller databases connected via telecommunications devices

Pretty picture
ITEC 1010 Information and Organizations

ITEC 1010

Information and Organizations

Data Warehouse
Data warehouse A relational database management system designed

Data mart A subset of a data warehouse for small and medium-

specifically to support management decision making Current evolution of Decision Support Systems (DSSs)

size businesses or departments within larger companies


Schematic
ITEC 1010 Information and Organizations

Relational databases Data extraction process

Hierarchical databases

Network databases

Data cleanup process

Flat files

Spreadsheets End user access

Data wharehouse

Query and analysis tools

ITEC 1010

Information and Organizations

Designing a Customer Data Warehouse


Sharply define your goals and objectives before
you build the warehouse Choose the software that best fits your goals Determine who/what should be in the database Develop a plan Measure results

ITEC 1010

Information and Organizations

Data Mining Applications


Data mining The automated discovery of patterns and relationships
in a data warehouse

Data mining applications



ITEC 1010

Market segmentation Customer queries Fraud detection Direct marketing Market basket analysis Trend analysis
Information and Organizations

On-Line Analytical Processing (OLAP)


On-line analytic processing (OLAP)
Access to multidimensional databases providing
managerially useful display techniques Now used to store and deliver data warehouse information

Data warehouse and OLAP


Provides top-down, query-driven analysis

Data mining
Provides bottom-up, discovery-driven analysis
ITEC 1010 Information and Organizations

Open Database Connectivity (ODBC)


Open database connectivity (ODBC)
A set of standards that ensures software written to
comply with these standards can be used with any ODBC-compliant database

Schematic
ITEC 1010 Information and Organizations

dBASE

ODBC Import

ACCESS database

ODBC Export

Paradox database

ODBC Link

Lotus 1-2-3 spreadsheet

ITEC 1010

Information and Organizations

Object-Relational Database Management Systems (ORDBMS)


Object-relational database management system (ORDBMS)
A DBMS capable of manipulating audio, video, and graphical data. Hypertext Users can search and manipulate alphanumeric data in an
unstructured way

Hypermedia
Allows businesses to search and manipulate multimedia forms of
data

Spatial data technology


Use of an object-relational database to store and access data
according to the location it describes and to permit spatial queries and analysis
Information and Organizations

ITEC 1010

End of Chapter 5
Chapter 6
ITEC 1010 Information and Organizations