Вы находитесь на странице: 1из 157

DB2 - IBM’s Relational DBMS

CTS-PAC Version 1.1 1


Session 1

CTS-PAC Version 1.1 2


Topics to be covered in this session
• Introduction to databases - covers their advantages and
the types of databases (time : 30 min)
• Relational database concepts - covers Terminology,
ER model , Normalisation, An Introduction to
Database objects, CODD’s Relational Rules, An
Introduction to SQL.

CTS-PAC Version 1.1 3


Introduction to Databases
What is Data ?
‘A representation of facts or instruction in a form
suitable for communication’ - IBM Dictionary

What is a Database ?
‘Is a repository for stored data’ - C.J.Date

CTS-PAC Version 1.1 4


contd...
What is a database system ?
An integrated and shared repository for stored data or
collection of stored operational data used by
application systems of some particular enterprise.
Or
‘Nothing more than a computer-based record keeping
system’.

CTS-PAC Version 1.1 5


Advantages of DBMS over File Mngt Sys
• Data redundancy
• Multiple views
• Shared data
• Data independence (logical/physical)
• Data dictionary
• Search versatility
• Cost effective
• Security & Control
• Recovery restart & Backup
• Concurrency
CTS-PAC Version 1.1 6
TYPES OF DATABASES (or Models)

• Hierarchical Model
• Network Model
• Relational Model
• Object-Oriented Model

CTS-PAC Version 1.1 7


contd...
• HIERARCHICAL
• Top down structure resembling an upside-down
tree
• Parent child relationship
• First logical database model
• Available in legacy systems on Mainframe
computers
• Example - IMS

CTS-PAC Version 1.1 8


contd...

• NETWORK
• Does not distinguish between parent and child. Any
record type can be assocaited with any number of
arbitrary record types
• Enhanced to overcome limitations of Network
model but in reality, there is minimal diffeence due
to frequent enhancements

CTS-PAC Version 1.1 9


contd...

• RELATIONAL
• Data stored in table in the form of tables and rows.
• Examples - DB2, Oracle, Sybase, Ingres etc

• OBJECT -ORIENTED MODEL


• Data attributes and methods that operate on those
attributes are encapsulated in structures called
objects

CTS-PAC Version 1.1 10


RELATIONAL DB CONCEPTS

CTS-PAC Version 1.1 11


Relational Properties
• Why Relational ? - Relation is a mathematical
term for a table - Hence Relational database ‘is
perceived’ by the users as a set of tables.
• All data values are atomic.
• Entries in columns are from the same domain
• Sequence of rows (T-B) is insignificant
• Each row is unique
• Sequence of columns (L-R) is insignificant

CTS-PAC Version 1.1 12


Relational Concepts (or Terminology)
• Relation : A table or File
• Tuple : Row contains an entry for each attribute
• Attributes : Columns or the characteristics that
define the entity
• Domain:. A range of values (or Pool)
• Entity : Some object about which we wish to store
information
• Null : Represents an unknown value
• Atomic : Smallest unit of data; the individual data
value

CTS-PAC Version 1.1 13


contd...
• Candidate key : Some attribute (or a set of
attributres) that may uniquely identify each
row(tuple) in the relation(table) This exists only for a
short period of time and the primary and attribute
key take its place.
• Primary key : The candidate key that is chosen for
primary attributes to uniquely identify each row.
• Alternate key : The remaining candidate keys that
were not chosen as primary key
• Foreign key : An attrtibute of one relation that might
be a primary key of another relation.
CTS-PAC Version 1.1 14
Entity Relationship Model

• E-R model is a logical representation of data for a


business area
• Represented as entities, relationship between entities
and attributes of both relationships and entities
• E-R models are outputs of analysis phase i.e they are
conceptual data models expressed in the form of an E-
R diagram

CTS-PAC Version 1.1 15


Normalisation (1NF - 5NF)

• It is done to bring the design of database to a


standadized mode
• 1NF : All entities must have a unique identifier, or key,
that can be composed of one or more attributes. All
attributes must be atomic and non repeating.
• 2NF : Partial functional dependencies removed - all
attributes that are not a part of the key must depend on
the entire key for that entity.

CTS-PAC Version 1.1 16


contd...

• 3NF : Transitive dependencies removed - attributes that


are not a part of the key must not depend on any non-
key attribute.
• 4NF : Multi valued dependencies removed
• 5NF : Remaining anomalies removed

CTS-PAC Version 1.1 17


Types of Integrity

• Entity Integrity : Rule states that no column that is


part of a primary key can have a null value
• Referential Integrity : Rule states that every foreign
key in the first table must either match a primary key
value in the second table or must be wholly null
• Domain Integrity : Integrity of information allowed in
column

CTS-PAC Version 1.1 18


Example of a Relational Structure

CUSTOMER Places ORDERS


ORDERS Has PRODUCTS

CTS-PAC Version 1.1 19


The above relations can be interpreted as
follows :
• A Customer can place any number of orders (one-to-
many)
• Each order relates to only one customer (one-to-one)
• One order can contain many products (one-to-many)
• One Product can be a part of many orders(one-to-
many)

CTS-PAC Version 1.1 20


contd...

• In the above example Customer, Order & Product are


called ENTITIES.
• An Entity may transform into table(s).
• The unique identity for information stored in an
ENTITY is called a PRIMARY KEY. Eg. Customer-
No uniquely identifies each customer

CTS-PAC Version 1.1 21


contd...

A table essentially consists of


• Attributes, which define the characteristics of the
table
• Primary key, which uniquely identifies each row of
data stored in a table
• Secondary & Foreign Keys/indexes

CTS-PAC Version 1.1 22


contd...
Table Definition :
Table ‘Customer’ -
Attributes - Customer-No, Cust-name,
Cust-location, Cust-Id, Order-no...

Primary Key - Customer-No


Secondary Key - Cust-Id
Foreign-Key - Order-no

CTS-PAC Version 1.1 23


contd...

• The Relationships transform into Foreign Keys. For eg.


Customer is related to Orders thru ‘Order-No’ which is
the Foreign-key in Customer and Primary key in Order.
So basically the relationship ‘Places’ is thru the Order-
No.
• As per the relational integrity the Primary-Key ,Order-
No, for the table ‘Orders’ can never be Null, while it
can be so in the table ‘Customer’.

CTS-PAC Version 1.1 24


contd...

• Tables exist in Tablespaces. A tablespace can contain


one or more tables
• Apart from the Primary Key, a table can have many
secondary keys/indexes, which exist in Indexspaces.
• These tablespaces and indexspaces together exist in a
Database

CTS-PAC Version 1.1 25


contd...

• To do transformations as described above we need a


tool that will provide a way of creating the tables,
manipulate the data present in these, create
relationships,indexes,tablespace, indexspace and so on.
DB2 provides SQL which performs these functions.
The next part briefly deals with SQL and its functions.
A detailed explanation will be taken up later.

CTS-PAC Version 1.1 26


CODDS RELATIONAL RULES

• 1. All information in a relational database is


represented explicitly at the logical level and in exactly
one way - by values in tables

• 2. Each and every datum(atomic value) in a relational


database is guarenteed to be logically accessible by
resorting to a combination of tablename, primary key
value, and column name

CTS-PAC Version 1.1 27


contd...

• 3. Null values are supported for representing missing


information in a systematic way irrespective of the
datatype.

• 4. The database description is represented at the logical


level in the same way as ordinary data, so that
authorised users can apply the same relational language
to its interrogation as they apply to the regular data.

CTS-PAC Version 1.1 28


contd...

• 5.A relational system may support several languages


and various modes of terminal use. However there
must be one language whose statements can express all
of the following items: (1)data definitions (2)view
definitions (3)data manipulation(interactive and by
program)(4) integrity constraints (5) authorisation(6)
transaction boundaries(begin, commit,rollback)

CTS-PAC Version 1.1 29


contd...

• 6. All views are theoretically updatable, are also


updatable by the system

• 7. The capability of handling a base relation or a


derived relation (view) as a single operand applies not
only to the retrierval of of data but also to the
insertion, updation and deletion of data

CTS-PAC Version 1.1 30


contd...

• 8. Application programs and terminal activities remain


logically unimpaired whenever any changes are made
in either storage representations or access methods

• 9. Application programs and terminal activities remain


logically unimpaired when information-preserving
changes of any kind that theoretically permit
unimpairment are made to the base tables.

CTS-PAC Version 1.1 31


contd...

• 10. Integrity constraints specific to a particular


relational database must be definable in the relational
data sublanguage and storable in the catalog, not in the
application programs.

• 11. The data manipulation sublanguage of a relational


DBMS must enable application programs and inquiries
to remain logically the same whether and whenever
data are physically centralized or distributed.

CTS-PAC Version 1.1 32


contd...

• 12. If a relational system has a low-level(single-


record-at-a-time)language, that low level cannot be
used to subvert or bypass the integrity rules and
constraints expressed in the higher-level relational
language(multiple-records-at-a-time)

CTS-PAC Version 1.1 33


An introduction to SQL
SQL or Structured Query Language is
• A Powerful language that performs the functions of
data manipulation(DML), data definition(DDL) and
data control or data authorization(DAL/DCL).
• A Non procedural language - the capability to act
on a set of data and the lack of need to know the
how to retrieve it. An SQL can perform the
functions of more than a procedure.
• Very flexible

CTS-PAC Version 1.1 34


contd...
SQL - Features
• What you want and not how to get it
• Unlike COBOL or 4GL’s, SQL is coded without
data-navigational instructions.The optimal access
paths are determined by the DBMS. This is
advantageous because the database knows better
how it has stored data than the user.
• Set level processing & multiple row processing

CTS-PAC Version 1.1 35


The following are the Operations that can be
performed by a SQL on the database tables :
• Select
• Project
• Union
• Intersection
• Difference
• Cartesian Product
• Join
• Divide

CTS-PAC Version 1.1 36


Session 2

CTS-PAC Version 1.1 37


Topics to be covered in this session

• SQL - this is to be dealt here because all other data


objects manipulation, creation and use, involve SQL’s.
• DB2 objects - Database, Tablespaces & Indexspaces -
creation & use, and other terminologies associated with
databases.

CTS-PAC Version 1.1 38


Topics dealt with, in SQL

• Definition and Types


• usage of SQL’s with examples, scalar and column
functions
• Subqueries and Multiple queries, DMLs
• Static & Dynamic SQLs

CTS-PAC Version 1.1 39


Structured Query Language - SQL

• Standard query language for RDBMS


• Non procedural lang : Programmer specifies what data
is needed but not how to retrieve it
• Used also to define data structures, control access to
the data and delete occurrences of data
• Uses set-level processing

CTS-PAC Version 1.1 40


SQL - Types - based on the functionality

• Data Definition Language (DDL) - CREATE, ALTER,


DROP
• Data Manipulation Language (DML) - DELETE,
INSERT, SELECT, UPDATE
• Data Control Language (DCL) - GRANT, REVOKE

CTS-PAC Version 1.1 41


SQL - Types

• Production SQL or Ad-Hoc SQL


• Embedded SQL or Stand-alone SQL
• Static or Dynamic SQL

CTS-PAC Version 1.1 42


SQL - Selection & Projection

• Select retrieves a specific number of rows from a table


• Projection operation retrieves a specified subset of
columns(but all rows) from the table
Eg : Select Cust-no, Cust-name from Customer;
The WHERE clause defines the Predicates for the SQL
operation.
The above WHERE clause can have multiple conditions
using AND & OR.

CTS-PAC Version 1.1 43


Select distinct, select in range :
Select Cust-no, Cust-name, Cust-addr
where Cust-no BETWEEN 10000 AND 20000;

Select Cust-no, Cust-name, Cust-addr


where Cust-no NOT BETWEEN 1000 AND 2000;

Select Cust-no, Cust-name, Cust-addr


where Cust-no IN(1000, 2000);

CTS-PAC Version 1.1 44


contd...

Select Cust-no, Cust-name, Cust-addr


where Cust-id like/not like ‘425%’

Note :- ‘_’ for a single char ; ‘%’ for a string of chars


Escape ‘\’ - escape char;if precedes ‘_’ or ‘%’ overrides
their meaning

CTS-PAC Version 1.1 45


contd...

NULL : To check null the syntax is ‘IS NULL’ or ‘IS


NOT NULL’.

Select Cust-no, Cust-name, order-no


where order-no IS NULL;

However if there are null values for order-no, then these


are always evaluated as a ‘Not True’ condition in a
Query.

CTS-PAC Version 1.1 46


Order by and Group by clauses :

• Order by sorts retrieved data in the specified order;


uses the WHERE clause
• Group by operator causes the table represented by the
FROM clause to be rearranged into groups, such that
within one group all rows have the same value for the
Group by column (not physically in the database). The
Select clause is applied to the grouped data and not to
the original table.
Here ‘HAVING’ is used to eliminate groups, just like
WHERE is used for rows.

CTS-PAC Version 1.1 47


Example :-

Select Order-No, SUM(No-Prodts)


From ORDER
Group by Order-No
Having AVG(No-Prodts) < 10
Order by Order-No ;

CTS-PAC Version 1.1 48


Functions

• Types are two :


• Column Function
• Scalar Function

CTS-PAC Version 1.1 49


Column Functions

• Compute from a group of rows aggregate value for a


specified column(s)
• AVG, COUNT, MAX, MIN, SUM
• Rules for column Functions - Refer Handout

CTS-PAC Version 1.1 50


Scalar Functions

• Are applied to a column or expression and operate on a


single value.
• CHAR, DATE, DAY(S), DECIMAL, DIGITS,
FLOAT, HEX, HOUR, INTEGER, LENGTH,
MICROSECOND, MINUTE, MONTH, SECOND,
SUBSTR, TIME, TIMESTAMP, VALUE,
VARGRAPHIC, YEAR
• Rules for Scalar Functions - Refer handout

CTS-PAC Version 1.1 51


Complex SQL’s
• One terms a SQL to be complex when data that is
to be retieved comes from more than one table
• SQL provides two ways of coding a complex SQL
• Subqueries and
• Joins

CTS-PAC Version 1.1 52


Subqueries

• Nested select statements


• specified using the IN(or NOT IN) predicate, equality
or non-equality predicate(‘=‘ or ‘<>‘) and comparative
operator(<, <=, >, >=)
• When using the equality, non-equality or comparative
operators, the inner query should return only a single
value

CTS-PAC Version 1.1 53


contd...
• Select Cust-No, Cust-Name
From CUSTOMER Where Order-No IN
( Select Order-No From ORDER
Where No-Prdts <5);

• Select Cust-No, Cust-addr


From CUSTOMER Where Order-No =
( Select Order-No From ORDER Where No-
Prdts=5);

CTS-PAC Version 1.1 54


contd...

• The nested loop statements gives the user the flexibility


for querying multiple tables
• A specialized form is Correlated Subquery - the nested
Select stmt refers back to the columns in previous
select stmts
• It works on Top-Bottom-Top fashion
• Noncorrelated Subquery works in Bottom-to-Top
fashion

CTS-PAC Version 1.1 55


Eg - Correlated Subquery..

• SELECT A.Cust-name A.Cust-addr


FROM CUSTOMER A WHERE A.Order-No IN
(SELECT Order-No FROM CUSTOMER B
WHERE A.Cust-id = B.Cust-id)
ORDER BY A.Cust-id, A.Cust-no ;

CTS-PAC Version 1.1 56


Corelated Subquery using EXISTS clause :

SELECT Cust-No, Cust-name FROM CUSTOMER A


WHERE EXISTS (SELECT * FROM ORDER B
WHERE B.Order-No = A.Order-No
AND B.Order-No = 5);

CTS-PAC Version 1.1 57


Multiple levels of Subquery
SELECT Cust-no, Cust-name, Cust-addr
FROM CUSTOMER A
WHERE Order-no IN
(SELECT order-no FROM ORDER B
WHERE Prod-id IN
(SELECT Prod-id FROM PRODUCTS
WHERE Prod-name = ‘NUTS’));

CTS-PAC Version 1.1 58


Joins

OUTER JOIN : For one or more tables being joined, both


matching and nonmatching rows are returned.
Duplicate columns may be eliminated
The nonmatching columns will have nulls in them.

INNER JOIN: Here there is a possibility one or more of


the rows from either or both tables being joined will
not be included in the table that results from the join
operation

CTS-PAC Version 1.1 59


DML’s

INSERT :

Eg: INSERT INTO Tablename(column1,


column2, column3 ,......)
VALUES( value1, value2, value3 ,........)

If any column is omitted in an INSERT stmt and that


column is NOT NULL, then INSERT fails; if null it is
set to null

CTS-PAC Version 1.1 60


contd...

• If the column is defined as NOT NULL BY


DEFAULT, it is set to that default value
• Omitting the list of columns is equivalent to specifying
all values
• SELECT - INSERT
INSERT INTO TEMP (A#, B)
SELECT A#, SUM(B) FROM TEMP1 GROUP
BY A# ;

CTS-PAC Version 1.1 61


contd...
UPDATE:

Eg: UPDATE tablename SET Columnname(s) =


scalar expression
WHERE [ condition ]

• Single or Multiple row updates


• Update with a Subquery

CTS-PAC Version 1.1 62


contd...
DELETE:

Eg: DELETE FROM Tablename WHERE


[condition ];

• Single or multiple row delete or deletion of all rows

CTS-PAC Version 1.1 63


Static SQL

• Hard-coded into an application program


• cannot be modified during the program’s execution
except for changes to the values assigned to the host
variables
• Cursors are used to access set-level data
• The general form is EXEC SQL
[SQL stmts]
END-EXEC.

CTS-PAC Version 1.1 64


Dynamic SQL

• Stmts can change throughout the program’s execution


• When the SQL is bound, the application plan or
package that is created does not contain the same info
as that for a static SQL program
• The access paths cannot be determined before
execution

CTS-PAC Version 1.1 65


SQL Guidelines :

- Refer handout
- Mullins, chapter 2

CTS-PAC Version 1.1 66


Topics dealt with, in DB2 objects

• Databases, stogroup, Tablespaces (types, creation and


modification)
• Indexspaces (creation and modification)
• some more terms associated with tablespaces

CTS-PAC Version 1.1 67


DB2 Objects

• Databases - User & system(catalog)


• A collection of logically related objects - like
Tablespaces, Indexspaces, Tables etc.
• not a physical kind of object - may occupy more
than one disk space
• A STOGROUP & BUFFERPOOL must be defined
for each database. Stogroup and user-defined VSAM
are the two storage allocations for a DB2 dataset
defn.

CTS-PAC Version 1.1 68


Stogroup

• It is a collection of direct access volumes, all of the


same device type
• The option is defined as a part of tablespace definition
• When a given space needs to be extended, storage is
acquired from the
appropriate stogroup

CTS-PAC Version 1.1 69


contd...

• In a given database, all the spaces need not have the


same stogroup
• These are, in a sense, the most physical of various
storage objects in DB2
• More than one volume can be defined in a stogroup.
DB2 keeps track of which volume was defined first &
uses that volume.

CTS-PAC Version 1.1 70


VCAT Option

• User Defined VSAM datasets have to be defined


explicitly by the AMS utility IDCAMS
• Two types of VSAM datasets are used -ESDS & LDS.
Linear Data set is more efficiently used by DB2
• Vsam datasets defined here are different from the plain
vsam datasets - can access them only thru VSAM
Media Manager

CTS-PAC Version 1.1 71


Tablespaces

• Logical address space on secondary storage to hold one


or more tables
• A ‘SPACE’ is basically an extendable collection of
pages with each page of size 4K or 32K bytes.
• It is the storage unit for for recovery and reorganizing
purpose
• Three Type of Tablespaces - Simple, Partitioned &
Segmented

CTS-PAC Version 1.1 72


Simple Tablespaces

• Can contain more than one stored table


• Depending on appln, storing more than one Table
might enable faster retrieval for joins using these tables
• Usually only one is preferred. This is because a single
page can contain rows from all tables defined in the
database.
• LOAD with replace option deletes all data

CTS-PAC Version 1.1 73


Segmented Tablespaces

• Can contain more than one stored table, but in a


segemented space
• A ‘Segment’ consists of a logically contiguous set of n
pages
• No segement is allowed to contain records for more
than one table
• Sequential access to a particular table is more efficient

CTS-PAC Version 1.1 74


contd...

• Mass Delete is much more efficient than in any other


Tablespace
• Reorganizing the tablespace will restore every table to
its clustered order
• Lock Table on table locks only the table, not the entire
tablespace
• If a table is dropped, the space for that table can be
reclaimed with minimum reorg

CTS-PAC Version 1.1 75


Partitioned Tablespaces

• Only one table in a partitioned TS; 1 to 64


partitions/TS
• It is partitioned in accordance with value ranges for
single or a combination of columns. Hence these
column(s) cannot be updated

CTS-PAC Version 1.1 76


contd...

• Individual partitions can be independently recovered


and reorganized
• Different partitions can be stored on different storage
groups for efficient access.

CTS-PAC Version 1.1 77


Tablespace parameters to be specified for TS
creation
• Locksize - indicates the type of locking DB2 performs
for the given TS
• Page
• Table
• Tablespace
• ANY - DB2 decides the starting page

CTS-PAC Version 1.1 78


contd...

• USING - method of storage allocations - Stogroup or


Vcat
• PCTFREE - % of space available for future inserts
• FREEPAGE - no of pages after which an empty page is
available
• Bufferpool - BPQ, BP1, BP2 & BP32K
• CLOSE - Yes/No - whether the underlying vsam
datasets be closed each time the table is used.Max no
of datasets that can be open in DB2 at a time is 10,000

CTS-PAC Version 1.1 79


contd...

• ERASE - Yes/No - whether physical DASD where the


TS reside to be written with binary zeros when the TS
is dropped
• NUMPARTS - For Partitioned Tablespaces
• SEGSIZE - For Segmented Tablespaces

CTS-PAC Version 1.1 80


Table Parameters for Creation

• Column Definition
• Format : CREATE TABLE TABLENAME (Column
Definitions)
• PRIMARY KEY(Columns) / FOREIGN KEY
*
• UNIQUE (Colname) (referential constraint)

CTS-PAC Version 1.1 81


contd...

• 1. LIKE Table name / View name


• 2. IN Database Tablespace Name
• Foreign Key references dbname.table on ‘relation
condition for delete’
• Table1 references table2(target) - Table2’s Primary key
is the foreign key defined in Table1

CTS-PAC Version 1.1 82


contd...

• The Condn’s are CASCADE, RESTRICT & SET


NULL (referential constraint for the foreign key
definition)
• Inserting (or updating ) rows in the target is allowed
only if there are no rows in the referencing table

CTS-PAC Version 1.1 83


Alter & Drop stmts

• ALTER : ALTER TABLE <Tablename>


ADD Column Data-type [ not null with default]
• Alter allows primary & Foreign key specifications to
be changed
• It does not support changes to width or data type of a
column or dropping a column

CTS-PAC Version 1.1 84


contd...

• DROP : DROP TABLE <Tablename>


• Similar stmts are there for INDEX.

CTS-PAC Version 1.1 85


Some general rules for RI & Table Parameters

• Avoid nulls in columns participating in


Arithmatic logic or comparisons
• Primary key cols cannot be nulls
• Limit referential structures to no more than
three levels in a direction

CTS-PAC Version 1.1 86


contd...

• Use DB2’s inherent features rather than pgm coded


RI’s.
• Do not use RI’s on tables build from another RI system
• Consider using Fieldprocs or Editprocs or Validprocs

CTS-PAC Version 1.1 87


Index Parameters for Creation
• CREATE INDEX Indexname ON Tablename
(Colnames asc/desc)
• CLUSTER
• SUBPAGES
• USING STOGROUP/VCAT (the corresponding name)
• PRIQTY / SECQTY ; ERASE Yes/No
• BUFFERPOOL
• CLOSE - Yes/No
• FREEPAGE
• PCTFREE

CTS-PAC Version 1.1 88


Index Guidelines - What to do ?

1. Consider indexing on columns used in


UNION,DISTINCT,GROUP BY, ORDER BY &
WHERE clauses.
2. Limit the indexing of frequently updated columns
3. Create explicitly, a clustering index
4. Create a unique index on the primary key and
indexes on foreign keys

CTS-PAC Version 1.1 89


contd...

5. overloading of index when row length of a table to


be accessed is short
6. Atleast one index must be defined for a table with
more than 100 pages
7. Use Multicolumn index rather than a multi-index
(appln dependent); however the latter requires more
DASD .

CTS-PAC Version 1.1 90


contd...

8. Create indexes before loading the table.


9. Clustering reduces I/O; DB2 optimizer usually tries
to use an index on clustered column before using the
other indexes.
10. Optimize Subpages Parameter
11. Specify Indexspace freespace the same as
tablespace freespace

CTS-PAC Version 1.1 91


contd...

12. Use the DEFER option while creating the index.


RECOVER INDEX utility can then be used to populate
the index. Recover utility populates index entries
faster.
13. Use different STOGROUP’s for Tablespaces &
indexspaces
14. Create Critical indexes in a different bufferpool than
the tablespaces.

CTS-PAC Version 1.1 92


Index Guidelines - What Not to do ?

1. Avoid indexing on Variable columns


2. Limit the number of indexes on partitioned TS
3. Avoid indexes if the table is very small (< 10 pages) or
it has heavy inserts and deletes and is very small (< 20
pages) or it is accessed with a scan. Avoid defining
redundant indexes

CTS-PAC Version 1.1 93


Some more terms & concepts associated with
Tables
VIEWS:
• It is a logical derivation of a table from other
table/tables. A View does not exist in its own right.
• They provide a certain amount if logical independence
• They allow the same data to be seen by different users
in different ways
• In DB2 a view that is to accept a update must be
derived from a single base table

CTS-PAC Version 1.1 94


Some more terms & concepts associated with
Tables
Aliases and Synonyms :
Both mean ‘another name’ for the table.
however the difference is a synonym is private to the
user who created it. Aliases are used basically for
accessing remote tables (in distributed data
processing), which add a location prefix to their
names.Using aliases creates a shorter name.

CTS-PAC Version 1.1 95


Some more terms & concepts associated with
Tables
Format:
CREATE VIEW <Viewname> (<columns>)
AS Subquery (Subquery - Select from
other Table(s))
. CREATE ALIAS <Aliasname> FOR
<Tablename>
CREATE SYNONYM <Synonymname> FOR
<Tablename>

CTS-PAC Version 1.1 96


Session 3

CTS-PAC Version 1.1 97


Topic to be covered in this session

• The following topics will be covered in this session


• Application programming using DB2 - 1 day
• Data control Language, SPUFI, QMF, Appln pgming
Guidelines - 0.5 days

CTS-PAC Version 1.1 98


Application programming using DB2

• Application environments supporting DB2 :


• IMS(Batch/Online), CICS, TSO(Batch/Online)
• CAF - Call Attach Facility
• All DB2 application types can execute concurrently
• Host Language support - Cobol, PL/1, C, Fortran or
Assembly lang

CTS-PAC Version 1.1 99


Steps involved in creating a DB2 application

• Coding the application


• using Host variables
• using Embedded SQL
• using Cursors
• issue DCLGEN command

CTS-PAC Version 1.1 100


contd...

• Pre compile the program


• Compile & Link edit the program
• Bind

CTS-PAC Version 1.1 101


Host Variables

• These are variables(or rather area of storage) defined in


the host language to use the predicates of a DB2 table.
These are referenced in the SQL stmt.
• A means of moving data from and to DB2 tables
• DCLGEN produces host variables, the same as the
columns of the table

CTS-PAC Version 1.1 102


Host Variables

Can be used in
• ‘INTO’ CLAUSE OF SELECT & FETCH
STATEMENTS
• AS INPUT OF ‘SET’ CLAUSE OF UPDATE STMTS
• AS INPUT FOR THE ‘VALUES’ CLAUSE OF
INSERT STATEMENT
• IN WHERE CLAUSE OF SELECT, INSERT,
UPDATE & DELETE
• AS LITERALS IN SELECT LIST OF A SELECT
STATEMENT
CTS-PAC Version 1.1 103
Example

• SELECT Cust_No, Cust_name, Cust_addr


FROM CUSTOMER
INTO :H-Cust-No, :H-Cust-name, :H-Cust-addr
WHERE Cust_No = :H_Cust_No;

CTS-PAC Version 1.1 104


Embedded SQL statements

• It is like the file I/O


• Normally the embedded SQL statements contain the
host variables coded in the INTO or SELECT .... as
shown above
• they are preceded by EXEC SQL
• SELECT, INSERT, UPDATE & DELETE stmts can be
coded inline

CTS-PAC Version 1.1 105


Using Cursors

• can be likened to a pointer


• used when a large number of rows are to be selected
• can be used for modifying data using a ‘FOR UPDATE
OF’ clause

CTS-PAC Version 1.1 106


Cursors

• DECLARE : name assigned for a particular SQL stmt


• OPEN : readies the cursor for row retrieval; sometimes
builds the result table.However it does not assign
values to the host variables
• FETCH : returns data from the results table one row at
a time and assigns the value to specified host variables
• CLOSE : releases all resources used by the cursor

CTS-PAC Version 1.1 107


DCLGEN

• issued for a single table


• prepares the structure of the table in a COBOL
copybook
• The copybook contains a ‘SQL DECLARE TABLE’
stmt along with a working storage host variable defn
for the table

CTS-PAC Version 1.1 108


Precompile

• searches all the SQL stmts and DB2 related INCLUDE


members and comments out every SQL stmt in the
program
• the SQL stmts are replaced by a CALL to the DB2
runtime interface module, along with parameters.
• All SQL statements are extracted and put in a Database
Request Module (DBRM)

CTS-PAC Version 1.1 109


Contd...

• places a time stamp in the modified source and the


DBRM so that these are tied. If there is a mismatch in
this a runtime error of ‘-818‘, timestamp mismatch, is
got
• all DB2 related INCLUDE stmts must be placed
between EXEC SQL & END EXEC keywords for the
precompiler to recognize them

CTS-PAC Version 1.1 110


Compile & Link

• modified precompiler COBOL output is compiled


• compiled source is link edited to an executable load
module
• appropriate DB2 host language interface module
should also be included in the link edit step(i.e
DSNALI)

CTS-PAC Version 1.1 111


Bind

• A type of compiler for SQL statements


• It reads the SQL statements from the DBRM and
produces a mechanism to access data (in an efficient
manner) as directed by the SQL statements being
bound
• Checks syntax, checks for correctness of table &
column definitions against the catalog info & performs
authorization validation

CTS-PAC Version 1.1 112


Bind Types

• BIND PLAN : accepts as input one or more DBRMs


and outputs an application plan containing executable
logic representing optimized access paths to DB2 data.
• BIND PACKAGE : acceps as input a single DBRM
and produces a single package containing the
optimized access path. The PLAN in this case contains
a reference to the physical location of the package(s).

CTS-PAC Version 1.1 113


What is a Package ?

• It is a single bound DBRM with optimized access paths


• It also contains a location identifier, a collection
identifier and a package identifier
• A package can have multiple versions, each with its
own version identifier

CTS-PAC Version 1.1 114


Advantages of Package

• Reduced bind time


• can specify bind options at the programmer level
• versioning
• provides for remote data access(in version DB2 V2.3
or higher)

CTS-PAC Version 1.1 115


Data Control language

• GRANT & REVOKE


• GRANT : grants the table privileges, plan & package
privileges, collection privileges, database privileges,
use privileges and system privileges
• user with a SYSADM privilege will be responsible for
overall control of the system

CTS-PAC Version 1.1 116


contd...

• Format of GRANT :
GRANT SELECT, UPDATE(NAME,NO)
ON TABLE EMPL
TO A, B, C(or PUBLIC);
GRANT ALL ON EMPL TO PUBLIC;
GRANT EXECUTE ON PLAN PLANA TO USER;

CTS-PAC Version 1.1 117


contd...

• The table privileges allowed are SELECT, UPDATE,


DELETE, INSERT, (both base tables & views),
ALTER(Table) & (Create)INDEX(only to base tables)

• There are no specific DROP privilages;the table can be


dropped by its owner or a SYSADM

CTS-PAC Version 1.1 118


contd...

• A user having authority to grant privilege to another,


also has the authority to grant the privilage with “with
the GRANT Option”

CTS-PAC Version 1.1 119


contd...

• REVOKE : this stmt revokes the privileges given to a


user. The user granting the privileges has the authority
to REVOKE also.
• It is not possible to be column specific when revoking
an UPDATE privilege
REVOKE SELECT ON TABLE EMPL FROM
USERA;

CTS-PAC Version 1.1 120


For the following refer handout

• List of common SQL return codes and solutions


• JCL’s for bind, compile of DB2 program

CTS-PAC Version 1.1 121


Application development guidelines

• Code modular DB2 programs and make them as small


as possible
• use unqualified SQL stmts;this enables movement from
one environment to another(test to prodn)
• Never use Select* in an embedded SQL program;
• use joins rather than subqueries

CTS-PAC Version 1.1 122


contd...

• use WHERE clause and filter out data


• use cursors when fetching multiple rows, though they
add overheads
• use FOR UPDATE OF clause for UPDATE or
DELETE with cursor - this ensures data integrity.
• use INSERTs minimally ; use LOAD utility instead of
INSERT, if the inserts are not application dependent

CTS-PAC Version 1.1 123


QMF - Query Management Facility

• It is an MVS- and VM- based query tool


• allows end users to enter SQL queries to produce a
variety of reports and graphs as a result of this query
• QMF queries can be formulated in several ways : by
direct SQL stmts, by means of relational prompted
query interface or by query-by-example (QBE). QBE is
similar to SQL in some ways but more user friendly

CTS-PAC Version 1.1 124


SPUFI

• supports the online execution of SQL statements from


a TSO terminal
• used for developers to check SQL statements or view
table details
• Spufi menu contains the input file in which the SQL
statements are coded, option for default settings and
editing and the output file.

CTS-PAC Version 1.1 125


Session 4

CTS-PAC Version 1.1 126


Topics to be covered in this Session

The duration of this session is 0.5 days


• DB2 Utilities
• DB2 Security
• DB2 catalog & Optimizer
• Performance tuning

CTS-PAC Version 1.1 127


DB2 System administration

• DB2 UTILITIES
• CHECK
• COPY, MERGECOPY
• RECOVER
• LOAD
• REORG, RUNSTATS
• EXPLAIN

CTS-PAC Version 1.1 128


Check

• checks the integrity of DB2 data structures


• checks the referential integrity between two tables and
also checks DB2 indexes for consistency

CTS-PAC Version 1.1 129


contd...

• can delete invalid rows and copies them to a exception


table
• Use CHECK DATA when loading a table without
specifying the ‘ENFORCE CONSTRAINTS’ option or
after the partial recovery of tablespaces in a referential
set

CTS-PAC Version 1.1 130


Copy

• used to create an imagecopy for the complete


tablespace or a partition of the tablespace - full
imagecopy or incremental imagecopy
• every succesful execution of COPY utility places in the
table SYSIBM.SYSCOPY, atleast one row that
indicates the status of the imagecopy

CTS-PAC Version 1.1 131


Mergecopy

• The MERGECOPY utility combines multiple


incremental image copy data sets into a new full or
incremental image copy data set

CTS-PAC Version 1.1 132


Recover

• Standard unit of recovery is a Tablespace


• restore DB2 tablespaces and indexes to a specific
instance
• data can be recovered for single pages,pages that
contain I/O errors, a single partition or an entire
tablespace
• indexes are always recovered from the actual table
data, not from image copy and log data, as in the case
of tablespace recovery

CTS-PAC Version 1.1 133


Load

• to accomplish bulk inserts into DB2 table


• can replace the current data or append to it .i.e. LOAD
DATA REPLACE or LOAD DATA RESUME(S)
• if a job terminates in any phase of LOAD REPLACE
the utility has to be terminated and rerun

CTS-PAC Version 1.1 134


contd...

• if a job terminates in any phase other than


UTILINIT(which sets up and initializes the LOAD
utility), the tablespace must be first restored using the
full RECOVER, if LOG NO option of the LOAD was
mentioned.. After the tablespace is restored, the error is
to be corrected, the utility terminated and the job rerun.

CTS-PAC Version 1.1 135


Reorg

• to reorganize DB2 tables and indexes and thereby


improving their efficiency of access
• reclusters data, resets free space to the amount
specified in the ‘create ddl’ statement and deletes and
redefines underlying vsam datasets for stogroup
defined objects

CTS-PAC Version 1.1 136


Runstats

• collects statistical information for DB2 tables,


tablespaces, partitions, indexes, and columns.
• it can place this info in the catalog tables with DB2
optimizer statistics or DBA monitoring statistics or
with all statistics that have been gathered
• it can be used on specific SQL queries without updting
the current usable statistics

CTS-PAC Version 1.1 137


Reorg Job stream

• the total reorg schedule should include


• a RUNSTATS job or step : to record current tablespace
and index statistics to DB catalog
• two copy steps for each tablespace being reorganized :
so that data is recoverable. The second copy job is
required after the REORG if it was performed with a
LOG NO option

CTS-PAC Version 1.1 138


contd...

• After a REORG is run with LOG NO option, DB2


turns on the copy pending status flag for tablespaces
specified in the REORG.
• When LOG NO parameter is specified it is better to
take a imagecopy of the tablespace being reorganized
immediately after reorg
• a REBIND job for all plans using tables in any of the
tblspaces being organized

CTS-PAC Version 1.1 139


Explain

• this feature can be detail the access paths chosen by the


DB2 optimizer for SQL statements
• used for performance monitoring
• When EXPLAIN is requested the access paths that the
DB2 chooses are put in coded format into the table
PLAN_TABLE, which is created in the default
database

CTS-PAC Version 1.1 140


contd...

• To EXPLAIN a single SQL stmt precede that SQL stmt


with the EXPLAIN Command
EXPALIN ALL SET QUERYNO = integer
FOR SQL stmt
• the other method is specifying EXPLAIN YES with the
Bind command
• then PLAN_TABLE is to be queried to get the required
information.

CTS-PAC Version 1.1 141


contd...

• the information provided include the type of access of


particualar tables used in the SQL or Package or Plan,
the order in which the tables or joined in a JOIN,
whether SORT is required and so on
• Since the EXPLAIN results are dependent on the DB
catalog, it is better to run RUNSTATS before running a
EXPLAIN

CTS-PAC Version 1.1 142


DB2 Security

• LOCKING SERVICES :
These are provided by an MVS subsystem called the
IMS resource Lock Manager(IRLM).
It is used to control concurrent access DB2 data,
regardless of whether IMS is present in a system or not.

CTS-PAC Version 1.1 143


contd...

• The above is based on Transaction Processing - the


system component that provides this is ‘A
TRANSACTION MANAGER’
• COMMIT & ROLLBACK are key methods of
implementing this

CTS-PAC Version 1.1 144


Explicit locking facilities

• the SQL statement LOCK TABLE


• the ISOLATION parameter on the BIND PACKAGE
command - the two possible values are RR(‘Repeatable
Read’) & CS(‘Cursor Stability’)

CTS-PAC Version 1.1 145


contd...

• the tablespace LOCKSIZE parameter - physically DB2


locks data in terms of pages or tables or tablespaces.
This parameter is specified in ‘CREATE or ALTER
Tablespace’ option ‘LOCKSIZE’. The options are
‘Tablespace’, ‘Table’, ‘Page’ or ‘Any’

CTS-PAC Version 1.1 146


contd...

• the ACQUIRE/RELEASE parameters on the BIND


PLAN command specifies when table locks(which are
implicitly acquired by DB2) are to be acquired and
released.
• Types : ACQUIRE USE & ACQUIRE ALLOCATE
• RELEASE USE & RELEASE ALLOCATE

CTS-PAC Version 1.1 147


Session 5

CTS-PAC Version 1.1 148


Topics to be covered in this Session

The duration of this session is 0.5 days


• DB2 Catalog & Directory

CTS-PAC Version 1.1 149


Catalog Tables & the DB2 directory
• Repository for all DB2 objects - contains 43 tables
• Each table maintains data about an aspect of the DB2
environment
• The data refers to info about tablespaces, tables,
indexes, privileges, on utilities run on DB2 and so on
eg : SYSIBM.SYSTABLES,
SYSINDEXES/SYSCOLUMNS ......’

CTS-PAC Version 1.1 150


contd...
• When standard DB2 SQL is used, the DB2 catalog is
either accessed or updated. eg. When a ‘CREATE
TABLE’ stmt is issued the catalog tables
SYSIBM.SYSTABLES, SYSIBM.SYSCOLUMNS &
SYSIBM.SYSFIELDS are updated.
• However the DB2 catalog is semi active only. This
is because updates to number of rows, the physical
order of the rows for a set of keys and the like are
updated only after running a RUNSTATS utility
• DB2 catalog is integrated - DB2 catalog and DB2
DBMS are inherently bound together
CTS-PAC Version 1.1 151
contd...

• It is nonsubvertible - DB2 catalog cannot be updated


behind DB2’s back. i.e. if a table of 10 columns is
created, it is not possible to go and change the number
of columns directly on the catalog to 15. It has to be
done using the standard SQL statements for dropping
and recreating the table

CTS-PAC Version 1.1 152


DB2 Optimizer

• Analyzes the SQL statements and determines the most


efficient way to access data - gives Physical data
independence
• It evaluates the following factors : CPU cost, I/O cost,
DB2 catalog statistics & the SQL statement
• it estimates CPU time, cost involved in applying
predicates, traversing pages and sorting

CTS-PAC Version 1.1 153


contd...

• It estimates the cost of physically retrieving and


writing the data
• The information pertaining to the state of the tables that
will be accessed by the SQL statements are provided
by the Catalog

CTS-PAC Version 1.1 154


Performance Tuning

• The performance of an application can be monitored


and enhanced in the application, as well as database
level
• In application side the SQL’s can be tuned to make
them more efficient, and avoid redundancy
• It is better to structure the SQLs so that they perform
only the necessary operations

CTS-PAC Version 1.1 155


contd...

• On the database side, the major enhancements can be


done to the definitions of tables, indexes & the
distribution of tablespace and indexspace
• The application run statistics are obtained from
EXPLAIN or DB2PM monitor report

CTS-PAC Version 1.1 156


Thank U

CTS-PAC Version 1.1 157

Вам также может понравиться