Вы находитесь на странице: 1из 971

Teradata SQL for Business

Users

Course # 36838
Version 14.0.0
Student Guide
Trademarks
The following names are registered names or trademarks and are used throughout this manual.
The product or products described in this book are licensed products of Teradata Corporation or
its affiliates.

Teradata, BYNET, DBC/1012, DecisionCast, DecisionFlow, DecisionPoint, Eye logo design,


InfoWise, Meta Warehouse, MyCommerce, SeeChain, SeeCommerce, SeeRisk, Teradata
Decision Experts, Teradata Source Experts, WebAnalyst, and You’ve Never Seen Your Business
Like This Before, and Raising Intelligence are trademarks or registered trademarks of Teradata
Corporation or its affiliates.

Adaptec and SCSISelect are trademarks or registered trademarks of Adaptec, Inc.


AMD Opteron and Opteron are trademarks of Advanced Micro Devices, Inc.
BakBone and NetVault are trademarks or registered trademarks of BakBone Software, Inc.
EMC2, PowerPath, SRDF, and Symmetrix are registered trademarks of EMC2 Corporation.
GoldenGate is a trademark of GoldenGate Software, a division of Oracle Corporation.
Hewlett-Packard and HP are registered trademarks of Hewlett-Packard Company.
Intel, Pentium, and XEON are registered trademarks of Intel Corporation.
IBM, CICS, RACF, Tivoli, z/OS, and z/VM are registered trademarks of International Business
Machines Corporation.
Linux is a registered trademark of Linus Torvalds.
LSI and Engenio are registered trademarks of LSI Corporation.
Microsoft, Active Directory, Windows, Windows NT, and Windows Server are registered
trademarks of Microsoft Corporation in the United States and other countries.
Novell and SUSE are registered trademarks of Novell, Inc., in the United States and other
countries.
QLogic and SANbox trademarks or registered trademarks of QLogic Corporation.
SAS and SAS/C are trademarks or registered trademarks of SAS Institute Inc.
SPARC is a registered trademark of SPARC International, Inc.
Symantec, NetBackup, and VERITAS are trademarks or registered trademarks of Symantec
Corporation or its affiliates in the United States and other countries.
Unicode is a collective membership mark and a service mark of Unicode, Inc.
UNIX is a registered trademark of The Open Group in the United States and other countries.

Other product and company names mentioned herein may be the trademarks of their respective
owners.

The materials included in this book are a licensed product of Teradata Corporation.

Copyright Teradata Corporation ©2007-2012


Miamisburg, Ohio, U.S.A.
All Rights Reserved.
Table of Contents

Teradata SQL for Business Users


Version 14.0.0

Module 1 - Database Concepts


Logical vs. Physical Model .......................................................................................................... 1-4
Relational Databases .................................................................................................................... 1-6
Relationships ................................................................................................................................ 1-8
Teradata Database and Users ..................................................................................................... 1-10
Storing Objects ........................................................................................................................... 1-12
Locating Objects ........................................................................................................................ 1-14
Module 1: Summary................................................................................................................... 1-16
Module 1: Review Questions ..................................................................................................... 1-18

Module 2 - SQL Assistant Features


A Table with Rows and Columns ................................................................................................ 2-4
Teradata Object Naming Conventions ......................................................................................... 2-6
SQL Assistant .............................................................................................................................. 2-8
Logging On Through SQL Assistant ......................................................................................... 2-10
Windows of SQL Assistant ........................................................................................................ 2-12
Listing Database Objects............................................................................................................ 2-14
HELP TABLE Command .......................................................................................................... 2-16
Other SQL HELP Commands .................................................................................................... 2-18
Setting a Default Database via SQL .......................................................................................... 2-20
Setting a Default Database via SQL Assistant ........................................................................... 2-24
The Teradata “SHOW” Command............................................................................................. 2-26
Other SQL SHOW Commands .................................................................................................. 2-28
Session Information via SELECT .............................................................................................. 2-30
Session Information via HELP SESSION ................................................................................. 2-32
Tools Options – Query Tab ........................................................................................................ 2-34
Tools Options – Code Editor Tab .............................................................................................. 2-36
Tools Options – Answerset Tab ................................................................................................. 2-38
Tools Options – History Tab ...................................................................................................... 2-40
Shortcuts to Typing Object Names ............................................................................................ 2-42
Commenting Lines of SQL ........................................................................................................ 2-44
Logging on to Multiple Systems ................................................................................................ 2-46
The Employee_Sales Database .................................................................................................. 2-48
The Emp_Views Database ......................................................................................................... 2-50
Module 2: Summary .................................................................................................................. 2-52
Module 2: Review Questions..................................................................................................... 2-54
Module 2: Lab Exercises ........................................................................................................... 2-56

Module 3 - Basic SELECT Clauses


SQL: Structured Query Language ............................................................................................... 3-4
Three SQL Classifications ........................................................................................................... 3-6
A Simple SQL SELECT .............................................................................................................. 3-8
Projecting All Columns and All Rows ...................................................................................... 3-10
Aliasing a Column Using AS .................................................................................................... 3-12
Aliasing Mistake? ...................................................................................................................... 3-14
Ordering Rows Using ORDER BY ........................................................................................... 3-16
Other Ordering Options ............................................................................................................. 3-18
Projecting Literal Values ........................................................................................................... 3-20
Using WHERE to Eliminate Rows ............................................................................................ 3-22
The ASCII Collating Sequence and Teradata Mode ................................................................. 3-24
The ASCII Collating Sequence and ANSI Mode ...................................................................... 3-26
Basic Logical Operators ............................................................................................................ 3-28
DISTINCT Option ..................................................................................................................... 3-30
Other Built-In Functions ............................................................................................................ 3-32
Recommended Coding Conventions ......................................................................................... 3-34
Module 3: Summary .................................................................................................................. 3-36
Module 3: Review Questions..................................................................................................... 3-38
Module 3: Lab Exercise ............................................................................................................. 3-40

Module 4 - Logical Operators


Logical Operators Introduction ................................................................................................... 4-4
The “AND” Condition ................................................................................................................. 4-6
The “OR” Condition .................................................................................................................... 4-8
Mixing AND and OR................................................................................................................. 4-10
Parentheses and the Predicate .................................................................................................... 4-12
The IN Operator ......................................................................................................................... 4-14
The NOT IN Operator ............................................................................................................... 4-16
The BETWEEN Operator .......................................................................................................... 4-18
Incorrect Sequencing of the BETWEEN ................................................................................... 4-20
Explaining the Incorrect Sequencing of BETWEEN ................................................................ 4-22
Precedence of Operators ............................................................................................................ 4-24
Module 4: Summary .................................................................................................................. 4-26
Module 4: Review Questions..................................................................................................... 4-28
Module 4: Lab Exercise ............................................................................................................. 4-30
Module 5 - Operators and NULL Processing
NULL ........................................................................................................................................... 5-4
Conditional Expressions and NULL ............................................................................................ 5-6
What Gets Returned ..................................................................................................................... 5-8
NULL and the Business Question .............................................................................................. 5-10
NOT NULL and the Business Question..................................................................................... 5-12
Negating Conditions and Operators ........................................................................................... 5-14
The IN Operator and NULL....................................................................................................... 5-16
The NOT IN Operator and NULL ............................................................................................. 5-18
NULL Literal in an IN-List ........................................................................................................ 5-20
Including NULL to an IN-List ................................................................................................... 5-22
NULL Literal in a NOT IN-List ................................................................................................ 5-24
Three Versions of NOT IN ........................................................................................................ 5-26
Module 5: Review Questions ..................................................................................................... 5-28
Module 5: Summary................................................................................................................... 5-30
Module 5: Lab Exercise ............................................................................................................. 5-32

Module 6 - Data Types and Functionality


Character Data Types ................................................................................................................... 6-4
Character Functionality ................................................................................................................ 6-6
BETWEEN Functionality with CHARACTER ........................................................................... 6-8
Integer Data Types ..................................................................................................................... 6-10
Decimal Data Types ................................................................................................................... 6-12
Float Data Type .......................................................................................................................... 6-14
Byte Data Types ......................................................................................................................... 6-16
Date Data Type .......................................................................................................................... 6-18
ARRAY Data Type .................................................................................................................... 6-20
NUMBER Data Type ................................................................................................................. 6-22
Arithmetic Operators.................................................................................................................. 6-24
Arithmetic and Derived Values .................................................................................................. 6-26
DATE Arithmetic ....................................................................................................................... 6-28
Data Type Conversions Using CAST ........................................................................................ 6-30
Data Type Conversions and Rounding ...................................................................................... 6-32
Concatenating Data Types ......................................................................................................... 6-34
Concatenated Example Results .................................................................................................. 6-36
FORMAT ................................................................................................................................... 6-40
SQL Assistant Methods for FORMAT ...................................................................................... 6-42
SQL Assistant Formatting Examples ......................................................................................... 6-44
Year, Month and Day Formatting Options ................................................................................ 6-46
Module 6: Summary................................................................................................................... 6-48
Module 6: Review Questions ..................................................................................................... 6-50
Module 6: Lab Exercise ............................................................................................................. 6-52
Module 7 - Basic SQL Functions
What are Functions? .................................................................................................................... 7-4
UPPER & LOWER...................................................................................................................... 7-6
UPPER & LOWER for Case Sensitivity ..................................................................................... 7-8
CHARACTER_LENGTH ......................................................................................................... 7-10
TRIM ......................................................................................................................................... 7-14
Trimming Other Than Space ..................................................................................................... 7-16
Trimming Numbers ................................................................................................................... 7-18
POSITION ................................................................................................................................. 7-20
Other Examples Using POSITION ............................................................................................ 7-22
SUBSTRING ............................................................................................................................. 7-24
SUBSTRING and Numbers ....................................................................................................... 7-26
LIKE .......................................................................................................................................... 7-28
LIKE Examples Using “%” ....................................................................................................... 7-30
LIKE Examples Using “_” ........................................................................................................ 7-32
LIKE & ESCAPE ...................................................................................................................... 7-34
CASESPECIFIC ........................................................................................................................ 7-36
EXTRACT ................................................................................................................................. 7-38
ADD_MONTHS ........................................................................................................................ 7-40
The Calendars ............................................................................................................................ 7-42
Calendar Differences ................................................................................................................. 7-44
Additional Calendar Functions .................................................................................................. 7-46
Module 7: Summary .................................................................................................................. 7-50
Module 7: Review Questions..................................................................................................... 7-52
Module 7: Lab Exercise ............................................................................................................. 7-54

Module 8 - Set Operators


What are Set Operators? .............................................................................................................. 8-4
The Three Set Operators .............................................................................................................. 8-6
UNION ........................................................................................................................................ 8-8
UNION ALL .............................................................................................................................. 8-10
INTERSECT .............................................................................................................................. 8-12
EXCEPT (MINUS) .................................................................................................................... 8-14
EXCEPT and ALL ..................................................................................................................... 8-16
Module 8: Summary .................................................................................................................. 8-18
Module 8: Review Questions..................................................................................................... 8-20
Module 8: Lab Exercise ............................................................................................................. 8-22

Module 9 - Subqueries
Subquery Introduction ................................................................................................................. 9-4
Basic Subquery Concepts ............................................................................................................ 9-6
Relating Concepts and Subqueries .............................................................................................. 9-8
Adding Conditions ..................................................................................................................... 9-10
Nesting Subqueries .................................................................................................................... 9-12
Multiple Column Matching ........................................................................................................ 9-14
NULL and NOT IN Subquery.................................................................................................... 9-16
Module 9: Summary................................................................................................................... 9-18
Module 9: Review Questions ..................................................................................................... 9-20
Module 9: Lab Exercise ............................................................................................................. 9-22

Module 10 - Inner Join


Inner Join Concepts .................................................................................................................... 10-4
Inner Join vs. Subquery .............................................................................................................. 10-6
Table Name Qualifications and Aliasing ................................................................................. 10-10
Varied Forms of INNER Join .................................................................................................. 10-12
Many-Table INNER Joins........................................................................................................ 10-14
Varied Forms of Many-Table Inner Joins ................................................................................ 10-16
Using Parentheses to Understand Order .................................................................................. 10-18
Using Parentheses with Other Forms ....................................................................................... 10-20
Self Joins .................................................................................................................................. 10-22
Guaranteeing Uniqueness......................................................................................................... 10-24
IN vs. Inner Join ....................................................................................................................... 10-26
NOT IN vs. Inner Join.............................................................................................................. 10-28
Cross Join ................................................................................................................................. 10-30
Mistakes on Table Aliasing...................................................................................................... 10-32
Mistakes on Column Aliasing .................................................................................................. 10-34
Module 10: Summary............................................................................................................... 10-36
Module 10: Review Questions ................................................................................................. 10-38
Module 10: Lab Exercise ......................................................................................................... 10-40

Module 11 - Outer Join


Outer Join Concepts ................................................................................................................... 11-6
Outer Join Syntax ..................................................................................................................... 11-10
Types of Outer Joins ................................................................................................................ 11-12
Employee as Left Outer ........................................................................................................... 11-14
Nulls and the Inner Table ......................................................................................................... 11-16
Department as Outer ................................................................................................................ 11-18
Outer Joins and WHERE ......................................................................................................... 11-20
Syntax Variations ..................................................................................................................... 11-22
Parts of Speech ......................................................................................................................... 11-24
Three Table Inner Join - Review.............................................................................................. 11-26
Three Table Outer Join ............................................................................................................ 11-28
Multiple Table Variations ........................................................................................................ 11-30
Three Table Outer Join Results ................................................................................................ 11-32
Uncharacteristic Data and Outer Join ...................................................................................... 11-34
Considering Nulls .................................................................................................................... 11-36
Full Outer Join ......................................................................................................................... 11-38
Module 11: Summary .............................................................................................................. 11-40
Module 11: Review Questions................................................................................................. 11-42
Module 11: Lab Exercise ......................................................................................................... 11-44

Module 12 - Correlated Subqueries


Subquery Review ....................................................................................................................... 12-4
Correlated Subquery Terminology ............................................................................................ 12-6
Correlated Subquery Processing ................................................................................................ 12-8
NOT IN vs. NOT EXISTS....................................................................................................... 12-10
NOT IN Review ....................................................................................................................... 12-12
NOT EXISTS vs. NOT IN Logic ........................................................................................... 12-14
Multiple Correlations ............................................................................................................... 12-16
Module 12: Summary .............................................................................................................. 12-18
Module 12: Review Questions................................................................................................. 12-20
Module 12: Lab Exercise ......................................................................................................... 12-22

Module 13 - Aggregation
The Aggregate Functions........................................................................................................... 13-4
Aggregate Functionality ............................................................................................................ 13-6
COUNT(*) ................................................................................................................................. 13-8
Getting Department Sums........................................................................................................ 13-10
Aggregating Groups................................................................................................................. 13-12
Using GROUP BY ................................................................................................................... 13-14
The HAVING Clause .............................................................................................................. 13-16
WHERE Clause Explain .......................................................................................................... 13-18
HAVING on Non-Aggregates ................................................................................................. 13-20
Aggregation and Joins ............................................................................................................. 13-22
Correlated Subqueries and Aggregation .................................................................................. 13-24
A Complex Example................................................................................................................ 13-26
COUNT DISTINCT ................................................................................................................ 13-28
Module 13: Summary .............................................................................................................. 13-30
Module 13: Review Questions................................................................................................. 13-32
Module 13: Lab Exercise ......................................................................................................... 13-34

Module 14 - CASE
CASE ......................................................................................................................................... 14-4
Valued Form (Projection List) ................................................................................................... 14-6
Valued Form and Null ............................................................................................................... 14-8
Searched Form ......................................................................................................................... 14-10
Searched Form (Complex Example) ....................................................................................... 14-12
CASE and Aggregation ........................................................................................................... 14-14
NULLIF Function .................................................................................................................... 14-16
NULLIF for Division ............................................................................................................... 14-18
COALESCE Function .............................................................................................................. 14-20
COALESCE and Multiple Arguments ..................................................................................... 14-22
NULLIF and COALESCE Aggregation Quiz ......................................................................... 14-24
Module 14: Summary............................................................................................................... 14-26
Module 14: Review Questions ................................................................................................. 14-28
Module 14: Lab Exercise ......................................................................................................... 14-30

Module 15 - Permanent and Derived Tables


Data Definition Language .......................................................................................................... 15-4
Set vs. Multiset ........................................................................................................................... 15-6
Column Level Options ............................................................................................................... 15-8
Index Level Options ................................................................................................................. 15-10
Deleting vs. Dropping Tables .................................................................................................. 15-12
Creating and Dropping Secondary Indexes.............................................................................. 15-14
Help Index ................................................................................................................................ 15-16
Using “Real” Tables - “Temporarily” ...................................................................................... 15-18
“Derived” Tables...................................................................................................................... 15-20
Some Derived Table Examples ................................................................................................ 15-22
Complex Derived Table Join ................................................................................................... 15-24
Module 15: Summary............................................................................................................... 15-26
Module 15: Review Questions ................................................................................................. 15-28
Module 15: Lab Exercise ......................................................................................................... 15-30

Module 16 - SAMPLE and RANDOM


SAMPLE - Introduction ............................................................................................................. 16-4
SAMPLE - Syntax ..................................................................................................................... 16-6
Multiple Samples (Number of Rows) ........................................................................................ 16-8
Multiple Samples (Percentage of Rows) .................................................................................. 16-10
SAMPLE WITH REPLACEMENT ........................................................................................ 16-12
WITH REPLACEMENT (Multiple Samples) ......................................................................... 16-14
Other Considerations................................................................................................................ 16-16
Significance of the Order of Operations .................................................................................. 16-18
Using Derived Tables............................................................................................................... 16-20
Stratified Sampling – What is it? ............................................................................................. 16-22
Stratified Sampling (No Replacement) .................................................................................... 16-24
Stratified Sampling (With Replacement) ................................................................................. 16-26
RANDOMIZED ALLOCATION ............................................................................................ 16-28
The RANDOM Function ......................................................................................................... 16-30
RANDOM and Limitations ...................................................................................................... 16-32
Module 16: Summary............................................................................................................... 16-34
Module 16: Review Questions ................................................................................................. 16-36
Module 16: Lab Exercise ......................................................................................................... 16-38
Module 17 - TOP N
TOP N Defined .......................................................................................................................... 17-4
TOP N Limitations .................................................................................................................... 17-6
TOP N Example ......................................................................................................................... 17-8
TOP N WITH TIES ................................................................................................................. 17-10
Without Ties – Same Result .................................................................................................... 17-12
WITH TIES – Same Result ..................................................................................................... 17-14
Getting Bottom Results............................................................................................................ 17-16
Bottom Results – WITH TIES ................................................................................................. 17-18
Unordered Rows ...................................................................................................................... 17-20
The PERCENT Option ............................................................................................................ 17-22
PERCENT Option – WITH TIES............................................................................................ 17-24
PERCENT Option and Millions of Rows ................................................................................ 17-26
Module 17: Summary .............................................................................................................. 17-28
Module 17: Review Questions................................................................................................. 17-30
Module 17: Lab Exercise ......................................................................................................... 17-32

Module 18 - Window Aggregates - Part 1


Window Aggregate Functions ................................................................................................... 18-4
The GROUP COUNT Window ................................................................................................. 18-6
Relating the Result to the Syntax............................................................................................... 18-8
GROUP COUNT and Null ...................................................................................................... 18-10
GROUP COUNT(*) ................................................................................................................ 18-12
Group SUM and AVG Window .............................................................................................. 18-14
Group AVG and QUALIFY .................................................................................................... 18-16
GROUP COUNT and PARTITION ........................................................................................ 18-18
GROUP COUNT, PARTITION, and Null .............................................................................. 18-20
GROUP COUNT and Null Partitions ...................................................................................... 18-22
GROUP SUM and Partition..................................................................................................... 18-24
GROUP SUM and Reordering ................................................................................................ 18-26
SQL ORDER BY to Preserve Order ....................................................................................... 18-28
Window ORDER BY to Preserve Order ................................................................................. 18-30
Qualifying on a Windowed Non-Aggregated .......................................................................... 18-32
WHERE vs. QUALIFY ........................................................................................................... 18-34
Order of Group SUM and Aggregation ................................................................................... 18-36
Module 18: Summary .............................................................................................................. 18-38
Module 18: Review Questions................................................................................................. 18-40
Module 18: Lab Exercise ......................................................................................................... 18-42
Module 19 - Window Aggregates - Part 2
What’s in this Module? .............................................................................................................. 19-4
Cumulative Sum ......................................................................................................................... 19-6
Cumulative Sum with Partitioning ............................................................................................. 19-8
Moving Sum ............................................................................................................................. 19-10
Moving AVG – Not in Range .................................................................................................. 19-12
Moving Difference ................................................................................................................... 19-14
Moving Difference and QUALIFY.......................................................................................... 19-16
Moving Difference and Partition ............................................................................................. 19-18
Remaining Window ................................................................................................................. 19-20
Remaining Window and Partition ............................................................................................ 19-22
RESET WHEN ........................................................................................................................ 19-24
Module 19: Summary............................................................................................................... 19-26
Module 19: Review Questions ................................................................................................. 19-28
Module 19: Lab Exercise ......................................................................................................... 19-30

Module 20 - RANK
Ranking Values .......................................................................................................................... 20-4
QUALIFY With no Tied Values ................................................................................................ 20-6
QUALIFY With Tied Ending Values ........................................................................................ 20-8
Qualifying Without Rank Projection ....................................................................................... 20-10
Bottom Values by ASC Rank .................................................................................................. 20-12
Bottom Values by DESC Rank ................................................................................................ 20-14
RANK and PARTITION.......................................................................................................... 20-16
ROW_NUMBER ..................................................................................................................... 20-18
ROW_NUMBER vs. RANK ................................................................................................... 20-20
ROW_NUMBER and PARTITION ........................................................................................ 20-22
ROW_NUMBER and RESET WHEN .................................................................................... 20-24
Finding Median Values ............................................................................................................ 20-26
Module 20: Summary............................................................................................................... 20-28
Module 20: Review Questions ................................................................................................. 20-30
Module 20: Lab Exercise ......................................................................................................... 20-32

Module 21 - QUANTILE and WIDTH_BUCKET


QUANTILE ............................................................................................................................... 21-4
QUANTILE and QUALIFY ...................................................................................................... 21-6
QUANTILE with no Projected Value ........................................................................................ 21-8
Aggregation and QUANTILE .................................................................................................. 21-10
OLAP vs. Window Aggregates ................................................................................................ 21-12
QUANTILE and GROUP BY.................................................................................................. 21-14
Varying a QUANTILE............................................................................................................. 21-16
Ordering a QUANTILE ........................................................................................................... 21-18
Module 21: Summary............................................................................................................... 21-20
Module 21: Review Questions................................................................................................. 21-22
Module 21: Lab Exercise ......................................................................................................... 21-24

Module 22 - Extended Grouping Functions


Extended Grouping Functions ................................................................................................... 22-4
Aggregation Review .................................................................................................................. 22-6
ROLLUP .................................................................................................................................... 22-8
Two-Level Rollup .................................................................................................................... 22-10
Switching Rollup Column Order ............................................................................................. 22-12
Null Group vs. Total ................................................................................................................ 22-14
The GROUPING Function ...................................................................................................... 22-16
CUBE vs. ROLLUP................................................................................................................. 22-18
CUBE Result ........................................................................................................................... 22-20
CUBE and GROUPING Function ........................................................................................... 22-22
The GROUPING SETS Function ............................................................................................ 22-24
Adding Grand Totals ............................................................................................................... 22-26
Combining Grouping Sets ....................................................................................................... 22-28
Module 22: Summary .............................................................................................................. 22-30
Module 22: Review Questions................................................................................................. 22-32
Module 22: Lab Exercise ......................................................................................................... 22-34

Module 23 - Views
What is a View? ......................................................................................................................... 23-4
Creating and Using Views ......................................................................................................... 23-6
Replacing a View via SQL Assistant......................................................................................... 23-8
Using Views to Rename Columns ........................................................................................... 23-10
Join View ................................................................................................................................. 23-12
Joining Views .......................................................................................................................... 23-14
Using View to Format for SQL Assistant................................................................................ 23-16
Views with Aggregates ............................................................................................................ 23-18
Aggregates and HAVING........................................................................................................ 23-20
Views and TOP N .................................................................................................................... 23-22
Restrictions on Views .............................................................................................................. 23-24
Advantages and Suggestions ................................................................................................... 23-26
Module 23: Review Questions................................................................................................. 23-28
Module 23: Lab Exercise ......................................................................................................... 23-30
Module 24 - Derived Tables and Volatile Tables
Temporary Table Choices .......................................................................................................... 24-4
Another Derived Table Syntax Form ......................................................................................... 24-6
Volatile Table Syntax................................................................................................................. 24-8
Volatile Table Restrictions....................................................................................................... 24-10
HELP and SHOW (Volatile) TABLE ...................................................................................... 24-12
ON COMMIT DELETE ROWS (Implicit Transactions) ........................................................ 24-14
ON COMMIT PRESERVE ROWS (Implicit Transactions) ................................................... 24-16
ON COMMIT DELETE ROWS (Explicit Transactions) ........................................................ 24-18
ON COMMIT PRESERVE ROWS (Explicit Transactions) ................................................... 24-20
Limitations ............................................................................................................................... 24-22
Using INSERT-SELECT ......................................................................................................... 24-24
Inserting a Single Row ............................................................................................................. 24-26
UPDATE .................................................................................................................................. 24-28
Updating with Joins ................................................................................................................. 24-30
DELETE................................................................................................................................... 24-32
Deleting with Joins................................................................................................................... 24-34
Module 24: Summary............................................................................................................... 24-36
Module 24: Review Questions ................................................................................................. 24-38
Module 24: Lab Exercise ......................................................................................................... 24-40

Appendix A: Review Question Solutions


Appendix B: Lab Exercise Solutions
Appendix C: SQL Assistant
Appendix D: The Lab Environment
Module 1

Database Concepts

After completing this module, you should be able to:

• Describe an entities and attributes.

• Describe a tables, columns and rows.

• Describe databases and users.

Database Concepts Page 1-1


Notes:

Page 1-2 Database Concepts


Table of Contents
Logical vs. Physical Model .......................................................................................................... 1-4
Relational Databases .................................................................................................................... 1-6
Relationships ................................................................................................................................ 1-8
Teradata Database and Users ..................................................................................................... 1-10
Storing Objects ........................................................................................................................... 1-12
Locating Objects ........................................................................................................................ 1-14
Module 1: Summary................................................................................................................... 1-16
Module 1: Review Questions ..................................................................................................... 1-18

Database Concepts Page 1-3


Logical vs. Physical Model
On the facing page you can think of “enterprise” as representing a particular facet or application
of a company. A logical model is a relational way in which to describe this enterprise such that:
• Entities are relational representations of real world people, places and things, such as
Employees.
• Attributes are relational representations of the qualities that an entity possesses.
• Each attribute value can be identified as belonging to a set of values in a domain. For
instance, the set of all possible employee numbers may be an integer between 10,000 and
50,000.

A physical model, on the other hand, is a way of defining how these logical representations
might be represented to a “Relational Database Management System” (RDBMS), or, more
simply, a “Database”. In a physical model:
• Entities may be represented as tables. (e.g. an entity name may become a table name)
• Attributes may be represented as columns. (e.g. the attribute “employee number” may
become “Employee_Number”)
• Domains are represented as data types. (e.g. employee number may be stored as an
INTEGER)
• An actual employee can be represented as a row in the table.

Page 1-4 Database Concepts


Logical vs. Physical Model

In the Logical Model:


• We represent the people, places, things and events of an enterprise as
entities.
• Entities have attributes (e.g. entity of employees has a last name, first
name, employee number etc.)
• Attributes have a domain that represents all possible values for that
attribute.
• Physical rows do not exist.
• You can not query entities.
• Entities relate to other entities. (e.g. employees work in departments as
described in the department entity)
In the Physical Model:
• Entities can become tables.
• Attributes can become columns.
• Domains can be represented with a data type. (e.g. integer, character, etc.)
• We are concerned with physical (database) access (e.g. Teradata)
• Rows are Physical (accessible).
• Tables relate to other tables.

Database Concepts Page 1-5


Relational Databases
To say that a database consists of a collection of logically related tables means that that each
physical database table relates to other tables in much the same way (perhaps identically) entities
in the logical model relate to one another. This will be discussed on the next page.

Page 1-6 Database Concepts


Relational Databases

• Relational Databases are founded on Set Theory and based on the Logical Model.
• A Relational Database consists of a collection of logically related tables.
• A table is a two dimensional representation of data consisting of columns and
rows.
Table EMPLOYEE Column
MANAGER
EMPLOYEE EMPLOYEE DEPARTMENT JOB LAST FIRST HIRE BIRTH SALARY
NUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT

1006 1019 301 312101 Stein John 861015 631015 3945000


Row 1008 1019 301 312102 Kanieski Carol 870201 680517 3925000
1005 0801 403 431100 Ryan Loretta 861015 650910 4120000
1004 1003 401 412101 Johnson Darlene 861015 560423 4630000
1007 Villegas Arnando 870102 470131 5970000
1003 0801 401 411100 Trader James 860731 570619 4785000

The employee table has:


– 9 columns of data
– 6 rows of data (1 per employee)
– Unknown data values represented by nulls
– Column and row order are inconsequential

Database Concepts Page 1-7


Relationships
To say that a database consists of a collection of logically related tables means that that each
physical database table relates to other tables in much the same way (perhaps identically) entities
in the logical model relate to one another. Examples are shown on the facing page.

Page 1-8 Database Concepts


Relationships

Entities can relate to themselves or DEPARTMENT

other entities based upon the MANAGER


DEPARTMENT DEPARTMENT BUDGET EMPLOYEE
definition of primary keys and foreign NUMBER NAME AMOUNT NUMBER
keys.
PK FK
These are logical, based upon how 501 marketing sales 80050000 1017
301 research and development 46560000 1019
they relate in the real world. 302 product planning 22600000 1016
403 education 93200000 1005
Primary keys are unique attributes of 402 software support 30800000 1011
entities and cannot be null. 401 customer support 98230000 1003
201 technical operations 29380000 1025
Foreign keys are not usually unique
and may be null.

EMPLOYEE
SQL joins are based
MANAGER
upon relationships EMPLOYEE EMPLOYEE DEPARTMENT JOB LAST FIRST HIRE BIRTH SALARY
among tables. NUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT
PK FK FK FK
1006 1019 301 312101 Stein John 861015 631015 3945000
These physical 1008 1019 301 312102 Kanieski Carol 870201 680517 3925000
1005 0801 403 431100 Ryan Loretta 861015 650910 4120000
relationships may or 1004 1003 401 412101 Johnson Darlene 861015 560423 4630000
may not be based upon 1007 432101 Villegas Arnando 870102 470131 5970000
1003 0801 401 411100 Trader James 860731 570619 4785000
PK and FK definitions.

Database Concepts Page 1-9


Teradata Database and Users
When it states that Teradata objects are physical things, this may be interpreted as being opposed
to logical. They are physical in the sense that they progress from “abstract” to “representative,”
or from “intangible” to “tangible”.

These objects must be defined to exists somewhere on the database, so they are created “inside,”
or said to be “owned” by a database. Since these many of these objects require space, the
“repository” called a “database” is defined to have an amount of space at create time. Space is a
concept deferred for a different time and place.

Page 1-10 Database Concepts


Teradata Databases and Users

The following are Teradata database objects.


They are physical things.

A Teradata Database is:


• A database created with the CREATE DATABASE command.
• Defined with an amount of space (in bytes)
• A repository for objects (e.g. tables, views procedures etc.)

A Teradata User is:


• A database created with the CREATE USER command.
• Defined with an amount of space (in bytes)
• A repository for objects (e.g. tables, views procedures etc.)
• Defined to have a password

That is: A USER is a database with a password that allows someone to


“logon” to Teradata and submit queries to it to retrieve results.

Database Concepts Page 1-11


Storing Objects
As stated earlier, objects like tables, view, macros, etc. are stored in a database space. This
means that repository objects, when created, do not require a definition of space. Objects name
within a repository must also be unique from one another.

Page 1-12 Database Concepts


Storing Objects

These two tables require space DEPARTMENT


to store rows. MANAGER
DEPARTMENT DEPARTMENT BUDGET EMPLOYEE
NUMBER NAME AMOUNT NUMBER
They can all be in (owned by)
the same (non-user) database, PK FK
or in different (non-user) 501 marketing sales 80050000 1017
301 research and development 46560000 1019
databases. 302 product planning 22600000 1016
403 education 93200000 1005
402 software support 30800000 1011
They could be in a user’s 401 customer support 98230000 1003
201 technical operations 29380000 1025
database, or in another user’s
database.

EMPLOYEE
MANAGER
EMPLOYEE EMPLOYEE DEPARTMENT JOB LAST FIRST HIRE BIRTH SALARY
NUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT
PK FK FK FK
1006 1019 301 312101 Stein John 861015 631015 3945000
1008 1019 301 312102 Kanieski Carol 870201 680517 3925000
1005 0801 403 431100 Ryan Loretta 861015 650910 4120000
1004 1003 401 412101 Johnson Darlene 861015 560423 4630000
1007 432101 Villegas Arnando 870102 470131 5970000
1003 0801 401 411100 Trader James 860731 570619 4785000

Database Concepts Page 1-13


Locating Objects
Later on, when we learn to write SQL, we will see how important a role the naming convention
plays in locating our objects.

Page 1-14 Database Concepts


Locating Objects
The queries that we write must,
DEPARTMENT (Table)
somehow, let Teradata know:
MANAGER
DEPARTMENT DEPARTMENT BUDGET EMPLOYEE • Which database(s) “own” the
NUMBER NAME AMOUNT NUMBER
tables or objects we reference
PK FK in our queries. (i.e. the
501 marketing sales 80050000 1017 locations of the objects)
301 research and development 46560000 1019
302 product planning 22600000 1016 • The names of the participating
403 education 93200000 1005
402 software support 30800000 1011
objects.
401 customer support 98230000 1003 • The names of the participating
201 technical operations 29380000 1025
columns.

Note that periods are used to separate names

Within SQL, the convention for qualifying is  databasename.tablename.columnname

EMPLOYEE (View)
Without a database MANAGER
qualifier, Teradata relies EMPLOYEE EMPLOYEE DEPARTMENT JOB LAST FIRST HIRE BIRTH SALARY
NUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT
on a default setting.
1006 1019 301 312101 Stein John 861015 631015 3945000

Database Concepts Page 1-15


Module 1: Summary
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 1-16 Database Concepts


Module 1: Summary

• Relational modeling is a way of modeling the people, places and things


in an enterprise.

• Physical databases contain accessible objects that are based upon the
logical model.

• Teradata Databases are repositories for objects that own space.

• Teradata Users are databases with passwords.

• SQL queries need to provide information to Teradata as to which


databases the objects can found.

Database Concepts Page 1-17


Module 1: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 1-18 Database Concepts


Module 1: Review Questions

True or False:

1. Logical models do not contain data.


True
2. You can access rows from an entity.
False
3. The physical model is always a direct reflection of a logical model.
False
4. Entity attributes may become table columns.
True
5. Data types are typically based upon domains.
True
6. The Teradata database can always determine which database owns which
object without any help from the query.
False

Database Concepts Page 1-19


Notes:

Page 1-20 Database Concepts


Module 2

SQL Assistant Features

After completing this module, you should be able to:

• Logon and use SQL Assistant to submit SQL to Teradata.


• Use the Explorer Tree GUI to drill-down into database object
information.
• Apply basic settings to SQL Assistant for tailoring its usage.
• Compare and contrast certain SQL Assistant features with SQL
alternatives like HELP; SHOW; Session/System Variables.
• Identify databases, tables, and views referenced in the labs of this
course.

SQL Assistant Features Page 2-1


Notes:

Page 2-2 SQL Assistant Features


Table of Contents
A Table with Rows and Columns ................................................................................................ 2-4
Teradata Object Naming Conventions ......................................................................................... 2-6
SQL Assistant .............................................................................................................................. 2-8
Logging On Through SQL Assistant ......................................................................................... 2-10
Windows of SQL Assistant ........................................................................................................ 2-12
Listing Database Objects............................................................................................................ 2-14
HELP TABLE Command .......................................................................................................... 2-16
Other SQL HELP Commands .................................................................................................... 2-18
Setting a Default Database via SQL .......................................................................................... 2-20
Setting a Default Database via SQL (continued) ....................................................................... 2-22
Setting a Default Database via SQL Assistant ........................................................................... 2-24
The Teradata “SHOW” Command............................................................................................. 2-26
Other SQL SHOW Commands .................................................................................................. 2-28
Session Information via SELECT .............................................................................................. 2-30
Session Information via HELP SESSION ................................................................................. 2-32
Tools Options – Query Tab ........................................................................................................ 2-34
Tools Options – Code Editor Tab .............................................................................................. 2-36
Tools Options – Answerset Tab ................................................................................................. 2-38
Tools Options – History Tab ...................................................................................................... 2-40
Shortcuts to Typing Object Names ............................................................................................ 2-42
Commenting Lines of SQL ........................................................................................................ 2-44
Logging on to Multiple Systems ................................................................................................ 2-46
The Employee_Sales Database .................................................................................................. 2-48
The Emp_Views Database ......................................................................................................... 2-50
Module 2: Summary................................................................................................................... 2-52
Module 2: Review Questions ..................................................................................................... 2-54
Module 2: Lab Exercises............................................................................................................ 2-56
Module 2: Lab Exercises (continued) ........................................................................................ 2-58

SQL Assistant Features Page 2-3


A Table with Rows and Columns
In keeping with the concepts discussed on the previous page, the facing page contrasts object
names, from a logical model, to names used in physical tables.

Page 2-4 SQL Assistant Features


A Table with Rows and Columns

EMPLOYEE
MGR
EMP EMP DEPT JOB LAST FIRST HIRE BIRTH SAL
NUM NUM NUM CODE NAME NAME DATE DATE AMT
PK FK FK FK
1006 1019 301 312101 Stein John 761015 531015 2945000
1008 1019 301 312102 Kanieski Carol 770201 580517 2925000
1005 0801 403 431100 Ryan Loretta 761015 550910 3120000
1004 1003 401 412101 Johnson Darlene 761015 460423 3630000
1007 1005 403 432101 Villegas Arnando 770102 370131 4970000
1003 0801 401 411100 Trader James 760731 470619 3785000

EMP NUM is named employee_number


MGR EMP NUM is named manager_employee_number
DEPT NUM is named department_number
JOB CODE is named job_code
LAST NAME is named last_name
FIRST NAME is named first_name
HIRE DATE is named hire_date
BIRTH DATE is named birthdate
SAL AMT is named salary_amount

SQL Assistant Features Page 2-5


Teradata Object Naming Conventions
The naming conventions on the facing page are for Teradata mode processing only and not ANSI
standard.

The Teradata Database offers 2 different transaction protocols for you to submit SQL.
Understanding transaction protocols can be a very important issue and is discussed more fully in
the class for “Application Design and Development”. This is not appropriate material for an
introductory course on SQL.

The ANSI standard for naming conventions is:


• Uppercase Characters only
• Digits 0-9
• Underscore (“_”)
• Names may not begin with a numeric digit
• Maximum of 18 characters

Using Double-Quotes When Naming Objects


You may introduce non-standard characters into any object name by including the invalid
character into the name and enclosing the name with double-quotes (“”) The double-quotes now
become part of the name and must be used in reference to the object. Keywords may also be
used as names, and follow the rules as do non-standard characters. The double-quote technique
can also be used in conjunction with valid names, but are not required during the reference.

Note: In SQL Assistant, the double-quotes do not appear in the Explorer Tree.

Examples requiring double-quotes:


• “Last Name” (space is invalid so double-quotes are required)
• “Last_Name” (valid technique, but need not be used during its reference)
• “Table” (since “Table” is a keyword, double-quotes are required)
• “999” (double quotes are required)

Page 2-6 SQL Assistant Features


Teradata Object Naming Conventions

These are the rules used for naming objects in the Teradata database.

• Object names are composed of the following characters:


A - Z (Upper or lowercase)
0-9
#, $, _

• Object names are limited to 30 characters.


• Object names may not begin with a numeric digit.

Examples of named objects.

• Account_Table
• Financials_2001_DB
• Sales_$_Column
• #_of_Years_Column

SQL Assistant Features Page 2-7


SQL Assistant
What is Teradata SQL Assistant?
Teradata SQL Assistant is an information discovery tool designed for Windows XP and Windows 2000.
Teradata SQL Assistant retrieves data from any ODBC-compliant database server. The data can then be
manipulated and stored on the desktop PC.

How Teradata SQL Assistant Works


Teradata SQL Assistant combines the data retrieved from ODBC databases with desktop applications
such as Excel to create consolidated reports, or to analyze the merged data. Teradata SQL Assistant
records all SQL activity, complete with source identification, timings, row counts and notes. This is
especially useful in data mining because the historical record can be used to build scripts from the SQL
that produced positive results.

Teradata SQL Assistant’s Key Features


The key features of Teradata SQL Assistant are:
• Create reports from any RDBMS that provides an ODBC interface
• Export data from the database to a file on a PC
• Import data from a PC file directly to the database
• Use an import file to create many similar reports (query results or Answer sets). For example,
display the DDL (SQL) that was used to create a list of tables.
• Send queries to any ODBC database or the same query to many different databases
• Create a historical record of the submitted SQL with timings and status information such as
success or failure
• Use SQL syntax examples to tailor statements
• Use the Database Explorer Tree to easily view database objects
• Use a procedure builder that provides a list of valid statements for building the logic of a stored
procedure
• Limit data returned to prevent runaway queries

Teradata SQL is an ANSI compliant product. Teradata has its own extensions to the language,
as do most vendors. Teradata SQL is fully certified at the SQL92 Entry level, with some
intermediate, some full and some SQL-99 Core features also implemented.

Page 2-8 SQL Assistant Features


SQL Assistant

SQL Assistant is a Windows-based front-end tool for


submitting SQL to Teradata.

SQL Assistant has the following properties:


• Is Windows-based.

• Is ODBC-compliant.
(Can be used to access any ODBC-compliant Database.)
• Saves information about previous query result sets.
• Permits retrieval of previously used queries.
• May be used to import or export data.
(Not covered by this course.)
• Allows many options for tuning how it works.

SQL Assistant Features Page 2-9


Logging On Through SQL Assistant
SQL Assistant logons are accomplished by clicking on the green (Connect) plug at the upper
right-hand corner of the tool.

Prior to logging on to the database, a “Data Source” must have been defined. One method for
setting up a data source is the following, although many other ways exist.

Under the “Tools” menu pull-down.


• Define Data Source
• Under USER DSN or SYSTEM DSN click “ADD”
• Choose “Teradata” (at the bottom of the list)
• In the next screen:
o Type in a descriptive name of your choice.
o Type in a description for it (optional).
o Type in a list of IP addresses for PE’s of the database.
o Type in a user name (may be typed over during logon)
o It is recommended not to type in a password, since doing this will automatically
log someone onto your user without them having to enter it themselves.
• Click  OK

The example on the facing page assumes that a data source has already been defined. In this
case, all that is required is to double-click the data source, or click it and choose “Ok”, t hen type
in a password.

Page 2-10 SQL Assistant Features


Logging On Through SQL Assistant

The sequence for logging on to a Teradata User (a repository for


objects) varies depending on how SQL Assistant is set up.
In our example, a user-name appears, requiring only a password.
Connect
Plug

To connect (logon) to Teradata using SQL


Assistant:

Click on the “Connect” plug


Click on “File” or “Machine” Data Source.
Double-click the appropriate system.
Fill in the logon information.
Click “OK”

SQL Assistant Features Page 2-11


Windows of SQL Assistant
There are 4 windows for SQL Assistant:

1. Explorer Tree
A GUI into Teradata’s Data Dictionary. This window does not appear until after a
successful logon to Teradata, and only appears if it was open when the previous session
left it open during logoff.

This window can be opened through either a toolbar button, or from the “View” pull-
down menu.

2. Query Window:
The window from which one submits SQL requests to Teradata. By right-clicking this
window, one can perform may options such as changing the type and size of fonts or
provide a SQL template for certain SQL operations.

3. History Window:
Provides a history of previous SQL requests sent to Teradata. By clicking on a request
text stored in history, one can retrieve that request back into the Query Window for re-
submission.

The number of these that can be kept is determined under the “TOOLS  OPTIONS 
History” menu.

One may also find it handy to double-click the “Notes” box and enter a note for personal
reference.

4. Results Window
Where results are returned. There are a great many ways one may tailor the viewing of
results. Many can be accomplished though SQL, and many more through the SQL
Assistant itself.

For detailed information on SQL Assistant, refer to the user’s guide for
“Teradata SQL Assistant for Microsoft Windows” (B035-2430-067A)

Page 2-12 SQL Assistant Features


Windows of SQL Assistant
Connect
Disconnect Clear Explorer Submit
Plugs Query Tree Query

SQL Assistant is a
Window’s based GUI to
Teradata’s data dictionary.

You can navigate through it


using standard window’s Query Window
features like:

- Right-clicking on areas
- Double-clicking on fields
- Clicking on drop-downs Explorer
- Click and drag from Tree
explorer tree to query Window Results Window
window.

Our discussion of this tool


will be introductory.

Feel free to experiment with History Window


using it during labs!

SQL Assistant Features Page 2-13


Listing Database Objects
Before we begin writing SQL, it would be useful to get familiar with the databases, tables, and
views we will be referencing. This page will be the first in a series of pages that attempts to
familiarize the student with these objects.

On the following page we begin by showing how one can use either the tool’s Explorer Tree or
SQL to retrieve names (table names, view names, etc) for the objects residing in a database or
user. Recall that both Databases and Users are repositories for tables and other objects, only that
a user is a logon. To better understand this difference, compare the minimum syntax needed to
create each:

To create a database:
CREATE DATABASE abc AS PERM = 1000000;
To create a user:
CREATE USER abc AS PERM = 1000000, PASSWORD = lucky;

Note that a password is required for a user and not allowed for a database. Also note that each
may be assigned an amount of “permanent” space for the objects they may own.

While the Explorer Tree provides names of objects, the names are by category as opposed to
alphabetical order regardless of category as seen by using the HELP DATABASE command.

Note that, in the HELP command, DATABASE and USER arte interchangeable. Both return the
same information. Since it is SQL, the result appears in the Results Window.

Page 2-14 SQL Assistant Features


Listing Database Objects

SQL “HELP” commands may be used to retrieve information on objects.


You can also obtain object information from the Explorer Tree as well.

To submit SQL, click


on this or press F5.

Explorer Tree
Information on
Employee_Sales.

Contrast SQL Request

The result of either


of the 2 following
commands.

HELP DATABASE Employee_Sales;


HELP USER Employee_Sales;

SQL Assistant Features Page 2-15


HELP TABLE Command
The facing page shows 2 methods for retrieving column information on tables. The information
retrieved via the SQL request syntax is more informative than for that retrieved via the Explorer
Tree. And because it is SQL, the result appears in the Results Window.

The Explorer Tree requires only pointing and clicking – as is expected from a GUI tool. Also,
more information can be viewed by either moving the cursor arrow onto a column, or by
dragging the edge of the Explorer Tree window to the right.

Page 2-16 SQL Assistant Features


HELP TABLE Command

The “HELP TABLE” command returns information about a database or user object.
This query gives Teradata the “databasename.tablename” to identify the object.

You can obtain


information from the
Explorer Tree by
clicking on objects.

Explorer Tree
information on table SQL Request
Employee.

Cursor over column


Last_Name shows
hidden information.

Result of HELP
TABLE command on
table Employee in
the query window.

SQL Assistant Features Page 2-17


Other SQL HELP Commands
The HELP commands shown on the facing page are ways to retrieve information on other
objects in a database or user. An important aspect to remember when retrieving any information
using SQL is that of which database owns the object.

In some examples the database name is specified into the SQL request – in other cases it is not.
The reason for this has to with something called your “default” database. Your default is shown
in SQL Assistant along a banner near the top portion of the tool. If the object you are
referencing exists in this database, there is no need to reference (qualify) it in your SQL. If it
does not exist in this database, then it must be referenced in the SQL request.

Default databases are discussed on the next slide after this one.

Page 2-18 SQL Assistant Features


Other SQL HELP Commands

The 3 different levels for referencing (Qualifying) objects are:


Database, Table, and Column.
Example  “databasename.tablename.columnname”

Databases and Users:


HELP DATABASE Employee_Sales;
Discuss the various HELP USER Dave_Jones;
levels of qualification to
Database Objects:
the right of the HELP HELP TABLE Employee;
command. HELP VIEW Emp_Views.emp;
HELP MACRO payroll_3;
HELP COLUMN Employee_Sales.employee.last_name;
HELP INDEX employee;
HELP STATISTICS employee;
HELP CONSTRAINT employee.over_21;
HELP VOLATILE TABLE vol_tab1;
HELP JOIN INDEX Employee_Sales.Employee;
HELP HASH INDEX Employee_Sales.Department;
HELP TRIGGER trigger1;
HELP PROCEDURE proc1;
HELP FUNCTION func1;

SQL Assistant Features Page 2-19


Setting a Default Database via SQL

Page 2-20 SQL Assistant Features


Setting a Default Database via SQL

For each logon a default database name is provided them.

This allows them the option of not having to continually specify the same
(long) database name in their SQL queries.

Only one default may be set at any time.

Users have the ability to change this default database value via SQL.

Only database names can be defaulted.

A default database can be automatically set at a user level (at logon).

A default database may also be set in SQL assistant as a logon option.

SQL Assistant Features Page 2-21


Setting a Default Database via SQL (continued)
As seen on the previous slide, knowing which database is your default is an important concept –
that is, if you don’t want to spend a substantial amount of time always qualifying databases and
tables.

To avoid always having to qualify objects you can set a single default database that will allow
you to refrain from having to qualify that database name for objects referenced or owned by that
database. You may have only one default set at any time, and this default may be set using SQL
at any time during your session. Once invoked, this new default remains until it is changed
again, or until you log off. Each time you log on to the Teradata your default database will be
the same. This default need not be your own user name. It can be set to something other than
your user when your user was created.

If you recall, the minimum syntax required to create a user is:

CREATE USER abc AS PERM = 1000000, PASSWORD = lucky;

The default database for this user will be their own user. (“username” is the default) It could
have been set to another database if this syntax were used to create “abc”. This would be your
default database if your user was created from our standardized script. Or it may have been
changed by your DBA.

CREATE USER abc AS


PERM = 1000000,
PASSWORD = lucky,
DEFAULT DATABASE = Emp_Views;

On the facing page, the new default database will be Employee_Sales.

Page 2-22 SQL Assistant Features


Setting a Default Database via SQL
(continued)
A default database assignment remains for a session until it is changed.
The user defined logon default is reacquired again at next logon.
Mouse-over the top of the Explorer Tree, as shown below, to display these defaults.

Logged on as user DLM.

Default database
is DLM.

Changes the After issuing the database default request shown, which
default database to database will be checked for the following?
Employee_Sales. HELP TABLE Employee;

Clicking on “SQL
Statement” in History
will recall it to the Query
Window later.

SQL Assistant Features Page 2-23


Setting a Default Database via SQL Assistant
Here we see that a default database can be established through SQL Assistant. A default
database set in this tool will change any default database set at logon to, perhaps, a different
database of your choosing. Regardless, any default database may be changed using SQL as
show previously.

When typing in a default database from the tool, it can either be manually typed with each logon,
or be more permanently (but not completely permanently) implemented by typing it in to the
TOOLS  DEFINE DATA SOURCE path to add it so that you don’t need to keep typing it in
for each logon.

Page 2-24 SQL Assistant Features


Setting a Default Database via SQL
Assistant
A default database can be assigned through SQL Assistant like this.

You can override the default


database of your logon from
within SQL Assistant by setting it
here.

The pathway to this screen is:

TOOLS
 DEFINE ODBC DATA SOURCE
 SYSTEM DSN
 CONFIGURE

This will override your user


defined default database.

SQL Assistant Features Page 2-25


The Teradata “SHOW” Command
The facing page contrasts the result of the SHOW command with similar information provided
by the SQL Assistant Explorer Tree.

Note how the result of the SHOW command appears in the result window where-as the result of
the “right-click”  “show definition” method would display the result into the Query Window
(not shown). Except for the location of the result, they display the same information. The nice
thing about the Explorer Tree method is that one can easily make changes and create another,
different, table from an existing table without having to copy from the Results Window to the
Query Window.

The definition shown includes any syntax acquired during the initial create by default and not by
explicitly typing the syntax itself. For example, what were likely not included during the initial
creation of the table were references to the following.

• Set (do not store duplicate rows in this table, Vs multiset – allow duplicate rows)
• (Permanent) Journaling references
• Fallback
• Character Set Latin
• Not Case Specific
• Checksum

There are default options for these settings and were likely obtained by these defaults.

Page 2-26 SQL Assistant Features


The Teradata “SHOW” Command

The “SHOW” command may be used to provide the definition of an object.


In this example, we are getting the definition for a table in database
Employee_Sales.

Contrast the SQL


“SHOW TABLE”
(on the right) with
the Explorer Tree Result of submitting
technique below. the SHOW TABLE.

Right-Click
 Show Definition

SQL Assistant Features Page 2-27


Other SQL SHOW Commands
Other SHOW commands are shown on the facing page. Although the HELP commands are
more plentiful than are those for the SHOW command, the SHOW command displays the syntax
used for creating the object, and, thus, can provide much more information about the object. The
definition shown includes any syntax acquired during the initial create by default and not by
explicitly typing the syntax itself.

Page 2-28 SQL Assistant Features


Other SQL SHOW Commands

Some other SQL SHOW commands.


Discuss the various Database Objects:
levels of qualification SHOW TABLE Employee;
shown on the right. SHOW VIEW Emp_Views.emp;
SHOW MACRO payroll_3;
SHOW TRIGGER Employee_Sales.trig1;
SHOW PROCEDURE proc1;
SHOW FUNCTION func1;

For a complete list of objects referenced by the SHOW command you can issue the
following:

HELP 'SQL SHOW';

SQL Assistant Features Page 2-29


Session Information via SELECT
The facing page shows how you can obtain information about your current session by using a
SELECT statement to “project” values found in session variables relating to your session. These
are sometimes referred to as “built-in functions”. Regardless, they are keywords. The more
important of these values are likely to be those returned from USER and DATABASE. Those
for ACCOUNT and SESSION are likely to be more important to DBA’s since account
information is tied closely to your request priority, while session (more specifically “session
number”) can be joined to other dictionary tables for providing more information about your
user. Joins will be discussed in a later module.

A brief description follows:


• USER  Who you are logged on as.
• DATABASE  Your active or current database default.
• ACCOUNT  The account information associated with your logon.
• SESSION  The session number for your current session.

Page 2-30 SQL Assistant Features


Session Information via SELECT

One method for retrieving information about your current session is to reference the
session variables below. (These are often referred to as “Built-in Functions”)

The SELECT is the main focus of the next module.

Logon User Session Number Database Default Logon Account String

SQL Assistant Features Page 2-31


Session Information via HELP SESSION
The HELP SESSION command may be used to provide more information about a session than
what we’ve seen previously with by using Session variables. Included in this single row result
are the values shown on the previous page, namely:
• USER  Who you are logged on as.
• ACCOUNT  Your account information.
• DATABASE  Your current default database.

Other valuable information includes:


• LOGON DATE
• LOGON TIME
• TRANSACTION SEMANTICS

For a complete list of values, feel free to issue this command during a lab. Much of the
information displayed from using this statement is described in other courses such as Advanced
SQL, Physical Implementation, and Application Design and Development.

Page 2-32 SQL Assistant Features


Session Information via HELP SESSION

Another method for listing session information is to issue the request shown below.
Note that the scroll bar indicates more information is available for viewing.

SQL Assistant Features Page 2-33


Tools Options – Query Tab
Some of the more relevant options are describe below.

Allowing Multiple Queries


Teradata SQL Assistant allows you to connect to multiple data sources. Each connection opens a
separate Query window. You can also have multiple queries within a Query window. Each
query opens a new tab in the Query window.

Splitting the Query Window Into Two Windows


Split the query window into two independent scrolling windows to view two different parts of
the query at once.

To split the Query Window into two scrolling windows


On the right side of the Query Window, drag the bar at the top of the vertical scroll bar
downwards.

Parameterized Queries
Queries may contain Named Parameters, which makes it easy to reuse a query because all that
changes are the data values (for example, in a Where clause). Named Parameters function like
variables. Enter the value for a named parameter once. If it is used in multiple places within the
query that same value will be used in all places.

Note: The values entered for named parameters will be saved to the Notes column of History
for future reference. Named Parameters are indicated by a “?” immediately followed by a name.
The name can consist of alphanumeric characters plus the “_” symbol. When an parameterized
query is executed, a prompt appears for each parameter before the query is submitted.

For example, if the following query is submitted, a prompt appears to enter a value for
NameStart:

Select * From PhoneBook Where LastName Like '%?NameStart%'

Zooming the Query Window


You can zoom the contents of a window in and out by using the Ctrl button and mouse wheel.

Page 2-34 SQL Assistant Features


Tools Options – Query Tab

TOOLS
 OPTIONS
 QUERY

Check this to allow for


submitting only the
queries that you have
highlighted.

SQL Assistant Features Page 2-35


Tools Options – Code Editor Tab
The query formatting feature adds line breaks and indentation before certain keywords, making
SQL that comes from automatic code generators or other sources more readable.
To format a query:
1. Ensure a statement exists in the Query Window.
2. Do one of the following:
• Right-click in the Query Window, then click Format Query
• Press Ctrl+Q
• Select Edit > Format Query

Note: If you want to retain special formatting when pasting information from other applications,
use Shift+Insert instead of Ctrl+V or the Paste tool button. If you use Ctrl+V to paste text from
applications such as Microsoft Office, the special formatting from those applications will be
changed as follows:
• Formatted text from Microsoft Excel will be pasted as unformatted data
• Where text is preceded by bullets from programs such as Microsoft Word, each
paragraph will be pasted prefixed with a period in place of the bullet
• Numbered paragraphs will be pasted prefixed with the number

Note: Some keywords will cause a line break and possibly cause the new line to be indented. If
a keyword is found to already be the first word on a line and it is already prefixed by a tab
character, then its indentation level will not change.

Indentation
When you press the Enter key, the new line will automatically indent to the same level as the line
above. If you highlight one or more lines in the query and press the Tab key, those lines are
indented one level. If you press Shift-Tab, the highlighted lines are unindented by one level.

This indentation of lines will only apply if the selected text includes a line feed character. For
example, you must either select at least part of two lines, or if selecting only one line, then the
cursor must be at the beginning of the next line. (Note that this is always the case when you use
the margin to select a line.) If no line end is included in the selected text, or no text is selected,
then a tab character will simply be inserted.

Page 2-36 SQL Assistant Features


Tools Options – Code Editor Tab

TOOLS
 OPTIONS
 CODE EDITOR

The “Code Editor” allows you to


customize how you “Query
Window” displays your SQL.

Among other things, you can:


• Convert keywords to uppercase
as you type.
• Display line numbers as shown.
• Use “Left Margin” to change the
background for number lines.
• Use “Background” to change
the background for you SQL
Code.

SQL Assistant Features Page 2-37


Tools Options – Answerset Tab
Displaying Commas to Mark Thousand Separators
To display commas
1. Right-click in the Answerset cell you wish to change and select Format Cells.
2. Check Display 1000 separators.
3. Click OK.

Displaying Decimal Places


To display decimal places
1. Right-click in the Answerset cell you wish to change and select Decimal Places.
2. Select a number between 0 and 4.

Formatting a Block of Cells


To format a block of cells
1. Select the area to be formatted.
2. Right-click and select Format Cells.
3. Set the decimal places, foreground or background color, font name, font style, or font
size.
4. Click OK.

Changing the Font for the Entire Window


To change the font for the entire window
1. Do one of the following:
◦ From the View menu, choose Set Font to bring up the Font dialog.
◦ Right-click to bring up the Shortcut menu and click Set Font.
2. Change the font name, style, and size.
3. Click OK.
Note: The font change applies to the current window and all future Answerset windows.

Zooming the Answerset Window


You can zoom the contents of a window in and out by using the Ctrl button and mouse wheel.

Page 2-38 SQL Assistant Features


Tools Options – Answerset Tab

TOOLS
 OPTIONS
 ANSWERSET

Check this and choose an


alternating line color for your
result.

For a large answer set, this


prompts you to continue after
this many rows are returned.

SQL Assistant Features Page 2-39


Tools Options – History Tab
Rearranging History Columns
History columns can be rearranged by dragging the column header to a new position. The new
column order will be used each time you open the History window.

Filtering the History Rows


You can sort and filter the History rows is various ways to help organize the information.

History Filter Operators


All history rows are stored in a single History database. The History Filter dialog allows you to
specify a set of filters to be applied to the history rows. The operators include >, <, =, and LIKE.
The filter applies to the entire history table. When you click in the fields or boxes in the Filter
dialog, the possible operators and proper format are displayed at the bottom of the
dialog.

To show the history filter options: right-click on a query in the history window and choose filter

Note: The operator box accepts only applicable operators for the filter function.
To filter the History table
1. Select the History window.
2. Right-click in the History window and select Filter.
3. Set the history filter as needed.

History Filter Option Description


Date Filters by date range. Clicking the combo box brings up a calendar. Place a filter operator
(>, <, =,) in the operator box. To display the history for the most recent n days instead of basing
it on a fixed date, check Previous ‘n’ days and enter the number of days in the date box.

Data Source Filter by data source name. Enter a data source name, optionally containing
wildcard characters. Check Use current Data Source to filter by the current data source only.
Note: The Use current Data Source filter option is used only when the Allow connection to
multiple data sources option is not checked.

User Name Shows only those rows for a specific User Name.

Statement Type Shows only those rows in which the query contains the specified statement
type. For example, Select or Create Table.

Statement Count Show only those rows in which the query contains this many statements (Use
operator <, > or =).

Row Count Shows only those rows in which the query effected this many rows (Use operator <,
> or =).

Page 2-40 SQL Assistant Features


Tools Options – History Tab

TOOLS
 OPTIONS
 HISTORY

These are the default settings.

(continued)

Elapse Time Shows only those rows in which the elapsed time matches the time entered (Use
operator <, > or =, and specify the time as hh:mm:ss).

Show successful queries only Check this box to filter for successful queries only. Queries with
errors are ignored.

SQL Assistant Features Page 2-41


Shortcuts to Typing Object Names
For the facing page, match the numbers on the left to their corresponding number in the Query
Window to determine what happens when you perform the action shown. These shortcut
methods can significantly aid one in the typing of queries whether lengthy or short.

Page 2-42 SQL Assistant Features


Shortcuts to Typing Object Names

The following key-stroke combinations may be used in conjunction with the


Explorer Tree as shortcuts to typing.

01 Click-Drag

02 Click-Ctrl-Drag

03 Ctrl-Ctrl-Ctrl-Drag

04 Click-Shift-Click-Drag

SQL Assistant Features Page 2-43


Commenting Lines of SQL
The facing page shows two different techniques one can use to comment lines of SQL.
Commenting lines means that they will not be included for execution by the database. You may
also comment ending portions of a SQL request like this.

HELP DATABASE DBC; -- this is a comment

Page 2-44 SQL Assistant Features


Commenting Lines of SQL

ANSI Standard method


for commenting lines
one at a time.

Teradata method for


commenting blocks of
lines.

Only the highlighted SQL gets


submitted.
(The “HELP DATABASE”)

Note how SQL


Assistant italicizes the
lines and changes the
color.

SQL Assistant Features Page 2-45


Logging on to Multiple Systems
The facing page illustrates how to log on to multiple Teradata systems in the same instance.

Page 2-46 SQL Assistant Features


Logging on to Multiple Systems

Note the 2 explorer tree


references.

One for each system. Note

SQL Assistant Features Page 2-47


The Employee_Sales Database
The intent of the facing page is to aid the student in understanding the objects that will be
referenced during the labs. Keeping the information in the Explorer Tree can greatly help the
student reference the correct column and table names for the lab exercises.

The Employee_Sales database contains all tables referenced by the lab exercises.

Page 2-48 SQL Assistant Features


The Employee_Sales Database

Database containing
tables used in the labs.

Note that columns include:


- Names
- Data Types
- NULL/NOT NULL Attributes

Note that index information


is available as well.

The explorer tree diagram


on the right lists all of the
tables used in labs for this
course.

Two tables have their


columns listed. These two
tables are very commonly
used.

SQL Assistant Features Page 2-49


The Emp_Views Database
The intent of the facing page is to aid the student in understanding the objects that will be
referenced during the labs. Keeping the information in the Explorer Tree can greatly help the
student reference the correct column and table names for the lab exercises.

The Emp_Views database contains all views referenced by the lab exercises

Page 2-50 SQL Assistant Features


The Emp_Views Database

Database containing
views used in the labs.

Note that columns include:


- Names
- Data Types
- NULL/NOT NULL Attributes

Note that index information


is not available.

The explorer tree diagram


on the right lists all of the
views used in labs for this
course.

For the labs, always check


your default database!

SQL Assistant Features Page 2-51


Module 2: Summary
A summary of this module is discussed.

Page 2-52 SQL Assistant Features


Module 2: Summary

• SQL Assistant is an ODBC-Based tool used to access any ODBC


Compliant database.
• SQL Assistant provides GUI access to the Teradata Dictionary.
• SQL commands like SHOW and HELP can also be used to provide
dictionary information.
• History can be used to recall a previously issued request to the Query
Window.
• One must always be mindful of their current database default setting.
• A database default is set via a SQL request.
• There are 2 databases used for labs: Employee_Sales (tables) and
Emp_Views (views).

SQL Assistant Features Page 2-53


Module 2: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 2-54 SQL Assistant Features


Module 2: Review Questions

True or False:

1. A user can have only one default database set at a time.


True
2. The following is a valid SQL request  HELP DATABASE;
False – a database name is required with this syntax.
3. The HELP TABLE command returns table index information.
False
4. You can click and drag information from the Explorer Tree to the Query
Window.
True
5. The following is a valid column name  _abc_
True
6. For the following request, the database will assume that the object “Employee”
is a database name  HELP COLUMN Employee.Last_Name;
False – it assumes it to be a table name
7. The following is a valid SQL request  SHOW COLUMN Employee.Last_Name;
False – you can not perform a SHOW on a column.

SQL Assistant Features Page 2-55


Module 2: Lab Exercises
Check your understanding of the concepts discussed in this module by completing the lab
exercises as directed by your instructor.

Page 2-56 SQL Assistant Features


Module 2: Lab Exercises

1. Logon to SQL Assistant and review the option in the “TOOLS” pull down menu.

Note the Function Key assignments for “Execute” and “EXPLAIN”.

Click on the “List Tables” option  Tables  OK


To which SQL command is this result set similar?

Next, see if you can successfully perform the “List Columns” option. Choose a
database and a table name based on what you have learned about the lab
environment.

2. Review the options in the “View” pull down menu.

Note the options for “Explorer Tree” and “History”. Click on each of these and
note the indentations to show whether they are turned on or not.

SQL Assistant Features Page 2-57


Module 2: Lab Exercises (continued)
Check your understanding of the concepts discussed in this module by completing the lab
exercises as directed by your instructor.

Page 2-58 SQL Assistant Features


Module 2: Lab Exercises (continued)

3. Try each of the following in the order shown, and note if it fails by looking at the
bottom-left portion of the utility screen. For those that fail, “double-click” the
“Notes” field for the failed request in the “History Window”.

HELP DATABASE;
HELP DATABASE yourusername;
DATABASE yourusername;
SHOW TABLE Employee;
DATABASE Employee_Sales;
SHOW TABLE Employee;
SHOW TABLE Emp_VIEWS.Employee;
SHOW VIEW Emp_VIEWS.Employee;

4. Practice dragging and dropping columns from the “Explorer Tree” window onto
the “Query Window”.

SQL Assistant Features Page 2-59


Notes:

Page 2-60 SQL Assistant Features


Module 3

Basic SELECT Clauses

After completing this module, you will be able to:

• Distinguish between 3 classes of queries.


• Select rows and columns from a table based upon equality.
• Use ORDER BY to sort result sets.
• Alias column names for providing new names.
• Use DISTINCT to project a distinct list of result rows.
• Apply WHERE constraints to conditionally return rows.
• Project character or numeric literal values.
• Write SQL in a way that is more structured and easier to read.

Basic SELECT Clauses Page 3-1


Notes:

Page 3-2 Basic SELECT Clauses


Table of Contents
SQL: Structured Query Language................................................................................................ 3-4
Three SQL Classifications ........................................................................................................... 3-6
A Simple SQL SELECT .............................................................................................................. 3-8
Projecting All Columns and All Rows ....................................................................................... 3-10
Aliasing a Column Using AS ..................................................................................................... 3-12
Aliasing Mistake? ...................................................................................................................... 3-14
Ordering Rows Using ORDER BY............................................................................................ 3-16
Other Ordering Options ............................................................................................................. 3-18
Projecting Literal Values............................................................................................................ 3-20
Using WHERE to Eliminate Rows ............................................................................................ 3-22
The ASCII Collating Sequence and Teradata Mode .................................................................. 3-24
The ASCII Collating Sequence and ANSI Mode ...................................................................... 3-26
Basic Logical Operators ............................................................................................................. 3-28
DISTINCT Option ..................................................................................................................... 3-30
Other Built-In Functions ............................................................................................................ 3-32
Recommended Coding Conventions .......................................................................................... 3-34
Module 3: Summary................................................................................................................... 3-36
Module 3: Review Questions ..................................................................................................... 3-38
Module 3: Lab Exercise ............................................................................................................. 3-40

Basic SELECT Clauses Page 3-3


SQL: Structured Query Language
Most of the SQL in this course adheres to ANSI standards. ANSI is an acronym for “American
National Standards Institute”. Teradata is fully compliant with ANSI entry-level SQL but
supports much of the other levels as well.

SQL is the “non-procedural” language of the database (Teradata). By non-procedural, it is meant


that SQL is not a compiled language, but an interpretive language. Procedural languages contain
looping constructs and IF/ELSE logic that interpretive languages don’t contain. Such constructs
do, however, exist in Stored Procedures. Stored Procedures are SQL based, but are not taught in
our SQL courses, but instead taught in Application Design and Development as a programming
language.

The database is a server that parses and processes requests from a client or host. In the scheme
of things, requests are made up of one or more statements. A statement is a single SQL construct
such as a SELECT. So multiple DML statements can be bundled into a single request. A
“request” is also often referred to as a “query” or as a “statement”, however, referring to a
request as a statement is correct if the request contains only a single statement.

In the previous module we were introduced to various SQL commands. Those were HELP;
SHOW; and SELECT (for session variables). In this module we shall look at the more common
uses for SELECT.

Page 3-4 Basic SELECT Clauses


SQL: Structured Query Language

 A complete data access and maintenance language


 Designed for Relational Database Management Systems (RDBMS)
 An industry standard for relational databases
 A non-procedural language
 Three defined SQL standards:
• SQL-89 (SQL 1)
• SQL-92 (SQL 2)
 Entry Level
 Intermediate Level
 Full Level
• SQL-99 (SQL 3)
 Core
 Enhanced
 ANSI has also met in 2003 and again in 2008.

Basic SELECT Clauses Page 3-5


Three SQL Classifications
The three classifications of SQL requests are DDL, DML and DCL. This course deals, almost
exclusively, with DML, or “data manipulation language”. Data manipulation refers to SQL that
manipulates data by either selecting, updating, inserting, or deleting rows of a table (and hence
its columns).

DCL is SQL that deals with controlling privileges and is taught in the Teradata Administration
class.

DDL refers to SQL that affects objects. Such SQL causes “write” locks to the Teradata
dictionary and, as such, have a potential for blocking queries from parsing. DDL should
“normally” be performed during off hours.

Page 3-6 Basic SELECT Clauses


Three SQL Classifications

Data Definition Language (DDL)


CREATE Define a database object (table, view, macro,
There are different index, trigger or stored procedure).
classifications of
SQL requests. DROP Remove a table, view, macro, index, trigger or
stored procedure.
ALTER Change a database object.

Data Manipulation Language (DML)


SELECT Select data from one or more tables.
INSERT Place a new row into a table.
UPDATE Change data values in one or more existing
DELETE Remove one or more rows from a table.

Data Control Language (DCL)


GRANT Give user privileges on database objects.
REVOKE Remove user privileges on database objects.
GIVE Transfer database ownership.

Basic SELECT Clauses Page 3-7


A Simple SQL SELECT
SELECT is the most basic of all SQL constructs. A SELECT is said to “project” rows of column
values from a table (or tables – as in a join). The result set for the query on the facing page can
be accomplished in many ways, any of which are simply a matter of personal style.

It is important to note that, throughout this text, ALL RESULT SETS ARE GENERATED BY
USING BTEQ AND NOT SQL ASSISTANT!

There are a few reasons for doing this.


• BTEQ does not parse requests, neither to, nor from, Teradata. This provides a truer
picture of what the database is doing.
• Title dashes are returned to help separate headings from result rows.
• Unless the column title is longer than the data type, you can count the number of title
dashes to determine the column width.

The facing page discusses how the alignment of the columns can tell you whether the data for
that column is character (left justified) or numeric (right justified).

With the top query it is not clear in which database the table resides. Only the author knows the
answer to this question. Although the bottom query is clear about the databases involved, it is
understandable to all of us which of these takes more of an effort to type.

The order of the result seems arbitrary. This can, and will, be rectified on a later page.

The heading corresponds to the name of the column in the table.

If copied and pasted correctly, the number of title dashes shown should match with the definition
of the column in the table. (i.e. a 30 character width)

Page 3-8 Basic SELECT Clauses


A Simple SQL SELECT

To obtain a list of all valid department names, two possible methods are:

SELECT Department_Name Recall that qualifications for SQL are:


FROM Department; databasename.tablename.columnname

SELECT Employee_Sales.Department.Department_Name
FROM Employee_Sales.Department;

Note that the order of the result appears department_name


to be random. ------------------------------
education
None
Note also that character data is left software support
justified. technical operations
president
The default column heading is the product planning
column name which is also left justified research and development
due to the column being character. marketing sales
customer support

Read how results are formatted for this class on the left-hand page!

Basic SELECT Clauses Page 3-9


Projecting All Columns and All Rows
To select all columns of a table you use a “*” (often referred to as “star” or “splat”). This is a
shortcut for having to list each and every column by name. (Thank goodness!)

Numeric data is right justified and character data is left justified and, unless the column name is
longer than the data type (as is the case for some of these columns), you can count the number of
title dashes to determine the column width. For instance, an integer is a whole number that is
plus-or-minus 2 billion and change, so, including the sign, it takes 11 characters to display its
values. (10 digits plus the sign equals 11 characters)

Nulls are show, but are not discussed until the next module. So hold on to that thought!

Page 3-10 Basic SELECT Clauses


Projecting All Columns and All Rows

This is a “shortcut” for displaying all column values for all rows in the department table.

SELECT * FROM Department;

department_number department_name budget_amount manager_employee_number


----------------- ------------------------- ------------- -----------------------
403 education 932000.00 1005
600 None ? 1099
402 software support 308000.00 1011
100 president 400000.00 801
302 product planning 226000.00 1016
301 research and development 465600.00 1019
? technical operations 293800.00 1025
401 customer support 982300.00 1003
501 marketing sales 308000.00 1017

Numeric column values are right justified in the field space, along with their column
headings, to signify numeric data.

Character column values are left justified in the field space, along with their column
headings, to signify character data.

The two values showing a “?” both represent NULLs (“?” is an invalid numeric value).

Basic SELECT Clauses Page 3-11


Aliasing a Column Using AS
You can use the AS keyword to rename an object. In our examples we are renaming columns.
Tables may be aliased as well, but this concept is covered later in our discussion on joins where
it is more appropriate.

Note that, as a new name, it becomes the new heading by default. Aliases are names we can
reference elsewhere in the same query. By stating that they are optional we mean that the
following would work as well.

SELECT department_number "Dept Nbr"


,department_name DeptName
,budget_amount Budget
,manager_employee_number Mgr#
FROM Employee_Sales.department;

Another example of using double-quotes is shown. Notice how it appears as a heading for the
result set.

Page 3-12 Basic SELECT Clauses


Aliasing a Column Using AS

You can rename, or “alias”, a projected column name using the optional “AS”
keyword. An alias is the assignment of a new name. It is a renaming the column
for the current query only.

Show all column values for all rows of the department table renaming the columns
names to something shorter.
SELECT department_number AS "Dept Nbr"
,department_name AS DeptName As new names, aliases now
become the names for the
,budget_amount AS Budget
column headings.
,manager_employee_number AS Mgr#
FROM Employee_Sales.department;
Dept Nbr DeptName Budget Mgr#
-------- ------------------------------ ------------ -----------
403 education 932000.00 1005
Note the 600 None ? 1099
Heading. 402 software support 308000.00 1011
201 technical operations 293800.00 1025
100 president 400000.00 801
302 product planning 226000.00 1016
301 research and development 465600.00 1019
501 marketing sales 308000.00 1017
401 customer support 982300.00 1003

Basic SELECT Clauses Page 3-13


Aliasing Mistake?
The intentional typing of the query on the facing page was to illustrate what would happen if a
comma were (unintentionally?) omitted for a query. Note how the column “Budget_Amount”,
which should have been projected into the result, has, instead, become only a heading for the
column “Department_Name”.

As far as commas are concerned, there need not be a space between a comma and a column
name. For instance, the following, with no spaces around the commas, would be perfectly
acceptable.

SELECT department_number,department_name,budget_amount,
manager_employee_number
FROM Employee_Sales.department;

Page 3-14 Basic SELECT Clauses


Aliasing Mistake?

Based on our discussion from the previous page, can you determine what is
happening with this query and its result?

Show all column values for all rows of the department


table without applying aliases.

SELECT department_number,
department_name
budget_amount,
manager_employee_number
FROM Employee_Sales.department;

department_number budget_amount manager_employee_number


----------------- ------------------------------ -----------------------
403 education 1005
600 None 1099
402 software support 1011
201 technical operations 1025
100 president 801
302 product planning 1016
301 research and development 1019
501 marketing sales 1017
401 customer support 1003

Basic SELECT Clauses Page 3-15


Ordering Rows Using ORDER BY
The ORDER BY clause can be used to place rows in an order according to your desire! The
example on the facing page is using the (implied) default of ascending, which could have been
specified explicitly like this.
ORDER BY DeptName ASC;

Note how it was acceptable for use to reference the alias name in the order clause.

Page 3-16 Basic SELECT Clauses


Ordering Rows Using ORDER BY

The ORDER BY clause can be used to order result rows.

Show all column values for all rows in department and ordered by department name.

SELECT department_number AS Dept# The default ORDER BY is


,department_name AS DeptName “ascending”.
,budget_amount AS Budget
,manager_employee_number AS Mgr# You could order explicitly doing
either of these:
FROM Employee_Sales.department
ORDER BY DeptName ASC;
ORDER BY DeptName; ORDER BY DeptName DESC;
The default is “ASC”.

Dept# DeptName Budget Mgr#


------ ------------------------------ ------------ -----------
401 customer support 982300.00 1003
403 education 932000.00 1005
501 marketing sales 308000.00 1017
600 None ? 1099
100 president 400000.00 801
302 product planning 226000.00 1016
301 research and development 465600.00 1019
402 software support 308000.00 1011
201 technical operations 293800.00 1025

Basic SELECT Clauses Page 3-17


Other Ordering Options
Other options for ordering result rows are shown on the facing page. A description of these
follows.

• ORDER BY manager_employee_number, department_number;


An ordering that will order by department number (implied ascending), within manager
number (implied ascending).

• ORDER BY manager_employee_number DESC, department_number;


A combination of ordering that will order by department number (implied ascending)
within manager number (explicit descending).

• ORDER BY 3, 1;
Ordering by column position. Ordering by column 1 (department number, implied
ascending) within column 3 (manager number, implied ascending)

• ORDER BY 4; (Invalid – no 4th column in the projected list.)

• ORDER BY 3 DESC, 1;
A combination of ordering that will order according by column position. Order by
column 1 (manager, implied ascending) within column 3 (department, explicit
descending).

• ORDER BY 3, department_number DESC;


A combination of ordering by department number (explicit descending) within column 3
(manager number, implied ascending).

• ORDER BY department_name;
Although not projected, the order will order by department name (implied ascending).
Ordering by a column (or columns) not projected may be useful under certain
circumstances.

Page 3-18 Basic SELECT Clauses


Other Ordering Options

There are many different techniques that may be used for ordering result rows.
Discuss what each option shown is attempting to do and if it is valid or not.

SELECT department_number, budget_amount,


manager_employee_number
FROM Employee_Sales.department

Valid ORDER BY manager_employee_number, department_number;


Valid ORDER BY manager_employee_number DESC, department_number;
Valid ORDER BY 3, 1;
Invalid ORDER BY 4;
Valid ORDER BY 3 DESC, 1;
Valid ORDER BY 3, department_number DESC;
Valid ORDER BY department_name;

Given that there are nine columns in Department, what about these two?
Valid SELECT * FROM department ORDER BY 2;
Invalid SELECT * FROM department ORDER BY 10;

Basic SELECT Clauses Page 3-19


Projecting Literal Values
Placing literal values into a projected list is a technique that is typically used in more advanced
situations. Suffice it to say that it is an important concept, even though our examples may
indicate otherwise. Examples that illustrate this usage best might be too advanced for an
introductory class.

Notice that characters literals must be enclosed in single quotes so as not to be confused (by the
database) as an object reference. Numeric values are not enclosed with single quotes because –
well – because they are numeric.

In answer to the questions posed.

What effect did ordering by a literal have on the result?


Ordering by identical values has an effect that appears as random since one value is as good as
another. Or, to put it another way, ordering only makes sense when the values we order by are
different from one another.

Why are there 9 rows returned?


Since there is no WHERE clause (condition), the database returns a result row for each table
row.

Page 3-20 Basic SELECT Clauses


Projecting Literal Values

With SQL you can project literal values as well as column values.

Character literals SELECT 'Department number' AS Character_Literal,


are enclosed inside 12345 AS Numeric_Literal
single quotes while FROM Department
numeric data is not. ORDER BY 1;

What effect would Character_Literal Numeric_Literal


ordering by a literal ----------------- ---------------
Department number 12345
have on the result?
Department number 12345
Department number 12345
Why are there 9 Department number 12345
rows returned? Department number 12345
Department number 12345
Department number 12345
Department number 12345
Department number 12345

Basic SELECT Clauses Page 3-21


Using WHERE to Eliminate Rows
A SELECT clause called the “WHERE” clause can be used to restrict the number of rows
returned.

In the examples on the facing page, the top one shows how to apply a condition to a numeric
column (using no single quotes), while the bottom example shows how to apply a condition on a
character column (single quotes required). If qualifying on a literal character column without
single quotes, then the database considers the literal to be an object name.

Page 3-22 Basic SELECT Clauses


Using WHERE to Eliminate Rows

Show the name for SELECT 'Department Number' AS Literal1,


department 401. Department_Number AS D#,
'Has the name of' AS Literal2,
Department_Name AS DName
FROM Department
WHERE Department_Number = 401
ORDER BY 1;

Literal1 D# Literal2 DName


----------------- ----------- --------------- -----------------------------
Department Number 401 Has the name of customer support

Show department SELECT 'The' AS Literal1,


number for the Department_Name AS DName,
customer support 'department is numbered' AS Literal2,
department. Department_Number AS D#
FROM Department
WHERE Department_Name = 'customer support'
ORDER BY 1;

Literal1 DName Literal2 D#


-------- ------------------------------ ---------------------- -----------
The customer support department is numbered 401

Basic SELECT Clauses Page 3-23


The ASCII Collating Sequence and Teradata Mode
The table definition for the query on the right follows.

It is a one column table that stores many of the ASCII character values. It is a “MULTISET”
table to show that, as a single column table, the upper and lower case letters for the same value
are actually the same value and could not be stored as a SET table due to causing a duplicate row
violation. Tables created in Teradata mode are, by default, SET tables and not MULTISET as
shown here. Note how column “c1” is defined as NOT CASESPECIFIC to show no case
sensitivity.

CREATE MULTISET TABLE DLM.collation_t ,FALLBACK ,


NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
c1 CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC)
PRIMARY INDEX (c1);

The ASCII chart is shown here.

+ -- --- -- --- -- --- -- --- -- --- -- --- -- --- -- --- +


| 00 nul | 01 soh | 02 stx | 03 etx | 04 eot | 05 enq | 06 ack | 07 bel |
| 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si |
| 10 dle | 11 dcl | 12 dc2 | 13 dc3 | 14 dc4 | 15 nak | 16 syn | 17 etb |
| 18 can | 19 em | 1a sub | 1b esc | 1c fs | 1d gs | 1e rs | 1f us |
| 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' |
| 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / |
| 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 |
| 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e > | 3f ? |
| 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G |
| 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O |
| 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W |
| 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ |
| 60 ' | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g |
| 68 h | 69 i | 6a j | 6b k | 6c l | 6d m | 6e n | 6f o |
| 70 p | 71 q | 72 r | 73 s | 74 t | 75 u | 76 v | 77 w |
| 78 x | 79 y | 7a z | 7b { | 7c | | 7d } | 7e ~ | 7f del |
+ -- --- -- --- -- --- -- --- -- --- -- --- -- --- -- --- +

Page 3-24 Basic SELECT Clauses


The ASCII Collating Sequence
and Teradata Mode
The sort sequence for Teradata Mode is on SELECT * FROM collation_t
the right. ORDER BY 1;
Null 0
In Teradata Mode upper and lower case 9
values for the same letter are equal. ! :
" ;
# <
The definition for the table “Collation_t” is
% =
shown on the left page along with the >
&
possible ASCII values and sorting order. ?
'
( @
Teradata Mode, is not case sensitive. ) a
* A
The issue of case sensitivity affects upper + Z
, z
and lower case letters.
- [
. \
Here upper and lower case sort the same, ]
/
so it only matters that the letter “a/A” is ^
show before the letter “z/Z”, in any order. _
Single {
Character }
~

Basic SELECT Clauses Page 3-25


The ASCII Collating Sequence and ANSI Mode
The table definition for the query on the right follows.

It is a one column table that stores many of the ASCII character values. It is a “SET” table to
show that, as a single column table, the upper and lower case letters for the same value are
actually different rows and not duplicates. Tables created in ANSI mode are, by default,
MULTISET tables, and not SET tables as shown here. Note how column “c1” is defined as
CASESPECIFIC to show case sensitivity.

CREATE SET TABLE DLM.collation_a ,FALLBACK ,


NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
c1 CHAR(1) CHARACTER SET LATIN CASESPECIFIC)
PRIMARY INDEX (c1);

Page 3-26 Basic SELECT Clauses


The ASCII Collating Sequence
and ANSI Mode
The sort sequence for ANSI Mode is on the SELECT * FROM collation_a
right. ORDER BY 1;
Null 0
In ANSI Mode upper and lower case values 9
for the same letter are different. ! :
" ;
# <
The definition for the table “Collation_t” is
% =
shown on the left page. >
&
' ?
ANSI Mode, is case sensitive. ( @
) A
The issue of case sensitivity affects upper * Z
and lower case letters. + [
, \
- ]
Here upper and lower case sort differently, ^
.
with upper case values sorting before _
/
lower case values. a
z
Single {
Character }
~

Basic SELECT Clauses Page 3-27


Basic Logical Operators
Some of the basic operators are shown on the facing page. More will be discussed later. The
only additional thing to note about these is that they must be used as show. For instance “<>”
(not equal) cannot be written as “><”. (It could, but it would fail.)

Page 3-28 Basic SELECT Clauses


Basic Logical Operators

The chart shows various operators for WHERE constraints.

These will discussed in more depth later.

Teradata
ANSI Standard
Extension

Equal = EQ

Not Equal <> NE

Less Than < LT

Greater Than > GT

Greater Than Equal


To
>= GE

Less Than
<= LE
Equal To

Basic SELECT Clauses Page 3-29


DISTINCT Option
For the DISTINCT clause, all projected columns must follow the key word or it will fail. Uses
that are not allowed, then, are:

SEL job_code DISTINCT department_number


FROM employee;
SEL DISTINCT department_number, DISTINCT job_code
FROM employee;

The facing page discusses aggregate processing to obtain a distinct list. This is a stratgey that
will be duscussed later.

The bottom line is: Add an ORDER BY to insure correct ordering.

Page 3-30 Basic SELECT Clauses


DISTINCT Option

Find all the different job codes assigned to employees.


D# J_Cd
------ --------
Without DISTINCT option: ? 211100
SELECT department_number AS D# ? 222101
? 321100
,job_code AS J_Cd 301 311100
FROM employee 301 312101
ORDER BY 1, 2; 301 312102
401 411100
DISTINCT option with DISTINCT processing: 401 412101
D# J_Cd D# J_Cd
------- ------- ------ ------- 401 412101
SELECT DISTINCT 301 311100 401 412101
? 211100
department_number AS D# 501 512101 ? 222101 401 412102
,job_code AS J_Cd ? 222101 ? 321100 401 412102
FROM employee; 999 111100 301 311100 401 413201
501 511100 301 312101 402 421100
402 421100 301 312102 402 422101
DISTINCT option with AGGREGATE 403 ?
401 411100 401 411100
(GROUP BY) processing: 403 ?
403 431100 401 412101
402 422101 401 412102 403 ?
SELECT DISTINCT 403 ?
401 412101 401 413201
department_number AS D# 403 ?
301 312102 402 421100
,job_code AS J_Cd 401 413201 402 422101 403 431100
FROM employee; 301 312101 403 ? 501 511100
401 412102 403 431100 501 512101
Note: When order is important, ? 321100 501 511100 501 512101
? 211100 501 512101
add an ORDER BY when using 501 512101
999 111100
403 ? 999 111100
DISTINCT!

Basic SELECT Clauses Page 3-31


Other Built-In Functions
Earlier we learned about certain session variables or built-in functions for use in SQL. Here we
show some other ones to use if needed. There are more of these that are available, but many are
specific to role and not relevant to our discussion.

Page 3-32 Basic SELECT Clauses


Other Built-in Functions

Built-in functions we have already learned.

SESSION - contains the session-id


DATABASE - contains the current database
ACCOUNT - contains the user account info
USER - contains the user name for this session

Other ones that are available for use.

SELECT DATE, TIME, CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP;


Date Time Date Current Time(0) Current TimeStamp(6)
-------- -------- -------- --------------- --------------------------------
09/03/27 11:33:02 09/03/27 11:33:02+00:00 2009-03-27 11:33:02.240000+00:00

Some information about the request and its result.


• The DATE and TIME keywords are Teradata extensions to the ANSI standard.
• Note that the values are considered numeric since they are all right justified.
• Those having “CURRENT_” in the name are ANSI standard.

Basic SELECT Clauses Page 3-33


Recommended Coding Conventions
The recommended coding convention, at the right, represents a style of coding that many people
use and has become (somewhat) a defacto standard. We say “somewhat” because SQL is, after
all, a free-form language and can be written in what is termed “paragraph style” as shown.

The query discussed will fail because there should be a space between the WHERE clause and
the table named “department_number”.

select last_name,first_name,hire_date salary_amount from employee


wheredepartment_number = 401 order by last_name;

What was probable meant:

SELECT last_name, first_name, hire_date, salary_amount


FROM employee
WHERE department_number = 401
ORDER BY last_name;

Page 3-34 Basic SELECT Clauses


Recommended Coding Conventions

Although SQL is considered a “free-form” language, the following represents a


commonly used convention for SQL coding.

SELECT last_name
,first_name
,hire_date
,salary_amount
FROM employee
WHERE department_number = 401
ORDER BY last_name;

The convention below, often referred to as “paragraph-style”, can be difficult to


debug. Identify two potential problems with the following query.

select last_name,first_name,hire_date
salary_amount from employee
wheredepartment_number = 401 order by last_name;

Basic SELECT Clauses Page 3-35


Module 3: Summary
A summary of this module is discussed.

Page 3-36 Basic SELECT Clauses


Module 3: Summary

• SQL has 3 classes of queries.


- DDL
- DML
- DCL
• The number of rows returned can be affected by condition applied via a
WHERE clause.
• You can rearrange the order of the rows in the result set by using
ORDER BY;
• You can alias a column name using AS.
• Operators like =, <>, <=, >=, <, > can be used as qualifiers.
• DISTINCT can be used to project a distinct list of result rows.
• You can project literal values as well as column values.
• Get into good habits of writing SQL early and avoid writing in paragraph
form.
• Understanding the ANSI collating sequence can help when writing
conditions.

Basic SELECT Clauses Page 3-37


Module 3: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 3-38 Basic SELECT Clauses


Module 3: Review Questions

True or False:

1. “SELECT * FROM Employee ORDER BY 1;” is a valid SQL construct.


True
2. The SQL DELETE is considered a DDL request.
False – it is a DML request
3. DISTINCT automatically performs a sort.
True
4. A WHERE clause can be used to eliminate columns from a result.
False – WHERE effects only row counts
5. A character literal not enclosed in single quotes is interpreted as an
object name.
True
6. Double quotes can also be used to display literal values.
False – single quotes are used to do this
7. The built-in functions DATE and TIME are ANSI standard.
False

Basic SELECT Clauses Page 3-39


Module 3: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 3-40 Basic SELECT Clauses


Module 3: Lab Exercise

1. Select all columns for all departments from the department table.

2. Request a report of employee last and first names and salary for all of
manager 1019's employees. Order the report in last name ascending
sequence.

3. Project a distinct list of job codes which have been assigned to people
and are greater than 510000 and sort the result descending.

4. What are the first names of people with a last name of “Brown”?

5. How many people have been assigned job codes greater than or equal
to 510001?
(since aggregation has not been taught yet you will have to manually count
them? Or can SQL Assistant tell you?)

Basic SELECT Clauses Page 3-41


Notes:

Page 3-42 Basic SELECT Clauses


Module 4

Logical Operators

After completing this module, you will be able to:

• Use Logical Operators in writing queries.


• Link Logical Operators together with AND – OR – BETWEEN – IN
and NOT IN.
• Identify unsatisfiable conditions.
• Use parentheses to show order of precedence and improve
readability.
• Incorporate NULL syntax into queries correctly.
• Identify issues introduced by adding NULL into WHERE conditions.

Logical Operators Page 4-1


Notes:

Page 4-2 Logical Operators


Table of Contents
Logical Operators Introduction .................................................................................................... 4-4
The “AND” Condition ................................................................................................................. 4-6
The “OR” Condition .................................................................................................................... 4-8
Mixing AND and OR ................................................................................................................. 4-10
Parentheses and the Predicate .................................................................................................... 4-12
The IN Operator ......................................................................................................................... 4-14
The NOT IN Operator ................................................................................................................ 4-16
The BETWEEN Operator .......................................................................................................... 4-18
Incorrect Sequencing of the BETWEEN ................................................................................... 4-20
Explaining the Incorrect Sequencing of BETWEEN ................................................................. 4-22
Precedence of Operators ............................................................................................................ 4-24
Module 4: Summary................................................................................................................... 4-26
Module 4: Review Questions ..................................................................................................... 4-28
Module 4: Lab Exercise ............................................................................................................. 4-30

Logical Operators Page 4-3


Logical Operators Introduction
We touched on the logical operators earlier and now we shall look at them in more depth,
especially BETWEEN, IN and NOT IN. We will also go in to depth on the effect of NULLs in
query processing – which is considerable and extremely important.

The concept of NULL is one of the most important in all of SQL, and, although it can get tricky
(and difficult) to deal with, a good SQL would be remiss if it didn’t go into depth on NULL and
continue with it throughout the class. It will definitely add a level of complexity to your SQL
education.

Page 4-4 Logical Operators


Logical Operators Introduction

The comparison operators that we will be studying in this module are shown below.
The symbols are ANSI standard.
The abbreviations are Teradata extensions.

Comparison Operators
(alternative abbreviations included where applicable):

= (EQ) Equal
< > (NE) Not equal
> (GT) Greater than
< (LT) Less than
> = (GE) Greater than or equal to
< = (LE) Less than or equal to
BETWEEN. . . AND Inclusive range
[NOT] IN Test against predefined set
IS [NOT] NULL Test for nulls

Logical Operators Page 4-5


The “AND” Condition
Before we go into more depth about Logical Operators, we need to look at them in the context of
connecting many of them together with constructs like AND and OR. The example on the top is
Ok, but the SQL you will write will certainly not always be as simple as having just one
constraint. Typically you will have multiple constraints linked together with the constructs
discussed in this module.

As you will soon find out, when linking constraints together with AND, all conditions must
evaluate true or no column values for the row involved will be projected. In other words, for our
example at the bottom of the facing page, we will only project rows if the last name for the
employee equals ‘Brown’ and their first name equals ‘Alan’ (no case sensitivity).

Page 4-6 Logical Operators


The “AND” Condition

Retrieve all employees with last name of “Brown”. Employee Table


Last First Dept#
SELECT Last_Name, First_Name, Dept# Name Name
FROM Employee
WHERE Last_Name = 'brown'; Brown Alan 401

last_name first_name dept# Brown Allen 801


---------- ---------- -----------
Brown Cary 567
Brown Alan 401
Brown Allen 801 Smith Mary 900
Brown Cary 567
Jones Jimmy 401

Retrieve information for employee “Alan Brown”.

SELECT Last_Name, First_Name, Dept#


FROM Employee
WHERE Last_Name = 'brown' All conditions linked with AND
AND First_Name = 'alan'; must evaluate “True” in the
same row for the predicate to
last_name first_name dept# evaluate “True”.
---------- ---------- -----------
Brown Alan 401

Logical Operators Page 4-7


The “OR” Condition
When using ‘OR’ to link multiple conditions, only one of the conditions need be true. In our
example we will project all columns for employees whose last name is equal to ‘Brown’ or their
first name is equal to ‘Mary’. As shown, there is no employee that satisfies both conditions, but
they each satisfy at least one of them.

Page 4-8 Logical Operators


The “OR” Condition

Employee Table
Contrast the AND with the OR, below: Last First Dept#
Name Name

Retrieve employees with last name “Brown” Brown Alan 401


and first name “Mary”.
Brown Allen 801
SELECT *
Brown Cary 567
FROM Employee *** Query completed.
WHERE Last_Name = 'brown' No rows found. Smith Mary 900

AND First_Name = 'mary'; Jones Jimmy 401

Only one condition ned evaluate “True” in the same


row for the predicate to evaluate “True” using “OR”.

Retrieve employees with last name “Brown”


or first name “Mary”.
last_name first_name dept#
SELECT * ---------- ---------- -----------
FROM Employee Smith Mary 900
WHERE Last_Name = 'brown' Brown Alan 401
OR First_Name = 'mary'; Brown Allen 801
Brown Cary 567

Logical Operators Page 4-9


Mixing AND and OR
Our SQL begins to take on another dimension when we start combining conditions with AND
and OR (as well as other operators). The first example is just another simple example like the
ones we saw earlier. The example on the bottom combines AND with OR so that we now have 3
different conditions. Note that it says that AND is evaluated first. We will discuss their
precedence of order later, but for now think of the WHERE clause as containing only 2
conditions linked with OR:

WHERE
Last_Name = ‘brown’ AND Dept# = 401
OR
First_Name = ‘Mary’

The result shows that “Mary Smith” doesn’t satisfy either of the AND’ed conditions, but it does
satisfy the (only) condition that the first name is “Mary”. While “Alan Brown satisfies the (only)
condition that his last name is “Brown” and he works in department 401.

We will continue this thread and add more examples to show you how complexity can be added
to a query making it more and more complex. But first we take small steps.

Page 4-10 Logical Operators


Mixing AND and OR

Contrast the 2 queries below. Employee Table


Last First Dept#
Retrieve employees with last name “Brown” Name Name
and work in department 401.
Brown Alan 401
SELECT *
FROM Employee Brown Allen 801

WHERE Last_Name = 'brown' Brown Cary 567


AND Dept# = 401;
Smith Mary 900
last_name first_name dept#
Jones Jimmy 401
---------- ---------- -----------
Brown Alan 401

Retrieve employees with last name “Brown” and Work in department 401, OR first
name “Mary”. (“AND” is evaluated first)
SELECT *
last_name first_name dept#
FROM Employee
---------- ---------- -----------
WHERE Last_Name = 'brown' Smith Mary 900
AND Dept# = 401 Brown Alan 401
OR First_Name = 'mary';

Logical Operators Page 4-11


Parentheses and the Predicate
Now we see how parentheses can be added to a query to make it more readable. The example at
the top returns the same result without the parentheses as it does here. All the parentheses did
was make it more readable(?).

If we re-arrange the parentheses to that shown for the query at the bottom of the page we see that
the query is now that for an entirely different business question. We still have only 2 conditions,
but now they are linked with AND:

WHERE
Last_Name = “brown”
AND
Dept# = 401 OR First_Name = “mary”

Notice that Mary Smith no longer qualifies since she isn’t in department 401.

Alan Brown still qualifies because his last name is “Brown” (first condition) AND he works in
department 401 (one of the OR’ed condition satisfies the second half of this AND’ed condition).

Page 4-12 Logical Operators


Parentheses and the Predicate

The use of the parentheses in this query illustrate the order of evaluation of “AND”
and “OR” for the query on the previous page.
Employee Table
Retrieve employees with SELECT * Last First Dept#
last name “Brown” and FROM Employee Name Name
work in department 401, WHERE (Last_Name = 'brown'
or first name is “Mary”. AND Dept# = 401) Brown Alan 401
OR (First_Name = 'mary');
Brown Allen 801
last_name first_name dept# Brown Cary 567
---------- ---------- -----------
Smith Mary 900 Smith Mary 900
Brown Alan 401
Jones Jimmy 401

Rearranging the parentheses changes the business question.


Retrieve employees with last name of “Brown”,
AND first name of “Mary” or work in department 401.

SELECT *
FROM Employee last_name first_name dept#
WHERE (Last_Name = 'brown') ---------- ---------- -----------
AND (Dept# = 401 Brown Alan 401
OR First_Name = 'mary');

Logical Operators Page 4-13


The IN Operator
The IN operator is a nicer way of taking a lot of values that are OR’ed together, against a single
column, all linked with equality - and listing them as a set of values for which the column value
of the row is equal to one of them in the set. The structure of the In-List is simple and intuitive
enough. It’s simply a comma delimited list of values. The list need not be separated by a
comma and a space, but only the comma.

In our example we see how the database rewrites the query to an equivalent one where each
condition is equality based and linked with OR. To get the optimizer plan you simple prefix the
query with the keyword “EXPLAIN”.

EXPLAIN
SELECT *
FROM Employee
WHERE First_Name IN (‘alan’, ‘allen’);

Another way to do this through SQL assistant is to press the F6 key. Just makes sure to highlight
the query if it is among many so as not to get an EXPLAIN for the whole of them.

One of the nice things about doing an EXPLAIN is that it not only shows you the plan of how it
is going to perform your query to get the result, but it also may show how it re-wrote it in doing
so.

Page 4-14 Logical Operators


The IN Operator

Retrieve employee whose first names are in the following Employee Table
list (as a set). Last First Dept#
Name Name

SELECT *
FROM Employee Brown Alan 401

WHERE First_Name IN ('alan', 'allen'); Brown Allen 801

Brown Cary 567


last_name first_name dept#
---------- ---------- ----------- Smith Mary 900
Brown Alan 401
Jones Jimmy 401
Brown Allen 801

Submitting the request by prefixing it with the EXPLAIN keyword will display the
optimizer rewrite of this query (see left hand page).

. . . . Rewrite Equivalent:
3) We do an all-AMPs RETRIEVE step from DLM.Employee by way of an
WHERE
all-rows scan with a condition of ("(DLM.Employee.first_name =
First_Name = 'allen'
'allen') OR (DLM.Employee.first_name = 'alan')") into Spool 1
OR
(group_amps), which is built locally on the AMPs. The size of
First_Name = 'alan';
Spool 1 is estimated with no confidence to be 7 rows (595 bytes).
The estimated time for this step is 0.02 seconds.
. . . .

Logical Operators Page 4-15


The NOT IN Operator
Just as the IN operator is a nice way to link a number of OR’ed conditions to a single column
based on equality, NOT IN is a great way to get the not equal result. Notice that the equivalent
writing of the query, generated by the optimizer as shown in an EXPLAIN, changes all OR’ed
conditions to AND’ed conditions and changes = to <>.

Let’s compare IN

First_Name IN (‘alan’, ‘allen’)

Written as -

First_Name = ‘alan’
OR First_Name = ‘allen’

In this example the only values satisfying at least one of these conditions are for the two shown.
All others evaluate false or unknown and are not returned.

to NOT IN

First_Name NOT IN (‘alan’, ‘allen’)

Written as -

First_Name <> ‘alan’


AND First_Name <> ‘allen’

In this example the only values satisfying both of these conditions are for all values but the two
shown. For these two, either ‘alan’ or ‘allen’ make one of these conditions false, and as such,
causes the entire condition to be false because when one condition of many AND’ed together is
false (regardless of unknown conditions involving null), the entire AND’ed condition is false.

Page 4-16 Logical Operators


The NOT IN Operator

Retrieve employee whose first names are NOT IN the Employee Table
following list (as a set).
Last First Dept#
Name Name
SELECT *
FROM Employee Brown Alan 401
WHERE First_Name NOT IN ('alan', 'allen');
Brown Allen 801

last_name first_name dept# Brown Cary 567


---------- ---------- -----------
Brown Cary 567 Smith Mary 900
Smith Mary 900 Jones Jimmy 401
Jones Jimmy 401

Submitting the request by prefixing it with the EXPLAIN keyword will yield the
optimizer rewrite of this query.

. . . . Rewrite Equivalent:
3) We do an all-AMPs RETRIEVE step from DLM.Employee by way
of an all-rows scan with a condition of “(DLM.Employee.first_name WHERE
<> 'alan') AND (DLM.Employee.first_name <> 'allen')") into Spool 1 First_Name <> 'allen'
(group_amps), which is built locally on the AMPs. The size of AND
Spool 1 is estimated with no confidence to be 20 rows (1,700 First_Name <> 'alan';
bytes). The estimated time for this step is 0.02 seconds.
. . . .

Logical Operators Page 4-17


The BETWEEN Operator
SQL has a BETWEEN operator that can be used to apply range constraints to either numeric or
character values. (We shall discuss using it with character data later since this gets a little more
complicated.) More typically, however, range constraints are used on numeric data.

As can be seen in the EXPLAIN of the query, BETWEEN is inclusive. By making it inclusive,
it becomes easier to control the ranges that are required. And since the optimizer rewrites it by
replacing BETWEEN with its “<= / >=” equivalent, it makes no difference how you may write
it, they will perform the same.

Page 4-18 Logical Operators


The BETWEEN Operator

You can use BETWEEN to perform range constraints.


BETWEEN is inclusive, as can be seen in the optimizer Employee Table
rewrite. Last First Dept#
Name Name

SELECT *
Brown Alan 401
FROM Employee
WHERE Dept# BETWEEN 401 and 567; Brown Allen 801

Brown ? 567
last_name first_name dept#
---------- ---------- ----------- Smith Mary ?
Jones Jimmy 401
Jones Jimmy 401
Brown Alan 401
Brown ? 567

Explanation
---------------------------------------------------------------------------------------------------------------------------------------
. . . .
3) We do an all-AMPs RETRIEVE step from customer_service.Employee by way of an all-rows
scan with a condition of ("(customer_service.Employee.dept# <= 567) AND
(customer_service.Employee.dept# >= 401)") into Spool 1 (group_amps), which is built
locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 4 rows (340
bytes). The estimated time for this step is 0.02 seconds.
. . . .

Logical Operators Page 4-19


Incorrect Sequencing of the BETWEEN
When writing a condition using BETWEEN you must make sure that the order of the values in
the range is correct. The smallest value must precede the larger value. The reason for this has to
do with how the database rewrites the BETWEEN as an equivalent condition using inequalities.
If you change the order of the values, the database simply leaves the order of the inequalities
unchanged, and instead changes the order of the values.

In our example, no department number can be greater than or equal to 567, and at the same time
less than or equal to 401. This does not fail! It is considered a success with no rows satisfying
the conditional expression.

Page 4-20 Logical Operators


Incorrect Sequencing of the BETWEEN

Contrast this query with the one following it: Employee Table
Last First Dept#
Name Name
SELECT *
FROM Employee
Brown Alan 401
WHERE Dept# BETWEEN 567 AND 401;
Brown Allen 801

For a row to be projected, the net result of all WHERE Brown ? 567
conditions must be TRUE.
Smith Mary ?
This condition is deemed un-satisfiable by the
optimizer. Jones Jimmy 401

The optimizer rewrites the previous query to the following.

How many rows can satisfy this condition?


SELECT *
FROM Employee *** Query completed. No rows found.
WHERE Dept# >= 567 This is not a failure!
AND Dept# <= 401;

Logical Operators Page 4-21


Explaining the Incorrect Sequencing of BETWEEN
When we perform an EXPLAIN for the previous query we see that the database recognizes this
condition as “unsatisfiable.” This means that it can never evaluate as being true. Since it is the
only condition for the query the database simply returns “no rows found”.

Next we shall see what happens when we integrate other conditions into this conditional
expression.

Page 4-22 Logical Operators


Explaining the Incorrect Sequencing of
BETWEEN

The EXPLAIN of the previous query illustrates how the Employee Table
optimizer is aware of the unsatisfiable condition. Last First Dept#
Name Name
Only a quick PI access is attempted with an
unsatisfiable condition. Brown Alan 401

Brown Allen 801


The optimizer never estimates 0 rows, so the EXPLAIN
Brown ? 567
showing the 1 row can be ignored.
Smith Mary ?
SELECT *
FROM Employee Jones Jimmy 401

WHERE Dept# BETWEEN 567 AND 401;

Explanation
----------------------------------------------------------------------------------------------------------------------------------------------
1) First, we do a single-AMP RETRIEVE step from DLM.Employee by way of the primary index
"DLM.Employee.last_name = _LATIN '010000001800000001E0'XC" with unsatisfiable conditions
into Spool 1 (one-amp), which is built locally on that AMP. The size of Spool 1 is estimated with
high confidence to be 1 row (52 bytes). The estimated time for this step is 0.00 seconds.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total
estimated time is 0.00 seconds..

Logical Operators Page 4-23


Precedence of Operators
The last point we shall discuss is that of Operator Precedence. Here, and on the next page, we
show examples of how complex it can get in determining the order in which conditions get
applied.

On the facing page, the query on the left side can be rewritten by applying parentheses as shown
by the query on the right side in a way that doesn’t change the query but, instead, hopefully
makes it easier to understand. You could use parentheses to change the order. For instance, you
could do the following to force the last OR’ed condition to be included into the AND’ed
condition. By doing this, however, you create an entirely different query with a different result
set.

SELECT Last_Name, Salary_Amount,


Manager_Employee_Number,
Job_Code, Department_Number
FROM Employee
WHERE
Manager_Employee_Number
NOT IN (1011, 801, 1017, 1019, 1005, 1003)
OR Salary_Amount
BETWEEN 10000 AND 20000
AND
(
Department_Number IN (301, 401)
OR Job_Code IN (111100, 211100)
)

Page 4-24 Logical Operators


Precedence of Operators

The following is the order of precedence for predicates (WHERE conditions).

Evaluation Precedence:
1. Parenthesis evaluated first
2. NOT operators
3. AND operators
4. OR operators
5. Operators of equal precedence evaluated from left to right

SELECT Last_Name, Salary_Amount, SELECT Last_Name, Salary_Amount,


Manager_Employee_Number, Manager_Employee_Number,
Job_Code, Department_Number Job_Code, Department_Number
FROM Employee FROM Employee
With
WHERE WHERE
Parentheses
Manager_Employee_Number (Manager_Employee_Number NOT IN
NOT IN (1011, 801, 1017, 1019, 1005, 1003) (1011, 801, 1017, 1019, 1005, 1003))
OR Salary_Amount OR (Salary_Amount
BETWEEN 10000 AND 20000 BETWEEN 10000 AND 20000
AND Department_Number IN (301, 401) AND Department_Number IN (301, 401))
OR Job_Code IN (111100, 211100) OR (Job_Code IN (111100, 211100))

Logical Operators Page 4-25


Module 4: Summary
A summary of this module is discussed.

Page 4-26 Logical Operators


Module 4: Summary

• Operators =, <>, <=, >=, <, >, IN, NOT IN and BETWEEN can be used as
qualifiers.

• All conditions found of the WHERE clause must be satisfied for results
to be projected.

• All AND'ed conditions in a list must evaluate true.

• Only one condition of an OR'ed list need evaluate true.

• Parentheses can be used to (reorder an) order of operations.

• An EXPLAIN of a query can reveal the optimizer's rewrite and provide a


better understanding of the query.

• BETWEEN is inclusive when used to qualify on a range of values.

• IN, and NOT IN lists can be used as a short cut to replace long
“AND/OR” linked conditions.

Logical Operators Page 4-27


Module 4: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 4-28 Logical Operators


Module 4: Review Questions

True or False:

1. The IN operator is a short-cut for replacing a list of OR'ed conditions.


True

2. When using BETWEEN, only numeric data may be compared.


False – character data values may be compared as well

3. “Unsatisfiable” conditions are those which can never be true.


True

4. The following are equivalent operator conditions. C1 > 500 vs. C1


>= 501
True – but only if C1 is an integer

5. The items inside an IN list must be in order from lowest to highest.


False

Logical Operators Page 4-29


Module 4: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 4-30 Logical Operators


Module 4: Lab Exercise

1. List last names, department numbers for employee in department 301,


401, and 501.

2. Project the last names of employees whose salary is greater than or


equal to $28,078.

3. Modify #1 to include those employee who have a job code of either


512102 or 432101.

4. Modify #4 to show only those whose salary amounts are between


$50,000 and $60,000.

Logical Operators Page 4-31


Notes:

Page 4-32 Logical Operators


Module 5

Operators and NULL Processing

After completing this module, you will be able to:

• Describe what NULLs are and how they affect query results.
• Incorporate NULL syntax into queries correctly.
• Identify issues introduced by adding NULL into WHERE conditions.

Operators and NULL Processing Page 5-1


Notes:

Page 5-2 Operators and NULL Processing


Table of Contents
NULL ........................................................................................................................................... 5-4
Conditional Expressions and NULL ............................................................................................ 5-6
What Gets Returned ..................................................................................................................... 5-8
NULL and the Business Question .............................................................................................. 5-10
NOT NULL and the Business Question..................................................................................... 5-12
Negating Conditions and Operators ........................................................................................... 5-14
The IN Operator and NULL....................................................................................................... 5-16
The NOT IN Operator and NULL ............................................................................................. 5-18
NULL Literal in an IN-List ........................................................................................................ 5-20
Including NULL to an IN-List ................................................................................................... 5-22
NULL Literal in a NOT IN-List ................................................................................................ 5-24
Three Versions of NOT IN ........................................................................................................ 5-26
Module 5: Review Questions ..................................................................................................... 5-28
Module 5: Summary................................................................................................................... 5-30
Module 5: Lab Exercise ............................................................................................................. 5-32

Operators and NULL Processing Page 5-3


NULL
Null is, beyond a doubt, one of the trickiest concepts that we deal with when using SQL. Null
can be both a performance issue as well as a result integrity issue. Some of the concepts that we
will address can be challenging to the thought processes of many, but these concepts are
significant and as such are crucial to understanding SQL and relational database in general.

Null is a tough concept to describe. Some say it is a missing value. In truth, null is truly a
concept. It is an idea of how to deal with “unknown-ness”. Something that is unknown is
inconvenient, to say the least. By saying the value is missing we almost imply that we forgot to
assign the value, which could be true for null as well, however null represents an unknown.

• How can you compare a known value to an unknown value?


• How can you perform arithmetic on unknown quantities?
• How can you represent these in a report?

All are good questions that must be dealt with, but first they must be understood.

• NULL is a keyword (upper or lower case is fine) that represents an unknown.


• NULL is not a data type since there is no way to physically represent an unknown.
• NULL is not the same thing as a blank or a space (Character data.)
• NULL is not the same thing as a zero (0) (Numeric data.)
• When you compare a null with any other value (including another null) the comparison
expression results to “unknown”.
• When used as a value in a computation the result of the resulting expression is NULL.

By default, in SQL Assistant and in BTEQ, null is displayed as a question mark (?). You can
change this in the respective tool to whatever you like, but remember, what you choose could be
mistaken for that value instead of an unknown. (After all, how do you display an unknown with
a known quantity or entity?)

Note: The phrase “NULL value” could be considered incorrect by some since a null is
unknown. However, we shall refer to a “NULL value” from time to time for purposes of
discussion.

Page 5-4 Operators and NULL Processing


NULL

• NULL represents something of unknown quantity or value.


• NULL is an SQL keyword.
• NULL is not a data type.
• For an ascending sort, NULLs sort before numbers and characters.
• Do not confuse NULL with a space (blank) or a zero (0).

Any arithmetic operation involving a NULL operand (literal, column, or expression)


computes a NULL result.
Col_A [ + - * / ] Col_B = NULL

Any logical conditional expression involving a NULL operand (literal, column, or


expression) evaluates as “unknown” (Not True AND Not False).

Col_A [ >, >=, <, <=, =, <> ] Col_B  result is unknown

Recall that only conditions evaluating TRUE result in row projection.


Although you can set the displayed value for NULL in both BTEQ and SQL
Assistant, the default value is to display a “?” (question mark).

Operators and NULL Processing Page 5-5


Conditional Expressions and NULL
The facing page describes how NULL can be used in an expression. As a literal we use the
NULL keyword.

It is important to note that although It is incorrect to use “= NULL” in a comparison, nulls


are often used in just that way when we compare one column with another and one or both
values for the columns contain null for their respective rows.

When used as a literal it must be referenced as either of the following only.

WHERE Col1 IS NULL


WHERE Col1 IS NOT NULL

You may not use it with an inequality since it can never evaluate “true or false”.

Using NULL in a arithmetic expression will be discussed in a later module.

Lastly, there is a subtle, but significant, difference between asking if a column value IS NULL
Vs comparing a column to a null via an operator. For instance, if we could write WHERE Col1
= NULL, it would evaluate unknown, whereas writing WHERE Col1 IS NULL will evaluate true
if the value for Col1 IS NULL.

Page 5-6 Operators and NULL Processing


Conditional Expressions and NULL

Conditional expressions may involve columns or literals.

Logically and semantically, there is a subtle, but distinct, difference between:


• Comparing two columns, where one or both contain a null. (e.g., WHERE Col1 = Col2)
• Asking if a column or expression result is null. (i.e., WHERE Col1 IS NULL)

We cannot write a condition like  WHERE C1 = NULL since it can never evaluate TRUE.
But we can write  WHERE C1 IS NULL
This evaluates TRUE only where the value for C1 is unknown – this can be determined.

Checking whether a column IS, or IS NOT, NULL:


Invalid: Valid:
WHERE Salary = NULL WHERE Salary IS NULL
WHERE Salary <> NULL WHERE Salary IS NOT NULL

Comparing two columns, either of which may contain a null.


• WHERE Old_Salary [ operator ] New_Salary

Operators and NULL Processing Page 5-7


What Gets Returned
The facing page is discussing what amounts to “Truth Table Logic”. With normal “binary” logic
we only have to consider “true” or “false” evaluations. With relational databases there are 3
possibilities to consider – “true”, “false” and “unknown”. Anytime a null gets involved in a
comparison of any kind, the result is unknown. This also means that null can never equal itself
(evaluate true) since an unknown can never be equal another unknown.

In relational databases, to say that a row wasn’t projected because the condition evaluated false is
incorrect. We say that it didn’t project because it didn’t evaluate true. This is a significant
statement! It will cause many a quandary with respect to writing bug-free SQL.

Page 5-8 Operators and NULL Processing


What Gets Returned

In SQL, the database only projects column values for those rows that contain data
that satisfy all of the WHERE CONDITIONS.
If columns for a row are not projected, it is not because the conditions evaluated
FALSE, but instead because they did not evaluate TRUE.
Observe:

For this The following “T/F/Unknown” evaluations occur.


C1 value:
WHERE C1 = 'X'  “T” (row projected – true)
X WHERE C1 = 'Y'  “F” (row not projected – not true)
WHERE C1 = NULL  Unknown (row not projected – not true)

WHERE C1 <> 'X'  “F” (row not projected – not true)


WHERE C1 <> 'Y'  “T” (row projected – true)
WHERE C1 <> NULL  Unknown (row not projected – not true)

Operators and NULL Processing Page 5-9


NULL and the Business Question
One of the trickiest considerations to learn about null values is how to think of nulls in the
business environment. Null mean unknown! No matter how you phrase it – not known,
unknown – the situations in which they will occur should become obvious. The main thing to
remember is that there is a difference between what is unknown to you vs. what is actually null
in the table for the column.

Page 5-10 Operators and NULL Processing


NULL and the Business Question

Retrieve employees whose first


name is not known. Note the Employee Table
addition of Last First Dept#
SELECT * two nulls in Name Name
FROM Employee the table!
WHERE First_Name IS NULL; Brown Alan 401

Brown Allen 801


last_name first_name dept#
---------- ---------- ----------- Brown ? 567
Brown ? 567
Smith Mary ?

Retrieve employees whose first name is unknown, Jones Jimmy 401


or work in department numbers less than 500.

SELECT *
FROM Employee
WHERE Dept# < 500 SELECT *
OR First_Name IS NULL; FROM Employee
WHERE Dept# = NULL;
last_name first_name dept#
---------- ---------- ----------- *** Failure 3731 The user must use IS
Jones Jimmy 401 NULL or IS NOT NULL to test for NULL
Brown Alan 401 values.
Brown ? 567

Operators and NULL Processing Page 5-11


NOT NULL and the Business Question
Here we discuss the concept of NOT NULL. Not null means that there is actually some value
present or that the value is a known quantity or entity. It is incorrect to use an operator to check
for an unknown since we cannot compare anything to an unknown that results as either true or
false.

Page 5-12 Operators and NULL Processing


NOT NULL and the Business Question

Retrieve all employees who have first names.


Employee Table
Last First Dept#
SELECT * Name Name
FROM Employee
WHERE First_Name IS NOT NULL; Brown Alan 401

Brown Allen 801


last_name first_name dept#
---------- ---------- ----------- Brown ? 567
Smith Mary ?
Smith Mary ?
Brown Alan 401
Jones Jimmy 401 Jones Jimmy 401
Brown Allen 801

SELECT *
FROM Employee
WHERE First_Name <> NULL;

*** Failure 3731 The user must use IS NULL or IS NOT NULL to test for NULL values.

Operators and NULL Processing Page 5-13


Negating Conditions and Operators
The difference between negating a condition vs. an operator is shown on the facing page. The
database will usually rewrite WHERE Dept# <> 401 to the expression WHERE NOT (Dept# =
401), so there is no difference, from a performance viewpoint, which method you choose. When
writing more involved SQL requests, beyond the scope of this class, you may find it safer to
negate an entire conditional expression rather than to negate each operator involved in the
conditional expression.

Page 5-14 Operators and NULL Processing


Negating Conditions and Operators

To retrieve employees other than those in department number 401,


you can perform either of the following:

Negate the operator - Employee Table


SELECT * Last First Dept#
NE
Name Name = 401
FROM Employee 401
WHERE Dept# <> 401 Brown Alan 401 T Not True
Or negate the condition - Brown Allen 801
F
F
SELECT * Brown ? 567
T
FROM Employee
Smith Mary ? F Not True
WHERE NOT (Dept# = 401);
Jones Jimmy 401
T
Not True
?
?
T
last_name first_name dept# F
---------- ---------- -----------
Brown Allen 801
Brown ? 567

Operators and NULL Processing Page 5-15


The IN Operator and NULL
We will now begin tackling some more complex issues surrounding nulls, beginning with IN-
List processing. The wording for the following paragraph is carefully crafted to help make the
concepts on a later page easier to understand.

For the conditional expression on the facing page, it is important to note that the EXPLAIN of
the request shows how a null comparison, via “=”, can be interjected into a request, even though
we unable to do so with SQL directly. Here, the values for the column First_Name include a
null, so we have effectively introduced something we cannot type in via the syntax, namely
WHERE First_Name = NULL (NULL being introduced from the table itself). For instance:

First_Name = ‘alan’

Becomes –

NULL = ‘alan’ (or ‘alan’ = NULL if this looks more recognizable to you)

No values for rows having a null last name can ever be projected since null, in a conditional
expression, can never evaluate true.

Page 5-16 Operators and NULL Processing


The IN Operator and NULL

Retrieve employees whose first names are in the following list (as a set).

SELECT * Employee Table


FROM Employee Last First Dept#
WHERE First_Name IN ('alan', 'allen'); Name Name =

The rewrite of this query shows an


Brown Alan 401 T
alternate method for writing this. Brown Allen 801 T
Brown ? 567
? Not True
last_name first_name dept#
Smith Mary ?
---------- ---------- ----------- F Not True
Brown Alan 401 Jones Jimmy 401
F Not True
Brown Allen 801

. . . .
3) We do an all-AMPs RETRIEVE step from DLM.Employee by way of an
EXPLAIN
all-rows scan with a condition of ("(DLM.Employee.first_name =
SELECT *
'allen') OR (DLM.Employee.first_name = 'alan')") into Spool 1
FROM Employee
(group_amps), which is built locally on the AMPs. The size of
WHERE First_Name
Spool 1 is estimated with no confidence to be 7 rows (595 bytes).
IN ('alan', 'allen');
The estimated time for this step is 0.02 seconds.
. . . .

Operators and NULL Processing Page 5-17


The NOT IN Operator and NULL
Just as with IN, NOT IN presents similar issues.

Page 5-18 Operators and NULL Processing


The NOT IN Operator and NULL

Retrieve employee whose first names are NOT IN the


following list (as a set).
Employee Table
Last First Dept#
SELECT * Name Name NOT
IN (IN)
FROM Employee
WHERE First_Name NOT IN ('alan', 'allen'); Brown Alan 401
T F Not True
Brown Allen 801 T F Not True
last_name first_name dept# Brown ? 567 ? ? Not True
---------- ---------- -----------
Smith Mary ? Smith Mary ? F T
Jones Jimmy 401 Jones Jimmy 401 F T

The EXPLAIN of this query shows an alternate method for writing this.

. . . .
3) We do an all-AMPs RETRIEVE step from DLM.Employee by way
of an all-rows scan with a condition of “(DLM.Employee.first_name
<> 'alan') AND (DLM.Employee.first_name <> 'allen')") into Spool 1
(group_amps), which is built locally on the AMPs. The size of
Spool 1 is estimated with no confidence to be 20 rows (1,700
bytes). The estimated time for this step is 0.02 seconds.
. . . .

Operators and NULL Processing Page 5-19


NULL Literal in an IN-List
Moving on to more challenging concepts, the facing page shows why one should not include a
null into an In-List. Note how the “= NULL” condition gets inserted into the conditional
expression by the optimizer. This is something that we cannot do ourselves. Recall from the
immediately preceding pages that this is treated exactly as would a condition like the following:

WHERE Last_Name = First_Name

Since the value for either of these could be null, this syntax could actually end up comparing a
null to a value or to another null. The database does not know if this could happen until it does.
That’s a whole different issue that this:

WHERE Last_Name = NULL

Which we can’t write.

So the database includes a comparison to null into the In-List, as is standard.

What happens, then, is that we will never return the row for Mary Smith since the condition “null
= null” can never be true.

Page 5-20 Operators and NULL Processing


NULL Literal in an IN-List

Recall the difference between comparing two columns


where one or both may contain a null, (e.g., WHERE Col1 Employee Table
= Col2) vs. asking if a column or expression is null (i.e., Last First Dept#
WHERE Col1 IS NULL). Name Name

SELECT * Brown Alan 401

FROM Employee Brown Allen 801


WHERE Dept# IN (401, 403, null);
Brown ? 567

last_name first_name dept# Not included in


Smith Mary ?
---------- ---------- ----------- the result.
Jones Jimmy 401 Jones Jimmy 401
Brown Alan 401

Explanation
-------------------------------------------------------------------------------------------------
The EXPLAIN shows how
. . . .
the database correctly
3) We do an all-AMPs RETRIEVE step from DLM.Employee by way
treats the NULL literal as a
of an all-rows scan with a condition of (
comparison.
"(DLM.Employee.dept# = 403) OR
((DLM.Employee.dept# = 401) OR
(DLM.Employee.dept# = NULL ))")
into Spool 1 (group_amps), which is built locally on the AMPs.
The size of Spool 1 is estimated with no confidence to be 4

Operators and NULL Processing Page 5-21


Including NULL to an IN-List
Continuing our train of thought from the immediately preceding page, the facing page shows that
if we wish to include a null department into the condition we must do it separately as shown, as
its own condition. Note how the EXPLAIN now contains a condition that asks (via AND) where
the department number is null, and not from using a comparison.

Page 5-22 Operators and NULL Processing


Including NULL to an IN-List

Note the difference between this EXPLAIN and that for the
query on the previous page. Employee Table
Last First Dept#
SELECT * Name Name
FROM Employee
WHERE Dept# IS NULL Brown Alan 401
OR Dept# IN (401, 403);
Brown Allen 801

last_name first_name dept# Brown ? 567


---------- ---------- -----------
Smith Mary ? Smith Mary ?
Brown Alan 401
Jones Jimmy 401
Jones Jimmy 401

Explanation
---------------------------------------------------------------------------------------------------------------------
. . . .
3) We do an all-AMPs RETRIEVE step from DLM.Employee by way of an
all-rows scan with a condition of (
"(DLM.Employee.department_number = 403) OR
((DLM.Employee.department_number = 401) OR
(DLM.Employee.department_number IS NULL ))") Note: “IS NULL”
into Spool 1 (group_amps), which is built locally on the AMPs.
The size of Spool 1 is estimated with no confidence to be 9 rows (1,323 bytes).

Operators and NULL Processing Page 5-23


NULL Literal in a NOT IN-List
All of the immediately preceding pages have led up to the very important concept shown on this
page!

Let’s examine the conditional expression to see why Alan Brown, in department 801, was not
returned by reviewing the logic in the EXPLAIN.

WHERE Dept# <> NULL -- 801 <> NULL evaluates unknown and
AND Dept# <> 401 -- not true, the next 2 conditions are
AND Dept# <> 403 -- rendered as being irrelevant.

For Alan Brown 801 <> NULL evaluates as unknown and not true. Since all conditions are
linked with and, all of them must evaluate as true for Alana Brown to be returned. The same
logic exists for each and every row. The result  No rows found.

To resolve this issue, simply remove NULL from the list. There is no reason for it to be there
anyway, is there? So why discuss it at all? Wait until we get to the module on Subqueries later
in the course!

Page 5-24 Operators and NULL Processing


NULL Literal in a NOT IN-List

This page is the reason for the immediately preceding pages.


No rows are returned due to the result of everything being linked with AND together
with the outlined condition never being able to be true.
SELECT * Employee Table Looking our the truth table.
FROM Employee
WHERE Dept# NOT IN Last First
Dept# <> <> <>
(401, 403, null); Name Name T/F/?
401 403 NULL
*** Query completed. Brown Alan 401 F F
No rows found. Brown Allen 801 T T ? ?
The EXPLAIN shows how Brown ? 567 T T ? ?
the database correctly Smith Mary ? ? ? ? ?
treats the NULL literal as a
Jones Jimmy 401
comparison. F F

Partial Explanation
---------------------------------------------------------------------------------------------------------------------------------------------
3) We do an all-AMPs RETRIEVE step from DLM.Employee by way of an all-rows scan with a
condition of ("(DLM.Employee.dept# <> NULL) AND ((DLM.Employee.dept# <> 401) AND
DLM.Employee.dept# <> 403 ))") into Spool 1 (group_amps), which is built locally on the AMPs.

Operators and NULL Processing Page 5-25


Three Versions of NOT IN
Each of the three examples on the facing page are equivalent and will yield the very same
EXPLAIN plan.

Page 5-26 Operators and NULL Processing


Three Versions of NOT IN

The following 3 queries return the same result.

SELECT * SELECT *
FROM Employee FROM Employee
WHERE First_Name NOT IN ('alan', 'allen'); WHERE First_Name <> 'alan'
AND First_Name <> 'allen';
SELECT *
FROM Employee
WHERE NOT (First_Name = 'alan' All three return the explain
OR First_Name = 'allen'); shown below.

Explanation
---------------------------------------------------------------------------
. . .
3) We do an all-AMPs RETRIEVE step from DLM.Employee by way of an
all-rows scan with a condition of ("(DLM.Employee.first_name <>
'allen ') AND (DLM.Employee.first_name <> 'alan ')") into Spool 1
(group_amps), which is built locally on the AMPs. The size of
Spool 1 is estimated with no confidence to be 10 rows (520 bytes).
The estimated time for this step is 0.02 seconds.
. . .

Operators and NULL Processing Page 5-27


Module 5: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 5-28 Operators and NULL Processing


Module 5: Review Questions

True or False:

1. NULLs are always displayed as “?”.


False
2. NULL is a data type.
False
3. A NULL is treated like a zero (0) or a space(' ').
False
4. NULLs involved in computations always return NULL.
True
5. NULLs used in comparisons always return unknown.
True
6. You can include NULL inside an IN or NOT IN list.
True – but it may not return the desired result.

Operators and NULL Processing Page 5-29


Module 5: Summary
A summary of this module is discussed.

Page 5-30 Operators and NULL Processing


Module 5: Summary

• Describe what NULLs are and how they affect query results.
• Incorporate NULL syntax into queries correctly.
• Identify issues introduced by adding NULL into WHERE conditions.

Operators and NULL Processing Page 5-31


Module 5: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 5-32 Operators and NULL Processing


Module 5: Lab Exercise

1. Request separate reports of employees who have not been assigned


to a department, then those who have not been given a job code.

2. Using an IN list, display employees with any of the following job codes:
412101, 412109, NULL.

3. Rewrite #2 using all OR’ed conditions.

4. List employee with un assigned job codes that have salaries between
30K and 40K.

Operators and NULL Processing Page 5-33


Notes:

Page 5-34 Operators and NULL Processing


Module 6

Data Types and Functionality

After completing this module, you will be able to:


• Identify various data types for table columns.
• Determine various effects of trailing spaces on character values.
• Use CAST to convert from one data type to another.
• Use FORMAT to display results in a more desirable form.
• Perform various kinds of arithmetic on numeric data.
• Use the functions like
ABS(arg), EXP(arg), LOG(arg), LN(arg), SQRT(arg)
• Use various formatting options on date fields.
• Concatenate fields together

Data Types and Functionality Page 6-1


Notes:

Page 6-2 Data Types and Functionality


Table of Contents
Character Data Types ................................................................................................................... 6-4
Character Functionality ................................................................................................................ 6-6
BETWEEN Functionality with CHARACTER ........................................................................... 6-8
Integer Data Types ..................................................................................................................... 6-10
Decimal Data Types ................................................................................................................... 6-12
Float Data Type .......................................................................................................................... 6-14
Byte Data Types ......................................................................................................................... 6-16
Date Data Type .......................................................................................................................... 6-18
ARRAY Data Type .................................................................................................................... 6-20
NUMBER Data Type ................................................................................................................. 6-22
Arithmetic Operators.................................................................................................................. 6-24
Arithmetic and Derived Values .................................................................................................. 6-26
DATE Arithmetic ....................................................................................................................... 6-28
Data Type Conversions Using CAST ........................................................................................ 6-30
Data Type Conversions and Rounding ...................................................................................... 6-32
Concatenating Data Types ......................................................................................................... 6-34
Concatenated Example Results .................................................................................................. 6-36
Concatenated Example Results (continued)............................................................................... 6-38
FORMAT ................................................................................................................................... 6-40
SQL Assistant Methods for FORMAT ...................................................................................... 6-42
SQL Assistant Formatting Examples ......................................................................................... 6-44
Year, Month and Day Formatting Options ................................................................................ 6-46
Module 6: Summary................................................................................................................... 6-48
Module 6: Review Questions ..................................................................................................... 6-50
Module 6: Lab Exercise ............................................................................................................. 6-52

Data Types and Functionality Page 6-3


Character Data Types
In addition to those character data types shown, LONG VARCHAR is also available.

In general, CHARACTER, VARCHAR, LONG VARCHAR, and CHARACTER LARGE


OBJECT (CLOB) data types represent character data.

Character data is automatically translated between the client and the database. Its form-of-use
is determined by the client character set or session character set. The form of character data internal to
Teradata Database is determined by the server character set attribute of the column.

Fixed character columns will always occupy the number of characters defined by the data type
(usually padded with spaces), whereas variable character columns will contain only the number
of characters for the stored value. Space is a valid character and will count as one of the
characters whether trailing, leading, or anywhere in the value.

Page 6-4 Data Types and Functionality


CHARACTER Data Types

There are three basic CHARACTER data types.

• CHARACTER(n) or CHAR(n) (Where “n” is the number of characters)


Fixed Character, Left justified, right padded with spaces to fill the length.
Example for last name, below is always 20 characters and 20 bytes of storage.

• VARCHAR(n) (Where “n” is the number of characters)


Variable length. Spaces count as valid characters.
The number of bytes for storage varies according to the value.
Examples 'this value' has 10 characters and takes 10 bytes of storage.
'this value ' has 13 characters and takes 13 bytes of storage.

• CLOB(n[K|M|G]) (Character Large OBject – Kilobytes; Megabytes; Gigabytes)


Fixed length up to 2GB in size.
Examples: CLOB(3200) – CLOB(32K) – CLOB(32M) – CLOB(2G)

CREATE TABLE Data_Types


(Last_Name CHAR(20),
First_Name VARCHAR(20),
Thesis CLOB(2M) );

Data Types and Functionality Page 6-5


Character Functionality
The facing page shows how trailing spaces do not change the value of a character string. This
should make sense if the database does as explained and, during comparison between 2 character
values, pads shorter one with spaces until they are both the same length.

The answers to the list are:

1. ‘abc’ = ‘ABC’  true


2. ‘190’ = ‘190 ‘  true
3. ‘abc ‘ = ‘ abc ‘ false
4. ‘ 190’ = ‘190 ‘  false
5. ‘‘=‘ ‘  true
6. ‘‘ = ‘ ‘  true
7. ‘‘ = null  false

Page 6-6 Data Types and Functionality


Character Functionality

When comparing two character strings of different lengths:


1. The shorter string gets padded to the right with spaces making them the same
length.
2. The database then compares the values.

In Teradata, trailing spaces do not change the value of a character field.

Recall that, by default, Teradata is not cases sensitive.

Determine whether or not these conditional expressions evaluate true.

1. 'abc' = 'ABC' T
2. '190' = '190 ' T
3. 'abc ' = ' abc ' F
4. ' 190' = '190 ' F
5. ' ' = ' ' T
Two single quotes 6. '' = ' ' T
(i.e. zero length string) 7. '' = null F

Data Types and Functionality Page 6-7


BETWEEN Functionality with CHARACTER
Although the discussion on the facing page is with respect to BETWEEN functionality, any
inequality would work the same.
For instance, the following would return both “Ryan” and “Stein”, as well as anyone else at that
end of the alphabet beyond those beginning with the letter “Q” since a last name of “R” would
also satisfy the “>” condition.  WHERE Last_Name > ‘r’
As noted on the right, case sensitivity plays a large role when qualifying any character value.
Since Teradata Mode is not case sensitive (not case specific), we look at the sort sequence for it
to be like the following. Recall that “space” sorts before numeric or character values.
Aa
.
Upper and Lower Case
.
.
Zz
You can change the case sensitivity when defining the table by adding “CASE SPECIFIC” to the
definition of the character column.
In ANSI mode, the default for character columns is to be case specific, in which case the sort
sequence is the one below.
A
.
. Upper Case
.
Z
.
.
.
a
. Lower Case
.
.
z
Note that the complete sort sequencing, with respect to case sensitivity, were shown in an earlier
module. Fortunately, we tend to use inequalities on numeric data values.

Page 6-8 Data Types and Functionality


BETWEEN Functionality with
CHARACTER
BETWEEN can be used with character data.
Consider what happens when case sensitivity gets involved
(Last_Name is defined as CHAR(20))

WHERE Last_Name BETWEEN 'r' AND 's'

If Not Case Specific: (underscore used to represent a space)


'r _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _' BETWEEN this
'R y a n _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _' AND
's _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _' this
'S t e i n _ _ _ _ _ _ _ _ _ _ _ _ _ _ _'

If Case Specific: No rows returned. Why?


'R _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _'
'R y a n _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _'
'S _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _'
'S t e i n _ _ _ _ _ _ _ _ _ _ _ _ _ _ _'
. . .
'r _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _' BETWEEN this
's _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _' AND
this

Data Types and Functionality Page 6-9


Integer Data Types
Integer data type are considered whole number. Each of the different integer data types exist for
differences in storage. Of these, BYTEINT is a Teradata extension to the ANSI standard. It
occupies only a single byte of storage.

Although there is no real need to memorize the ranges of these different types, knowing what
they are will help with understanding certain types of data type conversions beyond this module.
We will discuss some conversions later in this module.

Page 6-10 Data Types and Functionality


Integer Data Types
BYTEINT
Teradata extension.
1 byte of storage
Range: -128 to +127
Examples of the data type in a CREATE TABLE
Characters needed to display: 4
CREATE TABLE Data_Types
SMALLINT
(Last_Name CHAR(20),
ANSI Standard First_Name VARCHAR(20),
2 bytes of storage Thesis CLOB(2M),
Range: -32,768 to +32,767 Years_Employed BYTEINT,
Characters needed to display: 6 Employee_Age SMALLINT,
Employee_Number INT,
INTEGER (INT) Population BIGINT );
ANSI Standard
4 bytes of storage
Range: -2,147,483,648 to
+2,147,483,647
Characters needed to display: 11

BIGINT
ANSI Standard
8 bytes of storage
Range: ± 9,233,372,036,854,775,807
Characters needed to display: 20

Data Types and Functionality Page 6-11


Decimal Data Types
Decimal data types, as expected, are those numbers that contain fractional precision. It is
important to understand that the first digit in the definition (‘m’) is the total number of digits that
the data type will support, while the second digit (‘n’) says how many of those defined by ‘m’
are fractional.

The amount of storage required by each depends, again, on the size (number of digits) of the
decimal. The page shows how one can relate the number of digits for the decimal to the number
of digits for an integer to help determine storage requirements. If you disregard the decimal, and
consider only the value for ‘m’, the number of bytes of storage required will be the same as for
the corresponding integer that can completely store any decimal value for that decimal type after
removing the decimal point.

Page 6-12 Data Types and Functionality


Decimal Data Types

DECIMAL(m,n) or DEC(m,n) or NUMERIC(m,n)


Where: “m” is the total number of digits. (maximum value for “m” is 38)
“n” is the number of decimal digits. (<= “m”)

When value of “m” is: Then:


1 or 2 (fits into BYTEINT) Uses 1 byte of storage.
3 or 4 (fits into SMALLINT) Uses 2 bytes of storage.
Storage Requirements
5 to 9 (fits into INTEGER) Uses 4 bytes of storage.
10 to 18 Uses 8 bytes of storage.
18 to 38 Uses 16 bytes of storage.

CREATE TABLE Data_Types


(Last_Name CHAR(20),
First_Name VARCHAR(20),
Thesis CLOB(2M),
Years_Employed BYTEINT,
Employee_Age SMALLINT,
Employee_Number INT,
Population BIGINT,
Salary_Amount DEC(18,2) );

Data Types and Functionality Page 6-13


Float Data Type
Float data types can be used to represent data values that are beyond the ranges of the other
numeric data types. Note, however, that they have only 15 digits of precision compared to the
integer and decimal data types.

Page 6-14 Data Types and Functionality


Float Data Type

FLOAT
Number in the range of 2 X 10-307 to 2 X 10+308
Can be used to represent a very large number, but with only 15 digits of
precision.

Example: 4.35400000000000E-001
= 4.35400000000000 X 10-1
= 4.354 X 0.1
= 0.4354

CREATE TABLE Data_Types


(Last_Name CHAR(20),
First_Name VARCHAR(20),
Thesis CLOB(2M),
Years_Employed BYTEINT,
Employee_Age SMALLINT,
Employee_Number INT,
Population BIGINT,
Salary_Amount DEC(18,2),
Bigger_Than_BIGINT FLOAT );

Data Types and Functionality Page 6-15


Byte Data Types
The BYTE data type is not translated into ASCII by the database but, rather, stored “as is” from
the client or host source. This data type has been around since the beginning days. The BLOB
data type is just a very much larger BYTE type (or BYTE is a smaller BLOB).

BLOBs (Binary Large Objects) and CLOBs (Character Large Objects) are each a different form
of LOB (Large Object)

Page 6-16 Data Types and Functionality


Byte Data Types

These are:
• Never translated by the Teradata Database
• Handled as if they were n-byte, unsigned binary integers
• Suitable for digitized image information (BLOB)

BYTE(n)
VARBYTE(n) CREATE TABLE Data_Types
Where “n” = Number of bytes between (Last_Name CHAR(20),
1 and 64,000. First_Name VARCHAR(20),
(These two are Teradata extensions to Thesis CLOB(2M),
the ANSI syntax) Years_Employed BYTEINT,
Employee_Age SMALLINT,
BLOB(n[K|M|G]) Employee_Number INT,
(Binary Large OBject) Population BIGINT,
Fixed length up to 2GB in size. Salary_Amount DEC(18,2),
Examples: Bigger_Than_BIGINT FLOAT,
BLOB(3200) Img001 BYTE(32000),
BLOB(32K) Img002 VARBYTE(64000),
BLOB(32M) Img100 BLOB(300M));
BLOB(2G)

Data Types and Functionality Page 6-17


Date Data Type
DATE data types are a special type of INTEGER data type. The difference between the two is
the type of arithmetic that the database performs on each. For column C1 (below), the database
performs integer arithmetic on the column. So if column C1 = 991231 then the result is 991232.

C1 INT -- 991231 +1 = 991232

For column C2 (below), the data type is DATE, which is also stored as an integer. The database
will perform DATE arithmetic on this column. It happens that the integer number 991231
(earlier paragraph) also represents a valid date (1999-12-31), however the integer representation
for January 1st of 2000 (The day after 1999-12-31 being 2000-01-01) is 1000101 and not 991232.
In other words, the year goes from 99 to 100.

C2 DATE -- 991231 +1 = 1000101

One way to determine if an integer represents a valid date is to add nine-teen million to it using
normal (integer) arithmetic. If the resulting number looks like a date in the format of
YYYYMMDD then it represents that date. For instance:

101 + 19000000 = 19000101 Thus, 101 represents this valid date.


100 + 19000000 = 19000100 Thus, 100 represents an invalid date.
1080229 + 19000000 = 20080229 Thus, 1080229 represents this valid date.
-1239296 + 19000000 = 17760704 Thus, -1239296 represents a valid date.

The maximum date stored is ‘9999-12-31’.


The minimum date stored is ‘0001-01-01’.

When specifying a literal date, one should get into the habit of prefixing the character date
(format of ‘YYYY-MM-DD’) with the DATE keyword like this  DATE ‘20010-01-01’.

Page 6-18 Data Types and Functionality


Date Data Type

DATE
• Is stored internally as data type INTEGER (4 bytes of storage).
• Supports full date intelligent arithmetic.

CREATE TABLE Data_Types


ANSI Standard form for literal is (Last_Name CHAR(20),
DATE 'YYYY-MM-DD' First_Name VARCHAR(20)
Thesis CLOB(2M),
Maximum date is: DATE '9999-12-31' Years_Employed BYTEINT,
Minimum date is: DATE '0001-01-01' Employee_Age SMALLINT,
Employee_Number INT,
Example of usage in a query: Population BIGINT,
Salary_Amount DEC(18,2),
SELECT *
Bigger_Than_BIGINT FLOAT,
FROM Data_Types
Img001 BYTE(32000),
WHERE
Img002 VARBYTE(64000),
BirthDate < DATE '2000-01-01'
Img100 BLOB(300M),
BirthDate DATE);

Data Types and Functionality Page 6-19


ARRAY Data Type
Teradata 14.0 provides support for the ARRAY data type. The Teradata ARRAY data type is a
user-defined type (UDT). This differs from the ANSI standard which does not consider an
ARRAY data type to be a UDT.

It also supports an Oracle-compatible form of ARRAY type called VARRAY. However, unlike
the Oracle VARRAY data type, the Teradata VARRAY type can be defined in multiple
dimensions

The array data type is a user-defined type (UDT) with a fixed number of defined elements. It has
the following characteristics:
• An array data type is defined by the CREATE TYPE statement, like other UDTs
• It can be used as:
◦ A column of a table
◦ Parameter to a UDF/UDM/XSP/SP
◦ Local variable inside a SP
• All elements within the array have the same data type
• All elements default to an un-initialized state unless the DEFAULT NULL clause is
specified at creation time
• An array can be formed of a single dimension (1-D) or multiple dimensions(n-D)
• Overall size of an array is limited to ~64 KB of storage

The Teradata ARRAY data type is a user-defined type (UDT). This differs from the ANSI
standard which does not consider an ARRAY data type to be a UDT.

Page 6-20 Data Types and Functionality


ARRAY Data Type

Teradata 14.0 provides support for the ARRAY data type.

The Teradata ARRAY data type is a user-defined type (UDT). This differs
from the ANSI standard which does not consider an ARRAY data type to
be a UDT.

It also supports an Oracle-compatible form of ARRAY type called VARRAY.


However, unlike the Oracle VARRAY data type, the Teradata VARRAY type
can be defined in multiple dimensions

the Teradata VARRAY type can be defined in multiple dimensions:

• One-Dimensional Array

• Multi-Dimensional Array

Data Types and Functionality Page 6-21


NUMBER Data Type
A new numeric data type called NUMBER can be used as a column/parameter/member in the
database objects. NUMBER data type can store both fixed and floating point decimal numbers
and is stored as a variable length field.

NUMBER data type provides the following business value:


• Provides a data type similar to Number data type of Oracle. Provides an easier transition
of Number data, during migration from Oracle to Teradata, by avoiding costly and
cumbersome process of adjusting precision and scale of number by intermediate third-
party tools.
• Allows increasing the precision and/or scale of a Number column without modification of
the rows.
• Provides more accuracy than existing numeric types while providing more flexibility than
existing Decimal and Integer types.
• Allows numbers to be stored in the range of ± [1E-130 to 9.99…9E125] as well as 0.
• Provides more accurate results, with over 38 digits of accuracy.
• Enables greater efficiency in storing numeric data because NUMBER is a variable-length
data type that can vary from 0 to 18 bytes, depending on the value stored.
• Provides a DBSControl flag RoundNumberAsDec which can be used to control the
rounding behavior of NUMBER.

Use the NUMBER data type when migrating from Oracle or another database that uses the
NUMBER data type. Teradata’s NUMBER data type is similar to Oracle’s NUMBER data type,
with minor exceptions. DB2 also has support for Oracle’s NUMBER data type, for compatibility
reasons, but is stored in a fixed length as opposed to variable length in Oracle and Teradata.

The NUMBER data type is also useful in many other situations.

Page 6-22 Data Types and Functionality


NUMBER Data Type

NUMBER is a numeric data type that can be used as a


column/parameter/member in the database objects. NUMBER data type
can store both fixed and floating point decimal numbers and is stored as a
variable length field.

NUMBER data type:


• Provides a data type similar to Number data type of Oracle.
• Provides an easier transition of Number data, during migration.
• Allows increasing the precision and/or scale of a Number column.
• Provides more accuracy than existing numeric types.
• Provides more flexibility than existing Decimal and Integer types.
• Allows numbers to be stored in the range of ± [1E-130 to 9.99…9E125]
as well as 0.
• Provides more accurate results, with over 38 digits of accuracy.
• Enables greater efficiency in storing numeric data because NUMBER is
a variable-length data type that can vary from 0 to 18 bytes, depending
on the value stored.

Data Types and Functionality Page 6-23


Arithmetic Operators
The arithmetic operators on the facing page may be used to perform various calculations. Note
that multiplication is performed by using the asterisk, also referred to as a “star” or a “splat” by
some.

5 * 4 = 20
2 ** 3 = 8
8/4=2
9 / 4 = 2 (throw away the reminder and keep the quotient)
9 MOD 4 = 1 (throw away the quotient and keep the remainder)
8.00 / 3 = 2.67 (decimal division)

The order of operations follows the rules of arithmetic.

400 / 25 * 2 - 10 + 3**2 = ((((400 / 25) * 2) - 10) + (3**2)) = 31

Arithmetic expressions may be used in projections or predicates (conditionals).

Page 6-24 Data Types and Functionality


Arithmetic Operators

The following arithmetic operations can appear as


• Projected Values
• Conditional Expressions

Teradata Extensions: Operator Meaning

Operator Meaning () Evaluated first


* Multiply
** Exponentiation / Divide
MOD Modulo (remainder) + Add (positive value)
- Subtract (negative value)

The order of operations for these are as follows.


•()
• Exponentiation
• Multiplication, Division, and Modulo (left to right)
• Addition and Subtraction (left to right)

Data Types and Functionality Page 6-25


Arithmetic and Derived Values
The facing page shows more examples of how to perform arithmetic using combinations
columns and literals in both projections and as conditionals.

Page 6-26 Data Types and Functionality


Arithmetic and Derived Values

Examples using arithmetic expressions.

In a projection.
SELECT 5 + 2; 7
SELECT 8.40 / 4.20;  2.00
SELECT 10 - 7 * 8;  -46
SELECT 2**3 / 2 + 4;  8.00000000000000E 000
(exponentiation causes float)
SELECT Salary_Amount * 1.25;  Give a 25% raise?
SELECT Budget_Amount - 1e2;  Subtract 100 from the budget amount
SELECT 1e6 - 100000;  9.00000000000000E 005
SELECT 240378 MOD 100;  78

In a conditional Expression.
SELECT * FROM Employee WHERE Salary_Amount * 1.25 > 200000;
SELECT Job_Code FROM Job WHERE Job_Code MOD 1000 = 125;

Data Types and Functionality Page 6-27


DATE Arithmetic
While we are discussing various forms of arithmetic, it might be a good idea to discuss date
arithmetic. An earlier page discussed the DATE data type and how it is stored and referenced
internally as an integer. Here we discuss various uses of arithmetic involving date values. It is
important to note how literal dates should be referenced.

Page 6-28 Data Types and Functionality


DATE Arithmetic

You can perform intelligent date arithmetic on any date value.

Find the date 40 days from today


SEL DATE + 40;  Yields a date
Find the date 300 days from a specific date
SELECT DATE'1999-01-01' + 300;  Yields a date
Find the number of days between the 2 dates
SEL CURRENT_DATE - DATE'2007-11-30';  Yields an integer of days
Find the number of days each employee has worked for the company and divide by
the number of days in a year.
SEL CURRENT_DATE - Hire_Date FROM Employee;  Yields an integer of days
Find the employees who have worked for the company for more than 20 years.
SEL * FROM Employee WHERE CURRENT_DATE - Hire_Date > 365.25 * 20;
Find employees who were hired within the last 100 days.
SELECT * FROM employee WHERE Hire_Date > DATE - 100;

Data Types and Functionality Page 6-29


Data Type Conversions Using CAST
CAST is a function that can be used to change the data type of a resulting column or expression.
Data types may be changed in many ways: CHAR to CHAR; CHAR to NUMBER; NUMBER to
CHAR; NUMBER to NUMBER; etc.
All in all, data type conversions can be quite complex and somewhat challenging. The facing
page discusses conversions using CAST. Some examples using the Teradata extension to the
ANSI CAST are shown below.
SELECT Last_Name (CHAR(10));
SELECT 3.777 (INT);
SELECT Last_Name (CHAR(10)) FROM Employee;

Note how the CAST syntax forms a single construct while the extended form is actually to
constructs. These can be made into a single construct as shown which may, sometimes, be
required:

SELECT Last_Name (CHAR(10));  SELECT (Last_Name (CHAR(10)));


SELECT 3.777 (INT);  SELECT (3.777 (INT));
SELECT Last_Name (CHAR(10)) FROM Employee; SELECT (Last_Name
(CHAR(10))) FROM Employee;

Also, the Teradata extended form may produce a result differing from the CAST. For instance,
casting 12 as a 3 character field for each method is shown below.

SELECT 12 (CHAR(3));  12
---
1

Note that the heading is left justified and so is the result. The result contains 3 characters, the
first two of which are spaces. To explain what happened one must first understand that the
value 12 is a BYTEINT. A BYTEINT is +123, which takes 4 characters, right justified. Starting
from the left we get the first 3 characters (from the conversion), which are then left justified into
the character space. Thus the  ‘ 1’

In contrast, the ANSI CAST actually removes (or trims) the spaces in front of the number before
acquiring the characters.

** The note at the bottom of the facing page is answered by saying that any conversion that
results in a truncation while in ANSI mode, will fail. The exception is the Teradata extended
form of the CAST. This will truncate in either mode.

Page 6-30 Data Types and Functionality


Data Type Conversions Using CAST

You can change the data type for a column or


expression by using the CAST function.
The general form of this function is  CAST( expression AS data type );

Some examples of using CAST follow.


SELECT CAST(Last_Name AS CHAR(10)) FROM Employee;
SEL CAST(3.7777 AS INTEGER);  this will truncate the decimal to 3
SELECT CAST(Budget_Amount * 1.375 AS DEC(15,3)) FROM Department;
SELECT CAST(1.777 AS CHAR(10));  this will left-justify 1.777 into a CHAR(10) field
SEL CAST(Salary_Amount AS CHAR(11)) FROM Employee;

SEL CAST(1.77 AS CHAR(2))  this will truncate the decimal to “1.” left justified as
a two character field

What should happen to this request?


SELECT CAST(Last_Name AS INTEGER) FROM Employee;

** Read the left-hand page to learn how truncation using CAST works in ANSI mode.

Data Types and Functionality Page 6-31


Data Type Conversions and Rounding
The two methods that may be used for rounding are controlled by settings in DBSControl
(system “tunables” set by your administrator). One method for rounding is the more traditional
one that has a bias for rounding up:
“If the digit to the right of the rounded digit is 5 or more, round up. If the digit is less
than 5, round down.”

A more “unbiased” method is:


“if the digit to the right of the rounded digit is more than 5, round up. If it is less than
5, round down. If it is equal to 5, round to the even numbered digit (if rounding digit is
even, leave it even else round up to the even digit.”

Rounding works the same whether using the CAST or the Teradata extended method explained
on the earlier left-hand page. So the following are equivalent to the right-hand page.

SEL 1.34999 (DEC(2,1)); < 5, round down  1.3


SEL 1.35000 (DEC(2,1)); >= 5, round up  1.4
SEL 1.35001 (DEC(2,1)); >= 5, round up  1.4
SEL 1.34999 (DEC(2,1)); < 5, round down  1.
SEL 1.35000 (DEC(2,1)); equal to 5, round to the even number  1.4
SEL 1.45000 (DEC(2,1)); equal to 5, round to the even number  1.4
SEL 1.35001 (DEC(2,1)); > 5, round up  1.4

Page 6-32 Data Types and Functionality


Data Type Conversions and Rounding

One important thing to know and understand is how rounding takes place inside
your system.

There are 2 methods for rounding, each method is determined by a setting set by
your Teradata administrator.

Method 1:
SEL CAST(1.34999 AS DEC(2,1)); < 5, round down  1.3
SEL CAST(1.35000 AS DEC(2,1)); >= 5, round up  1.4
SEL CAST(1.35001 AS DEC(2,1)); >= 5, round up  1.4

Method 2:
SEL CAST(1.34999 AS DEC(2,1)); < 5, round down  1.3
SEL CAST(1.35000 AS DEC(2,1)); equal to 5, round to the even number  1.4
SEL CAST(1.45000 AS DEC(2,1)); equal to 5, round to the even number  1.4
SEL CAST(1.35001 AS DEC(2,1)); > 5, round up  1.4

Data Types and Functionality Page 6-33


Concatenating Data Types
Concatenation of data is discussed here because this concept requires that the database perform
an “implicit” casting (conversion) of data to character. Whether the source values are numeric or
character, they are always displayed as character when concatenated into a larger field. In other
words, concatenation is a method for combining several fields or columns into one, larger, field.

The actual method of conversion is the Teradata extended form discussed on the previous left-
hand page. In other words, numbers are right justified into the appropriate character space. So
an integer number would, by default, become an 11 character field with the value right justified
into the field. Examples of these will follow on the next page.

Page 6-34 Data Types and Functionality


Concatenating Data Types

Concatenation is a method for combining several columns or expressions into a


larger, single field of character data type using an implicit CAST.

Examples of results will be shown on the next page.

Two consecutive pipe characters (or vertical bars) are interpreted by the database
as a request to perform a concatenation of fields.

Examples:
SELECT Last_Name || First_Name FROM Employee;
SELECT Last_Name || ', ' || First_Name FROM Employee;
SELECT First_Name||' '||Last_Name||' is '||(DATE - Birthdate) / 365 FROM Employee;
SELECT 123||'ABC'||456;

For the previous examples, the resulting single field results will all have a data type
of VARCHAR.

You can CAST the result of a concatenation like this.


SELECT CAST( Last_Name || ', ' || First_Name AS CHAR(100) )
FROM Employee;

SELECT CAST(123||'ABC'||456 AS CHAR(3)); -- This will result in a truncation

Data Types and Functionality Page 6-35


Concatenated Example Results
The examples on the facing page illustrate the effect of concatenation. The resulting character
field is actually built, or created, from many other fields. Regardless of the data types from the
original sourcing fields or columns, the resulting concatenated field is one of variable character.
This resulting field may now be treated as a single character value and, as such, may become the
object of other manipulations or as arguments in functions.

Page 6-36 Data Types and Functionality


Concatenated Example Results

SELECT Last_Name || ', ' || First_Name FROM Employee WHERE Last_Name = 'Brown';

((last_name||', ')||first_name) In the examples,


---------------------------------------------------- last_name is CHAR(20)
Brown , Allen and first_name is
Brown , Alan VARCHAR(30).

SELECT First_Name || ', ' || Last_Name FROM Employee WHERE Last_Name = 'Brown';

((first_name||', ')||last_name)
----------------------------------------------------
Allen, Brown
Alan, Brown

Data Types and Functionality Page 6-37


Concatenated Example Results (continued)
Here we see how the database converts numbers to characters – implicitly.

Page 6-38 Data Types and Functionality


Concatenated Example Results
(continued)
Recall that concatenated fields end-up as a single variable-character field.

In the examples, 123 is data type byteint.


(3 digits plus the sign are a total of 4 characters, right justified)
The value “456” is a data type of SMALLINT.

SELECT 123||'ABC'||456 AS Concat; Concat


-------------
123ABC 456

CAST automatically “trims” the “leading” and “trailing” spaces after it casts the number
to VARCHAR.
SELECT CAST(123 AS VARCHAR(4))||'ABC'||CAST(4567 AS VARCHAR(4)) AS Concat;
Concat
-----------
123ABC4567

SELECT CAST(123||'ABC'||456 AS CHAR(3)); ((123||'ABC')||456)


This result gets
-------------------
truncated.
12

Data Types and Functionality Page 6-39


FORMAT
Formatting is a method of reporting information in a more stylized manner. Most of the
formatting characters are inserted into resulting fields while others (namely the “X” the “Z” and
the “9”) are used more to describe an action to be taken. In any case, all formatting characters
may be upper or lower case.

Although the format specification may not be longer than 30 characters, it may be used to format
a wide field. For instance, the following format specification may not include another ‘x’ since
there are 30 of them already specified.

FORMAT ‘xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx’

However, you may format a wider result like either of these:

FORMAT ‘x(31)’ -or- FORMAT ‘X(255)’

The widest formatting option is 255.

Page 6-40 Data Types and Functionality


FORMAT

You can reformat the appearance of how a value is displayed by using FORMAT.
(FORMAT is a Teradata extension to the ANSI Standard.)

Although there are many different formatting options, other than those used for
dates (later) the ones we will show are:
“-”  Dash
“%”  Percent sign
“$”  Dollar sign
“Z”  Leading zero suppress (“Z” or “z”)
“9”  Show leading zeroes
“X”  Character display (“X” or “x”)
“.”  Decimal or period
“,”  Comma
“B”  Blank or space (“B” or “b”)
“/”  Slash or division sign
“:”  Colon

• A FORMAT specification can contain a maximum of 30 characters.


• A FORMAT phrase can describe up to 18 digit positions.
• The output string produced using a FORMAT phrase can have a maximum of 255 characters.

Data Types and Functionality Page 6-41


SQL Assistant Methods for FORMAT
As an ODBC tool, SQL Assistant parses its output (to Teradata) and its input (from Teradata).
The “trick” is to pass to SQL Assistant a character field that has the formatted result. A
character field, whether fixed length or variable length, will be accepted by the tool as character
and not be reformatted. (After all, there is nothing more basic than being character.)

The facing page shows how we format the field and move it into the spool as a character value.
What SQL Assistant “sees” is a character field, not knowing or caring that it holds a reformatted
value.

Page 6-42 Data Types and Functionality


SQL Assistant Methods for FORMAT

Before providing examples with results, let us look at the different styles used for
formatting result fields.

These are the formatting styles available for use when using SQL Assistant.

SELECT CAST( Salary_Amount (FORMAT '$$$$,$$$,$$9.99') AS CHAR(15) ) . . . .

SELECT CAST( CAST(Salary_Amount AS FORMAT '$$$$,$$$,$$9.99') AS CHAR(15) ) . . .

Data Types and Functionality Page 6-43


SQL Assistant Formatting Examples
The facing page gives examples of formatting with SQL Assistant.

Page 6-44 Data Types and Functionality


SQL Assistant Formatting Examples

The various methods for formatting any field are to numerous too
mention.

Here we show different examples and styles.

SELECT CAST('abcdefg' AS FORMAT 'X(255)') (CHAR(255));

'abcdefg' 'abcdefg'
--------- -----------------------------------------------------------------
abcdefg abcdefg  (255 character space) 

SELECT CAST(CAST(1234567.89 AS FORMAT '$zzz,zzz,zz9.99') AS VARCHAR(15)),


CAST( (1234567.89 (FORMAT '$$$$,$$$,zz9.99')) AS VARCHAR(15));

1234567.89 1234567.89
Left justified in a
--------------- ---------------
$ 1,234,567.89 $1,234,567.89 12 character field.

Data Types and Functionality Page 6-45


Year, Month and Day Formatting Options
The facing page shows many of the formatting options that can be used to format date data. As
mentioned earlier, all formatting characters may be typed as either upper or lower case.

Page 6-46 Data Types and Functionality


Year, Month and Day Formatting Options

The following are available Y4  Four digit year (use also 'YYYY')
for date formatting. YY  Two digit year
M4  Full name of month (use also 'MMMM')
Lower case is allowed. M3  Three character abbreviation (use also 'MMM')
MM  Two digit month
E4  Full day of week name (use also 'EEEE')
E3  Abbreviated day of week name (use also 'EEE')
D3  three digit day of year (use also 'DDD')
DD  two digit day of month

SELECT CURRENT_DATE (FORMAT 'Y4-M3-DD'); Date


-----------
2009-Apr-01

SELECT CURRENT_DATE (FORMAT 'y4:m4:d3'); Date


------------------
2009:April:091

SELECT CURRENT_DATE (FORMAT 'y4:d3/m3/e3'); Date


----------------
2009:091/Apr/Wed

Data Types and Functionality Page 6-47


Module 6: Summary
The facing page is a summary for the module.

Page 6-48 Data Types and Functionality


Module 6: Summary

• Data types can be character, numeric, or just byte string.

• Case sensitivity has an effect on how SQL works with character data.

• Arithmetic can be performed on numeric data.

• Many functions have been created to operate on all data types.

• You can change the data type for any expression by using the CAST function.

• You can use FORMAT to change the formatting a column display.

• Many date formatting options exist to tailor how dates can be displayed.

Data Types and Functionality Page 6-49


Module 6: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 6-50 Data Types and Functionality


Module 6: Review Questions

True or False:

1. The FLOAT data type has more precision than does a decimal data type.
False – float only has 15 digits of precision.
2. Character data types can not be converted to a numeric data types.
False
3. FORMAT 'd2' is a valid formatting option.
False – for 2-digit day formatting “dd” must be used.
4. The expression  'a ' = 'A ' evaluates true.
True 3 spaces 10 spaces
5. You can use the CAST function to change a data type or to format results.
True
6. The comma “,” is a valid formatting character.
True
7. The formatting character “9” may be used to display leading or trailing
zeroes.
False – it can only display leading zeroes.

Data Types and Functionality Page 6-51


Module 6: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 6-52 Data Types and Functionality


Module 6: Lab Exercise

1. Find and list employees first and last names for employees where their
last name begins with either “R”, “S” or “T”. (Do this without regard to
case sensitivity.)

2. Write a request that will show the salary amount for the people
identified in #1 if they were given a 10% increase in salary that gave
them a salary > 50K.

3. Project new employee job codes (from the Employee table) for all those
job codes ending in 101, increasing them by 100. Include last names,
job codes, department numbers to make help verify results.
first_name last_name
------------ --------------------
Peter Rabbit
Larry Ratzlaff
Frank Rogers
Nora Rogers
Irene Runyon
Loretta Ryan
Michael Short
John Stein

Data Types and Functionality Page 6-53


Notes:

Page 6-54 Data Types and Functionality


Module 7

Basic SQL Functions

After completing this module, you should be able to discuss the features
and usage of the following functions, including: ANSI vs. Teradata
Extension; Character Vs. Numeric; Supporting Syntax.

• UPPER and LOWER


• CHARACTER_LENGTH
• TRIM
• POSITION
• SUBSTRING
• LIKE
• CASESPECIFIC
• EXTRACT
• ADD_MONTHS
• TYPE
• DEFAULT

Basic SQL Functions Page 7-1


Notes:

Page 7-2 Basic SQL Functions


Table of Contents
What are Functions?..................................................................................................................... 7-4
UPPER & LOWER ...................................................................................................................... 7-6
UPPER & LOWER for Case Sensitivity ..................................................................................... 7-8
CHARACTER_LENGTH ......................................................................................................... 7-10
CHARACTER_LENGTH (continued) ...................................................................................... 7-12
TRIM .......................................................................................................................................... 7-14
Trimming Other Than Space ...................................................................................................... 7-16
Trimming Numbers .................................................................................................................... 7-18
POSITION ................................................................................................................................. 7-20
Other Examples Using POSITION ............................................................................................ 7-22
SUBSTRING ............................................................................................................................. 7-24
SUBSTRING and Numbers ....................................................................................................... 7-26
LIKE........................................................................................................................................... 7-28
LIKE Examples Using “%” ....................................................................................................... 7-30
LIKE Examples Using “_” ......................................................................................................... 7-32
LIKE & ESCAPE....................................................................................................................... 7-34
CASESPECIFIC ........................................................................................................................ 7-36
EXTRACT ................................................................................................................................. 7-38
ADD_MONTHS ........................................................................................................................ 7-40
The Calendars............................................................................................................................. 7-42
Calendar Differences.................................................................................................................. 7-44
Additional Calendar Functions .................................................................................................. 7-46
Additional Calendar Functions (continued) ............................................................................... 7-48
Module 7: Summary................................................................................................................... 7-50
Module 7: Review Questions ..................................................................................................... 7-52
Module 7: Lab Exercise ............................................................................................................. 7-54

Basic SQL Functions Page 7-3


What are Functions?

Page 7-4 Basic SQL Functions


What are Functions?

The general form of a function is:


function name ( argument list or expression )

Functions are SQL constructs that perform certain and varied operations on an
argument list or expression containing:
• Single columns or groups of columns
• Literals
• Expressions involving computations
• Results of other functions (nested functions)

Functions may be either


• Non-ANSI Compliant (not supported by ANSI – e.g. HELP DATABASE)
• Partially ANSI Compliant (a Teradata function supporting some of the
ANSI requirements, but not all)
• Fully ANSI Compliant (a Teradata function supporting the full set of ANSI
requirements)

Some functions may be fully compliant and yet provide extended capabilities
supporting functionality beyond those defined by the ANSI Standadrd.

Basic SQL Functions Page 7-5


UPPER & LOWER
We begin our module on basic functions by looking at two of the more elemental ones. The
UPPER and LOWER functions make take values for a character column and make them either
all uppercase or all lowercase depending, of course, on the function. This function may only be
used on character data values.

LOWER function supports Unicode


In versions prior to Teradata Database 14.0, the LOWER function only accepted a character set
Latin string as the input argument. In Teradata Database 14.0, the LOWER function supports an
input argument string of the Latin, Unicode, Graphic, or KanjiSJIS character set, which could
include Greek, Cyrillic, Eastern European characters, etc.

The LOWER function result is in the same character set as the input argument. The only
exception to this is when the input is the Kanji1 character set. With Kanji1, the LOWER
function returns the result in the Latin character set, as previously. The Kanji1 character set is
deprecated and should no longer be used anywhere.

Page 7-6 Basic SQL Functions


UPPER & LOWER

UPPER ( expression )
Returns “expression” as uppercase.
LOWER ( expression )
Returns “expression” as lowercase.

UPPER and LOWER are ANSI compliant.


LOWER function supports Unicode.

SELECT CAST(First_Name AS CHAR(10)) AS FName,


CAST(Last_Name AS CHAR(10)) AS LName,
CAST(UPPER(First_Name) AS CHAR(10)) AS U_FName,
CAST(LOWER(Last_Name) AS CHAR(10)) AS L_LName
FROM Employee WHERE Employee_Number = 1010;

FName LName U_FName L_LName


---------- ---------- ---------- ----------
Frank Rogers FRANK rogers

Basic SQL Functions Page 7-7


UPPER & LOWER for Case Sensitivity
The facing page illustrates the effect that case sensitivity has on result sets. The table definition
for the queries is shown below.
Since ANSI mode is always case sensitive, the results for the queries on the facing page (against
the very same table) are shown as well as a (typical?) one.
CREATE SET TABLE employee
(employee_number INTEGER,
manager_employee_number INTEGER,
department_number SMALLINT,
job_code INTEGER,
last_name CHAR(20) CASESPECIFIC ,
first_name VARCHAR(30) NOT CASESPECIFIC ,
hire_date DATE ,
birthdate DATE ,
salary_amount DECIMAL(10,2))
UNIQUE PRIMARY INDEX (employee_number);

ANSI Mode Results:


SELECT First_Name FROM Employee
WHERE First_Name = 'ALAN';
*** Query completed. No rows found.
SELECT Last_Name FROM Employee
WHERE UPPER(Last_Name) = 'TRADER';
*** Query completed. One row found. One column returned.
last_name
--------------------
Trader

SELECT Last_Name FROM Employee


WHERE Last_Name = 'Trader';
*** Query completed. One row found. One column returned.

last_name
--------------------
Trader

Page 7-8 Basic SQL Functions


UPPER & LOWER for Case Sensitivity

Transaction processing in Teradata mode is not case specific (not case sensitive).

first_name
SELECT First_Name FROM Employee
------------------------------
WHERE First_Name = 'ALAN'; Alan

However, if the column is defined as CASESPECIFIC:

SELECT Last_Name FROM Employee


*** Query completed. No rows found.
WHERE Last_Name = 'TRADER';

You can then make the query not case sensitive (NOT CASESPECIFIC) by
performing a “case-blind” test by using either UPPER or LOWER.

last_name
SELECT Last_Name FROM Employee
--------------------
WHERE UPPER(Last_Name) = 'TRADER'; Trader

A later page will describe how to make Teradata case-sensitive (CASESPECIFIC)

Basic SQL Functions Page 7-9


CHARACTER_LENGTH
The CHARACTER_LENGTH function returns the number of characters (hence the “length”) of
a character string expression. The result is an integer number that represents this length. If the
character string is a fixed character (e.g. a column defined as CHAR(20)), then the length will
always be the same for every value (recall that spaces are valid characters and padding the right
with trailing spaces will make each value the same length.) If the character string is variable in
length then it will return the number of characters for each value, which may now vary due the
nature of variable length data values. Trailing spaces are still counted, but, for the most part,
there should be now trailing spaces, though they can occur depending upon how the data is
loaded.

Page 7-10 Basic SQL Functions


CHARACTER_LENGTH

CHARACTER_LENGTH ( expression )

Returns an INTEGER number that represents the number of characters in a character


string. This function is ANSI compliant.

SELECT CHARACTER_LENGTH('abc'), CHARACTER_LENGTH('abc ');


Characters('abc') Characters('abc ')
----------------- --------------------
3 6

SELECT DISTINCT CHARACTER_LENGTH(Last_Name) FROM Employee;

Characters(last_name)
Last Name is defined as CHARACTER(20)
---------------------
20 This means that every value contains 20 characters.

SELECT First_Name, CHARACTER_LENGTH(First_Name)


FROM Employee First Name is defined as VARCHARACTER(20)
WHERE Employee_Number = 801; So he number of characters varies.

first_name Characters(first_name)
------------------------------ ----------------------
I.B. 4

Basic SQL Functions Page 7-11


CHARACTER_LENGTH (continued)
The CHARACTER_LENGTH function returns the number of characters (hence the “length”) of
a character string expression. The result is an integer number that represents this length. If the
character string is fixed character (e.g. a column defined as CHAR(20)), then the length will
always be the same for every value (recall that spaces are valid characters and padding the right
with trailing spaces will make each value the same length). If the character string is variable in
length then it will return the number of characters for each value, which may now vary due the
nature of variable length data values. Trailing spaces are still counted, but, for the most part,
there should be now trailing spaces, though they can occur depending upon how the data is
loaded.

Page 7-12 Basic SQL Functions


CHARACTER_LENGTH (continued)

first_name Characters(first_name)
CHARACTER_LENGTH ( expression ) -------------- ----------------------
Irene 5
SELECT first_name, Robert 6
Nora 4
CHARACTER_LENGTH(first_name) Alan 4
FROM employee; John 4
Charles 7
SELECT LAST_name, Paulene 7
Larry 5
CHARACTER_LENGTH(last_name) Carol 5
FROM employee John 4
Where department_number IN (401, 501); James 5
Edward 6
Frank 5
last_name Characters(last_name) James 5
------------- --------------------- Allen 5
Runyon 20 Michael 7
Rogers 20 Ron 3
Peter 5
Brown 20
I.B. 4
Phillips 20 Jim 3
Johnson 20 William 7
Ratzlaff 20 Darlene 7
Rabbit 20 Domingus 8
Trader 20 Arnando 7
Albert 6
Wilson 20
Loretta 7
Hoover 20
Machado 20

Basic SQL Functions Page 7-13


TRIM
To remove a particular series of leading or trailing character you use the TRIM function. By
default it removes spaces, and only at the beginning or end of a value. You can trim BOTH the
leading and trailing spaces (or a different character) by using the BOTH key word, which is also
a default. TRIM is often used to transform a fixed-length value into a variable-length value. It
only accepts character data but may be used on numeric data (discussed later). When this used
on numeric data values, the database will perform an implicit CAST on the numeric value (to
make it character) prior to trimming it.

Page 7-14 Basic SQL Functions


TRIM

TRIM ( [ LEADING | BOTH | TRAILING ] [ trim_character ] FROM expression )

Removes LEADING, TRAILING or BOTH (LEADING and TRAILING) of a defined


character from a string. The default being spaces. TRIM is ANSI Standard.

SELECT Last_Name||', '||First_Name


FROM Employee WHERE Employee_Number = 1001;
((last_name||', ')||first_name)
Last_Name is defined ----------------------------------------------------
as CHAR(20). Hoover , William

SELECT TRIM(Last_Name)||', '||First_Name


FROM Employee WHERE Employee_Number = 1001;
Note the default in the
heading display. ((Trim(BOTH FROM last_name)||', ')||first_name)
----------------------------------------------------
Hoover, William

Basic SQL Functions Page 7-15


Trimming Other Than Space
The facing page illustrates how one can trim something other than spaces. Only a single
character may be specified for trimming.

Page 7-16 Basic SQL Functions


Trimming Other Than Space

SELECT TRIM(TRAILING FROM 'abc ') || 'XYZ';

These two requests SELECT TRIM(TRAILING ' ' FROM 'abc ') || 'XYZ';
return the same result.
(Trim(TRAILING ' ' FROM 'abc ')||'XYZ')
-----------------------------------------------
abcXYZ

SELECT 'abc_______', TRIM(TRAILING '_' FROM 'abc_______');


Here we trim trailing
underscores from a 'abc_______' Trim(TRAILING '_' FROM 'abc_______')
field. ------------ ------------------------------------
abc_______ abc

SELECT CHARACTER_LENGTH('ABC '),


Using TRIM to CHARACTER_LENGTH(TRIM('ABC '));
count characters in
a fixed-length Characters('ABC ') Characters(Trim(BOTH FROM 'ABC '))
string. -------------------- ------------------------------------
6 3

Basic SQL Functions Page 7-17


Trimming Numbers
As was mentioned on an earlier left-hand page, trimming numbers is also possible. When
trimming numeric data values, the database performs an implicit casting of the value to character
before doing the trim. The number will be cast, left-justified, into the appropriate character
space for the size of the number. When a numeric value gets trimmed it gets left-justified into a
variable character space that fits the data type. Prior to the conversion, the numeric value was
right justified into the character space, but trimming it for leading and trailing spaces has the
effect of shifting it into the space left-justified.

For instance:

• BYTEINT  left-justified into a variable character 4 space (recall that BYTEINT is


plus-or-minus 127, including the sign requires 4 characters).

• SMALLINT  left-justified into a variable character 6 space (recall that SMALLINT is


plus-or-minus 32767, including the sign requires 6 characters).

• INTEGER  left-justified into an variable character 11 space (recall that INTEGER is


plus-or-minus 2,147,483,648 , including the sign requires 11 characters).

• BIGINT  left-justified into a variable character 20 space (recall that BIGINT is plus-
or-minus 9,233,372,036,854,775,807, including the sign requires 20 characters).

Decimals are handled likewise.

Page 7-18 Basic SQL Functions


Trimming Numbers

When using TRIM on numeric fields, the database performs an implicit CAST to
character prior to doing the trim.

SELECT -999, TRIM(-999);


Note the alignment of the trimmed result.
-999 Trim(BOTH FROM -999)
Left alignment indicates character data. ------ --------------------
-999 -999

CAST performs an implicit TRIM.

SELECT -777||CAST(-888 AS CHAR(4))||TRIM(-999) AS ConcatData;

ConcatData
Six character field ----------------
-777-888-999 Both fields trimmed.
right justified.

The issue with using CAST is knowing how big to make the character length if it
were a column instead of a literal.

Basic SQL Functions Page 7-19


POSITION
POSITION returns an integer value that represents the starting position of one string inside
another. Only the position of the first occurrence of the string is returned.

Page 7-20 Basic SQL Functions


POSITION

POSITION (expression1 IN expression2 )

Returns an INTEGER value representing the numeric position of the first argument in
the second argument. POSITION is ANSI standard.

SELECT Last_Name, POSITION('i' IN Last_Name) AS ColPos


FROM Employee
WHERE ColPos > 0;
last_name ColPos
-------------------- -----------
In the result:
Kanieski 4
• The numeric position of the Kubic 4
letter ‘i’ is returned. Stein 4
• Only the value for the first Villegas 2
occurrence is returned. Wilson 2
Phillips 3
Trainer 4
Rabbit 5
Hopkins 5
Morrissey 5

Basic SQL Functions Page 7-21


Other Examples Using POSITION
As with the TRIM function, POSITION can be used with numeric values. Again, the Teradata
database performs an implicit casting of numeric-to-character. It is for this reason that the value
for which we need to find the position must be character, and we then need to trim the numeric
value. The result of the trim of the numeric data value was discussed earlier. It was there that
we explained

“When a numeric value gets trimmed it gets left-justified into a variable character space that
fits the data type. Prior to the conversion, the numeric value was right justified into the
character space, but trimming it for leading and trailing spaces has the effect of shifting it into
the space left-justified.”

Page 7-22 Basic SQL Functions


Other Examples Using POSITION

SELECT Last_Name, POSITION('il' IN Last_Name)


FROM Employee
WHERE POSITION('il' IN Last_Name) > 0
This example illustrates ORDER BY 1;
how one can find the
position of a multiple last_name Position('il' in last_name)
character field in another. -------------------- ---------------------------
Phillips 3
Villegas 2
Wilson 2

This example illustrates E# P#


how one can find the ----------- -----------
position of a numeric digit 1024 3
1020 3
inside a number.
1025 3
1022 3
SELECT Employee_Number AS E#, 1023 3
POSITION('2' IN TRIM(E#)) AS P# 1012 4
FROM Employee 1021 3
WHERE P# > 0; 1002 4

Read the left-hand page for a more thourough explanation of the second example.

Basic SQL Functions Page 7-23


SUBSTRING
The SUBSTRING function is used to “positionally” extract text from another data value. In the
ANSI syntax, we start from some position (numerically, according to the number line) and then
move right for a specified number of characters. There is no moving left (logically), so the value
for the keyword “FOR” must not be negative. We can, however, start the substring prior to the
text string (which begins on +1 of the number line). We can also begin (FROM) beyond the text,
but this will only return a “zero-length” string. To explain a zero-length string, think of this as a
VARCHAR(0)), which is invalid to declare as a column in a table, but can be a result value.

For instance:
SELECT TYPE(SUBSTRING(‘contact’ FROM 8 FOR 6)) AS testcol;
SELECT TYPE(SUBSTRING(‘contact’ FROM 8 FOR 0)) AS testcol;

Return the following:

testcol
---------------------------------------
VARCHAR(0) CHARACTER SET UNICODE

Page 7-24 Basic SQL Functions


SUBSTRING

SUBSTRING (expression1 FROM n1 [ FOR n2 ] )

Returns a substring of expression1, starting at the position designated by n1, for a


length of n2 (if present) or to the end of the string (if n2 is not present). SUBSTRING is
ANSI Standard.

C o n t a c t
--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
-5 -4 -3 -2 -1 0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +10

Following are some examples of the SUBSTRING function used on the word “Contact”.

SELECT SUBSTRING( 'Contact' FROM 4 FOR 3);  tac


SELECT SUBSTRING( 'Contact' FROM 8 FOR 6);  zero-length string ('')
SELECT SUBSTRING( 'Contact' FROM 4);  tact
SELECT SUBSTRING( 'Contact' FROM -5);  Contact
SELECT SUBSTRING( 'Contact' FROM 0 FOR 2);  C
SELECT SUBSTRING( 'Contact' FROM -5 FOR 8);  Co
SELECT SUBSTRING( 'Contact' FROM 2 FOR -1);  Invalid

Basic SQL Functions Page 7-25


SUBSTRING and Numbers
As with TRIM and POSITION, you can use substring on numeric data values as well. As before,
the database performs an implicit casting of the numeric value to character. Since TRIM isn’t
being used in our examples the numbers are right-justified into the character space. You could
use SUBSTRING on a trimmed field, but then one wouldn’t know where to begin the substring
from since different number would have different lengths, there-by making the starting position
inconsistent. This issue is avoided when the numeric value is always right-justified into the same
character space.

Page 7-26 Basic SQL Functions


SUBSTRING and Numbers

In line with other numeric-to-character functionality, the database performs an implicit


conversion to character prior to using SUBSTRING.

34567 is an integer cast, right-justified, into an 11-character field (including the sign).
Beginning with the 3rd character, it then takes the first 5 characters.

_ _ _ _ _ _ 3 4 5 6 7

SELECT SUBSTRING(34567 FROM 3 FOR 5); Substring(34567 From 3 For 5)


-----------------------------
3

SELECT Salary_Amount, SUBSTRING(Salary_Amount FROM 1 FOR 10)


FROM Employee
WHERE Employee_Number = 1011;

As a DEC(10,2), and including the sign plus decimal: _ _ _ _ 5 2 5 0 0 . 0 0

salary_amount Substring(salary_amount From 1 For 10)


------------- --------------------------------------
52500.00 52500.

Basic SQL Functions Page 7-27


LIKE
LIKE is not a function, but rather a logical operator as are IN; NOT IN; AND; OR etc. it is
included to this module because of its similar functionality to POSITION. The functionality of
LIKE, however, is more greater than that of POSITION. Here is an example of how they are
similar:
SELECT * FROM Employee
WHERE last_name LIKE ‘%ith% ‘;

SELECT * FROM Employee


WHERE POSITION(‘ith’ IN last_name) > 0

Unlike POSITION, LIKE does not return a numeric starting position. It simply searches to find
an occurrence of the value in inside the string column or expression.

Of the two wild cards, “%” designates that any number of characters may precede (or follow) it.
The “_” is positional and means that the character for that particular position may be anything.

LIKE may only be used on character strings, however, you may nest a conversion of the data
type to character from which it may search.

Page 7-28 Basic SQL Functions


LIKE

LIKE is a logical operator (recall BETWEEN, IN, NOT IN etc.) and not a function.
It is included in this module due to its complexity and its similarity to POSITION.

WHERE expression1 LIKE ' [ % | _ ] expression2 [ % | _ ] '

Searches for a character string pattern within another character string or character
string expression. “LIKE” is ANSI compliant.

LIKE has reference to two (2) wild cards: “%” (percent) and “_” (underscore).

“%”  Any number of characters preceding (or following it – depending on placement).


“_”  Positional.

SELECT Last_Name FROM Employee WHERE Last_Name LIKE '%il%';

last_name
-------------------- Any number of characters preceding.
Villegas
Wilson Any number of characters following.
Phillips

Here is how to do this SELECT Last_Name FROM Employee


using POSITION  WHERE POSITION('il' IN Last_Name) > 0;

Basic SQL Functions Page 7-29


LIKE Examples Using “%”
The percent (%) wildcard is used to signify that:
• Any number of characters may precede it from either the beginning of the string or from
the end of the previous search pattern.
• Any number of characters may follow it to either the end of the string or to the beginning
of the next search pattern.

In addition to the examples shown, observe the following queries and their results, where the
positions of the “o” and the “s” are switched:

SEL Last_name FROM employee


WHERE last_name LIKE ‘%s%o%’;

last_name
--------------------
Short
Wilson
Johnson

SEL Last_name FROM employee


WHERE last_name LIKE ‘%o%s%’;

last_name
--------------------
Rogers
Rogers
Hopkins
Johnson
Morrissey

Page 7-30 Basic SQL Functions


LIKE Examples Using “%”

Find employees having last names beginning with “Br”.

SELECT Last_Name FROM Employee WHERE Last_Name LIKE 'Br%';

last_name
--------------------
Brown
Brown

Find employees having last names ending with “wn”.


Recall last name is fixed character 20.

SELECT Last_Name FROM Employee WHERE Last_Name LIKE '%wn';

*** Query completed. No rows found.

SELECT Last_Name FROM Employee WHERE TRIM(Last_Name) LIKE '%wn';

last_name
The value “Brown” is followed by 15 spaces.
--------------------
Brown
The following would work.
Brown WHERE Last_Name LIKE '%wn ';

Basic SQL Functions Page 7-31


LIKE Examples Using “_”
The second (and only other) wildcard use by LIKE is the underscore. This wildcard is positional
in that the placement of the underscore in the search string determines that the place (position)
where it occurs in the string may be any character, as opposed to any number of places (as does
“%”).

In our example, we are looking for last names where the character “r” occurs as the second
character in it. If we wanted to find where the character “n” were the second from the last letter
in the last name, the following syntax would be needed. This would likely be true even if the
last name were variable character since a space could occur as the last character for it as well. In
that case, where the TRIM is left off and a variable character column has a value like 'Brown '
(where there is a space after ‘Brown’), the following last name would be included in the result
set.

SEL Last_name
FROM employee
WHERE last_name LIKE ‘%n_’;

last_name
----------
Brown

Page 7-32 Basic SQL Functions


LIKE Examples Using “_”

Find employees having last names with “r” as the second character.
SELECT Last_Name FROM Employee WHERE Last_Name LIKE '_r%';

last_name
--------------------
Brown
Brown
Crane
Trader
Trainer

Find employees having last names with “w” as the second from the
last character.
last_name
SELECT Last_Name
--------------------
FROM Employee Brown
WHERE TRIM(Last_Name) LIKE '%w_'; Brown

Basic SQL Functions Page 7-33


LIKE & ESCAPE
When the defined ESCAPE character is in the pattern string, it must be immediately followed by
an underscore, percent sign, or another ESCAPE character. In a left-to-right scan of the pattern
string the following rules apply when ESCAPE is specified:

• Until an instance of the ESCAPE character occurs, characters in the pattern are
interpreted at face value.
• When an ESCAPE character immediately follows another ESCAPE character, the two
character sequence is treated as though it were a single instance of the ESCAPE
character, considered as a normal character.
• When an underscore metacharacter immediately follows an ESCAPE character, the
sequence is treated as a single underscore character (not a wildcard character).
• When a percent metacharacter immediately follows an ESCAPE character, the sequence
is treated as a single percent character (not a wildcard character).
• When an ESCAPE character is not immediately followed by an underscore
metacharacter, a percent metacharacter, or another instance of itself, the scan stops and an
error is reported.

Page 7-34 Basic SQL Functions


LIKE & ESCAPE

To search for a character in a string that is a LIKE wild card you use the ESCAPE
clause.

• It is followed by a wild card as a signal to the database that it treat the


wild card as a character instead of as a wild card.
• The “escape” character is chosen by the author and may be any
character or digit.
• The chosen escape value should not be the option of the search.

Find table names having an underscore as the second character in its name.

SELECT TableName
FROM DBC.Tables
WHERE TableName LIKE '_x_%' ESCAPE 'x';

TableName
------------------------------
T_CST_WT_COMMANDS
T_CST_CTRL_INPUT
T_CST_WT_OVERALL

Basic SQL Functions Page 7-35


CASESPECIFIC
Since the Teradata database is not case sensitive (case specific), there may be times when we
would like it to perform in a case sensitive manner. To accomplish this we use the
CASESPECIFIC function. As a Teradata extension to ANSI, this is the only function that
appears in the place where an argument list normally appears. That is, we might have expected it
to be used like this  CASESPCIFIC(‘brown’)

When using this function, values like “Brown” and “brown” are no longer equal since “B” and
“b” have a different case.

By default, ANSI mode is case sensitive (CASESPECIFIC)

Page 7-36 Basic SQL Functions


CASESPECIFIC

{ expression } ( { CASESPECIFIC | CS } )

Treats column or literal or expression as Case Sensitive.


CASESPECIFIC is a Teradata Extension to the ANSI standard.

SELECT Last_Name FROM Employee WHERE Last_Name = 'brown' (CASESPECIFIC);


*** Query completed. No rows found.

SELECT Last_Name FROM Employee WHERE Last_Name LIKE '%Ra%' (CASESPECIFIC);


last_name
--------------------
Ratzlaff Note that “Crane” and “Trader” are also employees.
Rabbit

last_name
SELECT Last_Name FROM Employee --------------------
WHERE POSITION(('Ra' (cs)) IN Last_Name) > 0; Ratzlaff
Rabbit

Basic SQL Functions Page 7-37


EXTRACT
The EXTRACT function can be used to return portions of a date or time value. It cannot extract
combinations like “year-month’ or “month-day”, but these could be obtained in other ways, for
instance, if carefully done, you could do this through concatenating year with month or month
with day.

The result of the EXTRACT function is an integer value that represents that portion you are
extracting.

Page 7-38 Basic SQL Functions


EXTRACT

Returns (year/month/day) YEAR


portions from a date, or MONTH
(hour/minute/second) portions DAY
from a time. EXTRACT ( FROM date-value )
HOUR
EXTRACT is partially ANSI MINUTE
compliant. SECOND

SELECT EXTRACT(YEAR FROM DATE'2010-12-20' + 30) AS Yr,


EXTRACT(MONTH FROM DATE'2010-12-20' - 30) AS Mth,
EXTRACT(DAY FROM DATE'2010-12-20' + 30) AS Dy;

Yr Mth Dy
----------- ----------- -----------
2011 11 19

SELECT EXTRACT(HOUR FROM TIME'10:20:30') AS Hr,


EXTRACT(MINUTE FROM TIME'10:20:30') AS Mn,
EXTRACT(SECOND FROM TIME'10:20:30') AS Scd;

Hr Mn Scd
----------- ----------- -----------
10 20 30

Basic SQL Functions Page 7-39


ADD_MONTHS
The ADD_MONTHS function is a Teradata extension to ANSI syntax. It is used to add months
to a particular date expression to yield a new date. The first argument must be a date value. If
the first argument is a literal date, then either of these forms is typical: DATE’YYYY-MM-DD’
or ’YYYY-MM-DD’, however is always better practice to use the DATE prefix whenever
referencing a date literal. The second argument must be an integer expression that represents the
number of months to add the first (date) argument.

Since this can be used to add months to a date, it is also capable of adding years as well. The
second example for using the function illustrates this approach.

Page 7-40 Basic SQL Functions


ADD_MONTHS

ADD_MONTHS(date-expression, integer-expression)

ADD_MONTHS creates a new date by adding an integer number of months (integer-


expression) to a date (date-expression).
ADD_MONTHS takes into account the Gregorian calendar (months may have 28, 29, 30,
or 31 days). This function is a Teradata extension.

SELECT DATE; 09/05/01

SELECT ADD_MONTHS (DATE, 2); 2009-07-01

SELECT ADD_MONTHS (DATE, 12*14); 2023-05-01

SELECT ADD_MONTHS (DATE, -11); 2008-06-01

SELECT ADD_MONTHS ('2001-07-31', 2); 2001-09-30

SELECT ADD_MONTHS ('2003-12-31', 2); 2004-02-29

SELECT ADD_MONTHS ('2003-12-31', 14); 2005-02-28

Basic SQL Functions Page 7-41


The Calendars
A business calendar defines business and non-business days. The significance of a day being a
business day or a non-business day is user-determined. For example, a business day could be a
work day, and a non-business day could be either a non-working day, a weekend day, a holiday,
or a vacation day. You can define different week patterns (weekdays and weekends) and
exceptions (holidays and business open and closed days) for the system-defined calendars.

There are three Teradata system-defined business calendars that you can set for your session:
• Teradata
• ISO
• COMPATIBLE

All three calendars are based on the de facto international standard, the Gregorian calendar.

The Gregorian calendar has 365 days in most years and 366 days in a leap year. The calendars
support January 1, 1900, to December 31, 2100.

The default session calendar is Teradata. Each calendar defaults to all business days. You can
change that pattern using a Macro.

Sys_Calendar views provide business functionality for the three system-defined


businesscalendars.

The facing page illustrates the differences in the calendars.

Page 7-42 Basic SQL Functions


The Calendars

SELECT calendar_date, week_of_month


This query will produce the FROM Sys_Calendar.BusinessCal
following results WHERE week_of_month = 1
AND month_of_year = 1
AND year_of_calendar = 2011;

SET SESSION calendar = Teradata; SET SESSION calendar = ISO; SET SESSION calendar = Compatible;

calendar_date week_of_month calendar_date week_of_month calendar_date week_of_month


------------- ------------- ------------- ------------- ------------- -------------
11/01/02 1 11/01/03 1 11/01/01 1
11/01/03 1 11/01/04 1 11/01/02 1
11/01/04 1 11/01/05 1 11/01/03 1
11/01/05 1 11/01/06 1 11/01/04 1
11/01/06 1 11/01/07 1 11/01/05 1
11/01/07 1 11/01/08 1 11/01/06 1
11/01/08 1 11/01/09 1 11/01/07 1

S M T W T F S S M T W T F S S M T W T F S

1 1 1

2 3 4 5 6 7 8 2 3 4 5 6 7 8 2 3 4 5 6 7 8

9 10 11 12 13 14 15 9 10 11 12 13 14 15 9 10 11 12 13 14 15

16 17 18 19 20 21 22 16 17 18 19 20 21 22 16 17 18 19 20 21 22

23 24 25 26 27 28 29 23 24 25 26 27 28 29 23 24 25 26 27 28 29
1
30 31 30 31 30 31

Basic SQL Functions Page 7-43


Calendar Differences
Each calendar differs in how it determines the first week of the year:

• TERADATA Calendar:
◦ The first full week of the year starts on Sunday
◦ The days of the year before Sunday belong to Week 0. For example, if the year starts
on January 1, 2004 (a Thursday), then Week 0 is from January 1 to January 3. Week
1 begins on Sunday, January 4.

• ISO Calendar:
◦ This calendar follows the ISO and European standard.
◦ The week begins on Monday
◦ The first week of the year is the first week that has at least 4 days. If a week has
fewer than 4 days, it belongs to the last week of the previous year. (week 52)
◦ There are no partial weeks. For example, if the year starts on January 1, 2008 (a
Tuesday) and the week start is Monday, week 1 of 2008 is from December 31, 2007,
to January 6, 2008.

• COMPATIBLE Calendar:
◦ This calendar is Oracle-compatible.
◦ It specifies that the first full week of a year begins on January 1, regardless of what
day of the week that is
◦ There can be partial weeks with 1 day (for most years) or 2 days (for leap years) at
the end of the year.
◦ The day the week begins can change from year to year. For example, if January 1,
2011, is a Saturday, the first week of the year is from Saturday, January 1, 2011,
through Friday, January 7, 2011.

Page 7-44 Basic SQL Functions


Calendar Differences

Find the week number for a given data using the Teradata, Compatible,
and ISO Calendars via the new calendar UDFs : S M T W T F S

SET SESSION calendar = TERADATA; Week 0 of Current year 1

2 3 4 5 6 7 8
SELECT weeknumber_of_year(2011-01-01); 9 10 11 12 13 14 15

16 17 18 19 20 21 22
weeknumber_of_year(2011-01-01,’TERADATA’) 23 24 25 26 27 28 29
----------------------------------------- 30 31

0
S M T W T F S
SET SESSION calendar = COMPATIBLE; Week 1 of Current year 1

2 3 4 5 6 7 8
SELECT weeknumber_of_year(2011-01-01); 9 10 11 12 13 14 15

16 17 18 19 20 21 22
weeknumber_of_year(2011-01-01,’COMPATIBLE’)
23 24 25 26 27 28 29
-----------------------------------------
30 31
1
SET SESSION calendar = ISO; S M T W T F S

Last week of previous year 1

SELECT weeknumber_of_year(2011-01-01); 2 3 4 5 6 7 8

9 10 11 12 13 14 15
weeknumber_of_year(2011-01-01,’ISO’) 16 17 18 19 20 21 22
----------------------------------------- 23 24 25 26 27 28 29
52 30 31

Basic SQL Functions Page 7-45


Additional Calendar Functions
The facing pages illustrate the proper SYNTAX and charts listing Function Names

SYNTAX:
Function Name ( expression )
TD_SYSFNLIB , calendar_name
NULL

• Expressions:
an expression that results in a DATE, TIMESTAMP, or TIMESTAMP WITH TIME
ZONE value.

• calendar_name:
an optional business calendar name. The only possible values are Teradata, ISO, and
COMPATIBLE. This argument must be a character literal and cannot be a table column
or expression. If a calendar is not named, Teradata uses the calendar for the session.

NULL:
an optional argument, the calendar that is set for the session.

Page 7-46 Basic SQL Functions


Additional Calendar Functions

Syntax
Function Name ( expression )

TD_SYSFNLIB , calendar_name
NULL

Function Name Purpose Data Type


DayNumber_of_week Returns the number of days from the beginning of the week to the specified Integer
date.
DayNumber_of_month Returns the number of days from the beginning of the month to the specified Integer
date.
DayNumber_of_year Returns the number of days from the beginning of the year to the specified Integer
date.
DayNumber_of_calendar Returns the number of days from the beginning of the business calendar to the Integer
specified date.
WeekNumber_of_year Returns the number of weeks from the beginning of the year to the specified Integer
date.

WeekNumber_of_calendar Returns the number of weeks from the beginning of the month to the specified Integer
date.

WeekNumber_of_month Returns the number of weeks from the beginning of the month to the specified Integer
date.

DayOccurrance_of_month Returns the nth occurrence of the weekday in the month for the specified date. Integer

Basic SQL Functions Page 7-47


Additional Calendar Functions (continued)
The facing pages illustrate the proper SYNTAX and charts listing Function Names

SYNTAX:
Function Name ( expression )
TD_SYSFNLIB , calendar_name
NULL

• Expressions:
an expression that results in a DATE, TIMESTAMP, or TIMESTAMP WITH TIME
ZONE value.

• calendar_name:
an optional business calendar name. The only possible values are Teradata, ISO, and
COMPATIBLE. This argument must be a character literal and cannot be a table column
or expression. If a calendar is not named, Teradata uses the calendar for the session.

NULL:
an optional argument, the calendar that is set for the session.

Page 7-48 Basic SQL Functions


Additional Calendar Functions
(continued)

Syntax
Function Name ( expression )

TD_SYSFNLIB , calendar_name
NULL

Function Name Purpose Data Type


MonthNumber_of_year Returns the number of months from the beginning of the year to the specified Integer
date.
MonthNumber_of_quarter Returns the number of months from the beginning of the quarter to the Integer
specified date.

MonthNumber_of_calendar Returns the number of months from the beginning of the calendar to the Integer
specified date.

QuarterNumber_of_year Returns the number of quarters from the beginning of the year to the Integer
specified date.

QuarterNumber_of_calendar Returns the number of quarters from the beginning of the year to the Integer
specified date.

YearNumber_of_calendar Returns the year of the specified date. Integer

WeekNumber_of_quarter Returns the number of weeks from the beginning of the quarter to the Integer
specified date.

Basic SQL Functions Page 7-49


Module 7: Summary
A summary of this module is discussed.

Page 7-50 Basic SQL Functions


Module 7: Summary

• SQL functions are of the form “function(arg)”.


• UPPER and LOWER return character data by changing their case.
• CHARACTER_LENGTH returns the length of a character string.
• TRIM removes leading and/or trailing spaces from a character string.
• POSITION returns the numeric position of one string inside another.
• SUBSTRING returns a portion of a value.
• LIKE is an operator that references the wildcards “%” and “_”.
• CASESPECIFIC enables the database to perform case specific tests.
• EXTRACT can be used to return portions of a date or time.
• ADD_MONTHS can add or subtract months from a date expression..

Basic SQL Functions Page 7-51


Module 7: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 7-52 Basic SQL Functions


Module 7: Review Questions

True or False:

1. The CHARACTER_LENGTH function can accept numeric values as input.


False – Numbers must be CAST as character first.
2. POSITION returns a value of SMALLINT.
False – it is data type integer.
3. SUBSTRING can accept character and numeric values as input.
True
4. The following syntax searches for a “%” in a column  C1 LIKE '%g%%'
ESCAPE 'G'
True
5. The ADD_MONTHS function may be used to add years to a date value.
True – e.g. ADD_MONTHS(DATE, 12*n)
6. EXTRACT can be used to return the day-of-week for a date.
False
7. Functions may not be nested. (e.g. function(function(function(arg))) is not
valid)
False

Basic SQL Functions Page 7-53


Module 7: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 7-54 Basic SQL Functions


Module 7: Lab Exercises

1. From the Employee table, display the last name first name for
employees 1013, 1018, and 1024. Concatenate the columns so that you
see them as  “last, first”.

2. Repeat #1. Replace your WHERE Clause, using LIKE to only list
employees who have an "LL" combination in their last name.

3. Using POSITION, change #2 to also include last names having an “FF”


combination in their last name, and instead display the result like this:

FullName
------------------------
Ratzlaff, L.
Villegas, A.
Phillips, C.

Basic SQL Functions Page 7-55


Notes:

Page 7-56 Basic SQL Functions


Module 8

Set Operators

After completing this module, you will be able to:

• Use the Union operator.


• Use the Intersect operator.
• Use the Except operator.

Set Operators Page 8-1


Notes:

Page 8-2 Set Operators


Table of Contents
What are Set Operators?............................................................................................................... 8-4
The Three Set Operators .............................................................................................................. 8-6
UNION ......................................................................................................................................... 8-8
UNION ALL .............................................................................................................................. 8-10
INTERSECT .............................................................................................................................. 8-12
EXCEPT (MINUS) .................................................................................................................... 8-14
EXCEPT and ALL ..................................................................................................................... 8-16
Module 8: Summary................................................................................................................... 8-18
Module 8: Review Questions ..................................................................................................... 8-20
Module 8: Lab Exercise ............................................................................................................. 8-22

Set Operators Page 8-3


What are Set Operators?
Set operators are aptly named for what they operate on – sets of data. Whereas inner joins

Page 8-4 Set Operators


What are Set Operators?

Set operators combine two or more results into a single result.

All employee
Result 1 Result 2
Last Name last names, Last Name
First Name first names, First Name
Emp Number emp#, Emp Number
Dept Number dept# Dept Number

Hourly Paid Salaried


Employee Table Employee Table
Columns and Rows Columns and Rows

It should make sense, then, that:


• Each individual result must have the same number of columns.
• The corresponding columns from each result should share the same data type.

The diagram is an illustration of a union of two results into one result.


Two other forms of set operators are available.

Set Operators Page 8-5


The Three Set Operators
Set operators are aptly named for what they operate on – sets of data. Whereas inner joins
(previous module) and outer joins (later module) return rows based upon some matching
condition, set operators deal with differences or commonalities among entire projected rows, not
just certain columns as referenced as conditional criteria (predicate) or selected columns
(projection).

There are many operations that set operations support. You can use set operators within the
following operations:

• Simple queries
• Derived tables (not covered in this class)
• Subqueries
• Insert/Select clauses (later module)
• View definitions (later module)

SELECT statements connected by set operators can include all of the normal clause options for
SELECT except the WITH clause.

Page 8-6 Set Operators


The Three Set Operators?

The SQL set operators manipulate the result sets of two or more queries
by combining the results of each individual query into a single result set.

Set Operator Function


Returns result only rows that appear in all answer sets
INTERSECT generated by the individual SELECT statements.
MINUS / Returns only rows returned by one SELECT minus those that
EXCEPT are duplicated in another set.
UNION Combines the results of two or more results into one result.

Set operators deal with actual sets of data (i.e. sets of rows)

Whereas join and subquery results are based upon IN, NOT IN, or Equality of
one or more columns – set operators act upon rows of data as a set.

Set Operators Page 8-7


UNION
A UNION brings two or more result set together as a single result set. There should be no
wondering, then, about the rules and determining factors associated with all set operators
• Each projected column list of a set operation must contain the same number of columns.
• The domain for each corresponding columns from each projected list must match (i.e.
character to character, integer to integer, decimal to decimal)
• An ORDER BY, if used, must be referenced on the final select and must reference a
column positionally, by number.
• When used, aliasing, format, title, data type etc, are ignored in all but the very first
SELECT.
As an example for data type determination, if the data type for a column in the first projection is
CHARACTER(10), and that for the corresponding column in a subsequent projection is
CHARACTER(20), then the corresponding (longer column) will be truncated. In ANSI mode
this would result in an error.

If the data type for a column in the first projection is SMALLINT, and that for the
corresponding column in a subsequent projection is INTEGER, then the corresponding (longer
column) will result in a “numeric overflow”.

The following query is how one could write the SQL request on the facing page without using a
UNION.

SELECT Last_Name,
Department_Number AS Dept#,
Salary_Amount
FROM Employee
WHERE Department_Number = 401
OR Salary_Amount BETWEEN 35000 AND 38000
ORDER BY 1;

last_name Dept# salary_amount


-------------------- ----------- -------------
Brown 401 43100.00
Hoover 401 25525.00
Hopkins 403 37900.00
Johnson 401 36300.00
Machado 401 32300.00
Phillips 401 24500.00
Rogers 401 46000.00
Trader 401 37850.00

Page 8-8 Set Operators


UNION

The number of columns for each select is


SELECT Last_Name,
the same.
Department_Number AS Dept#,
Salary_Amount The data types for the corresponding
FROM Employee columns must be of the same domain.
WHERE Department_Number = 401 (e.g. INT vs. SMALLINT Vs. BYTEINT is
UNION OK)
SELECT Last_Name, UNION
Department_Number, does a The first SELECT determines the heading,
Salary_Amount DISTINCT format, etc. - if used.
FROM Employee
WHERE Salary_Amount Note:
BETWEEN 35000 AND 38000; • The result is unordered.
• There is a single result set.
last_name Dept# salary_amount • Some employees satisfy the
------------ -------- ------------- conditions for both selects, and yet no
Brown 401 43100.00 employee row occurs more than once.
Hoover 401 25525.00
Hopkins 403 37900.00
Johnson 401 36300.00
Machado 401 32300.00
Phillips 401 24500.00 Result Result
Rogers 401 46000.00
1 2
Trader 401 37850.00

Set Operators Page 8-9


UNION ALL
The ALL option instructs the database not to eliminate duplicate rows. Since the effort (in the
form of CPU) to eliminate duplicate is not without cost, the ALL option helps avoids this cost
and, hence, performs better that without it.

Page 8-10 Set Operators


UNION ALL
SELECT Last_Name,
Department_Number AS Dept#, Note the difference between this
Salary_Amount query (UNION ALL) and the earlier
FROM Employee
one (UNION).
WHERE Department_Number = 401
UNION ALL
SELECT Last_Name, ORDER BY must be be on the last
Department_Number, SELECT, and it must be a positional
Salary_Amount reference (i.e. not on a column
FROM Employee name).
WHERE Salary_Amount BETWEEN 35000 AND 38000
ORDER BY 2, 1;

last_name Dept# salary_amount


----------- ----------- -------------
Brown 401 43100.00
Hoover 401 25525.00
Johnson 401 36300.00
Johnson 401 36300.00
Machado 401 32300.00
Phillips 401 24500.00
Rogers 401 46000.00
Trader 401 37850.00
Trader 401 37850.00
Hopkins 403 37900.00

Set Operators Page 8-11


INTERSECT
The INTERSECT operator does exactly what it states, it returns the intersection of two or more
sets of rows. For the two tables below, the intersection would be the 1st, 2nd, and 3rd rows of the
left table with the 3rd, 4th, and 2nd rows of the right table respectively.

c1 c2 c3 c1 c2 c3
a b c a b d
b c null c d e
c d e a b c
d e f b c null

The question on the facing page is answered below.

SELECT Last_Name,
Department_Number AS Dept#,
Salary_Amount
FROM Employee
WHERE Department_Number = 401
AND Salary_Amount BETWEEN 35000 AND 38000
ORDER BY 2, 1;

Page 8-12 Set Operators


INTERSECT

Returns result rows that appear in all answer sets generated by the individual
intersected SELECT statements.
last_name Dept# salary_amount
----------- ----- -------------
SELECT Last_Name, Machado 401 ?
Department_Number AS Dept#, Rogers 401 46000.00
Salary_Amount Brown 401 43100.00
FROM Employee Phillips 401 24500.00
WHERE Department_Number = 401 Johnson 401 36300.00
Trader 401 37850.00
INTERSECT
Hoover 401 25525.00
SELECT Last_Name,
Department_Number,
Salary_Amount last_name dept# salary_amount
----------- ----- -------------
FROM Employee
Hopkins 403 37900.00
WHERE Salary_Amount BETWEEN 35000 AND 38000 Trader 401 37850.00
ORDER BY 2, 1; Johnson 401 36300.00

Result
last_name Dept# salary_amount
-------------------- ----------- -------------
Johnson 401 36300.00 Common
Trader 401 37850.00 Rows

Set Operators Page 8-13


EXCEPT (MINUS)
The EXCEPT operator removes, from one result set, all of the matching rows from different
result set. If we take the two tables from the left page discussing INTERSECT (below) the result
would be whatever wasn’t projected from the INTERSECT, namely, the 4th row from the left
table. (Recall that the intersect returned the 1st, 2nd, and 3rd rows from the left table).

c1 c2 c3 c1 c2 c3
a b c a b d
b c null c d e
c d e a b c
d e f b c null

The facing page has lines drawn through the rows that are being omitted due to either the left
table rows matching rows from the right table, or (for “Short”) a row that wasn’t involved at all.

This
Area
Result Is the Result
1 Except 2
result

Page 8-14 Set Operators


EXCEPT (MINUS)

SELECT Last_Name, Salary_Amount


FROM Employee
last_name salary_amount
WHERE Job_Code
----------- -------------
BETWEEN 312101 AND 412101
Johnson 36300.00
EXCEPT Result Rogers 46000.00
SELECT Last_Name, Salary_Amount
Rogers 56500.00
FROM Employee
Trader 37850.00
WHERE Salary_Amount
BETWEEN 25000 AND 35000;
Venn Diagram on left page.

SELECT Last_Name, Salary_Amount SELECT Last_Name, Salary_Amount


FROM Employee FROM Employee
WHERE Job_Code WHERE Salary_Amount
BETWEEN 312101 AND 412101; BETWEEN 25000 AND 35000;
last_name salary_amount last_name salary_amount
----------- ------------- ----------- -------------
Kanieski 29250.00 Kanieski 29250.00
Stein 29450.00 Stein 29450.00
Hoover 25525.00 Hoover 25525.00
Rogers 46000.00 Ryan 31200.00
Rogers 56500.00 Machado 32300.00
Result
Trader 37850.00 Rabbit 26500.00
Johnson 36300.00 Lombardo 31000.00
Short 34700.00

Set Operators Page 8-15


EXCEPT and ALL
The facing page contrasts what is returned from an EXCEPT vs. what would be returned if the
EXCEPT ALL were used. As with all set operators not using the ALL option, duplicate rows are
eliminated. Typically the ALL option is used in conjunction with UNION, however, it may be
used with any set operator.

Page 8-16 Set Operators


EXCEPT and ALL

SELECT Last_Name SELECT Last_Name


To better understand the FROM Employee FROM Employee
difference between these WHERE Job_Code WHERE Job_Code
two, note: BETWEEN 312101 AND 412101 BETWEEN 312101 AND 412101
EXCEPT EXCEPT ALL
• We are only projecting SELECT Last_Name SELECT Last_Name
FROM Employee FROM Employee
the last names.
WHERE Salary_Amount WHERE Salary_Amount
• There are 2 “Rogers” in BETWEEN 25000 AND 35000; BETWEEN 25000 AND 35000;
the table (see query
below).
last_name last_name
• By itself, EXCEPT ----------- -----------
returns only the distinct Johnson Johnson
occurrences. Rogers Rogers
Trader Rogers
Trader

SEL last_name, salary_amount, job_code, department_number


FROM Employee WHERE last_name = 'rogers';

last_name salary_amount job_code department_number


-------------------- ------------- ----------- -----------------
Rogers 46000.00 412101 401
Rogers 56500.00 321100 302

Set Operators Page 8-17


Module 8: Summary
A summary of the module is presented.

Page 8-18 Set Operators


Module 8: Summary

• Set operators may be used in most any SQL construct.


• Each SELECT must specify a 'FROM table_name' clause.
• Duplicates rows are eliminated unless the ALL option is used.
• Default order of evaluation is as follows:
- INTERSECT
- UNION
- EXCEPT (From left to right)
• Evaluation order can be manipulated with parentheses.
• Columns in the same relative position within their respective SELECT
statement, must have the same domain.
• Formats and titles are determined in the first SELECT only, and are
applied to the entire query result.
• ORDER BY clause must be specified following the last SELECT and
must use numeric designators.

Set Operators Page 8-19


Module 8: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 8-20 Set Operators


Module 8: Review Questions

True or False:

1. The INTERSECT operator returns the same result as does the MINUS
operator, however MINUS is a Teradata extension.
False – MINUS and EXCEPT are equivalent (though MINUS is an extension)
2. The following is valid for a set operator  ORDER BY Last_Name
False – you must order by a positional number
3. The ALL option may potentially return more rows than if not using it.
True
4. Set operators may cause truncation among corresponding columns of a
result sets.
True
5. An INTERSECT is just another way of returning an inner result.
False
6. If all three different set operators are referenced in a query, the UNION is
performed first.
False – the INTERSECT is first
7. “SELECT *” is a valid projection in a set operation.
True – as long as the numbers of columns projected among projections
remains constant

Set Operators Page 8-21


Module 8: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 8-22 Set Operators


Module 8: Lab Exercise

Use set operators taught in this module to do the following.

1. For those employees who work in departments 301 or 401, remove


those whose salary is less than $35,000.00 and order this result by last
name and then first name.

2. Use UNION to combine employees who earn more than $10,000.00 with
those who work in departments 301 or 401. Alias last name to LNM
and first name to FNM.

3. Add the ALL option to #2 and note the different result.

4. Change #1 to find those who satisfy both the department and salary
conditions.

Set Operators Page 8-23


Notes:

Page 8-24 Set Operators


Module 9

Subqueries

After completing this module, you will be able to:

• Write subqueries to replace an IN LIST or a NOT IN list.


• Distinguish between the INNER and OUTER portions of a subquery.
• Write subqueries utilizing multiple matches.
• Nest Subqueries.
• Use EXISTS and NOT EXISTS to test for conditions.
• Use ANY, SOME, and ALL with LIKE and equality test conditions.
• Recognize the effect NULL has on NOT IN subqueries.

Subqueries Page 9-1


Notes:

Page 9-2 Subqueries


Table of Contents
Subquery Introduction.................................................................................................................. 9-4
Basic Subquery Concepts............................................................................................................. 9-6
Relating Concepts and Subqueries ............................................................................................... 9-8
Adding Conditions ..................................................................................................................... 9-10
Nesting Subqueries .................................................................................................................... 9-12
Multiple Column Matching ........................................................................................................ 9-14
NULL and NOT IN Subquery.................................................................................................... 9-16
Module 9: Summary................................................................................................................... 9-18
Module 9: Review Questions ..................................................................................................... 9-20
Module 9: Lab Exercise ............................................................................................................. 9-22

Subqueries Page 9-3


Subquery Introduction
This is our first foray into involving more than a single table into a request. A subquery replaces
an IN-List by performing a SELECT to generate the set of values which will then be joined to
the table above it. The term “join”, in this context, refers to how the database obtains the final
result rows. It gets the result set by joining, in our example on the facing page, the employee
table with the department table.

Page 9-4 Subqueries


Subquery Introduction

Recall how a list of values can be passed into a query as a set of WHERE values.

Literal list of all Department Numbers


SELECT *
FROM Employee
WHERE Department_Number [ NOT ] IN
(403,600,402,201,100,302,301,501,401);

You can replace the “IN LIST” with a subquery that will generate the set.

Derived list of all Department Numbers


SELECT *
FROM Employee
WHERE Department_Number [ NOT ] IN
(SELECT Department_Number FROM Department);

The query used to replace the list of values is called a “subquery.”


• Note how the subquery is not terminated with a semicolon.
• The derived list of values generated by the database is a distinct
list, even if the values are not unique.

Subqueries Page 9-5


Basic Subquery Concepts
The terminology on the facing page is quite important. Future references to these terms in this
class as well as in other texts will be seen quite often. Here we simply use the terms “inner” and
“outer” to describe certain relationships between two tables, namely:

• Employees having department numbers that are not in the department table. (outer)
• Employees having department numbers that are in the department table. (inner)
• Departments that have people assigned to them (inner)
• Departments in which no one works (outer)

Note that items 2 and 3 (above) actually refer to the same set. But it is important to note that
their business focus is quite different.

Next, we shall relate these concepts to subqueries themselves.

Page 9-6 Subqueries


Basic Subquery Concepts

The basic terms for a subquery are:


• Inner portion are those rows that have data that satisfy the conditions.
• Outer portion are those rows that have data that do not satisfy the conditions.
Employees having Employees and their departments Departments
invalid department -or- having no
numbers. Departments and their employees employees
(OUTER) (INNER) (OUTER)

Emp Dept Emp Dept Emp Dept

• Venn Diagrams are used only to illustrate concepts.


• Employees and Departments are each differing sets of data that happen to share a
common column – department number.
• Each Venn Diagram, above, depicts department numbers and how they relate to
each diagram as stated in the shaded boxes.

Subqueries Page 9-7


Relating Concepts and Subqueries
Relating the earlier concepts of “inner” and “outer” to subqueries is basically a positional one.
The query that is the object of the IN or NOT IN is referred to as the INNER query, and its table
as the INNER table. The table that we are projecting column values from is referred to as the
OUTER table, and its query as the OUTER query.

Notice how the concepts of “inner” and “outer” relate to the Venn-Diagram. The answers to the
questions posed are:

• Left side question – area 3.


• Right side question – area 1.

Both of these areas are “outer” areas and these result sets are referred to as outer-results. Area 2
represents (conceptually) the inner-results, or IN results. This is the area where department
numbers from both tables match.

Page 9-8 Subqueries


Relating Concepts and Subqueries

In the following, note the use of the terms “Inner” and “Outer”.
Only rows from the Outer Table may be projected.
The Inner Query is used to generate a distinct list (In List) of values.

1 2 3
Into which area Into which area
of the diagram of the diagram
would this result Dept Emp would this result
set fall? set fall?

Outer Query References the Outer Table

SELECT * SELECT *
FROM Employee FROM Department
WHERE Department_Number NOT IN WHERE Department_Number NOT IN
(SELECT Department_Number (SELECT Department_Number
FROM Department); FROM Employee);

Inner Query References the Inner Table

Subqueries Page 9-9


Adding Conditions
You may think of subqueries as you would any other conditions. Granted, they involve more
typing (to be sure), and they are more involved, but when it comes down to it, it’s just another
condition.

On the facing page we illustrate how one would interpret a business question from a subquery.
To do so, begin from the lowest level and move outward to the outer-most query. The bottom
example illustrates how two separate subqueries, AND’ed together, might be interpreted as each
separate condition. If they were OR’ed together then the busuness question would be  “People
who are managers or work in support departments.” This is not an example of nested
subqueries, however, which will be discussed next.

Page 9-10 Subqueries


Adding Conditions

You can include conditions to the outer and inner queries like this.

SELECT *
FROM Employee Employees with job code 412101
WHERE Job_Code = 412101 who work in support departments
AND Department_Number IN
(SELECT Department_Number Support Departments
FROM Department
WHERE Department_Name LIKE '%Support%');

You can include multiple subqueries like this.

SELECT * People who are managers AND work in support departments


FROM Employee
WHERE Job_Code IN
(SELECT Job_Code
FROM Job Managers
WHERE Description LIKE '%Manager%')
AND Department_Number IN
(SELECT Department_Number
Support Departments
FROM Department
WHERE Department_Name LIKE '%Support%');

Subqueries Page 9-11


Nesting Subqueries
The facing page illustrates how one may nest subqueries. By “nest”, we mean that the first
subquery (selecting “sales” departments) is, itself, a subquery. To fully understand the “nested-
ness”, begin at the lowest level and work outward (or upward, depending upon how it is written).

The previous example was not nested because each subquery was each a separate condition, and
not part of another subquery.

Page 9-12 Subqueries


Nesting Subqueries

You can also “nest” subqueries.

To interpret the business question of the query start from the lowest-level
query and work upward.
Employee information on
Sales department managers
SELECT * who are not assigned to a
FROM Employee customer.
WHERE Department_Number IN
(SELECT Department_Number Sales department mangers
FROM Department who have no assigned
WHERE Department_Name LIKE '%Sales%' customers.
AND Manager_Employee_Number NOT IN
(SELECT Sales_Employee_Number
FROM Customer) ); Sales people having
customers

Sales department managers who are not assigned to a customer.

Subqueries Page 9-13


Multiple Column Matching
With multiple-matching, each combination from the outer table must match, value for value, a
combination from the inner table. For instance:

• D# 49; M# 100 does not match D# 49; M# 200


• D# 49; M# 100 does not match D# 10: M# 100
• D# 49; M# 100 does match D#49; M#100
• D#49; M#100 does not match D# 10; M# 200

It should go without saying that one cannot match 2 values with 3 values, but let’s say it anyway.
That is to say, equal numbers of values may only match equal numbers of values. Observe.

SELECT *
FROM Employee
WHERE (Employee_Number,
Department_Number,
Manager_Employee_Number)
IN (SELECT Department_Number,
Manager_Employee_Number
FROM Department);

*** Failure 3608 Not enough values in subquery.

Page 9-14 Subqueries


Multiple Column Matching

You can also use match multiple outer columns to multiple inner columns.

To find employees who work in departments that they manage, begin by finding
all department managers from the department table.
SELECT Department_Number, Departments and their managers
Manager_Employee_Number
FROM Department;

Then select these people from the employee table.

SELECT * Employees who work in departments that they manage


FROM Employee
WHERE (Department_Number, Employee_Number) IN
(SELECT Department_Number,
Manager_Employee_Number
FROM Department); Departments and their managers

Subqueries Page 9-15


NULL and NOT IN Subquery
The first query on the facing page recalls what occurs when a null is introduced into an NOT IN
list. That is, no rows are returned. Recall the explain.

Explanation
---------------------------------------------------------
. . . .
3) We do an all-AMPs RETRIEVE step from DLM.Employee by
way of an all-rows scan with a condition of
("(DLM.Employee.dept# <> NULL) AND
((DLM.Employee.dept# <> 401) AND DLM.Employee.dept# <>
403 ))") into Spool 1 (group_amps), which is built
locally on the AMPs.
. . . .

Recall that all AND conditions must evaluate true, and that, for this WHERE condition, the
condition  EMPLOYEE.dept# <> NULL can never evaluate true, so no rows are returned.

In the middle example of the facing page we replaced the NOT IN list with a subquery. Since
the subquery may, in fact, return a null department number, the query could potentially result no
rows being returned as well.

The example at the bottom of the page suggests adding a condition to the subquery that avoids
introducing a null into the resulting NOT IN list, this avoiding the no rows returned result.

Page 9-16 Subqueries


NULL and NOT IN Subquery

Recall from an earlier module that the following query will return zero rows.

SELECT *
FROM Department
WHERE Department_Number NOT IN (401, 403, null);

Consider the following query where the inner query returns a null department.
How many rows will be returned?
What would be a good condition to add to the inner query?
SELECT *
FROM Department
WHERE Department_Number NOT IN
(SELECT Department_Number FROM Employee);

Answer:
SELECT *
FROM Department
WHERE Department_Number NOT IN
(SELECT Department_Number FROM Employee
WHERE Department_Number IS NOT NULL);

Subqueries Page 9-17


Module 9: Summary
A summary of this module is discussed.

Page 9-18 Subqueries


Module 9: Summary

• IN and NOT IN may involve sets generated by query results.

• Subqueries are comprised of inner and outer tables and queries.

• You can only project rows from the outer table.

• The database derives a distinct list of subquery rows.

• Additional conditions may be referenced in either the inner or outer


queries.

• Multiple match subqueries can be performed.

• Subqueries may be nested.

Subqueries Page 9-19


Module 9: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 9-20 Subqueries


Module 9: Review Questions

True or False:

1. ORDER BY is not allowed in the outer query.


False

2. WHERE is not allowed on the outer query.


False

3. DISTINCT is not allowed on the inner query.


False – it is automatically performed, but may be used if desired

4. The inner query must include a semi-colon.


False

Subqueries Page 9-21


Module 9: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 9-22 Subqueries


Module 9: Lab Exercise

1. Write a subquery that finds employees who are not employee


managers. (i.e. Not managers in the employee table.)

2. Edit #1 to find employee who are neither employee managers, nor


department managers.

Extra tough
3. Write a nested subquery that finds employees whose managers are
department managers that are not managers in the employee table.

Subqueries Page 9-23


Notes:

Page 9-24 Subqueries


Module 10

Inner Join

After completing this module, you will be able to:

• Project columns from many tables within the same projection.


• Distinguish between Subqueries and Inner Joins.
• Discuss differences in styles for coding join syntax.
• Contrast inner joins with cross joins.
• Join a table to itself (Self Join).
• Identify pitfalls associated with incorrect aliasing.
• Indentify problems associated with many-to-many joins.

Inner Join Page 10-1


Notes:

Page 10-2 Inner Join


Table of Contents
Inner Join Concepts .................................................................................................................... 10-4
Inner Join vs. Subquery .............................................................................................................. 10-6
Table Name Qualifications and Aliasing ................................................................................. 10-10
Varied Forms of INNER Join .................................................................................................. 10-12
Many-Table INNER Joins........................................................................................................ 10-14
Varied Forms of Many-Table Inner Joins ................................................................................ 10-16
Using Parentheses to Understand Order .................................................................................. 10-18
Using Parentheses with Other Forms ....................................................................................... 10-20
Self Joins .................................................................................................................................. 10-22
Guaranteeing Uniqueness......................................................................................................... 10-24
IN vs. Inner Join ....................................................................................................................... 10-26
NOT IN vs. Inner Join.............................................................................................................. 10-28
Cross Join ................................................................................................................................. 10-30
Mistakes on Table Aliasing...................................................................................................... 10-32
Mistakes on Column Aliasing .................................................................................................. 10-34
Module 10: Summary............................................................................................................... 10-36
Module 10: Review Questions ................................................................................................. 10-38
Module 10: Lab Exercise ......................................................................................................... 10-40

Inner Join Page 10-3


Inner Join Concepts
The concept of inner join is, perhaps, the most important ones in all of SQL. Joins are the norm!
There are few times when they aren’t needed.

In our example, we might want to look up the department name for any number of employees.
Since employee information does not include department information other than for the
department number. The reason for storing the department number in the employee table is
because it is what we need to get information for the department. The actual syntax for inner
join is (as you will soon see) fairly straight forward.

Page 10-4 Inner Join


Inner Join Concepts

• Inner joins project values EMPLOYEE


MGR
based upon column EMP EMP DEPT JOB LAST FIRST HIRE BIRTH SAL
values of one table NUM NUM NUM CODE NAME NAME DATE DATE AMT
matching corresponding 1006 1019 301 312101 Stein John 761015 531015 2945000
column values of another 1008 1019 301 312102 Kanieski Carol 770201 580517 2925000
1005 0801 403 431100 Ryan Loretta 761015 550910 3120000
table based on equality. 1004 1003 401 412101 Johnson Darlene 761015 460423 3630000
1007 1005 403 432101 Villegas Arnando 770102 370131 4970000
• To get a report of 1003 0801 401 411100 Trader James 760731 470619 3785000
employee number, last
name, and department
name, you would need to employee_number last_name department_name
join the employee table 1004 Johnson customer support
and the department table. . . .
.. .. ..
• Department number is
the common column that DEPARTMENT
MGR
determines the way data DEPT DEPT BUDGET EMP
in these two tables will be NUM NAME AMOUNT NUM
matched. 501 marketing sales 80050000 1017
301 research and devel. 46560000 1019
• Note the one-to-many 302 product planning 22600000 1016
relationship for the join 403 education 93200000 1005
402 software support 30800000 1011
condition. 401 customer support 98230000 1003
201 technical operations 29380000 1025

Inner Join Page 10-5


Inner Join vs. Subquery
The two bullets on the facing page ask us to contrast subqueries with inner joins.

• Inner joins are similar to subqueries in that an inner join returns an inner result.
Subqueries, however, can return outer result sets as well
• Where subqueries are limited to projecting only from the outer table, inner joins can
project columns from any joined table.

Page 10-6 Inner Join


Inner Join vs. Subquery

Contrast the following bullets with what we know about subqueries.


• Inner Joins return only inner result sets.
• Inner joins can be used to project from any joined table.
Employees having Employees and their departments Departments
invalid department -or- having no
numbers. Departments and their employees employees
(OUTER) (INNER) (OUTER)

Emp Dept Emp Dept Emp Dept

• As with subqueries, Venn Diagrams are used only to illustrate concepts.


• As before, employees and Departments are each differing sets of data that happen
to share a common column – department number.
• And each Venn Diagram, above, depicts department numbers and how they relate
to each diagram as stated in the shaded boxes.

Inner Join Page 10-7


Comparing Subqueries and Inner Joins
The facing page adds yet one more distinction between inner join and subquery: namely, that
inner joins do not guarantee a one-to-many relationship between tables as do subqueries. On
another note, the concept of outer results is not applicable to inner joins in that they project
column values only when the join condition evaluates true.

Page 10-8 Inner Join


Comparing Subqueries and Inner Joins

 Note the differences between the syntax used for a subquery and that for the join.
 The join condition must evaluate “True” in order to project column values.
 The SELECT *, in the case of the join, will project all columns from both tables for
comparisons that evaluate “True.”

Subquery: Join Equivalent:


SELECT Last_Name SELECT Last_Name, Department_Name
FROM Employee FROM Employee, Department
WHERE Department_Number IN WHERE Employee.Department_Number =
(SELECT Department_Number Department.Department_Number;
FROM Department);

Recall that for a subquery: However, for an Inner Join:

1. You can only project columns from the 1. You can project columns from any table.
outer table.
2. A distinct list guarantees a one-to-many 2. Does not guarantee a one-to-many
relationship between the inner and outer relationship between the tables.
table.
3. Can return an inner result (using IN) or 3. Can only return an inner result.
an outer result (using NOT IN)

Inner Join Page 10-9


Table Name Qualifications and Aliasing
Qualifications are often required, and (usually) always recommended. On the facing page we see
examples of qualifications that are not required, as well as some that are.

When not required:


• The name only occurs in one of the tables. (No ambiguity exists.)

When required:
• When the name is ambiguous. (It occurs in more than one table.)

Aliasing is a way to provide another, more user-friendly, name to an table much like it is used in
aliasing columns. Aliasing table is optional (as it is for columns), but it is used in nearly all
queries involving joins. It is most always recommended that one use the alias as a qualifier
whenever referencing a column, even though optional to do so. The reason for this is to easily
identify from which table the column is being projected. An example of typical usage follows.

SELECT emp.Last_Name,
emp.First_Name,
emp.Department_Number,
dept.Manager_Employee_Number
FROM Employee emp , Department AS dept
WHERE emp.Department_Number = dept.Department_Number;

Page 10-10 Inner Join


Table Name Qualifications and Aliasing

Just as you can alias column names, you may also alias table names.
Without double-quotes, aliases:
• May not contain non-standard characters.
• May not contain key-words.

SELECT employee.Last_Name,
Qualification not required.
First_Name,
Employee.Department_Number,
d.Manager_Employee_Number
FROM Employee, Department AS d
WHERE Department_Number = d.Department_Number;

SELECT e.Last_Name,
First_Name,
e.Department_Number,
Qualification required.
d.Manager_Employee_Number
FROM Employee e, Department d
WHERE e.Department_Number = d.Department_Number;

Inner Join Page 10-11


Varied Forms of INNER Join
There are two different forms of SQL for writing inner joins. Both of them are ANSI standard.
From a performance point of view it makes no difference which one you use since the database
rewrites the bottom version to make it the same as the top one before optimization.

The style at the top of the page is often referred to as the “implicit” form (Inner Join is not stated
so it is implied) while the style at the bottom is referred to as the “explicit” form (Inner Join is
stated). Another term some may use for the top form is the “coma” form. When using the
explicit form, the INNER keyword is optional.

Also notice that the “ON” clause references the join condition. The WHERE clause is used to
reference conditions that are “residual” to the join. The “ON” clause is mandatory. Join
conditions when using the “implicit” form are not mandatory. We shall discuss this later in the
module.

Page 10-12 Inner Join


Varied Forms of INNER Join

Another form for doing an inner join is the ANSI 92 syntax.


Both return the same result.
Both are optimized equally.

SELECT e.Last_Name,
e.First_Name,
e.Department_Number,
ANSI 88 (Implicit Form)
d.Manager_Employee_Number
FROM Employee e, Department d
WHERE e.Department_Number = d.Department_Number
AND e.Last_Name = 'Brown';

Equivalent
Results

SELECT e.Last_Name,
e.First_Name,
e.Department_Number,
ANSI 92 (Explicit Form)
d.Manager_Employee_Number
FROM Employee AS e INNER JOIN Department AS d
ON e.Department_Number = d.Department_Number
WHERE e.Last_Name = 'Brown';

Inner Join Page 10-13


Many-Table INNER Joins
The example shows how more than 2 tables would be joined using the implicit join syntax. The
answers to the questions are provided.

Query result:
Jones Sales Manager

Explicit form:

SELECT e.Last_Name, e.Department_Name, j.Description


FROM Employee e JOIN Department d
ON e.Department_Number = d.Department_Number
JOIN Job j
ON e.Job_Code = j.Job_Code

Page 10-14 Inner Join


Many-Table INNER Joins

You can join these 3 tables like this.


Notice the uniqueness involved.
If the tables have only the rows shown, what will this return?
How would you write this in implicit form?

SELECT e.Last_Name, e.Department_Name, j.Description


FROM Employee e, Department d, Job j
WHERE e.Department_Number = d.Department_Number
AND e.Job_Code = j.Job_Code

Employee Department Job


Last Department Job Department Job
Department
Name Number Code Number Code Description
Name
Jones 100 6666 (Unique) (Unique)
100 Sales 6666 Manager
Smith 200 7777
200 Marketing 5555 President
Brown 300 8888
Adams 400 9999 600 Support 8888 Lead

Inner Join Page 10-15


Varied Forms of Many-Table Inner Joins
The facing page boldly attempts to show some of the more typical styles used for writing inner
joins. The example at the bottom of the facing page is likely to be rarely, if ever, used and is
only shown to illustrate yet another form of inner join.

Although not entirely obvious, the second and third forms must be precisely written so that each
join condition references the immediately preceding tables or a syntax error will result. Note
these examples, which place the “ON” (join) conditions improperly. Both of these yield the
failure shown below them.

SELECT e.Last_Name AS "Ln",


e.Department_Number AS Dn,
j.Description AS "Desc"
FROM Employee AS e JOIN Department AS d
ON e.Job_Code = j.Job_Code
JOIN Job AS j
ON e.Department_Number = d.Department_Number;

SELECT e.Last_Name AS "Ln",


e.Department_Number AS Dn,
j.Description AS "Desc"
FROM Department d JOIN
Employee e JOIN
Job j
ON e.Department_Number = d.Department_Number
ON e.Job_Code = j.Job_Code;

*** Failure 3782 Improper column reference in the search


condition of a joined table.

The business concern on the facing page could be stated something like this.

“Provide the job description and department name for all accountants working in departments
having budgets over $50,000.00”

Page 10-16 Inner Join


Varied Forms of Many-Table Inner Joins

SELECT Last_Name, e.Department_Name, j.Description There are many different


FROM Employee e, Department d, Job j forms one may use when
WHERE e.Department_Number = d.Department_Number writing inner joins.
AND e.Job_Code = j.Job_Code
AND j.Description LIKE '%Accountant%' State the business concern
AND d.Budget_Amount > 50000; for these queries.

SELECT e.Last_Name AS "Ln", e.Department_Name AS Dn, j.Description AS "Desc"


FROM Employee AS e JOIN Department AS d
ON e.Department_Number = d.Department_Number
JOIN Job AS j
ON e.Job_Code = j.Job_Code
WHERE j.Description LIKE '%Accountant%'
AND d.Budget_Amount > 50000;

SELECT e.Last_Name AS "Ln", e.Department_Name AS Dn, j.Description AS "Desc"


FROM Department d JOIN
Employee e JOIN
Job j
ON e.Job_Code = j.Job_Code
ON e.Department_Number = d.Department_Number
WHERE j.Description LIKE '%Accountant%'
AND d.Budget_Amount > 50000;

Inner Join Page 10-17


Using Parentheses to Understand Order
In order to better understand the syntax failure on the previous page, consider using parentheses,
as shown on the facing page. The parentheses do not change anything about the query, but they
do provide a better understanding of how to write the joins correctly. A rather rote process can
be used to determine placement of parentheses.

1. Write the query in a more standardized form as shown.


2. Place your pencil or pen at the upper left-hand corner of the query and move it down the
left-hand side until you come to the first “ON” clause.
3. Place a right-parenthesis (close parenthesis) at the end of the join.
4. Place a left-parenthesis prior to the immediately preceding tables (or join result).

Continue this procedure for each join until done.

Page 10-18 Inner Join


Using Parentheses to Understand Order

• Correct placement of parentheses can illustrate how to correctly place join


conditions.
• Again, note that the key word INNER is optional.
• Also note that the number of join conditions is the number of tables minus 1
• Whether aliasing or not, is to always best to use column qualifiers to match
columns to tables.

SELECT e.Last_Name AS "Ln", e.Department_Number AS Dn, j.Description AS "Desc"


FROM Employee AS e JOIN Department AS d
ON e.Department_Number = d.Department_Number
JOIN Job AS j
ON e.Job_Code = j.Job_Code;

Same
Join

SELECT e.Last_Name AS "Ln", e.Department_Number AS Dn, j.Description AS "Desc"


FROM ( ( Employee AS e JOIN Department AS d
ON e.Department_Number = d.Department_Number )
JOIN Job AS j
ON e.Job_Code = j.Job_Code );

Inner Join Page 10-19


Using Parentheses with Other Forms
Continuing with the concept of adding parentheses in order to correctly write inner joins, we
illustrate how to determine where to place them in a syntax that, as stated earlier, is, perhaps, one
that should be advised against, but relevant none-the-less.

Page 10-20 Inner Join


Using Parentheses with Other Forms

Note in the example below that the key word INNER is optional. Also note that the
number of join conditions is the number of tables minus 1 and that best practice,
whether aliasing or not, is to always qualify, whether required or not, to match columns
to tables.

SELECT e.Last_Name AS "Ln", e.Department_Number AS Dn, j.Description AS "Desc"


FROM Department d JOIN
Employee e JOIN
Job j
ON e.Job_Code = j.Job_Code
ON e.Department_Number = d.Department_Number;

Same
Join

SELECT e.Last_Name AS "Ln", e.Department_Number AS Dn, j.Description AS "Desc"


FROM ( Department d JOIN
( Employee e JOIN
Job j
ON e.Job_Code = j.Job_Code )
ON e.Department_Number = d.Department_Number );

Inner Join Page 10-21


Self Joins
For tables that have “self references” like the employee table, it may be necessary to join the
table to itself in order to answer certain kinds of business questions. In our example we need to
find name of employees and their manager’s names.

Note that the relationship between an employee’s name and the name of their manager is
between separate rows within the very same table. In order to display such information it is
necessary to join the table to itself as shown. Such a join is referred to as a self-join. Self joins
can be somewhat of a challenge for even experienced SQL coders. It is mainly the join condition
that poses such a challenge. At least one version of the table must be aliased or a error occurs as
shown below. The many things that are wrong with the following query should become obvious.

• The column list names are ambiguous.


• The table names are ambiguous.
• The join column names are ambiguous.

SELECT Last_Name,
First_Name,
Last_Name,
First_Name
FROM Employee JOIN Employee
ON Manager_Employee_Number = Manager_Employee_Number;

*** Failure 3868 A table or view without alias appears more than once in FROM clause.

Page 10-22 Inner Join


Self Joins

Sometimes it may be necessary to join a table to itself.


• Aliasing of at least one version of the table is necessary.
• In the query below, we project the name of the employee and name of the
manager -- as different rows in the same table -- onto the same result row.

Display the last name and first names of employees along with the last name and
first names of their managers for those working in departments 201 and 301.

SELECT Emp.Last_Name, Emp.First_Name, Mgr.Last_Name, Mgr.First_Name


FROM Employee Emp JOIN Employee Mgr
ON Emp.Manager_Employee_Number = Mgr. Employee_Number
WHERE Emp.Department_Number IN (201, 301);

Employee - Emp Employee - Mgr


Emp# Dept# Mgr# Emp# Dept# Mgr#
100 201 200 100 201 200
200 401 500
200 401 500
500 501 900
500 501 900

Inner Join Page 10-23


Guaranteeing Uniqueness
One of the most important differences between joins and subqueries is the need for establishing a
one-to-many relationship in a join, something that automatically is provided when writing a
subquery.

The facing page illustrates what happens when the join relationship is many-to-many. In such a
case, unintended result rows appear in the final result set. In the example, let’s assume that
employee 100 works only in department 30 (not department 55), since the join is on manager
number this result set will show that person working in both department 30 and department 55.
Certainly not in the real circumstance. Other employees (employees 400, 500, and 600) share
the same fate. It would be difficult, if not impossible, to view the result set and know who truly
works in which department.

Page 10-24 Inner Join


Guaranteeing Uniqueness

When joining a many-to-many relationship, unintended result rows can be projected!


The example below depicts 3 rows joining to 2, producing 6 result rows!

SELECT e.Employee_Number AS Emp#,


d.Department_Number AS Dept#
FROM Employee e, Department d
WHERE e.Manager_Employee_Number = d.Manager_Employee_Number
AND e.Manager_Employee_Number = 300;

Employee Department
Result
Emp# Dept# Mgr# Dept# Mgr#
20 100 Emp# Dept#
100 30 300 100 30
200 10 400 30 300 100 55
55 300 400 30
400 55 300
400 55
500 30 300 90 500
500 30
600 95 500 95 500 500 55

Inner Join Page 10-25


IN vs. Inner Join
The queries on the facing page contrast inner joins with subqueries.

Page 10-26 Inner Join


IN vs. Inner Join

Find employees have valid department numbers.

Subquery form:
SELECT Employee_Number,
First_name,
FROM Employee
WHERE Department_Number IN
(SELECT Department_Number FROM Department);

Inner Join form:


SELECT Employee_Number,
First_name,
FROM Employee e JOIN Department d
ON e.Department_Number = d.Department_Number;

Note that you may only rewrite a join as a subquery if you are only projecting
columns from one table!

Inner Join Page 10-27


NOT IN vs. Inner Join
Based upon what we saw on the previous page, we can now see what would happen if we tried to
accomplish the equivalent of a NOT IN subquery by using the join syntax on the facing page.
The subquery that would handle this situation without issue would be the following:

SELECT Last_Name,
First_Name
FROM Employee
WHERE Manager_Employee_Number NOT IN
(SELECT Manager_Employee_Number FROM Department);

Page 10-28 Inner Join


NOT IN vs. Inner Join

The NOT IN subquery would have no issue with obtaining the result intended here.
Find employees whose managers are not department managers.

SELECT Employee_Number,
First_name,
FROM Employee e JOIN Department d
ON e.Manager_Employee_Number <> d.Manager_Employee_Number;

Mgr# 300 doesn’t


match 3 Mgr#s in
Employee Department the department
table.
Returns: Emp# Mgr# Dept# Mgr#
3 rows 100 300 20 100 100 Carol 20
5 rows 200 400 30 300
3 rows 400 300 55 300
3 rows 500 300 90 500 100 Carol 90
3 rows 600 500 95 500 100 Carol 95

17 rows total

Inner Join Page 10-29


Cross Join
This is the only page that discussed cross joins. It is a rarely intended syntax compared to earlier
times. In our examples, the one on the right is preferred because it shows the “reader” that a
cross join is intended while the one at the right may or may not be (perhaps the “write” forgot the
join condition).

As noted on the facing page, since no join condition exists, the database invents one for us,
whether we are pleased with it or not. The condition of “WHERE 1=1” always evaluates true.
Thus, you can read the row for employee “Smith” as “Project the employee number and last
name of this employee for each row in the department table where 1=1 is true”. The result is
to project these column values (from the “Smith” row) for each department row. The same thing
happens all over again for each employee row. As a different example, the following query
would return the result shown.

SELECT e.Employee_Number,
e.Last_Name,
d.Department_Number,
d.Manager_Employee_Number
FROM Employee e CROSS JOIN
Department d;

Emp# Last_Name dept# Mgr#


------ ---------- ------ ------
100 Smith 20 100
100 Smith 30 300
100 Smith 55 300
200 Jones 20 100
200 Jones 30 300
200 Jones 55 300
400 Adams 20 100
400 Adams 30 300
400 Adams 55 300

Page 10-30 Inner Join


Cross Join

A CROSS join is a join where no join condition is specified.


Since no qualification exists, the database establishes a condition of “WHERE 1=1”.
Since this condition is true for each and every comparison, the following occurs.

SELECT Employee_Number, SELECT Employee_Number,


Last_Name Last_Name
FROM Employee e, Department d;
Equivalent FROM Employee e CROSS JOIN
Department d;

Result
Employee Department
100 Smith
Emp# Last_Name Dept# Mgr# 1 100 Smith
1 100 Smith 20 100 100 Smith
Project the
200 Jones 30 300 column values 200 Jones
2
where 1=1 is true. 2 200 Jones
3 400 Adams 55 300
200 Jones
400 Adams
3 400 Adams
400 Adams

Inner Join Page 10-31


Mistakes on Table Aliasing
When using aliases in writing joins, one must be careful to always use alias names when
referencing and not the aliased table names. In our examples, the table Department has been
aliased as “Dept” in the FROM, but the join condition references Department as a table name
and does not reference the alias. The database will interpret this as having four (4) tables being
joined, namely: “Emp”, “Dept”, “Job” and “Department”. To understand what is happening one
must realize that the FROM clause is not required in a SQL request. For instance, the following
is technically acceptable, but terribly repugnant.

SELECT Last_Name
WHERE Employee.Department_Number =
Department.Department_Number;

On the facing page, the join condition references a table called “Department” which is not
referenced in a FROM clause, just like the previous example on this page. Also, a table called
“Dept” (in a “FROM” clause alias) has no join condition. The optimizer will interpret this as a
cross join between (likely) Dept and Department, the result of which will then be joined to
Employee (in the immediate example). Ouch!

Page 10-32 Inner Join


Mistakes on Table Aliasing

• Be careful! Do not alias a table and then use the name instead of the alias.
• In the examples below, the first one will fail due to a syntax error (ANSI 92).
• The second will cause a 4-table join, one of which is a self join between Dept
(the aliased Department table) and Department!

SELECT Last_Name, First_Name,


Department_Name, Job_Description
FROM Employee AS Emp JOIN Department Dept
ON Emp.Department_Number = Department.Department_Number
JOIN Job
ON Emp.Job_Code = Job.Job_Code;

SELECT Last_Name, First_name,


Department_Name, Job_Description
FROM Employee Emp, Department AS Dept, Job
WHERE Emp.Department_Number = Department.Department_Number
AND Emp.Job_Code = Job.Job_Code;

Inner Join Page 10-33


Mistakes on Column Aliasing
Just as one can err by referencing the table name instead of an alias name in the join condition,
one can also err by qualifying a projected column with a table name instead of an alias name.
Consider the following queries, both valid, though equally obscene.

SELECT Employee.Last_Name
FROM Employee AS Emp;
Or
SELECT Employee.Last_Name
FROM Department;

In each case, there are two (2) tables involved. As stated on the previous page, cross joins will
be performed by the database to make this happen.

Page 10-34 Inner Join


Mistakes on Column Aliasing

Both forms of joins cause bad self joins when referring to the table
name in the select list instead of the alias!

SELECT Employee.Last_Name, First_Name,


Department_Name, Job_Description
FROM Employee AS Emp JOIN Department Dept
ON Emp.Department_Number = Dept.Department_Number
JOIN Job
ON Emp.Job_Code = Job.Job_Code;

SELECT Employee.Last_Name, First_name,


Department_Name, Job_Description
FROM Employee Emp, Department Dept, Job
WHERE Emp.Department_Number = Dept.Department_Number
AND Emp.Job_Code = Job.Job_Code;

Inner Join Page 10-35


Module 10: Summary
A summary of this module is discussed.

Page 10-36 Inner Join


Module 10: Summary

• Columns values may be projected from any table of a join.

• Subqueries and inner joins can both return inner result sets.

• Inner joins have both an implicit form and an explicit form.

• Inner joins typically involve one-to-many relationships based on


equality.

• A table may be joined to itself.

• Incorrect table and column references can cause incorrect result sets.

• Inner joins can not return outer (NOT IN) result sets as can subqueries.

Inner Join Page 10-37


Module 10: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 10-38 Inner Join


Module 10: Review Questions

True or False:

1. For inner joins, each FROM clause requires an ON clause for join
conditions.
False – Only the explicit form requires an ON clause
2. Referencing a WHERE clause is invalid for the explicit form of inner join.
False – A WHERE clause may be need for adding residual conditions
3. Many-to-many relationships are allowed with inner joins.
True
4. When performing a self join, table aliasing is required.
True – You may not reference the same table name with creating ambiguities
5. Inner join syntax requires at least one qualifying join column.
False – A WHERE clause may be need for adding residual conditions
6. The explicit form of inner join can reject some uses of incorrect
qualifications.
True – But only in the ON clause and not in the project list
7. The implicit form of inner join is not ANSI standard.
False – Both forms are ANSI standard

Inner Join Page 10-39


Module 10: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 10-40 Inner Join


Module 10: Lab Exercise

1. List all employees by name, the name of their department, their original
salary, and salary again with a ten percent increase, for those working
in departments with budgets > $40,000.00. Make the last and first name
10 characters each and use the implicit form of inner join.

2. Find the department names and employee names for employees that
have both an “i” and an “e” in their last name. Make the last and first
name 10 characters each and use the explicit form of inner join.

3. Use POSITION to list department names that have people working in


them whose job description has the word “sales” in it. List the
employee names as well.

Optional
4. Write a cross join that lists all possible combinations of first names
and last names from employee.

Inner Join Page 10-41


Notes:

Page 10-42 Inner Join


Module 11

Outer Join

After completing this module, you will be able to:

• Distinguish between Inner and Outer joins.


• Distinguish between LEFT, RIGHT and FULL outer join results.
• Determine the effect of nulls in the source tables on result rows.
• Determine the effect of nulls in the result rows.
• Use correct terminology when referring to outer join syntax.
• Identify various way of returning the same outer join results.
• Use parenthesis to aid in understanding outer join syntax.
• Outer Join three or more tables together.

Outer Join Page 11-1


Notes:

Page 11-2 Outer Join


Table of Contents
Outer Join Concepts ................................................................................................................... 11-6
Outer Join Syntax ..................................................................................................................... 11-10
Types of Outer Joins ................................................................................................................ 11-12
Employee as Left Outer ........................................................................................................... 11-14
Nulls and the Inner Table ......................................................................................................... 11-16
Department as Outer ................................................................................................................ 11-18
Outer Joins and WHERE ......................................................................................................... 11-20
Syntax Variations ..................................................................................................................... 11-22
Parts of Speech ......................................................................................................................... 11-24
Three Table Inner Join - Review.............................................................................................. 11-26
Three Table Outer Join ............................................................................................................ 11-28
Multiple Table Variations ........................................................................................................ 11-30
Three Table Outer Join Results ................................................................................................ 11-32
Uncharacteristic Data and Outer Join ...................................................................................... 11-34
Considering Nulls .................................................................................................................... 11-36
Full Outer Join ......................................................................................................................... 11-38
Module 11: Summary............................................................................................................... 11-40
Module 11: Review Questions ................................................................................................. 11-42
Module 11: Lab Exercise ......................................................................................................... 11-44

Outer Join Page 11-3


What is an Outer Join?
Inner joins can be used to project column values from multiple tables, but only for criteria that
matches – e.g. where employees department number matches one in the department table (i.e.
they have valid departments). But what if we would like to show information for those
employees who are, perhaps, new employees (null department numbers) or who have incorrect
or invalid departments. This is what an outer join can be used to perform. As shown on the
facing page, we may not only find employees that are new hires or have invalid department
numbers, but also those department that have no one assigned to them.

Having said all of this, outer join can be the most difficult to write. Much has yet to be learned.

Page 11-4 Outer Join


What is an Outer Join?

While inner joins can retrieve only inner results.

Employees and the departments Departments with people


OR
in which they work (2). assigned to them (2).

1 2 3

Emp Dept

Outer joins can retrieve both the matching (inner) and non-matching (outer) rows.
Employees and the Departments with people
departments in which they assigned to them (2) plus
OR
work (2), plus those with departments with no
unmatched departments (1). assigned people (3).

Outer Join Page 11-5


Outer Join Concepts
Before moving on to more detailed discussions of outer join, we will focus on a particular outer
join with emphasizes on employees. The question posed is a very important one to consider. In
past queries, where we discussed subqueries and outer join, our concern was for matching
criteria. Since outer joins involve non-matching criteria as well as matching, the concept of what
should be projected for the inner table (in our example this is the department table) is crucial to
understanding our result as well as our intention.

Here is the question on the facing page, restated:

“If an employee has a department number that does not (or will not) match one in the
department table, what should be projected for their department name (or any other
department column)?”

This question gets to the very heart of outer joins. The simple answer to this question is that the
data base will project a null for each and every column from the inner table that doesn’t match
the join criteria.

Page 11-6 Outer Join


Outer Join Concepts

• Inner joins retrieve only the INNER (matching join condition) result sets.
• Outer joins retrieve both the INNER and OUTER (non-matching) result sets.

• For OUTER JOIN, you can write one that returns the following:
 Employees with valid departments
 Employees with invalid departments
(i.e. all employees)

What would an outer join project for department name of employees in department
numbers 300 and 400, which don’t exists in the department table?
NOT IN
Employee Department
Last Department Department
Department
Name Number Number
Name
Jones 100 (Unique)

Smith 200 100 Sales

Brown 300 200 Support


Emp Dept
Adams 400 600 Finance

IN

Outer Join Page 11-7


Simulating an Outer Join
Before moving on to looking at the actual syntax used by an outer join we take a detour and look
at how one might return an outer result by using the set operators taught from the previous
module. (Which is why they were taught prior to this module.)

On the facing page we see the result of an inner join union’ed with that for the NOT IN. The
inner join can project columns for both tables, but since we cannot return the non-matching
(outer) results, we union these with the NOT IN to return the outer results as well. Note how it
projects a null for the column we are unable to project from the inner table found in the
subquery. You may now begin to understand how important it is to know and understand the
terms referenced from earlier modules.

Also, note how NULL (in the subquery) has been defined as character to match that for the
department name. The default data type for null is integer. This can be obtained by using the
TYPE function, like this  SELECT TYPE(null);

Since the data type for null is different from that for its corresponding column from the first
SELECT, without the explicit conversion, an implicit conversion would have been performed for
the null causing a mismatch in data types. This would have then failed the query.

Page 11-8 Outer Join


Simulating an Outer Join

To better understand outer join results, note these results.


The default data type for NULL is INTEGER.
SELECT e.Last_Name,
e.Department_Number, Employees with valid departments.
d.Department_Name (Inner results)
FROM Employee e, Department d
WHERE e.Department_Number = d.Department_Number
UNION ALL
SELECT Last_Name,
Department_Number, Employees with invalid departments.
NULL (CHAR(30)) (Outer results)
FROM Employee
WHERE Department_Number NOT IN
(SELECT Department_Number FROM Department);
Result
Employee Department
Last Department Department
Last Department Department
Department Name Number Name
Name Number Number
Name Jones 100 Sales
Jones 100 (Unique)
Smith 200 Support
100 Sales
Smith 200 Brown 300 ?
200 Support
Brown 300 Adams 400 ?
Adams 400 600 Finance

Outer Join Page 11-9


Outer Join Syntax
Before looking at some results (which would be like those for the previous UNION example
anyway), we show the syntax. In the syntax, the outer join will “point” to an outer table by using
either “LEFT”, “RIGHT” or “FULL” as a keyword. The keyword is chosen by the author to
describe which table is the outer table. To understand which table is outer, simply rewrite the
query on a single line (no matter how long – SQL is a free-form language after all). The table to
the left of the join syntax is the “left” table, and the table to the right is the “right” table.

You can now contrast this syntax to that of the previous query. Note that, in the second query,
the subquery actually determines which table is inner and which is outer, while in the first query
it is the LEFT keyword that determines the outer table and, hence, by process of elimination,
which is the inner table.

SELECT e.Last_Name, e.Department_Number


FROM Employee e LEFT OUTER JOIN Department d
ON e.Department_Number = d.Department_Number;

Vs.

SELECT e.Last_Name,
e.Department_Number,
d.Department_Name
FROM Employee e, Department d
WHERE e.Department_Number = d.Department_Number
UNION ALL
SELECT Last_Name,
Department_Number,
NULL (CHAR(30))
FROM Employee
WHERE Department_Number NOT IN
(SELECT Department_Number FROM Department);

Page 11-10 Outer Join


Outer Join Syntax

An example of an outer join is shown below.


Things to note.
• The keyword "OUTER" is optional.
• The keyword "LEFT" tells us which table is the OUTER table.
(If the query were written on a single line, in paragraph form, the table to its
LEFT)
• Use of the ON keyword is the only form available. (ANSI Standard)
SELECT e.Last_Name, e.Department_Number
FROM Employee e LEFT OUTER JOIN Department d
ON e.Department_Number = d.Department_Number;

Employee Department NOT IN


Last Department Department
Department
Name Number Number
Name
Jones 100 (Unique)

Smith 200 100 Sales

Brown 300 200 Support

Adams 400 600 Finance Emp Dept


IN

Outer Join Page 11-11


Types of Outer Joins
The facing page contrasts the differing style used for writing outer joins. Each syntax describes
a different business concern. Note that the inner portion for each consists of the same result set,
but with a different emphasis. For instance:

Left Join Result

“Employees and the departments in which they work.”

Right Join Result

“Departments and the people that work in them.”

These both return the same result set, but have different perspectives.

One more thing to note is that the key word “OUTER” is optional. That is to say, there is no
such thing as a RIGHT or LEFT or FULL inner join.

Page 11-12 Outer Join


Types of Outer Joins

Employees and the Departments with people Inner, Right Outer,


departments in which assigned to them (IN) and Left Outer
they work (IN), plus plus departments with
those with unmatched no assigned people
departments (NOT IN). (NOT IN)
SELECT e.Last_Name, SELECT e.Last_Name, SELECT e.Last_Name,
e.First_Name, e.First_Name, e.First_Name,
D.Dept_Name D.Dept_Name D.Dept_Name
FROM Employee e FROM Employee e FROM Employee e
LEFT JOIN RIGHT JOIN FULL JOIN
Department d Department d Department d
ON e.Dept# = d.Dept#; ON e.Dept# = d.Dept#; ON e.Dept# = d.Dept#;

LEFT OUTER JOIN RIGHT OUTER JOIN FULL OUTER JOIN

Emp Dept Emp Dept Emp Dept

Outer Join Page 11-13


Employee as Left Outer
We know consider an outer join result set. Later we will consider the department table as outer.

Notice that this result is exactly the same as that show earlier for using the UNION ALL
technique. No surprises here. Between the two this one will perform better and is generally
easier to write. Keep in mind that this example is a simple one and that outer joins can become
much more complex! Try to become familiar with this syntax and use it instead of the UNION
ALL. The UNION ALL was shown simply to help us understand terms and concepts, and not
meant as an alternative for practical usage.

Page 11-14 Outer Join


Employee as Left Outer

SELECT e.Last_Name, Result


Note
e.Department_Number AS Dept#,
d.Department_Name AS DName table Last
FROM Employee e LEFT OUTER JOIN Dept# DName
Name
Department d
ON e.Department_Number = Jones 100 Sales
d.Department_Number Smith 200 Support
Brown 300 ?
Adams 400 ?

Employee Department NOT IN


Last Department Department
Department
Name Number Number
Name
Jones 100 (Unique)

Smith 200 100 Sales

Brown 300 200 Support

Adams 400 600 Finance


Emp Dept
IN

Outer Join Page 11-15


Nulls and the Inner Table
There are several situations that can add complexity to understanding outer join results. Here we
see how introducing a null value into the inner table can potentially provide a certain level of
confusion.

In the result set a null is returned for an inner table column for an inner result and does not
indicate an outer result! If the department name column were defined as NOT NULL, then a null
value for department name could not exist, and so as null for this column in the result set can
now be interpreted as an outer result row.

Page 11-16 Outer Join


Nulls and the Inner Table

SELECT e.Last_Name, Result


e.Department_Number AS Dept#,
d.Department_Name AS DName Last
Dept# DName
FROM Employee e LEFT OUTER JOIN Name
Department d
Jones 100 Sales
ON e.Department_Number =
d.Department_Number Smith 200 ?
Brown 300 ?
Adams 400 ?

Employee Department NOT IN


Last Department Department
Department
Name Number Number
Name
Jones 100 (Unique)

Smith 200 100 Sales

Brown 300 200 ?

Adams 400 600 Finance Emp Dept


IN

Outer Join Page 11-17


Department as Outer
The query on the facing page could also have been written by making the department table a left
outer table and switching the table names as shown below.

SELECT d.Department_Name AS DName,


d.Department_Number AS Dept#,
e.Last_NameAS LName
FROM Department d LEFT JOIN Employee e
ON e.Department_Number = d.Department_Number;

Page 11-18 Outer Join


Department as Outer

SELECT d.Department_Name AS DName, Result


d.Department_Number AS Dept#,
e.Last_NameAS LName DName Dept# LName
FROM Employee e RIGHT JOIN Sales 100 Jones
Department d
ON e.Department_Number = Support 200 Smith
d.Department_Number; Finance 600 ?

Department Employee NOT IN


Department Last Department
Department
Number Name Number
Name
(Unique) Jones 100
100 Sales Smith 200
200 Support Brown 300
600 Finance Adams 400 Emp Dept
IN

Outer Join Page 11-19


Outer Joins and WHERE
When writing outer joins, you may add conditions as WHERE conditions. In general, conditions
against the outer table are placed in the WHERE. Conditions against the inner table are
generally added into the join condition. The logic behind why these conditions are referenced in
these locations is considered too advanced for this course, and these are recommendations that
are typically followed rather than rules. This means that you may do otherwise, but unexpected
results would occur. An example of an inner table qualification follows.

SELECT e.Last_Name,
d.Department_Number,
e.Department_Number
FROM Employee e LEFT OUTER JOIN Department d
ON e.Department_Number = d.Department_Number
AND Department_Name LIKE ‘%support%’
WHERE e.Last_Name = ' smith '
OR e.Last_name = ' brown ';

In general, qualifications on the columns of the inner table don’t make sense. Such
qualifications might make sense if the query were an inner join since the result contains results
only for matching conditions, where all column values, from each table, are available for
referencing, In the outer join the inner table columns values may not be available for
qualification in that they may be null, and, hence, not available. Such usage would be equivalent
to inner joins.

Page 11-20 Outer Join


Outer Joins and WHERE

Note table name qualifications. Result

SELECT e.Last_Name, Last d.Department e.Department


d.Department_Number, Name Number Number
e.Department_Number
FROM Employee e LEFT OUTER JOIN Smith 200 200
Department d Brown ? 300
ON e.Department_Number =
d.Department_Number
WHERE e.Last_Name = 'smith'
OR e.Last_name = 'brown';

Employee Department NOT IN


Last Department Department
Department
Name Number Number
Name
Jones 100 (Unique)

Smith 200 100 Sales

Brown 300 200 Support

Adams 400 600 Finance Emp Dept


IN

Outer Join Page 11-21


Syntax Variations
As one can readily see, there a number of syntax variations available for writing outer joins. The
answers to the descriptions posed follow, from top to bottom.

• Areas 1 and 2
• Areas 2 and 3
• Areas 2 and 3
• Areas 1 and 2

Page 11-22 Outer Join


Syntax Variations

Match these queries to their result areas by number.

SELECT e.Last_Name, d.Department_Name


FROM Employee e LEFT JOIN Department d
ON e.Department_Number = d.Department_Number;

SELECT e.Last_Name, d.Department_Name


FROM Employee e RIGHT JOIN Department d
ON e.Department_Number = d.Department_Number;

1 2 3
SELECT e.Last_Name, d.Department_Name
FROM Department d LEFT JOIN Employee e Emp Dept
ON e.Department_Number = d.Department_Number;

SELECT e.Last_Name, d.Department_Name


FROM Department d RIGHT JOIN Employee e
ON e.Department_Number = d.Department_Number;

Outer Join Page 11-23


Parts of Speech
On the facing page notice how terms are basically separated into “inner” and “outer”, with
“inner” references residing with the join conditions and “outer” references residing in the
“where” conditions.

Relationally, outer join is defined as being performed in three steps.

• Get the inner result according to the inner join conditions.


• Return the NOT IN result (as in the UNION ALL).
• Apply the WHERE conditions.

Page 11-24 Outer Join


Parts of Speech

For Outer Joins:


 OUTER is an optional keyword.
 LEFT, RIGHT, and FULL define the outer table.

Left Table
(Outer Table in this example)
SELECT e.Last_Name,
e.First_Name, Right Table
D.Department_Name (Inner Table in this example)
FROM Employee e Inner Join Condition
LEFT [OUTER] JOIN
Department d Inner Join Search Condition
ON e.Dept# = d.Dept# (A residual condition
AND d.Dept_Name LIKE '%support%' to the inner result.)
WHERE Salary < 50000
Outer Search Condition
(A residual condition
to the outer result.)

Outer Join Page 11-25


Three Table Inner Join - Review
Before looking at a three table outer join, we need to review a three table inner join. Recall –

• Only column values for matching conditions are projected.


• One-to-many conditions exist with uniqueness in the Job and Department tables.
• There is no relationship between Department and Job.
• INNER is an optional keyword.

Also recall the rules for placing parenthesis. With or without the parenthesis the query returns
the same result. How they can be placed was discussed in the module on inner joins.

Page 11-26 Outer Join


Three Table Inner Join – Review

Recall this inner join and its result.

SELECT e.Last_Name,
d.Department_Number,
j.Job_Code
FROM ((Employee e JOIN
Department d
ON e.Department_Number = d.Department_Number)
JOIN
Job j
ON e.Job_Code = j.Job_Code);

Employee Department Job


Last Department Job Department Job
Name Number Code Number Code
Jones 100 6666 (Unique) (Unique)

Smith 200 7777 100 6666

Brown 300 8888 200 5555

Adams 400 9999 600 8888

Outer Join Page 11-27


Three Table Outer Join
The syntax for this three table outer join is quite similar to that for the inner join, with the use of
parenthesis being optional and following the very same rules as for the inner join. For the inner
join we had no concern about outer results as we do here. Where we place the outer table
(employee) is important.

• The result of the first outer join (between Employee and Department) is obtained first.
Employee is the outer table in this join.
• The result of the first outer join then gets joined to the Job table. The outer table from the
previous join (Employee) is maintained as the outer table in the second join due to its
placement as the LEFT table.

In this join it is all about the employee.

• Employees with valid or invalid departments.


• Employee with valid or invalid jobs.

We shall discuss this result set later in this module.

Page 11-28 Outer Join


Three Table Outer Join

These two queries return equivalent results because Employee remains the outer-
most table.

There are many other ways in which one may write this and get the same result.
SELECT e.Last_Name,
d.Department_Name,
j.Description
FROM ( ( Employee e LEFT JOIN
Department d
ON e.Department_Number = d.Department_Number )
LEFT JOIN
Job j
ON e.Job_Code = j.Job_Code )

SELECT e.Last_Name,
d.Department_Name,
j.Description
FROM ( ( Department d RIGHT JOIN
Employee e
ON e.Department_Number = d.Department_Number )
LEFT JOIN
Job j
ON e.Job_Code = j.Job_Code )

Outer Join Page 11-29


Multiple Table Variations
As you might expect, multiple syntax variations of this outer join exist. Here only two of them
examine are examined.

Page 11-30 Outer Join


Multiple Table Variations

These two outer join queries also return the same result.

SELECT e.Last_Name,
d.Department_Name,
j.Description
FROM (Department d RIGHT JOIN
(Job j RIGHT JOIN
Employee e
ON e.Job_Code = d.Job_Code )
ON e.Department_Number = d.Department_Number );

SELECT e.Last_Name,
d.Department_Name,
j.Description
FROM (Job j RIGHT JOIN
(Department d RIGHT JOIN
Employee e
ON e.Department_Number = d.Department_Number )
ON e.Job_Code = j.Job_Code );

Outer Join Page 11-31


Three Table Outer Join Results
The completed result follows. Notice that we are not projecting the join columns from the outer
table, something we were told not to do earlier, but in a class, we are not bound strictly by these
rules.

Last Department Job


Name Number Code
Jones 100 6666
Smith 200 ?
Brown ? 8888
Adams ? ?

Page 11-32 Outer Join


Three Table Outer Join Results

Complete the result.

SELECT e.Last_Name, Last Department Job


d.Department_Number, Name Number Code
j.Job_Code
FROM Employee e LEFT JOIN Jones
Department d
Smith
ON e.Department_Number =
d.Department_Number Brown
LEFT JOIN
Job j Adams
ON e.Job_Code = j.Job_Code

Employee Department Job


Last Department Job Department Job
Name Number Code Number Code
Jones 100 6666 (Unique) (Unique)

Smith 200 7777 100 6666

Brown 300 8888 200 5555

Adams 400 9999 600 8888

Outer Join Page 11-33


Uncharacteristic Data and Outer Join
The completed result follows. Notice that we are now projecting the joins columns from the
outer table, however, we are also projecting other, non-joined, columns from the inner tables as
well.

Last Name Dept# JCD DName Desc


Jones 100 6666 Sales Manager
Smith 200 7777 Support ?
Brown 300 8888 ? President
Adams 400 9999 ? ?

Page 11-34 Outer Join


Uncharacteristic Data and Outer Join

Complete the result.


SELECT e.Last_Name,
e.Department_Number AS Dept#,
Last
e.Job_Code AS JCD, Dept# JCD DName Desc
d.Department_Name AS DName, Name
j.Description AS "Desc" Jones 100 6666
FROM Employee e LEFT JOIN
Department d Smith 200 7777
ON e.Department_Number =
Brown 300 8888
d.Department_Number
LEFT JOIN Adams 400 9999
Job j
ON e.Job_Code = j.Job_Code

Employee Department Job


Last Department Job Dept Job
Name Number Code Dept
Number Code Desc
Name
Jones 100 6666 (Unique) (Unique)

Smith 200 7777 100 Sales 6666 Manager

Brown 300 8888 200 Support 5555 Director

Adams 400 9999 600 Admin 8888 President

Outer Join Page 11-35


Considering Nulls
The completed result follows. Notice that we putting a new twist on the sourcing tables by
placing nulls for some column values.

Last Name Dept# JCD DName Desc


Jones 100 6666 Sales Manager
Smith 200 7777 ? ?
Brown 300 8888 ? ?
Adams 400 9999 ? ?

Page 11-36 Outer Join


Considering Nulls

Complete the result.


SELECT e.Last_Name,
e.Department_Number AS Dept#,
Last
e.Job_Code AS JCD, Dept# JCD DName Desc
d.Department_Name AS DName, Name
j.Description AS "Desc" Jones 100 6666
FROM Employee e LEFT JOIN
Department d Smith 200 7777
ON e.Department_Number =
Brown 300 8888
d.Department_Number
LEFT JOIN Adams 400 9999
Job j
ON e.Job_Code = j.Job_Code

Employee Department Job


Last Department Job Dept Job
Name Number Code Dept
Number Code Desc
Name
Jones 100 6666 (Unique) (Unique)

Smith 200 7777 100 Sales 6666 Manager

Brown 300 8888 200 ? 5555 Director

Adams 400 9999 600 Finance 8888 ?

Outer Join Page 11-37


Full Outer Join
What separates the result from all of the others that we have seen so far is that we are projection
both of the join columns so that there should be no confusion as to which is an inner result vs.
which is an outer result.

Dept Department_Name Last_Name EmpDept


---- ------------------------ --------- -------
600 new department ? ?  Left
402 software support Crane 402  Inner
? ? James 111  Right
501 marketing sales Runyon 501  Inner
301 research and development Kanieski 301  Inner
? ? Green ?  Right
100 president Trainer 100  Inner
301 research and development Stein 301  Inner

Page 11-38 Outer Join


Full Outer Join

The Full Outer Join returns the:


• Left outer result set.
• Right outer result set.
• Inner result set.

SELECT d.Department_Number AS Dept, d.Department_Name,


e.Last_name, e.Department_Number AS EmpDept
FROM Department D E FULL OUTER JOIN Employee E
ON E.Department_Number = D.Department_Number;

Department (LEFT Table) Employee (RIGHT Table)

Dept Department_Name Last_Name EmpDept


Specify whether 600 new department ? ?
each result row is: 402 software support Crane 402
• Inner ? ? James 111
501 marketing sales Runyon 501
• Left Outer
301 research and development Kanieski 301
• Right Outer ? ? Green ?
100 president Trainer 100
301 research and development Stein 301

Outer Join Page 11-39


Module 11: Summary
A summary of this module is discussed.

Page 11-40 Outer Join


Module 11: Summary

• Outer joins share a similar syntax to the explicit form of inner join.

• Outer join results can be obtained by using UNION with inner join and
NOT IN.

• There are many ways one can write outer joins to achieve the same
result.

• Outer joins return both inner and outer result sets.

• Teradata uses only the ANSI standard syntax for outer join.

• Parenthesis can be used to aid in determining syntax usage.

• Inner and Outer joins may be used together in a single query.

• Outer require the use of an “ON” clause establishing a join condition.

Outer Join Page 11-41


Module 11: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 11-42 Outer Join


Module 11: Review Questions

True or False:

1. All outer joins require use of either LEFT, RIGHT or FULL keywords.
True
2. Outer joins can return more rows that can inner joins.
True
3. Nulls returned from the inner table mean the result row is an outer
result.
False – Only for the join column, or if the column is defined as NOT NULL
4. The use of a WHERE clause is not allowed in an outer join.
False – WHERE can be used for writing residual conditions
5. The use of an ON clause is required when writing an outer join.
True
6. The keyword OUTER is required when writing outer joins.
False
7. The FULL outer join returns LEFT and RIGHT outer results.
True – It also returns inner results

Outer Join Page 11-43


Module 11: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 11-44 Outer Join


Module 11: Lab Exercise

1. From the employee and department tables, list employee last names,
first names, the department names and the employees department
numbers only, for all employees. Compare this to the number of rows
returned by the inner join.

2. For #1, include the department number from the department table to
the projection to see which rows are actually outer results.

3. From the employee, department, and job tables, list employee last
names, first names, and the join columns from all three tables for all
employees having salaries between $34,000.00 and $58,000.00. Do any
of these employees have both an invalid department and invalid job?

Outer Join Page 11-45


Notes:

Page 11-46 Outer Join


Module 12

Correlated Subqueries

After completing this module, you will be able to:

• Identify correlated subqueries.


• Contrast subqueries with correlated subqueries.
• Use EXISTS and NOT EXISTS with correlated subqueries.
• Use aggregation with correlated subqueries.
• Contrast IN with EXISTS.
• Contrast NOT IN with NOT EXISTS.
• Use multiple correlations in a query.
• Incorporate joins into correlated subquery usage.

Correlated Subqueries Page 12-1


Notes:

Page 12-2 Correlated Subqueries


Table of Contents
Subquery Review ....................................................................................................................... 12-4
Correlated Subquery Terminology............................................................................................. 12-6
Correlated Subquery Processing ................................................................................................ 12-8
NOT IN vs. NOT EXISTS ....................................................................................................... 12-10
NOT IN Review ....................................................................................................................... 12-12
NOT EXISTS vs. NOT IN Logic............................................................................................ 12-14
Multiple Correlations ............................................................................................................... 12-16
Module 12: Summary............................................................................................................... 12-18
Module 12: Review Questions ................................................................................................. 12-20
Module 12: Lab Exercise ......................................................................................................... 12-22

Correlated Subqueries Page 12-3


Subquery Review
Before getting into correlated subqueries, a quick review of standard subqueries is in order. For
comparison purpose it will be best to recall that:

• Subqueries are a set process.


• You can only project columns from the outer table
• To understand the business question, start from the inner query and work out.

The correlated subquery will differ in that:

• Correlated subqueries are a row-at-a-time process.


• You can still only project from the outer table.

To understand the business question, start from the outer table and work in.

Page 12-4 Correlated Subqueries


Subquery Review

Before we begin looking at the Correlated Subquery, let’s review the


normal subquery from earlier.

Recall that subqueries are considered a set process.

Subquery processing:
• Resolve bottom-most level query first
• Use results as input to next level up
• Continue until top level reached

Find all employees who work in any Research department.

SELECT last_name
FROM employee Outer Table
WHERE department_number IN
(SELECT department_number
FROM department Inner Table
WHERE department_name
LIKE ('%research%'));

Correlated Subqueries Page 12-5


Correlated Subquery Terminology
The facing page highlights the remarks from the earlier left-hand page as well as discusses some
newer points. A description of how it processes will be discussed on the next right-hand page
after this one. Here we are just interested in the terminology.

Page 12-6 Correlated Subqueries


Correlated Subquery Terminology

Correlated subqueries:

• are subqueries.
• can only project from outer table.
• have the inner table “inter-connected” to outer table via a join condition
(correlated).
• are considered a row-at-a-time process as opposed to a set process.
• most typically references EXISTS and NOT EXISTS.

SELECT last_name Outer Table


,department_number AS deptno
,salary_amount
FROM employee ee Inner Table
WHERE [ NOT ] EXISTS
(SELECT *
FROM department dd Correlation
WHERE ee.department_number = (Inter-Connection)
dd.department_number);

Correlated Subqueries Page 12-7


Correlated Subquery Processing
It has already been mentioned that correlated subqueries are relationally a row-at-a-time process.
This is how they are defined to work relationally. In other words, any database must return a
result set that follows this logic. Row-at-a-time logic is quite different that set logic.

As it turns out, the database can return a result set according to this logic quite efficiently! The
database is not restricted to performing in the manner defined, it is only necessary to return the
result by following this defined process. The result differs from that of the subquery for mainly
the NOT IN (vs. NOT EXISTS) rather than for IN (vs. EXISTS). This will be discussed later in
the module.

Page 12-8 Correlated Subqueries


Correlated Subquery Processing

• Correlated subqueries provide an implicit loop function within any standard SQL
DML statement.
• The logic defining the relational processing is different that the internal database
processing.

Correlated subqueries are ANSI SQL-2003-compliant.

Outer references behave as described in the following process.


This process does not mirror the query plan generated by the Optimizer.
It is meant only to describe how a correlated subquery works at a conceptual level.

SELECT last_name 1. Retrieve an arbitrary row (e.g. an


employee) from the outer table
,department_number (employee).
,salary_amount
FROM employee ee 2. See if the this row’s department Continue
WHERE EXISTS number finds a match in the inner Until EOF
(SELECT * table (department)
FROM department dd
WHERE ee.department_number = 3. If a match is found, the condition
exists, so project the columns for
dd.department_number); this row.

Correlated Subqueries Page 12-9


NOT IN vs. NOT EXISTS
The facing page contrasts NOT IN with NOT EXISTS. The page focuses mainly on the different
results returned. It was mentioned earlier that the main difference between subquery processing
and correlated subquery processing is with respect to what is shown on the facing page.

Page 12-10 Correlated Subqueries


NOT IN vs. NOT EXISTS

NOT IN can have issues when nulls are involved.


NOT EXISTS does not share those issues due to row-at-a-time logic.

Which job codes are not assigned to any employee?

Subquery using NOT IN (set logic):


Recall our earlier discussion
SELECT job_code where column job_code
FROM job returned null.
WHERE job_code NOT IN
(SELECT job_code *** No rows found
FROM employee) ;

Correlated Subquery using NOT EXISTS (row-at-a-time logic):


job_code
SELECT job_code 104202
FROM job 104201
WHERE NOT EXISTS 412103
(SELECT * 322101
FROM employee ee
WHERE ee.job_code = job.job_code);

Correlated Subqueries Page 12-11


NOT IN Review
Again, a review of an earlier concept is needed before looking at the correlated subquery
alternative on the next page.

Page 12-12 Correlated Subqueries


NOT IN Review

Find all empty departments


Tables
SELECT Dept# FROM Dept
WHERE Dept# NOT IN Dept Emp
(SELECT Dept# FROM Emp); Dept# Emp# Dept#
401 1018 401
Becomes
403 1020 403
600 1030 NULL
SELECT Dept# FROM Dept
NULL
WHERE Dept# NOT IN (401,403, NULL);

Becomes

401 403 600 NULL *** No rows returned


SELECT Dept# FROM Dept
F T T ?
WHERE Dept# NOT = 401 Only ‘true’ qualifies
T F T ? for output
AND Dept# NOT = 403 ? ? ? ?

AND Dept# NOT = NULL; F F ? ?

Correlated Subqueries Page 12-13


NOT EXISTS vs. NOT IN Logic
The facing page describes the logic behind the result that is obtained. Of the rows in the
department table, only department 600 and the NULL department do not find matches to in the
employee table. So the columns for these rows will be projected.

Page 12-14 Correlated Subqueries


NOT EXISTS vs. NOT IN Logic

Find departments in which no one works.

SELECT Dept#
FROM Dept D
WHERE NOT EXISTS
(SELECT * FROM Emp E
WHERE E. Dept # = D. Dept #);

Dept (outer) Emp (inner)


Dept# Emp# Dept#
401 EXISTS 1018 401

EXISTS Answer: 600, null


403 1020 403
NOT EXISTS
600 1030 NULL
NOT EXISTS
NULL

Correlated Subqueries Page 12-15


Multiple Correlations
The following query is: “List employees with invalid departments and invalid jobs.”
The result is:
manager_employee_number department_number job_code
----------------------- ----------------- -----------
801 ? 211100
It can be verified with the following:
Employees with invalid departments.
SELECT Manager_Employee_Number, Department_Number, Job_Code
FROM Employee ee
WHERE NOT EXISTS
(SELECT * FROM Department d
WHERE ee. Department_Number = d. Department_Number);
manager_employee_number department_number job_code
----------------------- ----------------- -----------
801 999 111100
801 ? 211100
1025 ? 222101
801 ? 321100
And:
Employees with invalid jobs.
SELECT Manager_Employee_Number, Department_Number, Job_Code
FROM Employee ee
WHERE NOT EXISTS
(SELECT * FROM Job j
WHERE ee.Job_code = j.Job_Code);
manager_employee_number department_number job_code
----------------------- ----------------- -----------
801 ? 211100
1005 403 ?
1005 403 ?
1005 403 ?
1005 403 ?
1005 403 ?
Of these two results, the only row that satisfies the “AND” condition between them is the one
shown above.

Page 12-16 Correlated Subqueries


Multiple Correlations

Derive the business question for the following query.

SELECT Manager_Employee_Number
,Department_Number
,Job_Code
FROM Employee ee
WHERE
NOT EXISTS
(SELECT *
FROM Department d
WHERE ee. Department_Number = d. Department_Number)
AND
NOT EXISTS
(SELECT *
FROM Job j
WHERE ee.Job_code = j.Job_Code);

Correlated Subqueries Page 12-17


Module 12: Summary
A module of this module is discussed.

Page 12-18 Correlated Subqueries


Module 12: Summary

Correlated subqueries:

• Can be used in place of IN and NOT IN.


• Can reference EXISTS or NOT EXISTS.
• Use row-at-a-time logic.
• May outperform traditional subqueries.
• Involve INNER and OUTER portions like traditional subqueries.
• May only project from the outer table.

Correlated Subqueries Page 12-19


Module 12: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 12-20 Correlated Subqueries


Module 12: Review Questions

True or False:

1. EXISTS and NOT EXISTS can be used in traditional subqueries.


True
2. When using EXISTS in a correlated subquery, the projected list of the
subquery is irrelevant.
True
3. Correlated subqueries are an ANSI standard.
True
4. Correlated subqueries process sets of data.
True – but they process row-at-a-time
5. Correlated subqueries can not project columns from the inner table.
True
6. You can not nest correlated subqueries.
False

Correlated Subqueries Page 12-21


Module 12: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 12-22 Correlated Subqueries


Module 12: Lab Exercise

1. Write a correlated subquery to find employees with invalid job_codes.

2. For query #1, include their department name.

Correlated Subqueries Page 12-23


Notes:

Page 12-24 Correlated Subqueries


Module 13

Aggregation

After completing this module, you will be able to:

• Identify the 5 aggregate functions.


• Use GROUP BY to aggregate groups.
• Use HAVING to qualify on aggregated values.
• Distinguish between the effects of WHERE vs. HAVING.
• Aggregate the results of joins.
• Determine the impact of nulls as aggregated values.
• Determine the impact of nulls as grouped values.
• Use COUNT(*) to count numbers of qualifying rows.

Aggregation Page 13-1


Notes:

Page 13-2 Aggregation


Table of Contents
The Aggregate Functions ........................................................................................................... 13-4
Aggregate Functionality ............................................................................................................. 13-6
COUNT(*) ................................................................................................................................. 13-8
Getting Department Sums ........................................................................................................ 13-10
Aggregating Groups ................................................................................................................. 13-12
Using GROUP BY ................................................................................................................... 13-14
The HAVING Clause ............................................................................................................... 13-16
WHERE Clause Explain .......................................................................................................... 13-18
HAVING on Non-Aggregates ................................................................................................. 13-20
Aggregation and Joins .............................................................................................................. 13-22
Correlated Subqueries and Aggregation .................................................................................. 13-24
A Complex Example ................................................................................................................ 13-26
COUNT DISTINCT ................................................................................................................. 13-28
Module 13: Summary............................................................................................................... 13-30
Module 13: Review Questions ................................................................................................. 13-32
Module 13: Lab Exercise ......................................................................................................... 13-34

Aggregation Page 13-3


The Aggregate Functions
As the facing page states, aggregate function perform operations that summarize information
found in tables. That is, one can expect that a certain amount of detailed information will be lost
when performing aggregations. In our first example we see that a single value, representing the
sum of all salary amounts for all employees, is returned.

Page 13-4 Aggregation


The Aggregate Functions

The aggregate functions produce a arithmetic summarization of values.

They produce sums, averages and counts as well as minimum and maximum
values.

[ SUM ]

[ AVERAGE |
AVG ]

[ MINIMUM |
( expression )
MIN ]

[ MAXIMUM |
MAX ]

[ COUNT ]

SumSal
SELECT SUM(Salary_Amount) AS SumSal FROM Employee; ------------
1102050.00

Aggregation Page 13-5


Aggregate Functionality
The most important thing to understand about aggregation is that nulls are ignored! When
saying that they are ignored, we mean just that. They are not even considered as existing. This
is underscored when we look closely at the result for the query at the bottom of the facing page.
In the normal sorting sequence, nulls sort first, so they may normally be considered as the
minimum value. Because they are ignored, however, they do not appear as the minimum value.
Notice that the average is 60, which is  240 / 4 and not  240 / 6, which would have been
40.

Page 13-6 Aggregation


Aggregate Functionality

All aggregate functions ignore nulls.


T1
Given table “t1” with column “c1” and its values, note: C1
• the default headings. 0
• that nulls are ignored.
0
• that a single row is returned (i.e. there is no detail)
200
40
NULL
NULL

SELECT SUM(c1), COUNT(c1), AVG(c1), MAX(c1), MIN(c1)


FROM T1;

Sum(c1) Count(c1) Average(c1) Maximum(c1) Minimum(c1)


----------- ----------- ----------- ----------- -----------
240 4 60 200 0

Aggregation Page 13-7


COUNT(*)
COUNT(*) provides us with a mechanism that can be used to count actual rows. It doesn’t
matter if nulls appear for each and every column value in a row, is still gets counted!

The query is intended to illustrate how COUNT(*) can be used to derive an average, and how
this average would differ from that for a typical aggregation, which is also replicated via a
derivation as well.

Page 13-8 Aggregation


COUNT(*)

An optional argument for COUNT is to use a “*” or “star”.

The COUNT(*) is often referred to as a “Count-Star”. T1


The “*” means to count the rows. C1
0
SELECT SUM(c1),
COUNT(c1), 0
COUNT(*), 200
1 AVG(c1) AS AvgFunc, 40
2 SUM(c1)/COUNT(c1) AS ByCount,
NULL
3 SUM(c1)/COUNT(*) AS ByStar
FROM t1; NULL

3
Sum(c1) Count(c1) Count(*) AvgFunc ByCount ByStar
----------- ----------- ----------- ----------- -----------
1 -----------
240 4 6 60 60 40

Aggregation Page 13-9


Getting Department Sums
The facing page shows how one can, through the use of repetitive aggregations, obtain sums for
each of the different departments, by department number.

Page 13-10 Aggregation


Getting Department Sums

To find the total salary for all employees in each department you could do this.

However:
• A separate SELECT is required for each department.
• Impractical for large numbers of departments.

SELECT SUM (salary_amount) Sum(salary_amount)


FROM employee ------------------
WHERE department_number = 401; 245575.00

SELECT SUM (salary_amount) Sum(salary_amount)


FROM employee ------------------
WHERE department_number = 403; 233000.00

SELECT SUM (salary_amount) Sum(salary_amount)


FROM employee ------------------
WHERE department_number = 301; 116400.00

Aggregation Page 13-11


Aggregating Groups
Another very important syntax is that of the GROUP BY clause. This clause can be used, as
shown, to find totals for each department, or group. Columns not involving an aggregation are
termed as “non-aggregates” while those referenced by either of the five functions is termed as
“aggregates”

For the GROUP BY Clause:

• Every non-aggregate column must be included into the GROUP BY.


• GROUP BY may reference a column by name or by positional value.
• Nulls, though ignored during aggregation, do form a GROUP.
• Grouping does not imply order. (i.e. ORDER BY may be required)
• You may group by column(s) that are not projected.

An example of the last bullet follows. The result differs from the facing page due to the change
in the grouping.

SELECT SUM(Salary_Amount)
FROM Employee
GROUP BY Department_Number, Manager_Employee_Number;

Sum(salary_amount)
------------------
37850.00
201800.00
38750.00
91200.00
57700.00
58700.00
66000.00
24500.00
134125.00
52500.00
31200.00
100000.00
207725.00

Page 13-12 Aggregation


Aggregating Groups

Create a report showing total the salary for each department.

• The GROUP BY clause allows ‘groups’ to be defined.


• Aggregate operations are performed on these groups.
• A row of information is output for each group.

SELECT department_number, department_number Sum(salary_amount)


SUM (salary_amount) ----------------- ------------------
403 233000.00
FROM employee
402 77000.00
GROUP BY department_number; 301 116400.00
999 100000.00
Note that NULL departments aggregate ? 129950.00
as a NULL group. 401 245575.00
501 200125.00

• All projected non-aggregates must appear in the GROUP BY clause.


• If you violate this rule, you will get this message:
“3504 Selected non-aggregate values must be part of the associated group.”
• As before, aggregation eliminates detail rows from the answer set.
• A GROUP BY does not imply an ordering of results rows.

Aggregation Page 13-13


Using GROUP BY
It should become clear, even to the most casual observer, that adding columns to the grouping
increases the number of groups and, hence, the number of rows being returned increases until the
point where the GROUP BY may become unique. In the case of unique groups, the detail and
the aggregation would coincide.

Page 13-14 Aggregation


Using GROUP BY

Adding columns to the GROUP BY creates more groups.


This could add more groups and result in little or no aggregation happening.
Mgr Dept JCd SumSal
------ ------ ----------- ----------
In this example, the GROUP BY 801 ? 211100 34700.00
combination is nearly unique. 801 ? 321100 56500.00
801 301 311100 ?
SELECT Manager_Employee_Number AS Mgr, 801 401 411100 37850.00
Department_Number AS Dept, 801 402 421100 52500.00
Job_Code AS JCd, 801 403 431100 31200.00
SUM(Salary_Amount) AS SumSal 801 501 511100 66000.00
FROM employee_sales.employee 801 999 111100 100000.00
GROUP BY 1, 2, 3 1003 401 412101 107825.00
ORDER BY 1, 2, 3; 1003 401 412102 24500.00
1003 401 413201 43100.00
1005 403 ? 162300.00
1011 402 422101 ?
Examples of valid GROUP BY clauses 1017 501 512101 134125.00
For this query: 1019 301 312101 29450.00
• GROUP BY 1, 2, 3 1019 301 312102 29250.00
• GROUP BY 3, 1, 2 1025 ? 222101 38750.00
• GROUP BY Mgr, Dept, JCd
• GROUP BY 1, Department_Number, JCd

Aggregation Page 13-15


The HAVING Clause
The facing page shows how the HAVING clause works with aggregation. This clause is used to
eliminate result rows based upon qualifications against the aggregate values. In the scheme of
things the WHERE clause is performed prior to the select so that only rows satisfying the
WHERE conditions participate in the aggregation.

The HAVING clause can reference a column either by its alias or by its actual aggregation as it
appears in the projection. HAVING that attempts to reference by numeric position interprets the
number used as a literal and not as a column position. So the following clause would always be
false, and no rows would be returned. An EXPLAIN will reveal this fact.

In this example, the value “2” is treated as a literal. Since the condition of “1 > 100000” can
never be true, no rows are ever returned. Observe the EXPLAIN plan.

SELECT Department_Number,
SUM(Salary_Amount) AS SumSal
FROM Employee
GROUP BY 1
HAVING 2 > 100000;

We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan with a
condition of ("2 > 100000") into Spool 1 (group_amps), which is built locally on the AMPs. The
size of Spool 1 is estimated with no confidence to be 15 rows (660 bytes). The estimated time
for this step is 0.02 seconds.

The EXPLAIN plan should read like the one for this query, referencing a “field”.

SELECT Department_Number,
SUM(Salary_Amount) AS SumSal
FROM Employee
GROUP BY 1
HAVING SumSal > 100000;

We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan with a
condition of ("Field_3 > 100000.00") into Spool 1 (group_amps), which is built locally on the
AMPs. The size of Spool 1 is estimated with no confidence to be 15 rows (660 bytes). The
estimated time for this step is 0.02 seconds.

Page 13-16 Aggregation


The HAVING Clause

You can use a HAVING clause for qualifying on an aggregated value.


The HAVING clause should reference an aggregated value.
The WHERE clause must reference a non-aggregated value.
SELECT department_number, SUM(Salary_Amount)
FROM Employee
GROUP BY 1
WHERE Department_Number IN (401, 402)
HAVING SUM(Salary_Amount) > 100000;

department_number Sum(salary_amount)
----------------- ------------------
401 245575.00

HAVING, like WHERE and GROUP BY, may reference column names or alias names.
SELECT department_number AS d#, SUM(Salary_Amount) AS SumSal
FROM Employee
GROUP BY d#
WHERE d# IN (401, 402)
HAVING SumSal > 100000;

Aggregation Page 13-17


WHERE Clause Explain
The EXPLAIN plan on the facing page shows that the WHERE condition is being applied during
the aggregation so that only the qualifying rows are aggregated. This means better performance
that checking afterwards, where the HAVING is performed.

The steps outlining when certain clauses get executed is important. It implies that the HAVING
clause can also reference a non-aggregate value as well. We will show an EXPLAIN plan of this
on the next page.

Page 13-18 Aggregation


WHERE Clause Explain

At the right is a partial list showing the order in which certain clauses take
place during a query’s execution.

SELECT department_number, 1. WHERE {join conditions}


SUM(Salary_Amount) AS SumSal 2. AGGREGATION (w or w/o group by)
FROM Employee 3. HAVING
GROUP BY 1 4. ORDER BY
WHERE Department_Number IN (401, 402); 5. FORMAT

Only salaries for departments 401 and 402 are summed.


. . . .
3) We do an all-AMPs SUM step to aggregate from DLM.Employee by way of an all-rows scan with
a condition of ("(DLM.Employee.department_number = 402) OR
(DLM.Employee.department_number = 401)") , grouping by field1 (
DLM.Employee.department_number). Aggregate Intermediate Results are computed globally,
then placed in Spool 3. The size of Spool 3 is estimated with no confidence to be 2 rows (74
bytes). The estimated time for this step is 0.03 seconds.
4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into
Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with
no confidence to be 2 rows (82 bytes). The estimated time for this step is 0.02 seconds.
. . . .

Aggregation Page 13-19


HAVING on Non-Aggregates
Here we see that the HAVING clause can reference a non-aggregate value, but the fact that this
condition appears after the aggregation is performed means that all of the employee rows will
participate in the aggregation. Not good. This condition should be placed as a WHERE
condition for performance reasons, if not for clarity.

The answer to the questioned posed at the bottom of the facing page is that it would fail because
the aggregation has not been performed until after the WHERE conditions have been satisfied.

Page 13-20 Aggregation


HAVING on Non-Aggregates

The explanation for the following 1. WHERE {join conditions}


2. AGGREGATION (w or w/o group by)
query shows that the HAVING clause
3. HAVING
occurs after the aggregation. 4. ORDER BY
5. FORMAT

SELECT Department_Number, SUM(Salary_Amount) AS SumSal


FROM Employee
GROUP BY 1
HAVING Department_Number IN (401, 402);
Here the sums are performed on ALL departments, with the HAVING clause
happening AFTER the aggregation of ALL departments.

. . . .
3) We do an all-AMPs SUM step to aggregate from DLM.Employee by way of an all-rows scan
with no residual conditions, grouping by field1 ( DLM.Employee.department_number).
Aggregate Intermediate Results are computed globally, then placed in Spool 3. The size of
Spool 3 is estimated with no confidence to be 15 rows (555 bytes). The estimated time for this
step is 0.03 seconds.
4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan with a
condition of ("(department_number = 401) OR (department_number = 402)") into Spool 1
(group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no
confidence to be 15 rows (615 bytes). The estimated time for this step is 0.02 seconds.
. . . .

Aggregation Page 13-21


Aggregation and Joins
Suppose that you would like to find the sums for the salary amount, not for department numbers,
but for department names? This requires a join so that we may project the department names.
Note that, since the join is considered a WHERE condition, only the result of the inner join will
be performed. There is an interesting question at the bottom of the facing page.

“What does SumSal for the null department name represent?”

The value for “SumSal” represents the two employees in department 402 that has a null
department name in the department table. Observe the result if this query were an outer join
instead

SELECT d.Department_Name, SUM(e.Salary_Amount) AS SumSal


FROM Employee e LEFT JOIN Department d
ON e.Department_Number = d.Department_Number
GROUP BY 1;

department_name SumSal
------------------------------ ------------
education 233000.00
research and development 116400.00
customer support 245575.00
? 306950.00
marketing sales 200125.00

Now the value for “SumSal” represents:


• The two employees in department 402 that have a null department name as before.
• The one employee in department 999 that has no match in department.

The two employees that have a null department number (new hires?).

Page 13-22 Aggregation


Aggregation and Joins

You can join tables to group by other columns as well.

In this example we are:


• Summing salary amounts from Employee.
• Grouping by department names from Department.

SELECT d.Department_Name, SUM(e.Salary_Amount) AS SumSal


FROM Employee e JOIN Department d
ON e.Department_Number = d.Department_Number
GROUP BY 1;

department_name SumSal
------------------------------ ------------
education 233000.00
research and development 116400.00
customer support 245575.00
? 77000.00
marketing sales 200125.00

The row with the null department name represents the sum of the salaries for those
employees working in a department that does not have a name for it in the
department table.

Aggregation Page 13-23


Correlated Subqueries and Aggregation
Correlated subqueries can also involve aggregation. The answer to the question at the bottom of
the facing page is:

“No. As a subquery you may only project columns from the outer table.”

Page 13-24 Aggregation


Correlated Subqueries and Aggregation

List employee information for those whose salary is greater than their
department average.

SELECT last_name 1. Retrieve an arbitrary row


(e.g. an employee) from the
,department_number outer table (employee).
,salary_amount
FROM employee ee 2. Retrieve the average salary
WHERE salary_amount > amount for the matching Continue
(SELECT AVG(Salary_Amount) department number from the Until EOF
FROM employee dd same table.
WHERE ee.department_number =
3. If it is, project the colmns
dd.department_number); from this outer row and
continue to another row.

Would it be possible to add a column to this report showing the


average department salary being used for the comparison?

Aggregation Page 13-25


A Complex Example
In this complex example we are adding a join to the correlated subquery.

Think of it this way:


• The correlated subquery is used to find employees having a salary that is the largest
salary for their department.
• The join is to qualify that, of all employees, these are only the department managers.

The following example shows how to find those department managers with salaries greater than
their department average. Except for the “>” symbol and “AVG” (instead of the “=” and
“MAX”), it is identical to the one on the facing page.

SELECT d.manager_employee_number AS mgr_emp_#


,d.department_number
,e.salary_amount
FROM department d INNER JOIN
employee e
ON e.employee_number =
d.manager_employee_number
WHERE e.salary_amount >
(SELECT AVG (salary_amount)
FROM employee em
WHERE d.department_number
= em.department_number);

Page 13-26 Aggregation


A Complex Example

Show the employee number, department number, and salary of all


department managers who have the highest salary in their department.

In this example the join is a qualification prior to the correlation.

SELECT d.manager_employee_number AS mgr_emp_#


,d.department_number
,e.salary_amount
FROM department d INNER JOIN
employee e Employees who are department
ON e.employee_number = managers . . .
d.manager_employee_number
WHERE e.salary_amount =
(SELECT MAX (salary_amount)
. . . and have the highest
FROM employee em
salary in that department.
WHERE d.department_number
= em.department_number);

Aggregation Page 13-27


COUNT DISTINCT
Examples of the COUNT DISTINCT appear on the facing page for reference.

Page 13-28 Aggregation


COUNT DISTINCT

You can use aggregation and DISTINCT together.


Typically this is done using COUNT.

SELECT COUNT (DISTINCT Department_Number) VALID


FROM Employee; (remember that nulls are ignored)

SELECT COUNT DISTINCT Department_Number INVALID


FROM Employee;

SELECT COUNT (DISTINCT Department_Number, Job_Code) INVALID


FROM Employee;

SELECT COUNT (DISTINCT Department_Number || Job_Code)


FROM Employee;
VALID
(Note: any null, in the concatenated string, results in a null string and,
therefore, a null result)

Both of the following are VALID


SELECT COUNT (DISTINCT Employee_Number), COUNT(DISTINCT Department_Number)
FROM Employee;

SELECT COUNT (DISTINCT Employee_Number), COUNT(department_Number)


FROM Employee;

Aggregation Page 13-29


Module 13: Summary
A summary of this module is discussed.

Page 13-30 Aggregation


Module 13: Summary

• Aggregation summarizes detail data into fewer rows.

• You can use GROUP BY to summarize by groups.

• WHERE conditions eliminate rows prior to performing the aggregation.

• HAVING eliminates groups after aggregations are performed.

• GROUP BY can reference by name or by numeric position.

• HAVING must refer to only a column or alias, not a numeric position.

• GROUP BY may be used to perform a DISTINCT operation.

• Aggregations may be performed on the results of Inner and Outer joins.

Aggregation Page 13-31


Module 13: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 13-32 Aggregation


Module 13: Review Questions

True or False:

1. The value zero (0) is ignored in the aggregate process.


False – Nulls are ignored, and a zero is not that same thing as a null
2. The following is a valid request  SELECT * FROM Employee GROUP BY 1;
False
3. DISTINCT can be replicated by GROUP BY.
True
4. The HAVING clause gets applied after the aggregation is performed.
True
5. Qualifying non-aggregates with the HAVING clause can impact performance.
True
6. Qualifying aggregate values with WHERE is allowed, but impacts
performance.
False
7. You can aggregate joined columns.
True

Aggregation Page 13-33


Module 13: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 13-34 Aggregation


Module 13: Lab Exercise

1. Display the salary sums, by job code within department, for all
employees who work for manager 1003, 1004, and 1017. Order by job
code within department.

2. Find the average budget amount for each department, and another
average that includes a 50% budget increase.

3. Count the number of distinct manager numbers and distinct


departments from the employee table.

4. Find the minimum and maximum salaries within each department by


department name with a count of the number of employees for each
department. Return only where there are more than 5 employees in the
department.

Aggregation Page 13-35


Notes:

Page 13-36 Aggregation


Module 14

CASE

After completing this module, you should be able to:

• Use CASE to display values based upon conditions.

• Use NULLIF to replace a non-null value with a null.

• Use COALESCE to replace a null value with a non-null value.

• Use CASE in aggregations.

• Discern when to use CASE for performance reasons.

• Use NULLIF and COALESCE in arithmetic expression.

CASE Page 14-1


Notes:

Page 14-2 CASE


Table of Contents
CASE ......................................................................................................................................... 14-4
Valued Form (Projection List) ................................................................................................... 14-6
Valued Form and Null................................................................................................................ 14-8
Searched Form ......................................................................................................................... 14-10
Searched Form (Complex Example) ........................................................................................ 14-12
CASE and Aggregation ............................................................................................................ 14-14
NULLIF Function .................................................................................................................... 14-16
NULLIF for Division ............................................................................................................... 14-18
COALESCE Function .............................................................................................................. 14-20
COALESCE and Multiple Arguments ..................................................................................... 14-22
NULLIF and COALESCE Aggregation Quiz ......................................................................... 14-24
Module 14: Summary............................................................................................................... 14-26
Module 14: Review Questions ................................................................................................. 14-28
Module 14: Lab Exercise ......................................................................................................... 14-30

CASE Page 14-3


CASE
The CASE syntax is a major player in writing SQL. It can greatly simplify writing SQL that
might be, at the least, very difficult to write. It can also help with performance. Although
performance is not necessarily a major concern for this class, it would be remiss if we at least
failed to mention the following.

“CASE is typically used to improve performance of a query when you can reduce multiple
passes of a table to just a single pass.”

As far as the terms themselves go, they are normally seen while reading reference material, or
when taking a class like this one.

Page 14-4 CASE


CASE

CASE:
Specifies alternate values for a conditional expression or expressions
based on various equality and TRUTH conditions.

CASE is ANSI Compliant.

There are two forms for CASE:

Valued Form – Based only on equality for a single column.


(Implication  There is no facility for “IS [ NOT ] NULL”)

Searched form – Can reference multiple columns and multiple


operators, including “IS [ NOT ] NULL”.

CASE Page 14-5


Valued Form (Projection List)
The first form we shall look at is termed the “valued” form. This form is based on equality By
equality.

Page 14-6 CASE


Valued Form (Projection List)

For the following example:


• Each WHEN condition is tested in order for a true response.
• The first “True” halts further CASE evaluations.
• ELSE is performed if all of the preceding WHEN clauses fail.
• ELSE NULL is the default (it is not specified in this example)
• CASE is terminated by an END clause.

SELECT Last_Name, Department_Number AS D#, Salary_Amount,


CASE Department_Number
WHEN 301 THEN Salary_Amount
WHEN 999 THEN ‘Wow’
END Note the default column heading for
FROM Employee the CASE column.
WHERE Department_Number IN (301, 501, 999)
ORDER BY D#; Note the alignment of these values.

last_name D# salary_amount <CASE expression>


-------------------- ----------- ------------- ------------------
Stein 301 29450.00 29450.00
Kubic 301 ? ?
Kanieski 301 29250.00 29250.00
Runyon 501 66000.00 ?
Wilson 501 53625.00 ?
Ratzlaff 501 54000.00 ?
Trainer 999 100000.00 WOW

CASE Page 14-7


Valued Form and Null
Here we finally show what happens when NULL is referenced as a condition. The answer to the
question on the facing page is that ELSE is not a conditional as is WHEN. It is a value that is to
be returned when all of the other conditions fail to evaluate true.

Page 14-8 CASE


Valued Form and Null

For the valued form the following example is illegal.


Recall that this test is base on equality.
WHEN NULL is interpreted as  WHERE Department_Number = NULL

SELECT Last_Name, Department_Number, Salary_Amount,


CASE Department_Number
WHEN 401 THEN salary_amount * 1.1
This is invalid WHEN NULL THEN salary_amount * 0.85
ELSE NULL
END This is Ok, Why?
FROM Employee;

3731: The user must use IS NULL or IS NOT NULL to test for NULL values.

CASE Page 14-9


Searched Form
Notice that, in the searched for of CASE, conditions other that equality are being tested. In fact,
one references LIKE while another references “IN”. There is even a test for null, which is
allowed with the searched form.

Page 14-10 CASE


Searched Form
The Searched Form can:
• reference any operator.
• reference multiple columns.
• be used to test NULL conditions.

SELECT Department_Name, Budget_Amount, Manager_Employee_Number AS M#,


CASE
WHEN Department_Name LIKE '%support%' THEN Budget_Amount * 3
WHEN Manager_Employee_Number IN (1011, 1017, 1019)
THEN Budget_Amount * 2
WHEN Department_Name IS NULL THEN 1000.00
ELSE Budget_Amount
END (DEC(10,2)) AS NewBudget
FROM Department ORDER BY 1 DESC;

department_name budget_amount M# NewBudget


------------------------------ ------------- ----------- ------------
technical operations 293800.00 1025 293800.00
research and development 465600.00 1019 931200.00
product planning 226000.00 1016 226000.00
president 400000.00 801 400000.00
marketing sales 308000.00 1017 616000.00
education 932000.00 1005 932000.00
customer support 982300.00 1003 2946900.00
? ? 1099 1000.00
? 308000.00 1011 616000.00

CASE Page 14-11


Searched Form (Complex Example)
There are more rules that apply to the WHEN conditions for the CASE statement. They have
been deferred until know so that the reader would have some time to absorb the many intricacies
associated with CASE.

Rules for WHEN Search Conditions


WHEN search conditions have the following properties:
• Can take the form of any comparison operator, such as LIKE, =, or <>.
• Can be a quantified predicate, such as ALL or ANY.
• Can contain joins of two tables.
For example:
SELECT CASE
WHEN t1.x=t2.x THEN t1.y
ELSE t2.y
END FROM t1,t2;
• Cannot contain SELECT statements.
• Can nest (reference) another CASE construct.

An example of nesting CASE expression follows.

SELECT Last_Name,First_Name,
CASE Last_Name
WHEN 'Brown' THEN CASE First_Name
WHEN 'Alan' THEN 'Allan'
WHEN 'Allen' THEN 'Alen'
END
WHEN 'Trainer' THEN 'Ethel'
END
FROM employee
WHERE Last_Name IN ('brown', 'trainer');

last_name first_name <CASE expression>


----------- --------------- ------------------
Brown Allen Alen
Brown Alan Allan
Trainer I.B. Ethel

Page 14-12 CASE


Searched Form (Complex Example)

Based upon the information SELECT last_name (CHAR(11))


found in the chart for “Plan ,(date - hire_date)/365.25 AS On_The_Job
Levels” (below): ,(date - birthdate)/365.25 AS AGE
,CASE
Find the people who qualify WHEN Age > 60 AND On_The_Job > 20
for early retirement and THEN 'Gold Plan'
which plan for which they WHEN Age > 55 AND On_The_Job > 15
qualify. THEN 'Silver Plan'
ELSE 'Bronze Plan'
END AS Plan
WHERE Age > 50 AND On_The_Job > 10
FROM employee
ORDER BY 4 DESC;

Plan Levels Result


Plan Age Yrs Serv last_name On_The_Job Age Plan
Gold Over 60 Over 20 ----------- ---------- -------- -------------
Hopkins 24.84 59.90 Silver Plan
Silver Over 55 Over 15
Rogers 24.87 66.73 Gold Plan
Bronze Over 50 Over 10 Villegas 25.03 64.95 Gold Plan
Trader 25.46 54.57 Bronze Plan
: : : :

CASE Page 14-13


CASE and Aggregation
The example on the facing page (they are the same query only the second one cleans-up the first
result) involves CASE with aggregation. Note that there is only a single column value projected,
but there is a great deal of syntax involved.

It sometimes helps if you break up a query to better understand it. For instance.

SELECT
CAST (
SUM (
CASE department_number
1 WHEN 401 THEN salary_amount
4 2
ELSE 0
END)
/ SUM(salary_amount) 3
AS DECIMAL(2,2)
)
AS Sal_Ratio
FROM employee;

1) Sum the result of the CASE


2) CASE used to decide which salary to sum  i.e. only for dept. 401
3) Divide the dept. 402 salary total by the total for all departments.
4) CAST the result as a decimal.

Page 14-14 CASE


CASE and Aggregation

Get the ratio of Dept 401 salaries to all employees.


SELECT SUM ( CASE department_number
WHEN 401 THEN salary_amount
ELSE 0
END ) / SUM(salary_amount)
FROM employee;
(Sum(<CASE expression>)/Sum(salary_amount))
--------------------------------------------
.22

• Default result is DECIMAL.


• Default headings use generic CASE expression.

Get the ratio of Dept 401 salaries to all employees.


SELECT CAST (SUM (
CASE department_number Sal Ratio
---------
WHEN 401 THEN salary_amount .22
ELSE 0
END ) / SUM(salary_amount) AS DECIMAL(2,2))
AS Sal_Ratio
FROM employee;

CASE Page 14-15


NULLIF Function
NULLIF is another CASE construct. It can be rewritten using CASE. Like CASE, it too is
ANSI compliant. Although it provides an easier way of converting any value to null, it is
typically used to convert a zero (0) to a null as shown on the facing page.

Page 14-16 CASE


NULLIF Function

NULLIF returns NULL if its arguments are equal, otherwise, it returns its first argument.

Return a NULL if col1 = expr


NULLIF ( expression1 , expression2 );
Example: NULLIF(C1, 0)

NULLIF is ANSI compliant.

Without NULLIF:
SELECT job_code AS Job Job Rate
----------- --------
,hourly_billing_rate AS Rate
104202 .00
FROM job 104201 .00
WHERE job_code < 200000; 111100 .00

With NULLIF:

SELECT job_code AS Job Job Rate


----------- --------
,NULLIF(hourly_billing_rate,0) AS Rate
104202 ?
FROM job 104201 ?
WHERE job_code < 200000; 111100 ?

CASE Page 14-17


NULLIF for Division
One very good reason for converting a zero to a null is to avoid a division-by-zero error. Such
an error, for even just a single row of the entire result set, aborts the request. The great thing
about dividing by a null is that the division can take place without error because a null, in an
arithmetic expression, returns a null.

Page 14-18 CASE


NULLIF for Division

Find the ratio of hourly billing rate to hourly cost rate for all "analyst" jobs.

Without NULLIF:
SELECT description
,hourly_billing_rate / hourly_cost_rate AS "Billing to Cost Ratio"
FROM job
WHERE description like '%analyst%';

Error Message: Division by zero in an expression involving job.hourly_cost_rate.

With NULLIF:
SELECT description
,hourly_billing_rate / (NULLIF(hourly_cost_rate, 0))
AS "Billing to Cost Ratio"
FROM job
WHERE description LIKE '%analyst%';

description Billing to Cost Ratio


-------------------------- ---------------------
Software Analyst 1.29
System Support Analyst ?
System Analyst 1.14

CASE Page 14-19


COALESCE Function
If NULLIF returns a null for some other value, then think of COALESCE as returning some
other value for a null.

A multi-argument example will be illustrated on the next page.

Page 14-20 CASE


COALESCE Function

COALESCE returns NULL if all its arguments evaluate to null, otherwise, it returns the
value of the first non-null argument in the list.

COALESCE ( expression1, expression2 [ , expression list ] )

If first argument IS NOT NULL, then return it.


If the first argument IS NULL, return the first non-NULL in the following list, else return NULL.

Example: COALESCE( Col1, Col2, Col3, Col4, 'Positive Literal Response');

Without COALESCE:
SELECT budget_amount budget_amount
FROM department -------------
WHERE department_number = 600; ?

With COALESCE:
SELECT COALESCE(budget_amount,0) <CASE expression>
FROM department ------------------
WHERE department_number = 600; .00

CASE Page 14-21


COALESCE and Multiple Arguments
The facing page illustrates how to reference multiple arguments using COALESCE. The
example pretty much says it all. One important thing to consider is to establish a positive
response instead of, eventually, a null (if this should happen).

One other item worth mentioning is that CASE (which include COALESCE and NULLIF), can
potentially project multiple data types into a single projected column. Each different value for its
data type will occupy the proper character space. Character values being left justified and
numbers being right justified within their respective space..

Page 14-22 CASE


COALESCE and Multiple Arguments

Prioritize a search for a phone number if first choice is NULL.

Using COALESCE:
SELECT last_name
,COALESCE( office_phone,
cell_phone,
pager_number,
home_phone,
fax_number,
'No Number Found') AS Phone Number
FROM phone_table;

Using CASE:
SELECT last_name,
CASE WHEN office_phone IS NOT NULL THEN office_phone
WHEN cell_phone IS NOT NULL THEN cell_phone
WHEN pager_number IS NOT NULL THEN pager_number
WHEN home_phone IS NOT NULL THEN home_phone
WHEN fax_number IS NOT NULL THEN fax_number
ELSE 'No Number Found'
END AS Phone Number
FROM phone_table;

CASE Page 14-23


NULLIF and COALESCE Aggregation Quiz
A quiz is posed on the facing page.

Page 14-24 CASE


NULLIF and COALESCE Aggregation
Quiz

Determine the result for each select below given the data at the right.

SELECT COUNT(c1) FROM T1; 4


T1
SELECT SUM(c1) FROM T1; 240
SELECT AVG(c1) FROM T1; 60 C1
0
SELECT COUNT(NULLIF(c1,0)) FROM T1; 2 0
200
SELECT SUM(NULLIF(c1,0)) T1; 240
40
SELECT AVG(NULLIF(c1,0)) FROM T1; 120 ?
?
SELECT COUNT(COALESCE(c1,0)) FROM T1; 6
SELECT SUM(COALESCE(c1,0)) T1; 240
SELECT AVG(COALESCE(c1,0)) FROM T1; 40

CASE Page 14-25


Module 14: Summary
A summary of this module is discussed.

Page 14-26 CASE


Module 14: Summary

• CASE specifies alternate values for a conditional expression or


expressions based on various equality and TRUTH conditions.

• There are two basic form for using CASE: Valued and Searched.

• The valued form is based on equality and can only reference a single
column or expression.

• The searched form can reference more than one column or expression
based on other than equality.

• NULLIF is an abbreviated form of CASE than can change a value to null.

• COALESCE is an abbreviated form of CASE that can change a null to


another value.

• CASE can improve query performance when it can replace multiple


passes of a table with a single pass.

CASE Page 14-27


Module 14: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 14-28 CASE


Module 14: Review Questions

True or False:

1. CASE can be used to replace a value with a null.


True
2. NULLIF changes a null to a value if the first argument equals the second
argument.
False – It changes a value to a null
3. COALESCE can reference many arguments.
True
4. CASE, NULLIF, and COALESCE can be referenced in the predicate or the
projection.
True
5. COALESCE can be used with aggregation for including a null into an average.
True – Typically a zero
6. NULLIF can be used with aggregation for removing a non-zero number from an
average.
True – e.g. NULLIF(c1, 1) will return a null for the value 1
7. In a SELECT, COALESCE(NULLIF(C1, 0), 1) will replace a “0” with a “1” for
column C1.
True – It will also replace a null with a 1

CASE Page 14-29


Module 14: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 14-30 CASE


Module 14: Lab Exercise

1. Display whatever employee information you deem necessary to compare salary


changes for the people in their respective departments as shown in the chart
below.

Where their salary amount is null, make it equal to their job code BEFORE
DOING THE CHANGE.

Department Change in Salary


301 10%
NULL use job code for salary
501 20%

2. Display whatever employee information you deem necessary to return the


sums and two averages for department salaries. One average should be that
where a null salary is made into a zero, and another where a zero salary is
made into a null.

Verify that, where different averages occur, the zero-to-null salary averages
should be larger.

CASE Page 14-31


Notes:

Page 14-32 CASE


Module 15

Permanent and Derived Tables

After completing this module, you will be able to:

• Distinguish between SET and MULTISET tables.


• Identify table-level options in a CREATE TABLE.
• Identify column-level options in a CREATE TABLE.
• Identify index-level options in a CREATE TABLE.
• Create and drop secondary indexes on existing tables.
• Distinguish between deleting tables and dropping tables.
• Return help information for a table’s indexes.
• Use permanent tables for temporary use.
• Use derived tables for temporary use.
• Distinguish between the various forms used to define a derived
table.
• Involve derived tables in multiple table joins.

Permanent and Derived Tables Page 15-1


Notes:

Page 15-2 Permanent and Derived Tables


Table of Contents
Data Definition Language .......................................................................................................... 15-4
Set vs. Multiset ........................................................................................................................... 15-6
Column Level Options ............................................................................................................... 15-8
Index Level Options ................................................................................................................. 15-10
Deleting vs. Dropping Tables .................................................................................................. 15-12
Creating and Dropping Secondary Indexes.............................................................................. 15-14
Help Index ................................................................................................................................ 15-16
Using “Real” Tables - “Temporarily” ...................................................................................... 15-18
“Derived” Tables...................................................................................................................... 15-20
Some Derived Table Examples ................................................................................................ 15-22
Complex Derived Table Join ................................................................................................... 15-24
Module 15: Summary............................................................................................................... 15-26
Module 15: Review Questions ................................................................................................. 15-28
Module 15: Lab Exercise ......................................................................................................... 15-30

Permanent and Derived Tables Page 15-3


Data Definition Language
As stated on the facing page, DDL requests are those which require updates to the dictionary. In
those cases, the database must place a “write” lock on a dictionary table. This can be observed in
an EXPLAIN of the request.

While the dictionary is write locked on an object (e.g. a table, a view etc), attempts by the parser
to “resolve” the object for concurrent accesses block on the “write” lock, preventing those
queries from being parsed until the DDL is finished. This is normally an issue with explicit
transaction processing and not with implicit transaction processing.

Page 15-4 Permanent and Derived Tables


Data Definition Language

CREATE TABLE is a DDL request.


Data Definition Language (DDL) is used by SQL to create, modify, and
remove object definitions.

These definitions are stored in dictionary found in database DBC.

Changes to dictionary tables require a write lock, and can block the
database's attempts to access this locked information during parsing.

CREATE < SET/MULTISET > TABLE tablename, < Table Level Attributes >
( column name < Column Level Data Types and Attributes >
. . . )
< Primary and Secondary Index Level Attributes >;

Permanent and Derived Tables Page 15-5


Set vs. Multiset
The keywords MULTISET and SET describe whether-or-not duplicate are allowed respectively.
These keywords appear in the CREATE TABLE as shown in the example.

CREATE [SET | MULTISET] TABLE abc . .

The defaults are SET (in Teradata mode) and MULTISET (in ANSI mode).

The concept of a duplicate row is a very important one in database theory, and case sensitivity
plays an important role in this discussion. Since Teradata is, by default, not case sensitive,
uppercase vs. lowercase characters, for the same character, evaluate as being the same character
value. While as with case specific-ness, uppercase vs. lowercase characters evaluate as being
different for the same character value. The biggest concern between the two, however, is beyond
the scope of this class and has to deal with the following question:
“Why allow duplicate rows at all?”

As SQL goes, we simply discuss SET and MULTISET concepts and syntax, not strategies.

Page 15-6 Permanent and Derived Tables


SET vs. MULTISET

The default for Teradata Mode tables is “SET”. (No duplicate rows
allowed.)
The default for ANSI Mode tables is “MULTISET”. (Duplicate rows allowed.)
A duplicate row is where each and every column value for one row is equal
to it’s corresponding column value in another row.
For values that are defined as “case sensitive”, uppercase values differ
from corresponding lower case values for the same character value.
For values defined as “not case sensitive” equal character values are the
same whether upper case or lowercase.

Last First
Dept#
Name Name If both character columns are not case
100 'Smith' 'Mary' sensitive, which rows would be duplicates?
100 'Smith' 'Mary ' If both character columns are case
sensitive, which rows would be duplicates?
100 'smith' 'Mary'
100 'Smith' 'mary'

Permanent and Derived Tables Page 15-7


Column Level Options
The column-level options are those that have an effect on a table’s columns as a whole. These
options are better discussed in a course on Physical Design rather than in a SQL course. The
following is a list of column-level options only, and can be obtained by issuing the following
SQL command:

HELP 'SQL CREATE TABLE'


. . . .
COLUMN_DECLARATION IS
cname data_type_declaration [column_attribute [...,
column_attribute]]

COLUMN_ATTRIBUTE IS ONE OF THE FOLLOWING:


COMPRESS [constant | ({NULL | constant} [.. ,{NULL | constant}])
{ UNIQUE }
[CONSTRAINT name]{ PRIMARY KEY }
{ CHECK (boolean_condition) }
{ references option }

TABLE LEVEL OPTION IS ONE OF THE FOLLOWING:


[CONSTRAINT name] [UNIQUE | PRIMARY KEY] (cname[ ..., cname])
[CONSTRAINT name] FOREIGN KEY (cname [ ..., cname])
references_option
[CONSTRAINT name] CHECK (boolean condition)

REFERENCES_OPTION IS
REFERENCES [WITH [NO] CHECK OPTION] tname [(cname [..., cname])]

DATA TYPE ATTRIBUTES FOLLOW:


NOT NULL
UPPERCASE
[ NOT ] CASESPECIFIC
{ FORMAT | TITLE } quotestring
NAMED name
WITH DEFAULT character_data_type
{ number }
{ USER }
DEFAULT { DATE }
{ TIME }
{ NULL }
GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
[(optional_idcol_parameters)]

Page 15-8 Permanent and Derived Tables


Column Level Options

Columns may assigned:


• a name
• a column attribute
For a more complete list see the left-hand page.
• a data type
• a data type attribute
• to a constraint

SHOW TABLE Department;


Column Name
CREATE SET TABLE DLM.department ,FALLBACK ,
NO BEFORE JOURNAL, Column Data Type
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
Column Data Type Attribute
(defaulted assignment)
(
department_number SMALLINT,
department_name CHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
budget_amount DECIMAL(10,2),
manager_employee_number INTEGER,
location_name VARCHAR(100)
)
UNIQUE PRIMARY INDEX ( department_number )
UNIQUE INDEX ( department_name );

Permanent and Derived Tables Page 15-9


Index Level Options
The index-level options are those that have an effect on the table’s indexes as a whole. These
options are better discussed in a course on Physical Design rather than in a SQL course. The
following is a list of create table index options only, and can be obtained by issuing the following
SQL command:

HELP 'SQL CREATE TABLE'


. . . .
[ [UNIQUE] PRIMARY INDEX [name] [ALL] (cname [ ...,
cname])

[ { partitioning_expression } ]
[ { } ]
[ PARTITION BY { (partitioning_expression, } ]
[ { partitioning_expression } ]
[ { [..., partitioning_expression]) } ]

[ [,] [UNIQUE] INDEX [name] [ALL] (cname [ ... ,cname]) ]


[ ... [,] [UNIQUE] INDEX [name] [ALL] (cname [ ...,
cname]) ] ]

[ [,] INDEX [name] [ALL] (cname [..., cname])


ORDER BY [VALUES | HASH] (cname) ]

Page 15-10 Permanent and Derived Tables


Index Level Options

For indexes:
• Only one primary index allowed per table.
For a more complete list see
• Up to 32 secondary indexes are allowed per table.
the left-hand page.
• Up to 64 columns are allowed per index.
• Indexes may be unique or non-unique.

SHOW TABLE Department;

CREATE SET TABLE DLM.department ,FALLBACK ,


NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
department_number SMALLINT,
department_name CHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
budget_amount DECIMAL(10,2),
manager_employee_number INTEGER
)
UNIQUE PRIMARY INDEX ( department_number ) Unique Primary Index
UNIQUE INDEX ( department_name )
INDEX ( manager_employee_number); Unique Secondary Index
Non-Unique Secondary Index

Permanent and Derived Tables Page 15-11


Deleting vs. Dropping Tables
The three delete commands at the top of the facing page all perform equally as fast. They each
remove all of the rows from the table, leaving only the table definition. No transient journaling
is performed on such a delete. Transient journaling provide a log of all of the changes taking
place against a table during a transaction and provide for rollback capability for a failed
transaction. This delete performs fast enough that it is unlikely to be able to abort it once it
begins. Having said this, if one could abort it in time, a rollback would occur without issue.

The drop command, at the bottom of the facing page, removes not only the rows, but the table
definition as well. It performs slightly slower than the preceding delete commands because of
the amount of additional time that it takes to remove the dictionary entries.

Page 15-12 Permanent and Derived Tables


Deleting vs. Dropping Tables

To remove all data associated with a table, without dropping the


table definition from the Data Dictionary, use the DELETE statement.

Examples: DELETE FROM emp_data ALL;


(all three are DELETE FROM emp_data;
synonymous) DELETE emp_data;

• Deletes all data in emp_data.


• Table definition remains in the Data Dictionary.
• Access rights remain unchanged.

To remove all data associated with a table, as well as the table structure
definition from the Data Dictionary, use the DROP TABLE statement.

Example: DROP TABLE emp_data;

• Deletes all data in emp_data.


• Removes table headers for emp_data.
• Removes the emp_data definition from the Data Dictionary.
• Removes all explicit access rights on the table.

Permanent and Derived Tables Page 15-13


Creating and Dropping Secondary Indexes
Although secondary indexes may be defined within a CREATE TABLE, generally these are
defined after the table has been created and populated with data. Once they are created,
however, they may be repetitively dropped (e.g. prior to loading data from a file) and recreated.

Indexes may be provided with a name. This may be useful for dropping it later, by name. It may
be easier, perhaps for some, to just simply replace the “CREATE” keyword (used when creating
the index) with the “DROP” keyword. In this case, if the index is unique, do not provide the
“UNIQUE” keyword. Examples are shown on the facing page.

Page 15-14 Permanent and Derived Tables


Creating and Dropping Secondary
Indexes

Secondary indexes may be created on existing tables.

• They may be defined as unique (USI) or non-unique (NUSI).


• They may be optionally named, or left unnamed.
• They may include up to 64 columns.

Named unique secondary index (USI) on employee name:


CREATE UNIQUE INDEX fullname (last_name, first_name) ON emp_data;
DROP INDEX fullname ON emp_data;
DROP INDEX (last_name, first_name) ON emp_data;

Unnamed, non-unique secondary index (NUSI) on job code:


CREATE INDEX (job_code) ON emp_data;
DROP INDEX (job_code) ON emp_data;

Permanent and Derived Tables Page 15-15


Help Index
The HELP INDEX command allows one to view information about table indexes. Except for the
value for “approximate count”, each piece of information should be fairly straight forward. The
approximate count value is obtained via a sampling of data and represents an approximate count
of the number of distinct values of the sample. Normally this number should be fairly close to
the number of unique values shown via a HELP STATS command. If they are very widely
different values, this could be an indication that the statistics collected are stale (i.e. old) and
need to be refreshed.

Page 15-16 Permanent and Derived Tables


Help Index

Unique? Y
HELP INDEX emp_data; Primary//or//Secondary? P
Column Names employee_number
HELP INDEX shows Index Id 1
Approximate Count 0
information on all indexes Index Name Emp_Key
defined for a table.
Unique? N
Primary//or//Secondary? S
The values in the Index Id Column Names department_number
column correlate to the Index Id 4
Approximate Count 0
index numbers referenced Index Name ?
in EXPLAIN text.
Unique? Y
Primary//or//Secondary? S
Column Names last_name,first_name
Index Id 8
Approximate Count 0
Index Name Full_Name
Unique? N
Primary//or//Secondary? S
Column Names job_code
Index Id 12
Approximate Count 0
Index Name ?

Permanent and Derived Tables Page 15-17


Using “Real” Tables - “Temporarily”
The facing page discusses using permanent tables in an interim or temporary fashion. In the
example we create an actual table to store information (i.e. the average salary for all employees)
into a real table. We then write a cross join (no equality join condition) that qualifies each
employee’s salary where it is greater than what is stored into the new table (a single row). Since
the cross join compares each and every row to a single value (the average salary for all
employees), the cross join become trivial in that a single row comparison is done for each
employee.

One of the nice things about this strategy is that one can actually display the average salary,
which can’t be accomplished via a subquery. The bad thing about this strategy is that you need
to create the table and then drop it after using it. This requires more steps, and it requires DDL,
which is not terrific.

Page 15-18 Permanent and Derived Tables


Using “Real” Tables - “Temporarily”

• The strategy below can be used to return the result for the business concern
shown.
• Correlated Subqueries can return the result, but without projecting the average
salary!
• Both queries are cross joins because no equality condition exists for the join
condition.
• This table will eventually have to be dropped.
Find employees whose salaries are greater than their department average.

CREATE TABLE deptsal INSERT INTO deptsal


( avgsal DEC(10,2) ) SELECT AVG(salary_amount)
UNIQUE PRIMARY INDEX (avgsal) FROM Employee

SELECT Last_Name, Salary_Amount, AvgSal


FROM Employee e, DeptSal d
WHERE e.Salary_Amount > d.AvgSal;

SELECT Last_Name, Salary_Amount, AvgSal


FROM Employee e CROSS JOIN DeptSal d
WHERE e.Salary_Amount > d.AvgSal;

Permanent and Derived Tables Page 15-19


“Derived” Tables
“Derived” tables (or simply Derived tables) are those that are created automatically by the
database for the life of the query. The database creates the table for us and then drops the table
when it is no longer required. To do this we need to provide the database with information
necessary to use it, namely: A table name; Column names; Information on what to put into it.
The example on the facing page illustrates how to accomplish this. The table is then
materialized into spool and dropped when no longer needed.

Page 15-20 Permanent and Derived Tables


“Derived” Tables

Derived tables -
• are “database created” tables that are only available to a single query.
• are discarded by the database when they are no longer required.
• are materialized into spool.
• are referenced and treated as any “real” table.
• must be defined by the author of the query, and require -
 a table name
 columns and their names
 a SELECT that is used to populate the table

SELECT Last_Name, Query to populate AvgT


Salary_Amount,
AvgSal
Table Name – “AvgT”
FROM Employee e,
(SELECT AVG(Salary_Amount)
FROM Employee) AS AvgT (AvgSal)
WHERE e.Salary_Amount > AvgT.AvgSal;

Column Name(s) – “AvgSal”

Permanent and Derived Tables Page 15-21


Some Derived Table Examples
There are various forms (or styles) one may use when using derived tables. The facing page
illustrates the most common forms. Many variations are possible, all of which are variations of
those on the facing page.

Page 15-22 Permanent and Derived Tables


Some Derived Table Examples

SELECT Last_Name,
Salary_Amount,
AvgSal
FROM Employee e,
(SELECT AVG(Salary_Amount)
FROM Employee) AS AvgT (AvgSal)
WHERE e.Salary_Amount > AvgT.AvgSal;

SELECT Last_Name, Each of these queries


Salary_Amount, return the same result.
AvgSal
FROM Employee e, They each involve a
(SELECT AVG(Salary_Amount) AS AvgSal different form of derived
FROM Employee) AS AvgT
table usage.
WHERE e.Salary_Amount > AvgT.AvgSal;

SELECT Last_Name,
Salary_Amount,
AvgSal
FROM Employee e JOIN
(SELECT AVG(Salary_Amount) AS AvgSal
FROM Employee) AvgT
ON e.Salary_Amount > AvgT.AvgSal;

Permanent and Derived Tables Page 15-23


Complex Derived Table Join
The query on the facing page simply adds another table to the mix. By adding the department
name, we must involve the department table. So there are now three tables being joined
together:
• Employee (for getting last name, first name and salary amount)
• Department (for getting department name)
• AvgT (for deriving the average salary for each department).

Of course, there are all of the necessary one-to-many join needed for correctly obtaining the
result.
• Department number is unique in department (one-to-many to employee)
• Department number is unique in AvgT (one-to-many to employee)

Page 15-24 Permanent and Derived Tables


Complex Derived Table Join

Show the department name for those having a salary larger than their department
avg.
SELECT d.Department_Name, e.Last_Name, e.Salary_Amount, AvgT.AvgSal
FROM Employee e,
Department d,
(SELECT Department_Number, AVG(Salary_Amount) AS AvgSal
FROM Employee
GROUP BY 1) AS AvgT
WHERE e.Department_Number = d.Department_Number
AND e.Department_Number = AvgT.Department_Number
AND e.Salary_Amount > AvgT.AvgSal
ORDER BY 1;

department_name last_name salary_amount AvgSal


------------------------------ -------------------- ------------- ------------
customer support Brown 43100.00 35545.83
customer support Trader 37850.00 35545.83
customer support Rogers 46000.00 35545.83
customer support Johnson 36300.00 35545.83
education Villegas 49700.00 38700.00
education Brown 43700.00 38700.00
marketing sales Wilson 53625.00 50031.25
marketing sales Ratzlaff 54000.00 50031.25
marketing sales Runyon 66000.00 50031.25
research and development Stein 29450.00 29350.00

Permanent and Derived Tables Page 15-25


Module 15: Summary
A summary of this module is discussed.

Page 15-26 Permanent and Derived Tables


Module 15: Summary

• Permanent tables use permanent space and must be physically created


and dropped.
• Derived tables are created and dropped by the database.
• Table-level attributes have default values.
• Some column-level attributes are defaulted while most must be
specified.
• Secondary indexes may be included into the CREATE TABLE or
separately after the table has been created and/or loaded.
• Derived tables are specified in the FROM clause of a SQL request.
• Derived tables must specify a table name, a column name list, and tell
the database what to load into it.

Permanent and Derived Tables Page 15-27


Module 15: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 15-28 Permanent and Derived Tables


Module 15: Review Questions

True or False:

1. In a CREATE TABLE, every column must be given a data type.


True
2. A USI can not be defined as NULL.
False
3. The HELP INDEX command references only a specific table.
True
4. MULTISET tables can be created with a unique index.
True
5. Derived tables need not specify column names.
False – Either after the table name, or inside the query using aliases
6. The DELETE TABLE syntax remove the table definition from the
dictionary.
False – It only removes rows
7. A secondary index must be provided a name.
False

Permanent and Derived Tables Page 15-29


Module 15: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 15-30 Permanent and Derived Tables


Module 15: Lab Exercise

1. Use a derived table to list those departments whose budgets are


greater than the average budget for all departments.

2. Add to exercise #1 those employees who work in those departments.

3. Modify #1 to add the differences between the department’s budget and


the average.

Permanent and Derived Tables Page 15-31


Notes:

Page 15-32 Permanent and Derived Tables


Module 16

SAMPLE and RANDOM

After completing this module, you will be able to:

• Return a sample as a number of rows or as a percentage of table


rows.
• Discern the difference between sampling with replacement or no
replacement.
• Use SAMPLEID while retrieving a sample.
• Discern the difference between AMP proportional sampling vs.
sampling using RANDOMIZED ALLOCATION.
• Relate sampling to its order in the order of operations for a request.
• Stratify a sample across various levels.
• Use the RANDOM function to generate random values.
• Use RANDOM to generate test data.

SAMPLE and RANDOM Page 16-1


Notes:

Page 16-2 SAMPLE and RANDOM


Table of Contents
SAMPLE - Introduction ............................................................................................................. 16-4
SAMPLE - Syntax ..................................................................................................................... 16-6
Multiple Samples (Number of Rows) ........................................................................................ 16-8
Multiple Samples (Percentage of Rows) .................................................................................. 16-10
SAMPLE WITH REPLACEMENT ........................................................................................ 16-12
WITH REPLACEMENT (Multiple Samples) ......................................................................... 16-14
Other Considerations................................................................................................................ 16-16
Significance of the Order of Operations .................................................................................. 16-18
Using Derived Tables............................................................................................................... 16-20
Stratified Sampling – What is it? ............................................................................................. 16-22
Stratified Sampling (No Replacement) .................................................................................... 16-24
Stratified Sampling (With Replacement) ................................................................................. 16-26
RANDOMIZED ALLOCATION ............................................................................................ 16-28
The RANDOM Function ......................................................................................................... 16-30
RANDOM and Limitations ...................................................................................................... 16-32
Module 16: Summary............................................................................................................... 16-34
Module 16: Review Questions ................................................................................................. 16-36
Module 16: Lab Exercise ......................................................................................................... 16-38

SAMPLE and RANDOM Page 16-3


SAMPLE - Introduction
The sample feature is use to randomly retrieve some specified amount of data from a table. It is
used when a smaller and more manageable amount of data is more desirable than that of the
entire table. For instance, when a survey of a certain demographic of customers would aid in
product development.

Notice that the facing page carefully states that the samples are considered to be random. It
should be understood that nothing is “truly” random. With this in mind, there are two different
methods the database can use for retrieving samples from tables: AMP proportional and a more
randomized allocation, either of which can be specified by the user as will be shown in this
module.

One may also use the sample feature to retrieve multiple samples within the same projection.
Each individual sample may be associated with a unique sample identification number that is
generated by the database for purpose of relating rows with their associated sample.

Another consideration that may be specified is whether-or-not the rows, once chosen for the
sample, will be replaced into the source and be made available for re-sampling (therefore
occurring again within the sample) or whether they will not be made available and, hence, will
occur no more than once within the sample.

Page 16-4 SAMPLE and RANDOM


SAMPLE – Introduction

The SAMPLE clause is a Teradata extension to the ANSI SQL-2003 standard.

SAMPLE reduces the number of rows to be considered for further processing by


returning one or more samples of rows specified either as a list of fractions of
the total number of rows or as a list of whole numbers of rows from the SELECT
query.

By default, no replacement of values is performed, so that any row in the sample


will not re-occur in the same sample, nor will it re-occur across the various other
samples for the same query.

Samples are considered to be random.

The degree to which different sample queries produce different results is


dependent on the size of the sample relative to the number of rows in the table.

By default (i.e. this can be altered via SQL), the sample is generated AMP
proportional, so that each AMP is responsible for a proportional fraction of the
rows in the sample.

SAMPLE and RANDOM Page 16-5


SAMPLE - Syntax
The facing page illustrates how to retrieve samples based upon either a number of the table’s
rows or a percentage of the table’s rows. By default:

• The sample is “AMP Proportionally” – that is, the database seeks to have each AMP
contribute a proportional share of the sample.
• The sample is performed “without replacement” – that is, no column values from any
specific row will appear more than once within the sample.

The fact that the employee numbers are unique reinforces the notion of “no replacement.” No
employee number should appear more than once within the sample.

Page 16-6 SAMPLE and RANDOM


SAMPLE – Syntax

All employee numbers are unique.


By default –
• There is “no replacement” of rows.
• The sample is performed AMP proportionally.

To retrieve a sampled number To retrieve a sampled percentage


of rows. of rows.
SELECT Employee_Number SELECT Employee_Number
FROM Employee FROM Employee
SAMPLE 10 SAMPLE .25
ORDER BY 1; ORDER BY 1;

employee_number employee_number
--------------- ---------------
1001 0.25 * 26 = 6.5 1003
1002 1004
1003 Fractional results greater 1006
1004 than .4999 generate an 1007
1006 added row. 1011
1011 1016
1014 1019
1016
1019
1024

SAMPLE and RANDOM Page 16-7


Multiple Samples (Number of Rows)
You can also request the database to retrieve multiple samples from the same projection. When
projecting such a result, one may elect to reference the keyword SAMPLEID for associating
rows with their respective sample.

The default of “no replacement” is still in effect in that no row will appear within the same
sample more than once – moreover – no row will appear in more than one sample! Think of “no
replacement” as meaning that once a row is select for appearing in a sample, it does not get
“replaced” back into the set, so it is not available for being selected again by the same sample.

“No replacement” also means that rows cannot appear across samples within the same
projection. The employee numbers being unique reinforce this notion, but so, also, does the fact
that we run out of values for sampling! There are only 26 employees in our table. Since the total
number of rows for all 3 samples gets exhausted during the final 10 row sample (only 6 rows are
left), the database returns the remaining 6 rows along with a warning describing the event.

Page 16-8 SAMPLE and RANDOM


Multiple Samples (Number of Rows)

To retrieve a sampled number of rows for multiple employee_number SampleId


--------------- -----------
samples in the same query. (“SAMPLEID” is an 801 1
optional keyword) 1002 1
1005 1
“No replacement” means that rows do not appear within 1011 1
a sample, nor across samples. 1014 1
1015 1
After all 26 employees have been sampled, no rows 1016 1
remain for sampling. 1018 1
1019 1
1021 1
SELECT Employee_Number, SAMPLEID 1003 2
FROM Employee 1004 2
1006 2
SAMPLE 10, 10, 10 1007 2
ORDER BY 2, 1; 1008 2
1010 2
1012 2
*** Warning: 7473 Requested sample is larger than table rows. 1013 2
All rows returned 1020 2
1025 2
1001 3
1009 3
1017 3
6 rows 1022 3
1023 3
1024 3

SAMPLE and RANDOM Page 16-9


Multiple Samples (Percentage of Rows)
Percentages of a table’s rows may be sampled as well as numbers of rows. Unlike sampling
numbers of rows (where one can specify a number greater that the number of rows in a table),
percentages of rows may not exceed 100%. Obviously, that database can easily determine when
the total percent being sampled exceed 100. For expediency, it doesn’t count the number of rows
in a table to determine if your total for all samples exceeds the total rows in a table.

It should also be noticed that one cannot mix percentage samples with whole number samples in
the same projection.

Page 16-10 SAMPLE and RANDOM


Multiple Samples (Percentage of Rows)

To retrieve a sampled percentage of rows for


multiple samples in the same query. employee_number SampleId
--------------- -----------
1005 1
SELECT Employee_Number, SAMPLEID 1006 1
FROM Employee 1011 1
SAMPLE .25, .25, .25 1014 1
1016 1
ORDER BY 2, 1; 1019 1
1021 1
1003 2
The following are invalid. 1004 2
1007 2
SELECT Employee_Number, SAMPLEID 1012 2
1015 2
FROM Employee
1018 2
SAMPLE .25, .25, .25, .50 1020 2
ORDER BY 2, 1; 1001 3
1009 3
SELECT Employee_Number 1010 3
1013 3
FROM Employee
1017 3
SAMPLE 5, .25 1022 3
ORDER BY 2, 1; 1025 3

*** Failure 5473 SAMPLE clause has invalid set of arguments.

SAMPLE and RANDOM Page 16-11


SAMPLE WITH REPLACEMENT
The replacement feature can be used to allow a row, once sampled, to be “replaced” back into
the sampling pool and, hence, be made available for re-sampling.

Note that the default of “AMP proportional” is still in effect.

Page 16-12 SAMPLE and RANDOM


SAMPLE WITH REPLACEMENT

When sampling with replacement, a sampled row, once sampled, is returned to the
sampling pool. As a result, a row might be sampled multiple times.

SELECT Employee_Number SELECT Department_Number


FROM Employee FROM Department
SAMPLE WITH REPLACEMENT 10 WHERE Department_Number IN (401, 402)
ORDER BY 1; SAMPLE WITH REPLACEMENT 7
employee_number ORDER BY 1;
--------------- department_number
1002 -----------------
1004 401
Recall that Recall that
1008 401
employee department
1010 numbers are 401 numbers are
1012 unique. 401 unique.
1013 402
1013 402
1016 402
1019
1025

SAMPLE and RANDOM Page 16-13


WITH REPLACEMENT (Multiple Samples)
With respect to multiple samples within the same projection, replacement also means that rows
can appear across samples as well. You can use this capability, in combination with other
functionality not yet discussed, to generate test data. For instance, you could get a sample of a
million rows from the employee table, which has only 26 rows.

Page 16-14 SAMPLE and RANDOM


WITH REPLACMENT (Multiple Samples)

employee_number SampleId
--------------- -----------
Note the replacement both within and 1001 1
across samples. 1004 1
1012 1
SELECT Employee_Number, SAMPLEID 1013 1
FROM Employee 1014 1
SAMPLE WITH REPLACEMENT 10, 10 1016 1
ORDER BY 2, 1; 1018 1
1019 1
1019 1
1025 1
1001 2
1002 2
1002 2
1004 2
1005 2
1012 2
1014 2
1016 2
1019 2
1019 2

SAMPLE and RANDOM Page 16-15


Other Considerations
The chart outlining the order of operations of certain features within the life of a single query
will become increasingly more important as one learns more about SQL, both in this course and
in the Advanced SQL course. The list is by no means complete (and may never become fully
complete), but that does not diminish its importance.

So far, the chart illustrates that one may write a single SQL request, for a single projection, that
includes:
• Many joins and other WHERE conditions
• Aggregations with or without a HAVING condition
• Samples
• A specified ordering
• Formatting

For all such requests, these features will be performed in the order described in the chart. That
is:
• The WHERE condition will restrict the number of qualifying rows which will participate
in the activities following it in the list.
• Aggregation will be performed on the qualified WHERE result.
• HAVING is performed on the aggregate result.
• SAMPLES are obtained on what results from the earlier steps.
• ORDER BY orders the (final?) result.
• Formatting is performed on the final spool as it gets returned to the user. (i.e. the final
spool is not formatted.)

Page 16-16 SAMPLE and RANDOM


Other Considerations

The following rules apply to SAMPLE. Where SAMPLE falls into the
order of operations.
1. No more than 16 samples can be
requested per fraction description or 1. WHERE {join conditions}
count description. 2. AGGREGATION
2. A sampled result set cannot be 3. HAVING
guaranteed to be repeated. 4. SAMPLE
5. ORDER BY
3. Sampling can be used in a derived table, 6. FORMAT
view (discussed in a later module), or
INSERT-SELECT to reduce the number of
rows to be considered for further
computation.
4. You cannot use a SAMPLE clause in a
subquery.
5. You cannot specify the SAMPLE clause
in a SELECT statement that uses the set
operators UNION, INTERSECT, or MINUS.

SAMPLE and RANDOM Page 16-17


Significance of the Order of Operations
The role that the order of operations plays is better illustrated through an example.

The facing page shows how understanding the order of operations for a request can help one
determine why, exactly, a request returns the result that it does. In the example on the facing
page, the result is the same result as for the aggregation! Since the aggregation returns a single
row result, the SAMPLE, which follows in the order of operations, only has one row from which
to sample. As we have learned earlier in this module, requesting a 10 row sample from a one
row set (the aggregation result) results in a warning.

Page 16-18 SAMPLE and RANDOM


Significance of the Order of Operations

Retrieve a sample of 10 employees, and sum


their salary amounts. 1. WHERE {join conditions}
2. AGGREGATION
SELECT SUM(salary_amount) 3. HAVING
FROM Employee 4. SAMPLE
SAMPLE 10; 5. ORDER BY
6. FORMAT
Sum(salary_amount)
------------------
1134307.50

*** Warning: 7473 Requested sample is larger than table rows. All rows returned

According to the order of operations.


• The sum retrieved 113407.50 (single row sum of all salary amounts)
• SAMPLE only had one row to sample (row containing 113407.50)
• A warning resulted to indicate this happened!

SAMPLE and RANDOM Page 16-19


Using Derived Tables
Derived tables are very effective in changing the order of operations for a request.

Here we see a derived tables used to force an aggregation to be performed prior to requesting a
sample.
• The derived table result must be obtained first, to create the desire set of data.
• The sample is then performed on the result “derived” by the “derived table.”

Each time one submits the request to the database a (potentially) different result occurs.

Page 16-20 SAMPLE and RANDOM


Using Derived Tables

Use a derived table to alter the order of operations like this:


Retrieve a sample of 10 employees, and sum
their salary amounts. 1. WHERE {join conditions}
2. AGGREGATION
SELECT SUM(salary_amount) 3. HAVING
FROM 4. SAMPLE
(SELECT salary_amount 5. ORDER BY
FROM employee 6. FORMAT
SAMPLE 10) temp;

First run: Sum(salary_amount)


------------------
409845.00

Second run: Sum(salary_amount)


------------------ A sample of 10 employees, and
431615.00 sum their salary amounts.

Third run: Sum(salary_amount)


------------------
453765.00

SAMPLE and RANDOM Page 16-21


Stratified Sampling – What is it?
Stratified sampling is an ability to arrange data in levels of strata (as in a hierarchy) in order to,
in this case, derive samples for processing according to these defined level of strata. It may be
best to let the facing page speak for itself and defer further discussion until actual examples be
made available as will be seen on subsequent pages.

Page 16-22 SAMPLE and RANDOM


Stratified Sampling – What is it?

Stratified random sampling is sometimes called proportional or quota


random sampling.

It is a sampling method that divides a heterogeneous population of


interest into homogeneous subgroups, or strata, and then takes a random
sample from each of those subgroups.

The result of this homogeneous stratification of the population is a


stratified random sample that represents, not only the overall population,
but also key subgroups.

Example:
A retail application might divide a customer population into subgroups
composed of customers who pay for their purchases with cash, those who
pay by check, and those who buy on credit.

SAMPLE and RANDOM Page 16-23


Stratified Sampling (No Replacement)
The structure of the SAMPLE phrase shown on the facing page should already be somewhat
familiar to the reader. It follows the same structure as does the CASE construct. “No
replacement” and “AMP proportional” are still the defaults. The order of the WHEN
conditionals is also as important in this example as it is for CASE in that the first WHEN that
evaluates true for a row terminates further evaluations. No replacement is performed neither
within nor across levels. Sample Id’s are generated from left-to-right and then top-to-bottom.

Page 16-24 SAMPLE and RANDOM


Stratified Sampling (No Replacement)

SELECT Last_Name,
Department_Number AS Dept#
SAMPLEID
FROM employee
SAMPLE
WHEN department_number < 401 THEN 2, 2, 2
WHEN department_number < 501 THEN 3, 1
ELSE 2, 2
END last_name Dept# SampleId
ORDER BY 3, 1, 2; ----------- -------- -----------
Short 201 1
Stein 301 1
Kanieski 301 2
The reference to “ELSE” is optional. Trainer 100 2
Kubic 301 3
Note that replacement can not occur: Morrissey 201 3
• Within any level across samples. Crane 402 4
e.g. across 1, 2 and 3 Daly 402 4
• Across levels. Phillips 401 4
e.g. No crossing level 1 (samples1, Hopkins 403 5
2 and 3) with level 2 (samples 4 and Rabbit 501 6
5) or level 3 (samples 6 and 7) Runyon 501 6
Ratzlaff 501 7
Wilson 501 7

SAMPLE and RANDOM Page 16-25


Stratified Sampling (With Replacement)
The facing page illustrates the option of SAMPLE WITH REPLACEMENT.” Replacement can
be used for any sample construct. Notice, in the example on the facing page, that replacement is
occurring both within samples as well as across samples within a stratified level. Due to nature
of stratification, row replacement cannot occur across stratified levels since satisfying an earlier
WHEN condition precludes the same row from satisfying a subsequent WHEN condition.

Page 16-26 SAMPLE and RANDOM


Stratified Sampling (With Replacement)

SELECT Last_Name,
Department_Number AS Dept#,
SAMPLEID
FROM Employee
SAMPLE WITH REPLACEMENT
WHEN Dept# < 402 THEN .25, .25 last_name Dept# SampleId
WHEN Dept# < 501 THEN .50, .50 ---------- ------- -----------
ELSE .25 Hoover 401 1
END Johnson 401 1
ORDER BY 3, 1, 2; Phillips 401 1
Stein 301 1
Hoover 401 2
Johnson 401 2
Note that replacement can occur:
Phillips 401 2
• Within any level across samples.
Stein 301 2
e.g. across 1 and 2
Brown 403 3
Note that replacement can not occur: Crane 402 3
• Across levels. Hopkins 403 3
e.g. No crossing level 1 (samples1 Lombardo 403 3
and 2) with level 2 (3 and 4) or level 3 Brown 403 4
(sample 5) Crane 402 4
Hopkins 403 4
Lombardo 403 4
Ratzlaff 501 5

SAMPLE and RANDOM Page 16-27


RANDOMIZED ALLOCATION
Randomized allocation means that the requested rows are allocated across the AMPs by
simulating random sampling. This is a slow process, especially for large sample sizes, but it
provides a simple random sample for the system as a whole.

The default row allocation method is proportional. This means that the requested rows are
allocated across the AMPs as a function of the number of rows on each AMP. This method is
much faster than randomized allocation, especially for large sample sizes. Because proportional
allocation does not include all possible sample sets, the resulting sample is not a simple random
sample, but it has sufficient randomness to suit the needs of most applications.

Note that simple random sampling, meaning that each element in the population has an equal and
independent probability of being sampled, is employed for each AMP in the system regardless of
the specified allocation method.

One way to decide on the appropriate allocation method for your application is to determine
whether it is acceptable to “stratify” (not to be confused with “stratified” sampling of data rows)
the sampling input across the AMPs to achieve the corresponding performance gain, or whether
you need to consider the table as a whole.

Page 16-28 SAMPLE and RANDOM


RANDOMIZED ALLOCATION

Randomized allocation means that the requested rows are allocated across the
AMPs by simulating random sampling (without regard for AMP proportionality).

The result is not discernibly different than that for the default.

SELECT Employee_Number, SAMPLEID


FROM Employee
SAMPLE WITH REPLACEMENT RANDOMIZED ALLOCATION 4, 4
ORDER BY 2, 1;

employee_number SampleId
--------------- -----------
1002 1
1010 1
1012 1
1014 1
1001 2
1005 2
1012 2
1012 2

SAMPLE and RANDOM Page 16-29


The RANDOM Function
In addition to the statements on the facing page, the following rules and restrictions apply to the
use of the RANDOM function.

• RANDOM can only be called in one of the following SELECT query clauses:
o WHERE
(Using RANDOM as a WHERE condition is much like that for obtaining a
sampled percentage. Due to the nature of the RANDOM function, however, it
cannot guarantee the requested percentage, e.g. using the following to obtain a
66% sample.)

WHERE RANDOM(1, 3) < 3; )

o GROUP BY
e.g. like this only:

SEL SUM(salary_amount)
FROM employee
GROUP BY RANDOM(1, 6);

o ORDER BY
e.g. like this only:

SEL last_name
FROM employee
ORDER BY RANDOM(1,6);

o HAVING
e.g. like this only:

SEL SUM(salary_amount)
FROM employee
HAVING RANDOM(1, 6) > 2;

• RANDOM cannot be referenced by position in a GROUP BY or ORDER BY clause.


• RANDOM cannot be nested inside aggregate or ordered analytical functions.
• RANDOM cannot be used in the expression list of an INSERT statement to create a
primary index or partitioning column value.
For example:

INSERT t1 (RANDOM(1,10),...)

In this example, RANDOM causes an error to be reported in this case if the first column
in the table is a primary index or partitioning column.

Page 16-30 SAMPLE and RANDOM


The RANDOM Function

The RANDOM function generates a random integer between a specified range.

Both limits must be specified and both must be of data type integer.

RANDOM is a Teradata extension to the ANSI SQL-2003 standard.


Assign a random number between 1 and 9 to each department.

SELECT department_number department_number Random(1,9)


,RANDOM(1,9) ----------------- -----------
403 9
FROM department;
600 5
402 1
201 3
100 3
302 5
301 6
501 5
401 3

It is possible for random numbers to repeat.


The RANDOM function is replaced for each row processed, thus duplicate
random values are possible.

SAMPLE and RANDOM Page 16-31


RANDOM and Limitations
The limitations of the RANDOM function are stated on the facing page.

Page 16-32 SAMPLE and RANDOM


RANDOM and Limitations

• RANDOM is a Teradata extension to the ANSI standard.

• It supports integers between -2147483648 and +2147483647 only.

• RANDOM may be used in a SELECT list or a WHERE clause, but not both.

• RANDOM may be used in Updating, Inserting or Deleting rows.

• RANDOM may not be used to generate a primary index value.

• RANDOM may not be used with aggregate or OLAP functions.


(e.g.  SELECT SUM(RANDOM(1, 9)); in valid.)
(e.g.  SELECT SUM(Salary_Amount) + RANDOM(1, 9); is valid.)

• RANDOM cannot be referenced by numeric position in a GROUP BY or ORDER BY


clause.

SAMPLE and RANDOM Page 16-33


Module 16: Summary
A summary of this module is discussed.

Page 16-34 SAMPLE and RANDOM


Module 16: Summary

• The SAMPLE feature can be used to “randomly” return rows for various
business purposes.

• You can choose between AMP proportional or RANDOMIZED


ALLOCATION.

• You can choose between replacement or no replacement of sampled


rows.

• You can use SAMPLEID to associate rows to their sample.

• You can “stratify” the sample across many different levels.

• You can sample numbers of rows or as a percentage of the table.

• The RANDOM function can be used to derive a random value.

• The RANDOM function can be used to generate test data.

SAMPLE and RANDOM Page 16-35


Module 16: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 16-36 SAMPLE and RANDOM


Module 16: Review Questions

True or False:

1. By default, sampling is considered random.


True
2. The RANDOM function is ANSI standard.
False – it is a Teradata extension to the ANSI standard.
3. You can perform a sample on an aggregated result.
True – by using a derived table
4. A subquery can perform a sample.
False
5. Derived tables can perform a sample.
True
6. By default, rows can appear more than once within a sample.
False – the default is “no replacement”
7. By default, random values, using the RANDOM function, are
“replaced”.
True – any value can be returned more than once.

SAMPLE and RANDOM Page 16-37


Module 16: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 16-38 SAMPLE and RANDOM


Module 16: Lab Exercise

1. Return two 15-row samples from the employee table. Anything less
than 15 rows per sample is unacceptable. Project last and first names
plus hire dates and birth dates.

2. Return two 50% row samples of employees for each of departments


401 and 501.

3. Using the same sampling as #2, SUM the salaries for each sample.

SAMPLE and RANDOM Page 16-39


Notes:

Page 16-40 SAMPLE and RANDOM


Module 17

TOP N

After completing this module, you will be able to:

• Use TOP N to retrieve the top “N” values according to a sorting


order.
• Use TOP N to retrieve the top “N” percent according to a sorting
order.
• Determine what occurs when no sorting order is specified.
• Use the WITH TIES option to return all rows with tied sorting order
values.
• Compare TOP N with SAMPLE for returning rows more quickly
when unordered.

TOP N Page 17-1


Notes:

Page 17-2 TOP N


Table of Contents
TOP N Defined .......................................................................................................................... 17-4
TOP N Limitations ..................................................................................................................... 17-6
TOP N Example ......................................................................................................................... 17-8
TOP N WITH TIES ................................................................................................................. 17-10
Without Ties – Same Result..................................................................................................... 17-12
WITH TIES – Same Result ...................................................................................................... 17-14
Getting Bottom Results ............................................................................................................ 17-16
Bottom Results – WITH TIES ................................................................................................. 17-18
Unordered Rows ...................................................................................................................... 17-20
The PERCENT Option............................................................................................................. 17-22
PERCENT Option – WITH TIES ............................................................................................ 17-24
PERCENT Option and Millions of Rows ................................................................................ 17-26
Module 17: Summary............................................................................................................... 17-28
Module 17: Review Questions ................................................................................................. 17-30
Module 17: Lab Exercise ......................................................................................................... 17-32

TOP N Page 17-3


TOP N Defined
The general syntax form for the “TOP N” feature is provided on the facing page. Limitations are
listed on the next page.

Page 17-4 TOP N


TOP N Defined

“TOP N” is a Teradata extension to the ANSI SQL-2003 standard.

Syntax:
TOP { [ INTEGER | DECIMAL ] } [ PERCENT ] [ WITH TIES ]

• TOP N is mutually exclusive to SAMPLE.


1. WHERE {join conditions}
2. AGGREGATION • It can use ORDER BY to produce ranked
3. HAVING rows or to return unranked rows.
4. SAMPLE / TOP N
• It can be used to replace RANK
5. ORDER BY
(Advanced SQL) or SAMPLE, but, unlike
6. FORMAT
SAMPLE, results are not randomly
generated.

TOP N Page 17-5


TOP N Limitations
The limitations for the “TOP N” feature listed on the facing page are repeated below. For an
explanation of the WITH BY syntax, read Appendix A.

The following options cannot appear in a SELECT statement that specifies the TOP N operator:
• DISTINCT option
• QUALIFY clause
• SAMPLE clause
• WITH clause (WITH/BY – see Appendix A)
• ORDER BY clause where the sort expression is an ordered analytical function
• Sub-selects of set operations

You cannot specify the TOP N operator in any of the following SQL statements or statement
components:
• Correlated subquery
• Subquery in a search condition
• CREATE JOIN INDEX statement
• CREATE HASH INDEX statement
• Seed statement or recursive statement in a CREATE RECURSIVE VIEW statement
or WITH RECURSIVE clause

Page 17-6 TOP N


TOP N Limitations

The following options cannot appear in a SELECT statement that specifies the TOP
option:
• DISTINCT option
• QUALIFY clause
• SAMPLE clause
• WITH clause (See appendix)
• ORDER BY clause where the sort expression is an ordered analytical
function
• Sub-selects of set operations

The TOP option cannot appear in the following:


• Correlated subquery
• Subquery in a search condition
• CREATE JOIN INDEX statement (“Physical Database Tuning” class)
• CREATE HASH INDEX statement (“Physical Database Tuning” class)
• Seed statement or recursive statement in a CREATE RECURSIVE VIEW
statement or WITH RECURSIVE clause (“Advanced SQL” class)

TOP N Page 17-7


TOP N Example
Although the example on the facing page appears to be that for a ranking, do not confuse the
“TOP N” feature with the RANK function, which is taught in the Advanced SQL course. This
function can return similar results, but it does not return a ranking value, and treats tied values
very differently. Also, RANK is considered a “Window Aggregate” function and is ANSI
standard while this is neither. Another difference is that “TOP N” has a PERCENT option that
RANK does not. The PERCENT option will be discussed later in this module.

The QUALIFY clause with the RANK or ROW_NUMBER ordered analytical functions (from
the Advanced SQL class) returns the same results as the TOP N operator.

The following is an excerpt from the SQL reference manual. It is placed here for reference only,
and only to show another SQL construct capable of providing similar results.

The following two requests have the same semantics:

SELECT TOP 10 *
FROM sales ORDER BY county;

SELECT *
FROM sales QUALIFY ROW_NUMBER() OVER (ORDER BY COUNTY) <= 10;

Similarly, the following two requests have the same semantics:

SELECT TOP 10 WITH TIES *


FROM sales ORDER BY county;

SELECT *
FROM sales QUALIFY RANK() OVER (ORDER BY county) <= 10;

For best performance, use the TOP option instead of the QUALIFY clause with RANK or
ROW_NUMBER. In best-case scenarios, the TOP option provides better performance; in worst-
case scenarios, the TOP option provides equivalent performance.

Page 17-8 TOP N


TOP N Example

• ORDER BY defines the


Show the top five budget amounts. sequencing of the result set.
• It therefore defines the ranking
SELECT TOP 5 criteria.
department_number • To get the TOP highest amounts,
, budget_amount you must use ORDER with
FROM department DESC.
ORDER BY 2 DESC; • TOP N where N is an integer up
to 18 digits in length.

department_number budget_amount
------------------ ----------------------
401 982300.00
403 932000.00
301 465600.00
100 400000.00
501 308000.00

TOP N Page 17-9


TOP N WITH TIES
The TOP N feature uses the facilities of the same ORDER BY as taught previously. Other than
FORMAT, “TOP N” the very last feature performed in a query. “TOP N” is also mutually
exclusive to SAMPLE. That is: Within the same projection, one may use either SAMPLE or
TOP N, but not both.

The WITH TIES option only applies to those values that are tied at the bottom of the list. This is
true whether the order is ascending or descending. In our example it can be deduced that there
are no more values of $308,000.00 because the WITH TIES option will return all values at the
bottom that are the same. Without WITH TIES one can never be certain no more values exists.

Page 17-10 TOP N


TOP N WITH TIES

Show the top five budget amounts, allowing for ties.

department_number budget_amount
SELECT TOP 5 WITH TIES ----------------- -------------
401 982300.00
department_number
403 932000.00
,budget_amount 301 465600.00
FROM department 100 400000.00
ORDER BY 2 DESC; 501 308000.00
402 308000.00

• Even though TOP 5 is specified, six rows are returned.


• Because there is a tie for the fifth position, both rows are returned.
• This only occurs when WITH TIES is specified.
• WITH TIES returns multiple tied rows when there is a tie for the 'last' position.
• It will return all rows containing the 'tied' value, but it will only count it as one
row.
• Tied rows which are not in the last position, are each counted separately toward
the N total.

TOP N Page 17-11


Without Ties – Same Result
Based upon the previous result, we now know that there are no more values with the bottom
value. Based upon this query, and since there is no use of the WITH TIES option, if this were
done without this knowledge we would not be able to discern if there are more budget amounts
sharing the same value.

Page 17-12 TOP N


Without Ties – Same Result

Show the top six budget amounts. department_number budget_amount


----------------- -------------
401 982300.00
SELECT TOP 6 403 932000.00
department_number 301 465600.00
,budget_amount 100 400000.00
FROM department 501 308000.00
ORDER BY 2 DESC; 402 308000.00

• This is the same output as the example using TOP 5 WITH TIES.
• Each is counted as a separate row for the top six.
• By default, rows with the same amount are counted as separate rows toward
the total.
• If the WITH TIES option had been specified in this example, how would the
result set differ?

TOP N Page 17-13


WITH TIES – Same Result
Lastly, note that this example produces the very same result set as the previous 3 examples of
TOP 5; TOP 5 WITH TIES; and TOP 6.

Page 17-14 TOP N


WITH TIES – Same Result

Get the top six budget amounts with ties.

department_number budget_amount
SELECT TOP 6 WITH TIES ----------------- -------------
401 982300.00
department_number
403 932000.00
,budget_amount 301 465600.00
FROM department 100 400000.00
ORDER BY 2 DESC; 501 308000.00
402 308000.00

• There is no difference between this example and the previous example.


• WITH TIES has no effect because the top six rows are requested.
• If a seventh row existed with the same amount as the sixth, it would be returned.
• Note, that the same result is returned using either TOP 5 or 6 if ties are requested.

TOP N Page 17-15


Getting Bottom Results
You can obtain a list of the lowest number of values by changing the ORDER BY to ascending.
Whether ascending or descending the very same rules apply with respect to WITH TIES or the
default where ties are ignored. Here, ties are ignored, so we cannot be certain that the bottom
value occurs more than once.

Page 17-16 TOP N


Getting Bottom Results

Show the bottom three employees by salary.

SELECT TOP 3 employee_number salary_amount


employee_number --------------- -------------
,salary_amount 1014 24500.00
FROM employee 1013 24500.00
ORDER BY salary_amount ASC; 1001 25525.00

• ORDER BY ASC reverses the ranking sequence and shows the bottom rankings.
• Two rows with the same salary are treated as two rows of output.

TOP N Page 17-17


Bottom Results – WITH TIES
As before, with ORDER BY DESC and WITH TIES, we can see that no more salary amounts
share the value of $25,525.00.

Page 17-18 TOP N


Bottom Results – WITH TIES

Get the three lowest salaried employees, allowing for ties.

SELECT TOP 3 WITH TIES employee_number salary_amount


employee_number --------------- -------------
,salary_amount 1014 24500.00
FROM employee 1013 24500.00
ORDER BY salary_amount ASC; 1001 25525.00

• The WITH TIES option has no effect.


• Ties must occur in the last row returned to have any effect.
• If another row tied with the last row (25525.00), a fourth row would be
returned.

TOP N Page 17-19


Unordered Rows
When no ORDER BY is referenced, TOP N returns the requested number of rows or percentage
of rows very quickly. Since this is not a random process, like SAMPLE. SAMPLE may take
minutes to return only 1 or 2 rows from a very large table, whereas “TOP N” can retrieve them in
about 2 seconds. The difference is that “TOP N” is not random. It goes to the first data block it
finds and gets the first 2 rows it finds from that data block. Since this process is repeated for
each submission of the request, it continues to return the very same results each time.

Page 17-20 TOP N


Unordered Rows

The TOP N function can be used effectively to quickly select rows from a table.

Select two rows from the employee table.

SELECT TOP 2 employee_number salary_amount


employee_number --------------- -------------
,salary_amount 1008 29250.00
FROM employee; 1018 54000.00

• No ORDER BY clause means rows are returned without regard to ranking.


• Different rows could be returned with another execution of this query.
• This is similar to using the SAMPLE function,
• The SAMPLE function however, produces a more truly randomized result set.
• WITH TIES, if specified, is ignored. (doesn’t fail the query)

TOP N Page 17-21


The PERCENT Option
A percent option may be used with “TOP N”. The percentage value is expressed as a decimal
fraction between 0 and 100. When used with an ORDER BY, it returns the top so many percent
according to the order specified.

Page 17-22 TOP N


The PERCENT Option

The TOP N function can also produce a percentage of rows in addition to an


absolute number of rows.

Return employees whose salaries represent the top ten percent.

SELECT TOP 10 PERCENT employee_number salary_amount


employee_number --------------- -------------
,salary_amount 801 100000.00
FROM employee 1017 66000.00
ORDER BY salary_amount DESC; 1019 57700.00

• 10% of 26 rows is 2.6 rows rounded to 3.


• PERCENT must be a number between 0 and 100.
• At least one row is always returned (if there is at least one row in the table).
• A percentage resulting in a fractional number of rows is always rounded up:
• 10% of 6 rows = .6 rows = 1 row output
• 20% of 6 rows = 1.2 rows = 2 rows output
• 30% of 6 rows = 1.8 rows = 2 rows output

TOP N Page 17-23


PERCENT Option – WITH TIES
The WITH TIES option works the same way with percents as it does with whole numbers.

Page 17-24 TOP N


PERCENT Option – WITH TIES

Get the top 45 percent of department budgets and allow for ties.

department_number budget_amount
SELECT TOP 45 PERCENT WITH TIES ----------------- -------------
401 982300.00
department_number
403 932000.00
, budget_amount 301 465600.00
FROM department 100 400000.00
ORDER BY 2 DESC; 402 308000.00
501 308000.00

• 45% of 9 rows is 4.05 rows rounded to 5.


• Because there is a tie for the fifth position, six rows are returned.
• No ORDER BY works similarly with PERCENT as without it.
 WITH TIES is also ignored.

TOP N Page 17-25


PERCENT Option and Millions of Rows
If very many rows are in the table and “TOP N PERCENT” is being used, the facing page
illustrates how to use this feature without retrieving many thousands of rows. The key is to
remember that the percentage may be a decimal fraction between 0.0000 and 100.0000.

Page 17-26 TOP N


PERCENT Option and Millions of Rows

Recall that a percent may be any integer or decimal number between 1 and 100.

SELECT TOP 1 PERCENT


department_number
1% of 30 million rows would
, budget_amount
be 300,000 rows.
FROM department
ORDER BY 2 DESC;

SELECT TOP 0.01 PERCENT


department_number
0.01% of 30 million rows
, budget_amount
would be 3000 rows.
FROM department
ORDER BY 2 DESC;

TOP N Page 17-27


Module 17: Summary
A summary of this module is discussed.

Page 17-28 TOP N


Module 17: Summary

• TOP N can be used to return the top values by number or percentage


according to a specified order.

• TOP N can be used to quickly return a number or percentage of rows


without regard to a specified order.

• The WITH TIES option can be used to return all rows at the bottom of an
ordered list that share the same value.

• Without a specified order, the TOP N feature is not considered to be


random as is SAMPLE.

TOP N Page 17-29


Module 17: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 17-30 TOP N


Module 17: Review Questions

True or False:

1. Values may be replaced when using TOP N.


False - don’t confuse this option with SAMPLE
2. TOP N may be referenced in the same projection as SAMPLE.
False
3. TOP N may be referenced within a subquery.
False
4. The WITH TIES option will return only a specified number of rows.
False – it can return a percentage as well
5. ORDER BY returns the same thing as a ranked result.
False
6. The WITH TIES option is invalid if no ORDER BY clause is included in the
query.
True - but it is ignored
7. ORDER BY can be referenced along with the PERCENT option.
True

TOP N Page 17-31


Module 17: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise questions as directed by your instructor.

Page 17-32 TOP N


Module 17: Lab Exercise

1. List the top 5 salaries amount VALUES in the employee table along with
the last names and first name of the employee.

2. To see if it can be used on character data values, list the top 3


department names from the department table by VALUE (in descending
order).

3. Retrieve half of the job descriptions that have “manager” in the


description using the TOP N feature. Verify the result by doing a
COUNT(*) from job.

4. Add employee names to those descriptions for #3.

5. Retrieve the top 3 department salary sums by VALUE - descending.

TOP N Page 17-33


Notes:

Page 17-34 TOP N


Module 18

Window Aggregates – Part 1

After completing this module, you will be able to:

• Use Window Aggregate functions to perform GROUP functions.


• Explain how the term “OVER” is used.
• Use PARTITION to group results into windows by one or more
columns.
• Explain what the default of “ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING” refers to.
• Determine the correct order of the terms PARTITON, ORDER, and
ROWS within a window aggregate expression.
• Use QUALIFY to qualify on Window Aggregate result sets.

Window Aggregates - Part 1 Page 18-1


Notes:

Page 18-2 Window Aggregates - Part 1


Table of Contents
Window Aggregate Functions ................................................................................................... 18-4
The GROUP COUNT Window ................................................................................................. 18-6
Relating the Result to the Syntax ............................................................................................... 18-8
GROUP COUNT and Null ...................................................................................................... 18-10
GROUP COUNT(*) ................................................................................................................. 18-12
Group SUM and AVG Window............................................................................................... 18-14
Group AVG and QUALIFY .................................................................................................... 18-16
GROUP COUNT and PARTITION ........................................................................................ 18-18
GROUP COUNT, PARTITION, and Null .............................................................................. 18-20
GROUP COUNT and Null Partitions ...................................................................................... 18-22
GROUP SUM and Partition ..................................................................................................... 18-24
GROUP SUM and Reordering ................................................................................................. 18-26
SQL ORDER BY to Preserve Order ........................................................................................ 18-28
Window ORDER BY to Preserve Order.................................................................................. 18-30
Qualifying on a Windowed Non-Aggregated .......................................................................... 18-32
WHERE vs. QUALIFY ........................................................................................................... 18-34
Order of Group SUM and Aggregation ................................................................................... 18-36
Module 18: Summary............................................................................................................... 18-38
Module 18: Review Questions ................................................................................................. 18-40
Module 18: Lab Exercise ......................................................................................................... 18-42

Window Aggregates - Part 1 Page 18-3


Window Aggregate Functions
The following is a complete list of the Window Aggregate functions as of TD13. The underlined
are ANSI 2003 compliant.
AVG, CORR, COUNT, COVAR_POP, COVAR_SAMP, MAX, MIN, REGR_AVGX,
REGR_AVGY, REGR_COUNT, REGR_INTERCEPT, REGR_R2, REGR_SLOPE,
REGR_SXX, REGR_SXY, REGR_SYY, STDDEV_POP, STDDEV_SAMP, SUM, VAR_POP,
VAR_SAMP, PERCENT_RANK, RANK, ROW_NUMBER

Teradata SQL-specific functions (that should be avoided in deference to the preceding


functions):
• CSUM
• MAVG
• MDIFF
• MSUM
• QUANTILE
• RANK (different syntax than the one mentioned above.)

Performing ordered analytical computations at the SQL level rather than through a higher level
OLAP calculation engine provides four distinct advantages.
• Reduced programming effort.
• Elimination of the need for external sort routines.
• Elimination of the need to export large data sets to external tools because ordered
analytical functions enable you to target the specific data for analysis within the
warehouse itself by specifying conditions in the query.
• Marked enhancement of analysis performance

The use of Teradata-specific functions is strongly discouraged. These functions are retained only
for backward compatibility with existing applications.

Page 18-4 Window Aggregates - Part 1


Window Aggregate Functions

The Window feature is ANSI SQL-2003 compliant and provides a way to


dynamically define a subset of data, or window, in an ordered relational
database table.

About Window Aggregate Functionality:


• They share the standard aggregate syntax of COUNT, SUM, MIN, MAX
and AVG.
• They also apply to the ANSI SQL-2003-compliant window functions of
RANK, PERCENT_RANK, and ROW_NUMBER.
• Additional syntax expands the usefulness of the normal aggregate
syntax.
• This expanded usefulness includes an ability to retain the detail lost in
normal aggregation.
• They may be used within the same projection as aggregation.
• The may be used together with Views, all temp-tables, and insert-select.

This module will focus only on a specific type of Window – the Group Window

Window Aggregates - Part 1 Page 18-5


The GROUP COUNT Window
The facing page illustrates our first usage of a Window Aggregate function. It uses COUNT
function that is also used for (normal) aggregation. It is important to note that the result set
retains the detail for each row. This is the main separator that distinguishes this from typical
aggregation.

You need not directly code SQL queries to take advantage of ordered analytical functions. Both
Teradata Database and many third-party query management and analytical tools have full access
to the Teradata SQL ordered analytical functions. Teradata Warehouse Miner, for example, a
tool that performs data mining preprocessing inside the database engine, relies on these features
to perform functions in the database itself rather than requiring data extraction.

Teradata Warehouse Miner includes approximately 40 predefined data mining functions in SQL,
based on the Teradata SQL-specific functions. For example, the Teradata Warehouse Miner
FREQ function uses the Teradata SQL-specific functions CSUM, RANK, and QUALIFY
to determine frequencies.

Page 18-6 Window Aggregates - Part 1


The GROUP COUNT Window

Show employee details and total number of employees:


SELECT last_name AS Name
,salary_amount AS Salary
,department_number AS Dept
,COUNT(salary) OVER (ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) AS Total_Count
FROM Employee
WHERE Department_Number = 401
ORDER BY 1;

• Note that the count is projected as a column in each row.


In the
• There are 7 non-null salaries.
result: • The format for every row is the same so this can be projected from SQL Assistant.

Name Salary Dept Total_Count


---------- -------- ------ -----------
Brown 43100.00 401 7
Hoover 25525.00 401 7
Johnson 36300.00 401 7
Machado 32300.00 401 7
Phillips 24500.00 401 7
Rogers 46000.00 401 7
Trader 37850.00 401 7

Window Aggregates - Part 1 Page 18-7


Relating the Result to the Syntax
The challenging part to learning this syntax is to first get past the semantics. The facing page
takes a good line in trying to explain it. The phrase:

ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED


FOLLOWING

Is the default for this expression, meaning an empty pair of parentheses accomplishes the very
same thing.

Page 18-8 Window Aggregates - Part 1


Relating the Result to the Syntax

ANSI syntax is typically very explicit.


Knowing this can help when trying to determine what it does.

,COUNT(salary)
OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

Name Salary Dept Total_Count


---------- -------- ------ -----------
Brown 43100.00 401 7
Hoover 25525.00 401 7
Johnson 36300.00 401 7
Machado 32300.00 401 7
Phillips 24500.00 401 7
Rogers 46000.00 401 7
Trader 37850.00 401 7

To interpret the syntax, read it as it appears.


We want to count the salaries OVER the criteria defined within the parentheses.
1. Take ANY salary amount.
2. Count it with the all the non-nulls preceding it and all the non-nulls
following it.
3. Project that count in this column for that row.

Window Aggregates - Part 1 Page 18-9


GROUP COUNT and Null
This window is termed a GROUP window because it is projecting a count for a “group” of rows.
The “group” for this particular query is the entire set of rows after qualifying on the WHERE
clause. We will show how to provide further qualification on groups later.

It should be of no surprise to anyone to see that the window aggregates follow the very same
rules as do the standard aggregate functions. This is to say, all nulls are ignored.

Page 18-10 Window Aggregates - Part 1


GROUP COUNT and Null

Show employee details and total number of employees:

SELECT last_name AS Name The phrase


,salary_amount AS Salary “UNBOUNDED PRECEDING AND
,department_number AS Dept UNBOUNDED FOLLOWING”
,COUNT(salary) OVER ( ) AS Total_Count is the default, so the syntax shown is
FROM Employee equivalent.
WHERE Department_Number = 401
ORDER BY 1; This phrase makes it a GROUP WINDOW.

For the result:


• Note that nulls are not counted (same as in aggregation).
• There are still 7 rows and the count has changed to 5.
• Use COALESCE to add the nulls to the group's count.

Name Salary Dept Total_Count


-------------------- ------------ ----------- -----------
Brown 43100.00 401 5
Hoover 25525.00 401 5
Johnson 36300.00 401 5
Machado ? 401 5
Phillips 24500.00 401 5
Rogers 46000.00 401 5
Trader ? 401 5

Window Aggregates - Part 1 Page 18-11


GROUP COUNT(*)
Following the very same logic on the preceding page, it should also be no surprise to see that a
COUNT(*) may be used to count the rows within each group – the set of all rows. As an
alternative, this very same result could be obtained by issuing the following.

SELECT last_name AS Name


,salary_amount AS Salary
,department_number AS Dept
,total_count
FROM Employee,
(SELECT COUNT(*) AS total_count
FROM employee
WHERE Department_Number = 401) AS t1
WHERE department_number = 401
ORDER BY 1;

Name Salary Dept total_count


------------------- ------------ ----------- -----------
Brown 43100.00 401 7
Hoover 25525.00 401 7
Johnson 36300.00 401 7
Machado ? 401 7
Phillips 24500.00 401 7
Rogers 46000.00 401 7
Trader 37850.00 401 7

Page 18-12 Window Aggregates - Part 1


GROUP COUNT(*)

You can use COUNT(*) to count rows instead of values.

SELECT last_name AS Name


,salary_amount AS Salary
,department_number AS Dept
,COUNT(*) OVER ( ) AS Total_Count
FROM Employee
WHERE Department_Number = 401
ORDER BY 1;

Name Salary Dept Total_Count


-------------------- ------------ ----------- -----------
Brown 43100.00 401 7
Hoover 25525.00 401 7
Johnson 36300.00 401 7
Machado ? 401 7
Phillips 24500.00 401 7
Rogers 46000.00 401 7
Trader 37850.00 401 7

Window Aggregates - Part 1 Page 18-13


Group SUM and AVG Window
The facing page shows examples of GROUP SUM and GROUP AVG results. The heading was
defaulted on purpose so that you may see that the database returns headings reflecting the
window type.

Of the four available window types, only the GROUP window is taught in this module. The
others windows will be taught in the following module. It is in these groups that the ORDER
BY becomes important. The ORDER BY can be an important consideration from a performance
related point of view. This will be discussed later in this module.

Page 18-14 Window Aggregates - Part 1


Group SUM and AVG Window

Show employee details with their group SUM and AVG.


In this example we are getting both the group
SELECT last_name AS Name
sum and average.
,salary_amount AS Salary
,department_number AS Dept
Notice that the default headings describe the
,SUM(salary) OVER ( )
window as a Group Window.
,AVG(Salary) OVER ( )
FROM Employee
There are 4 aggregate windows altogether.
WHERE Department_Number = 401
The bottom 3 will be taught in the next module.
ORDER BY 1;
1. Group
2. Cumulative
3. Moving
4. Remaining

Name Salary Dept Group Sum(Salary) Group Avg(Salary)


------------- ---------- ----------- ----------------- -----------------
Brown 43100.00 401 245575.00 35082.14
Hoover 25525.00 401 245575.00 35082.14
Johnson 36300.00 401 245575.00 35082.14
Machado 32300.00 401 245575.00 35082.14
Phillips 24500.00 401 245575.00 35082.14
Rogers 46000.00 401 245575.00 35082.14
Trader 37850.00 401 245575.00 35082.14

Window Aggregates - Part 1 Page 18-15


Group AVG and QUALIFY
QUALIFY is to window aggregates as HAVING is to aggregates. The list shows the “Order of
Operations”, Notice, in the list that window aggregates fall into a broader class of SQL referred
to as OLAP (On-Line Analytical Processing). This chart can be very helpful in understanding
result sets. Notice that window aggregates are performed after aggregations. In general:
• WHERE restricts rows that participate in those processes that follow it.
• HAVING gets applied after aggregation and, therefore, restricts rows participating in
those processes following it.
• QUALIFY gets applied after window aggregates and, therefore, restricts rows
participating in those processes following it.
• Although HAVING may be used to reference aggregates as well as non-aggregates,
WHERE may not reference aggregates because WHERE occurs prior to aggregation in
the chart.

1. WHERE
2. AGGREGATION
3. HAVING
4. OLAP { [ PARTITION BY ]
[ ORDER BY ] [ rows ] }
5. QUALIFY [ ORDER BY ]
6. RANDOM
7. SAMPLE | TOP N
8. ORDER BY
9. FORMAT

Page 18-16 Window Aggregates - Part 1


Group AVG and QUALIFY

QUALIFY, on window aggregates, works much the same way as does HAVING on
standard aggregates.

They are available for all window aggregates.

Find employees in department 401 whose salary is greater than


their department average.
1. WHERE
2. AGGREGATION
SELECT last_name AS Name 3. HAVING
,salary_amount AS Salary 4. OLAP { [ PARTITION BY ]
,department_numberAS Dept [ ORDER BY ] [ rows ] }
,AVG(Salary) OVER ( ) AS GrpAvg 5. QUALIFY [ ORDER BY ]
6. RANDOM
FROM Employee
7. SAMPLE | TOP N
WHERE Department_Number = 401 8. ORDER BY
QUALIFY Salary > GrpAvg; 9. FORMAT

Name Salary Dept GrpAvg


-------------------- ------------ ----------- ------------
Brown 43100.00 401 35082.14
Johnson 36300.00 401 35082.14
Rogers 46000.00 401 35082.14
Trader 37850.00 401 35082.14

Window Aggregates - Part 1 Page 18-17


GROUP COUNT and PARTITION
PARTITION is what is used to break results down by groupings. Not to confuse PARTITION
and GROUP, but the result, in the example, is still a GROUP COUNT because it acts on all rows
within a partition. In this case the partition is Department Number. Other than the fact that we
are using partition, everything works as described earlier, only with partitions, and not across the
entire set found in the table.

Page 18-18 Window Aggregates - Part 1


GROUP COUNT and PARTITION

The keyword PARTITION may be used to return window aggregates within a


partition or group.

This is similar to GROUP BY in aggregation, but retains the detail.

Unlike GROUP BY, PARTITION performs an on ordering as well.

SELECT last_name AS Name


,salary_amount AS Salary
,department_number AS Dept
,COUNT(salary) OVER (PARTITION BY Dept)
FROM Employee
WHERE Department_Number IN (301, 501);

Name Salary Dept Group Count(Salary)


-------------------- ------------ ----------- -------------------
Kubic 57700.00 301 3
Stein 29450.00 301 3
Kanieski 29250.00 301 3
Rabbit 26500.00 501 4
Wilson 53625.00 501 4
Runyon 66000.00 501 4
Ratzlaff 54000.00 501 4

Window Aggregates - Part 1 Page 18-19


GROUP COUNT, PARTITION, and Null
Counting non-null salaries within a partition works the same as it does on a larger set. Nulls are
still ignored.

Page 18-20 Window Aggregates - Part 1


GROUP COUNT, PARTITION, and Null

Recall that the default for ROWS is:


UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
So the expression below is rquivalent to:
. . . OVER (PARTITION BY Dept
ROWS UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

SELECT last_name AS Name


,salary_amount AS Salary
,department_number AS Dept
,COUNT(salary) OVER (PARTITION BY Dept)
FROM Employee
WHERE Department_Number IN (301, 501);

Name Salary Dept Group Count(Salary)


-------------------- ------------ ----------- -------------------
Kubic 57700.00 301 2
Kanieski ? 301 2
Stein 29450.00 301 2
Rabbit ? 501 2
Wilson 53625.00 501 2
Runyon 66000.00 501 2
Ratzlaff ? 501 2

Window Aggregates - Part 1 Page 18-21


GROUP COUNT and Null Partitions
Notice how window aggregates follow the same basic rules as does aggregation. In aggregation
nulls are ignored, but they do aggregate as a group. Here we see the very same concept played
out with window aggregates and partitions. Nulls form a partition just as aggregates form a
group. As always, however, COUNT(*) counts rows, in this case within partition.

Page 18-22 Window Aggregates - Part 1


GROUP COUNT and Null Partitions

Remember, window aggregates use aggregate functionality, but retain detail.


As in aggregation, Nulls are ignored during aggregation, but do aggregate as
groups.
SELECT last_name AS Name
,salary_amount AS Salary
,department_number AS Dept
,COUNT(*) OVER (PARTITION BY Dept)
FROM Employee
WHERE Department_Number IS NULL
OR Department_Number IN (301, 501);

Name Salary Dept Group Count(*)


-------------------- ------------ ----------- --------------
Morrissey 38750.00 ? 2
Short 34700.00 ? 2
Kubic 57700.00 301 3
Kanieski ? 301 3
Stein 29450.00 301 3
Rabbit ? 501 4
Wilson 53625.00 501 4
Runyon 66000.00 501 4
Ratzlaff ? 501 4

Window Aggregates - Part 1 Page 18-23


GROUP SUM and Partition
Whether sums or counts, window aggregates work similarly to aggregation as well as to other
window aggregates.

Page 18-24 Window Aggregates - Part 1


GROUP SUM and Partition

The keyword PARTITION may be used to return window aggregates within a


partition or group.

SELECT last_name AS Name


,salary_amount AS Salary
,department_number AS Dept
,SUM(Salary) OVER (PARTITION BY Dept)
FROM Employee
WHERE Department_Number IS NULL
OR Department_Number IN (301, 501);

Name Salary Dept Group Sum(Salary)


-------------------- ------------ ----------- -----------------
Morrissey 38750.00 ? 73450.00
Short 34700.00 ? 73450.00
Kubic 57700.00 301 87150.00
Kanieski ? 301 87150.00
Stein 29450.00 301 87150.00
Rabbit ? 501 119625.00
Wilson 53625.00 501 119625.00
Runyon 66000.00 501 119625.00
Ratzlaff ? 501 119625.00

Window Aggregates - Part 1 Page 18-25


GROUP SUM and Reordering
The chart depicting the order of operations for a query was first shown as a left-hand page earlier
in the course. Now it appears on the facing page. An explanation of what is happening follows
on the next page. Here we are just bringing to your attention the fact that there is more than one
ordering of data in this query. Each ordering may be potentially reordered by a subsequent
ordering, causing the result, performed according to a specific ordering, to be reordered so as to
render the previously applied result unintelligible. This is not true with respect to Group
Windows, since the actual result values returned are not affected by an ordering. True, the
ORDER BY reorders the department numbers making it difficult to associate each row with its
partition, but the actual values are the same for each row. Additional orderings will be of much
more significance in the following module, where order plays a very important role in affecting
the values returned for the result.

Page 18-26 Window Aggregates - Part 1


GROUP SUM and Reordering

The standard ORDER BY (shown) occurs in step 7 of the order operations (also
shown). Notice its affect on the result.

1. WHERE
2. AGGREGATION
SELECT last_name AS Name
3. HAVING
,salary_amount AS Salary
4. OLAP { [ PARTITION BY ]
,department_number AS Dept
[ ORDER BY ] [ rows ] }
,SUM(Salary) OVER (PARTITION BY Dept)
5. QUALIFY [ ORDER BY ]
FROM Employee
6. RANDOM
WHERE Department_Number IN (301, 501)
7. SAMPLE | TOP N
ORDER BY 1;
8. ORDER BY
9. FORMAT

Name Salary Dept Group Sum(Salary)


-------------------- ------------ ----------- -----------------
Kanieski 29250.00 301 116400.00
Kubic 57700.00 301 116400.00
Rabbit 26500.00 501 200125.00
Ratzlaff 54000.00 501 200125.00
Runyon 66000.00 501 200125.00
Stein 29450.00 301 116400.00
Wilson 53625.00 501 200125.00

Window Aggregates - Part 1 Page 18-27


SQL ORDER BY to Preserve Order
The request on the facing page requires two separate orderings. Not only is partitioning a way in
which to separate into groups, it is also, you may have noticed, an ordering as well. In fact, it is
the primary sorting order for a window aggregate. The subsequent ORDER BY is an attempt to
maintain the order of the partitioning column and to order by last name within the partition of
department number.

Page 18-28 Window Aggregates - Part 1


SQL ORDER BY to Preserve Order

The standard ORDER BY (shown) occurs in step 8 of the order operations (also
shown). Notice its affect on the result.

SELECT last_name AS Name


,salary_amount AS Salary
,department_number AS Dept
,SUM(Salary) OVER (PARTITION BY Dept)
FROM Employee
WHERE Department_Number IS NULL
OR Department_Number IN (301, 501)
ORDER BY 3, 1;

Name Salary Dept Group Sum(Salary)


----------- ------------ ----------- -----------------
Kanieski 29250.00 301 116400.00
Ordered by last name Kubic 57700.00 301 116400.00
within partition of Stein 29450.00 301 116400.00
Department number. Rabbit 26500.00 501 200125.00
Ratzlaff 54000.00 501 200125.00
Runyon 66000.00 501 200125.00
Wilson 53625.00 501 200125.00

Window Aggregates - Part 1 Page 18-29


Window ORDER BY to Preserve Order
In the example on the facing page shows both PARTITION and ORDER within the same OVER
clause. When using them together like this they must be placed in the order shown that is,
PARTITION BY first followed by ORDER BY. Besides partitioning, the sort now becomes the
partitioning column(s) first, followed by the ordering column(s) within PARTITION.

Note that this example contains a single ordering instead of two as found on the previous page.
The net result, however, is the same.

Page 18-30 Window Aggregates - Part 1


Window ORDER BY to Preserve Order

ORDER BY is also an option for window aggregates.


It is not typically used with Group Windows because it does not affect the group
sum. It will be quite important in the next module on Window Aggregates.
When used with PARTITION, it must follow the PARTITION keyword.
PARTITION is both a grouping and a major sort, while ORDER BY is a minor sort.

SELECT last_name AS Name


,salary_amount AS Salary
,department_number AS Dept
,SUM(Salary) OVER (PARTITION BY Dept ORDER BY Name)
FROM Employee
WHERE Department_Number IS NULL
OR Department_Number IN (301, 501);

Name Salary Dept Group Sum(Salary)


-------------------- ------------ ----------- -----------------
Kanieski 29250.00 301 116400.00
Kubic 57700.00 301 116400.00
Stein 29450.00 301 116400.00
Rabbit 26500.00 501 200125.00
Ratzlaff 54000.00 501 200125.00
Runyon 66000.00 501 200125.00
Wilson 53625.00 501 200125.00

Window Aggregates - Part 1 Page 18-31


Qualifying on a Windowed Non-Aggregated
The example on the facing page lustrates how one may qualify on the window expression and
not project it in the result.

Page 18-32 Window Aggregates - Part 1


Qualifying on a Windowed Non-
Aggregated
You can qualify on the window aggregate expression without having to project the
values themselves.

Note how the order of the partitioned column is still maintained.

Find employees whose salary is the maximum in their department.

SELECT last_name AS Name


,salary_amount AS Salary
,department_number AS Dept
FROM Employee
QUALIFY Salary = MAX(Salary) OVER (PARTITION BY DEPT) ;

Name Salary Dept


-------------------- ------------ -----------
Trainer 100000.00 100
Morrissey 38750.00 201
Kubic 57700.00 301
Rogers 56500.00 302
Rogers 46000.00 401
Daly 52500.00 402
Villegas 49700.00 403
Runyon 66000.00 501

Window Aggregates - Part 1 Page 18-33


WHERE vs. QUALIFY
Although mentioned briefly on a previous page, we finally show how WHERE and QUALIFY
differ from one another with respect to when they are performed within the order of operations of
a query. As a WHERE condition it fails because we are conditioning on a window aggregate
value that has yet to be determined.

Page 18-34 Window Aggregates - Part 1


WHERE vs. QUALIFY

1. WHERE
2. AGGREGATION
According to our order of operations, 3. HAVING
this window aggregate result is not 4. OLAP { [ PARTITION BY ]
available for WHERE conditioning. [ ORDER BY ] [ rows ] }
5. QUALIFY [ ORDER BY ]
6. RANDOM
SELECT last_name AS Name 7. SAMPLE | TOP N
,salary_amount AS Salary 8. ORDER BY
9. FORMAT
,department_number AS Dept
,SUM(Salary) OVER ( ) AS TotSum SELECT last_name AS Name
FROM Employee ,salary_amount AS Salary
WHERE Dept = 401 ,department_number AS Dept
AND totsum / salary > 3 ,SUM(Salary) OVER ( ) AS TotSum
ORDER BY 3, 2; FROM Employee
*** Failure 5479 Ordered Analytical WHERE Dept = 401
Functions not allowed in WHERE Clause. QUALIFY totsum / salary > 3
ORDER BY 3, 2;

Name Salary Dept TotSum


-------------------- ------------ ----------- ------------
Phillips 24500.00 401 245575.00
Hoover 25525.00 401 245575.00

Window Aggregates - Part 1 Page 18-35


Order of Group SUM and Aggregation
When performing aggregation and window aggregate in the same projection, the order of
operations show that the aggregate sum will be performed first.

The result of the sum is:

SELECT department_number AS DeptNbr


SUM(Salary_Amount) AS SumSal
FROM Employee
WHERE DeptNbr IN (401, 403, 501)
GROUP BY 1;

DeptNbr SumSal
----------- ------------
401 213275.00
501 200125.00
403 193500.00

Next, the window aggregate is performed on this result next (shown without the QUALIFY):

DeptNbr SumSal AvgSal


----------- ------------ ------------
401 213275.00 202300.00
403 193500.00 202300.00
501 200125.00 202300.00

Page 18-36 Window Aggregates - Part 1


Order of Group SUM and Aggregation

For the following query, the sum is 1. WHERE


2. AGGREGATION
performed first followed by the window 3. HAVING
aggregate that averages those sums. 4. OLAP { [ PARTITION BY ]
(For the departments in the list.) [ ORDER BY ] [ rows ] }
5. QUALIFY [ ORDER BY ]
6. RANDOM
7. SAMPLE | TOP N
8. ORDER BY
9. FORMAT
SELECT department_number AS DeptNbr
,SUM(Salary_Amount) AS SumSal
,AVG(SumSal) OVER ( ) AS AvgSal
FROM Employee
WHERE DeptNbr IN (401, 403, 501)
GROUP BY 1;

DeptNbr SumSal AvgSal


----------- ------------ ------------
401 213275.00 202300.00
403 193500.00 202300.00
501 200125.00 202300.00

Window Aggregates - Part 1 Page 18-37


Module 18: Summary
A summary of this module is discussed.

Page 18-38 Window Aggregates - Part 1


Module 18: Summary

To summarize, group window aggregates:

• Like aggregation, they use SUM, COUNT, MIN, MAX and AVG.
• Unlike aggregation, they retain the detail data of each row.
• Can be partitioned into groups.
• Can be ordered.
• Can be used with QUALIFY
• Occur after aggregation in the order of operations.

Window Aggregates - Part 1 Page 18-39


Module 18: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 18-40 Window Aggregates - Part 1


Module 18: Review Questions

True or False:

1. Of the four Windows, this module only discussed the GROUP Window.
True
2. With the Group Window, ORDER BY, in the OVER, will not change result
values.
True
3. In the OVER clause, ORDER BY must be after PARTITION, if both are used.
True
4. PARTITION and GROUP BY may both be present within the same
projection.
True – they may be alone or together or not present at all
5. QUALIFY need not reference a projected value.
True
6. HAVING must reference a projected value.
False
7. PARTITION may return a null value.
True

Window Aggregates - Part 1 Page 18-41


Module 18: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 18-42 Window Aggregates - Part 1


Module 18: Lab Exercise

1. From the “SalesTbl”, list the store id, product id, sales, for each row,
include with each projected row the sum of the sales for each product
across all stores and order by store id within product id.

2. Add the minimum and maximum sales to #1 and order by storeid.

3. Use a GROUP function to find employees having a salary greater that


their department average and display the average.

Window Aggregates - Part 1 Page 18-43


Notes:

Page 18-44 Window Aggregates - Part 1


Module 19

Window Aggregates – Part 2

After completing this module, you will be able to:


• Write Cumulative Window Aggregate queries.
• Write Moving Window Aggregate queries.
• Write Remaining Window Aggregate queries.
• Write Moving Difference queries.
• Use RESET WHEN to dynamically condition results.
• Use variations of PARTITION, ORDER, and ROWS for Windows
functions.

Window Aggregates - Part 2 Page 19-1


Notes:

Page 19-2 Window Aggregates - Part 2


Table of Contents
What’s in this Module? .............................................................................................................. 19-4
Cumulative Sum ......................................................................................................................... 19-6
Cumulative Sum with Partitioning ............................................................................................. 19-8
Moving Sum ............................................................................................................................. 19-10
Moving AVG – Not in Range .................................................................................................. 19-12
Moving Difference ................................................................................................................... 19-14
Moving Difference and QUALIFY.......................................................................................... 19-16
Moving Difference and Partition ............................................................................................. 19-18
Remaining Window ................................................................................................................. 19-20
Remaining Window and Partition ............................................................................................ 19-22
RESET WHEN ........................................................................................................................ 19-24
Module 19: Summary............................................................................................................... 19-26
Module 19: Review Questions ................................................................................................. 19-28
Module 19: Lab Exercise ......................................................................................................... 19-30

Window Aggregates - Part 2 Page 19-3


What’s in this Module?
Whereas the previous module only covered the GROUP window, this module will cover the rest
of the window aggregate functions. By now you should be familiar with these following
constructs that will be built upon more in this module.

• PARTITION BY
• ORDER BY
• ROWS [BETWEEN]
• QUALIFY

In this module we will discuss the following:

• Cumulative Windows
• Moving Windows
• Remaining Windows
• RESET WHEN
• Moving Differences

Page 19-4 Window Aggregates - Part 2


What’s in this Module?

• The preceding module taught how the use of "UNBOUNDED" determined the group window.
• The expression for a Group Window is:
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
• A GROUP includes either all rows in a PARTITION, or all rows in the result.
• ORDER BY, inside the window expression, did not affect the aggregate values within the
group.
There are 4 different windows.
1. Group Window
2. Cumulative Window
3. Moving Window Covered in this module.
4. Remaining Window

Regarding the 3 windows covered in this module:


•A Cumulative Window is UNBOUNDED PRECEDING.
 e.g., ROWS UNBOUNDED PRECEDING means with all preceding and none following
• A moving window is bounded discretely on both the PRECEDING and FOLLOWING values.
 e.g., ROWS 2 PRECEDEDING means with the 2 preceding and none following
• The Remaining Window is UNBOUNDED FOLLOWING, which is similar to a Cumulative Window.
 e.g., ROWS UNBOUNDED FOLLOWING means with all following rows and none preceding
• Different ORDER BY columns can affect the aggregate values within the result.
• Each new value can be affected by the ever-changing list of preceding or following values.
• We will also discuss both ANSI and non-ANSI forms for obtaining moving differences.

Window Aggregates - Part 2 Page 19-5


Cumulative Sum
An alternate syntax to that on the facing page is shown below. Notice that if this syntax is used,
the clause “CURRENT ROW” must appear after the UNBOUNDED clause. This is because,
logically, we move downward to the current row, not upward from the current row.

SELECT
ItemID
,SalesDate
,Sales
,SUM(Sales) OVER (ORDER BY SalesDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM SalesHist
WHERE itemid IN (4, 6);

Also note that, unlike the GROUP window, ORDER BY plays an extremely important role in a
cumulative window. A different ordering provides an entirely different result. By different
results, we mean different values for the very same table rows. Consider the following, where
the ordering is changed to item id and sales date.

SELECT
ItemID
,SalesDate
,Sales
,SUM(Sales) OVER (ORDER BY itemid, SalesDate ROWS
UNBOUNDED PRECEDING)
FROM SalesHist
WHERE itemid IN (4, 6);

itemid salesdate sales Cumulative Sum(sales)


------ --------- --------- ---------------------
4 08/05/24 562.00 562.00
4 08/05/25 395.00 957.00
4 08/05/26 548.00 1505.00
4 08/05/27 387.00 1892.00
6 08/05/24 465.00 2357.00
6 08/05/25 283.00 2640.00
6 08/05/26 379.00 3019.00
6 08/05/27 224.00 3243.00

Without an ORDER BY clause, the default ordering would be each of the un-aggregated values
within the previous one in ascending order.

Page 19-6 Window Aggregates - Part 2


Cumulative Sum

The following is an example of a Cumulative Window.


Note the default heading for the window aggregate column.
Interpret the result like this:
1. Pick any sales amount
2. Sum it with all non-nulls preceding it
3. Project the sum
Without an ORDER BY clause, the default
ordering would be each of the un-aggregated
SELECT ItemID values within the previous one in ascending
,SalesDate order.
,Sales
,SUM(Sales) OVER (ORDER BY SalesDate ROWS UNBOUNDED PRECEDING)
FROM SalesHist
WHERE itemid IN (4, 6);

What could we do to itemid salesdate sales Cumulative Sum(sales)


obtain the cumulative ------- --------- ------------ ---------------------
sum for each item-id, 6 08/05/24 465.00 465.00
by sales date, instead 4 08/05/24 562.00 1027.00
of mixing them into the 4 08/05/25 2 395.00 1422.00
same result? 6 08/05/25 283.00 1705.00
6 08/05/26 1 379.00 3 2084.00
4 08/05/26 548.00 2632.00
6 08/05/27 224.00 2856.00
4 08/05/27 387.00 3243.00

Window Aggregates - Part 2 Page 19-7


Cumulative Sum with Partitioning
As discussed in the previous module, using PARTITION BY, breaks out the results into two
partitions, performing a cumulative sum, in this case, within the partition if “itemid.”

Page 19-8 Window Aggregates - Part 2


Cumulative Sum with Partitioning

SELECT ItemID
,SalesDate
,Sales
,SUM(Sales) OVER (PARTITION BY itemid
ORDER BY SalesDate
ROWS UNBOUNDED PRECEDING)
FROM SalesHist
WHERE itemid IN (4, 6);

• Note the default heading for the window aggregate column.


• Interpret the result like this:
1. Pick any sales amount
2. Sum it with all non-nulls preceding it in that partition
3. Project the sum
itemid salesdate sales Cumulative Sum(sales)
----------- --------- ------------ ------------------
4 08/05/24 562.00 562.00
4 08/05/25 395.00 957.00
4 08/05/26 548.00 1505.00
4 08/05/27 387.00 1892.00
New partition
6 08/05/24 465.00 465.00
6 08/05/25 283.00 748.00 reset
6 08/05/26 379.00 1127.00
6 08/05/27 224.00 1351.00

Window Aggregates - Part 2 Page 19-9


Moving Sum
The facing page illustrates how to obtain a “moving sum.” By moving sum, we mean that the
sums actually “slide” through the data according to the value specified for the ROWS value.
Notice that we have switched from and unbounded ROWS value to a discrete ROWS value.
This is what makes this a “moving” window.

The answer to the question is shown below, in the following query.

SELECT itemid
,salesdate
,sales
,SUM(sales) OVER (PARTITION BY itemid
ORDER BY salesdate
ROWS 2 PRECEDING)
FROM saleshist
WHERE salesdate BETWEEN DATE'2008-05-24' AND DATE'2008-05-31';

Page 19-10 Window Aggregates - Part 2


Moving Sum

Provide a 3-day moving sum for the sales of item 1 for the last week.

How could we change this query to include


SELECT itemid all items, by item, for that same week,
,salesdate keeping this at the item level?
,sales
,SUM(sales) OVER (ORDER BY salesdate ROWS 2 PRECEDING)
FROM saleshist
WHERE itemid = 1
AND salesdate BETWEEN DATE'2008-05-24' AND DATE'2008-05-31';

Notice the default heading for the Window Aggregate column.

Interpret the results itemid salesdate sales Moving Sum(sales)


like this: ----------- --------- ------------ -----------------
1. Choose any 1 08/05/24 375.00 375.00
sales amount. 1 08/05/25 549.00 924.00
2. Sum it with the 2 1 08/05/26 464.00 1388.00
preceding it. 1 08/05/27 534.00 1547.00
3. Project the sum. 1 08/05/28 279.00 1277.00
1 08/05/29 582.00 1395.00
1 08/05/30 423.00 1284.00
1 08/05/31 545.00 1550.00

Window Aggregates - Part 2 Page 19-11


Moving AVG – Not in Range
There are very many variations one can perform using window aggregate functionality.
Consider the variation on the facing page. Here we are illustrating another basic concept
showing that the values referenced by the ROWS clause need not include the current row.

Page 19-12 Window Aggregates - Part 2


Moving AVG - Not in Range

Project a 2-day moving average of the prior 2 days onto each row for comparison.

The order of the values here is important.


SELECT itemid Logically reading downward, we hit the second previous one first.
,salesdate The query would fail if the values were switched.
,sales
,AVG(sales) OVER (ORDER BY salesdate
ROWS BETWEEN 2 PRECEDING AND 1 PRECEDING)
FROM saleshist
WHERE itemid = 1
AND salesdate BETWEEN DATE'2008-05-24' AND DATE'2008-05-31';

itemid salesdate sales Moving Avg(sales)


----------- --------- ------------ -----------------
Since the current 1 08/05/24 375.00 ?
row is not in 1 08/05/25 549.00 375.00
included in the 1 08/05/26 464.00 462.00
window's range, it 1 08/05/27 534.00 506.50
is not included in 1 08/05/28 279.00 499.00
1 08/05/29 582.00 406.50
the average.
1 08/05/30 423.00 430.50
1 08/05/31 545.00 502.50

Window Aggregates - Part 2 Page 19-13


Moving Difference
For the query on the facing page, the expression calculating the DayOfWeek was obtained by
performing an EXPLAIN of a query, in SQL Assistant, where it was used in a WHERE
constraint. We pulled the re-written expression from the explanation and used it in the projection
of this query. For the day of week, Sunday is day 1.

We could have added the following QUALIFY to limit the result to just the rows for just the
week of concern. The result of including this QUALIFY will be shown on the next page.

QUALIFY salesdate BETWEEN DATE'2008-06-01' AND DATE'2008-06-07'

Page 19-14 Window Aggregates - Part 2


Moving Difference

Find the daily differences of sales from one week to the next for item 1.
SELECT salesdate, ((((salesdate) - (DATE'1901-01-06')) MOD 7 ) + 1 ) AS DayOfWeek, sales,
sales - MIN(sales) OVER (ORDER BY salesdate
ROWS BETWEEN 7 PRECEDING AND 7 PRECEDING) AS Diff
FROM saleshist
WHERE itemid = 1
AND salesdate BETWEEN DATE'2008-05-25' AND DATE'2008-06-07'

salesdate DayOfWeek sales Diff


--------- ----------- ------------ -----------
08/05/25 1 549.00 ?
08/05/26 2 464.00 ?
08/05/27 3 534.00 ?
08/05/28 4 279.00 ?
08/05/29 5 582.00 ?
08/05/30 6 423.00 ?
08/05/31 7 545.00 ?
08/06/01 1 383.00 -166.00
08/06/02 2 563.00 99.00
08/06/03 3 471.00 -63.00
08/06/04 4 537.00 258.00
08/06/05 5 280.00 -302.00
08/06/06 6 588.00 165.00
08/06/07 7 424.00 -121.00

Window Aggregates - Part 2 Page 19-15


Moving Difference and QUALIFY
For the query on the facing page, the expression calculating the DayOfWeek was obtained by
performing an EXPLAIN of a query, in SQL Assistant, where it was used in a WHERE
constraint. We pulled the re-written expression from the explanation and used it in the projection
of this query. For the day of week, Sunday is day 1.

The WHERE condition helps performance by having only the rows for the required 2 weeks
participate in the aggregate operation since it is applied in step-1. The QUALIFY limits the
result, from the predicate, to only the week of concern.

Page 19-16 Window Aggregates - Part 2


Moving Difference and QUALIFY

Find the daily differences of sales from one week to the next for item 1.

SELECT salesdate, ((((salesdate) - (DATE'1901-01-06')) MOD 7 ) + 1 ) AS DayOfWeek, sales,


MIN(sales) OVER (ORDER BY salesdate
ROWS BETWEEN 7 PRECEDING AND 7 PRECEDING) AS PrevWeek,
sales AS CurrWeek,
CurrWeek - PrevWeek AS Diff
FROM saleshist
WHERE itemid = 1
AND salesdate BETWEEN DATE'2008-05-25' AND DATE'2008-06-07'
QUALIFY salesdate BETWEEN DATE'2008-06-01' AND DATE'2008-06-07';

Notice the differing range constraints for WHERE vs. QUALIFY.


Discuss their function with respect to performance and necessity.

salesdate DayOfWeek sales PrevWeek CurrWeek Diff


--------- ----------- ------------ ------------ ------------ -------------
08/06/01 1 383.00 549.00 383.00 -166.00
08/06/02 2 563.00 464.00 563.00 99.00
08/06/03 3 471.00 534.00 471.00 -63.00
08/06/04 4 537.00 279.00 537.00 258.00
08/06/05 5 280.00 582.00 280.00 -302.00
08/06/06 6 588.00 423.00 588.00 165.00
08/06/07 7 424.00 545.00 424.00 -121.00

Window Aggregates - Part 2 Page 19-17


Moving Difference and Partition

Page 19-18 Window Aggregates - Part 2


Moving Difference and Partition

The following illustrates how to retrieve a moving difference within a PARTITION.


Our example finds the moving difference within departments.

SELECT last_name, department_number AS Dept#, salary_amount, hire_date,


salary_amount - MIN(salary_amount) OVER
See left-hand page
(PARTITION BY department_number
to see how MDIFF
ORDER BY hire_date
does partitioning.
ROWS BETWEEN 3 PRECEDING AND 3 PRECEDING) AS mdf
FROM customer_service.employee
WHERE department_number IN (401, 403);
last_name Dept# salary_amount hire_date mdf
-------------------- ----------- ------------- --------- -------------
Hoover 401 25525.00 06/06/18 ?
Trader 401 37850.00 06/07/31 ?
Brown 401 43100.00 06/07/31 ?
Johnson 401 36300.00 06/10/15 10775.00
Rogers 401 46000.00 07/03/01 8150.00
Phillips 401 24500.00 07/04/01 -18600.00
Machado 401 ? 09/03/01 ?
Ryan 403 31200.00 06/10/15 ?
Villegas 403 49700.00 07/01/02 ?
Lombardo 403 31000.00 07/02/01 ?
Hopkins 403 37900.00 07/03/15 6700.00
Charles 403 ? 08/10/01 ?
Brown 403 43700.00 09/05/01 12700.00

Window Aggregates - Part 2 Page 19-19


Remaining Window
The sort order that you specify in the window specification defines the sort order of the rows
OVER which the function is applied - it does not define the ordering of the results. In other
words, logically, the result is obtained in reverse order, which returns the intended result if
interpreted from the bottom value upward. To order the results, use an ORDER BY phrase in the
SELECT statement.

The very same result could have been obtained using a cumulative window. As a matter of fact,
in a normal query it is likely that the default heading would be replaced with an alias name.
Since it is the default heading that indicates this result to be a remaining window result, without
it no one would know the difference between this result and the result of a cumulative window
ascending. An ORDER BY 1 DESC would add a bit of sense to this query.

Page 19-20 Window Aggregates - Part 2


Remaining Window

• Permits computed aggregates based on the remaining rows in a defined window.


• Remaining rows are defined as relative to the current row.
• The remaining function is activated by the two bullets shown below.
Show sales of all products in all stores in ascending sequence.
Also show the sales sum of all products below the current row in the order.

SELECT storeid, prodid, sales,


SUM(sales) OVER (ORDER BY sales DESC
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
FROM salestbl;
storeid prodid sales Remaining Sum(sales)
------- ------ ----------- --------------------
1003 C 20000.00 20000.00
• Presence the keywords 1002 D 25000.00 45000.00
UNBOUNDED FOLLOWING. 1003 A 30000.00 75000.00
1001 D 35000.00 110000.00
• Absence of the keywords 1002 C 35000.00 145000.00
UNBOUNDED PRECEDING. 1002 A 40000.00 185000.00
1003 D 50000.00 235000.00
1001 C 60000.00 295000.00
1003 B 65000.00 360000.00
1001 A 100000.00 460000.00
1001 F 150000.00 610000.00

Window Aggregates - Part 2 Page 19-21


Remaining Window and Partition
Partitioning a remaining window works the same as with any other window. However, note that
the sorting order is still the reverse of that specified in the ORDER BY for the window
aggregate. This is why the additional ORDER BY was performed.

Page 19-22 Window Aggregates - Part 2


Remaining Window and Partition

Sum each sales amount with all of the ones following it for 2008-05-25 to 2008-05-28, by item.

SELECT itemid, salesdate, sales,


SUM(sales) OVER (PARTITION BY itemid ORDER BY salesdate
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
AS SumCol
FROM saleshist
WHERE salesdate BETWEEN DATE'2008-05-25' AND DATE'2008-05-28'
ORDER BY itemid, salesdate;

In this example, we added itemid salesdate sales SumCol


partitioning. ------ --------- -------- ---------
1 08/05/25 549.00 1826.00
1 08/05/26 464.00 1277.00
Since the Remaining Widow is sorted 1 08/05/27 534.00 813.00
as a Cumulative Window in reverse 1 08/05/28 279.00 279.00
order, we added an ORDER BY to view 2 08/05/25 269.00 1672.00
it as intended. 2 08/05/26 461.00 1403.00
2 08/05/27 586.00 942.00
2 08/05/28 356.00 356.00
- - -
- - -

Window Aggregates - Part 2 Page 19-23


RESET WHEN
The RESET WHEN is used to dynamically partition a window aggregate result set.

This feature is an enhancement to the Teradata windows aggregate functions. “RESET WHEN”,
has been added to the window function that adds a dynamic condition to the query. This feature
provides applications with an easy-to-use method of creating “conditional partitions” as part of
window aggregate processing based on a user specified RESET WHEN condition

At run time, during the evaluation of the row, the RESET WHEN condition is evaluated. If the
condition is TRUE a new partition is created for statistical function evaluation. This feature adds
more dynamic handling than does the window PARTITION BY clause, which is more limited in
terms of the kind of analysis that could be performed on the data. RESET WHEN is added to the
ORDER BY clause as used in analytic functions.

Limitations
This feature imposes following limitations for its functionality:
• There must be an ORDER BY specification in the WINDOW function
• The RESET WHEN condition cannot have a SELECT clause.
• Nested RESET WHEN condition is not permitted, i.e. RESET WHEN clause in a
window function that is sub-expression in a RESET WHEN condition, is not supported.

Considerations:
• RESET is not a reserved word
• The RESET WHEN condition is equivalent in its scope to the condition allowed in
QUALIFY clause
• This is a Teradata extension to the ANSI SQL standard for windows aggregate functions.

Page 19-24 Window Aggregates - Part 2


RESET WHEN

• This feature is an enhancement to the Teradata windows aggregate functions.


• RESET WHEN adds dynamic condition handling to the query.
• The RESET WHEN condition is evaluated at run-time.
• If the condition is TRUE a new partition is created for statistical function evaluation.
• RESET WHEN is added to ORDER BY (i.e. ORDER BY is required.)
SELECT BirthDate,
MIN(Salary_Amount) OVER (ORDER BY Birthdate DESC
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS PrevSal,
Salary_Amount,
SUM(Salary_Amount) OVER (ORDER BY Birthdate DESC
RESET WHEN Salary_Amount IS NULL OR Salary_Amount < PrevSal
ROWS UNBOUNDED PRECEDING) AS Growth
FROM Employee
WHERE birthdate BETWEEN DATE'1976-01-01' AND DATE'1983-01-01';

birthdate PrevSal salary_amount Growth


--------- ------------ ------------- ------------
We will revisit 81/11/10 ? 66000.00 66000.00
this syntax in 80/01/14 66000.00 25525.00 25525.00 RESET
a later module. 79/12/11 25525.00 52500.00 78025.00
79/06/21 52500.00 ? ? RESET
77/07/07 ? 34700.00 34700.00
77/06/19 34700.00 37850.00 72550.00
76/04/23 37850.00 36300.00 36300.00 RESET

Window Aggregates - Part 2 Page 19-25


Module 19: Summary
A summary of this module is discussed.

Page 19-26 Window Aggregates - Part 2


Module 19: Summary

• A group window is defined by clauses – UNBOUNDED PRECEDING AND


UNBOUNDED FOLLOWING.
• A cumulative window is defined by the clause – UNBOUNDED
PRECEDING.
• A moving window is defined by a discrete value – UNBOUNDED is not
referenced.
• A remaining window is defined by the clause – UNBOUNDED
FOLLOWING.
• A moving difference function (MDIFF) may be used to obtain a moving
difference.
• A moving difference can also be computed using standard window
aggregates.
• The RESET WHEN can be used to dynamically establish partitioning.

Window Aggregates - Part 2 Page 19-27


Module 19: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 19-28 Window Aggregates - Part 2


Module 19: Review Questions

True or False:

1. RANK is considered a window aggregate function.


True
2. MDIFF can be partitioned even though it’s not ANSI standard.
True – by using a GROUP BY
3. A remaining window must use ORDER BY.
False – although an ORDER BY is typically found in one
4. A moving window can not contain and UNBOUNDED clause.
True
5. The RESET WHEN feature must be accompanied by an ORDER BY.
True
6. A moving window can not contain a FOLLOWING clause.
False
7. A cumulative COUNT(*) is similar to ranking values.
True

Window Aggregates - Part 2 Page 19-29


Module 19: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 19-30 Window Aggregates - Part 2


Module 19: Lab Exercise

1. Write a window aggregate that provides a rank of salary amounts for all
employees using a cumulative window.

2. Change #1 to perform the rank within department.

3. Write a moving window aggregate from the daily_sales table that


compares each sales amount to the averages of the two preceding
days.

Window Aggregates - Part 2 Page 19-31


Notes:

Page 19-32 Window Aggregates - Part 2


Module 20

RANK

After completing this module, you will be able to:


• Use RANK to rank values.
• Show bottom values and their rank.
• Rank within a partition.
• Contrast differences between ROW_NUMBER an RANK.
• Assign sequence values within a partition.

RANK Page 20-1


Notes:

Page 20-2 RANK


Table of Contents
Ranking Values .......................................................................................................................... 20-4
QUALIFY With no Tied Values ................................................................................................ 20-6
QUALIFY With Tied Ending Values ........................................................................................ 20-8
Qualifying Without Rank Projection ....................................................................................... 20-10
Bottom Values by ASC Rank .................................................................................................. 20-12
Bottom Values by DESC Rank ................................................................................................ 20-14
RANK and PARTITION.......................................................................................................... 20-16
ROW_NUMBER ..................................................................................................................... 20-18
ROW_NUMBER vs. RANK ................................................................................................... 20-20
ROW_NUMBER and PARTITION ........................................................................................ 20-22
ROW_NUMBER and RESET WHEN .................................................................................... 20-24
Finding Median Values ............................................................................................................ 20-26
Module 20: Summary............................................................................................................... 20-28
Module 20: Review Questions ................................................................................................. 20-30
Module 20: Lab Exercise ......................................................................................................... 20-32

RANK Page 20-3


Ranking Values
We begin this module by reviewing a page from an earlier module and show how to use a
cumulative count to retrieve a ranking value. Granted, this probably should have been a
COUNT(*), but we this column was defined as not null, so it didn’t matter. The problem,
however, is that RANK would not treat a tie in quite the same fashion as it was treated here.
Rank would have given the tied values the same rank value.

It should be noted that RANK can operate on character data values as well as numeric, though it
is like a rare instance where this would be needed.

Page 20-4 RANK


Ranking Values

In the previous module, we saw how a Cumulative Count Window could act as
method for ranking values.
What we didn't discuss is what happens for tied values.
Here is how the COUNT strategy works when ties occur.

SELECT itemid
,sales
,COUNT(itemid) OVER (ORDER BY sales DESC ROWS UNBOUNDED PRECEDING)
FROM saleshist
WHERE salesdate = date '2008-05-24'
itemid sales Cumulative Count(itemid)
----------- ------------ ------------------------
5 690.00 1
4 562.00 2
8 489.00 3
6 465.00 4
7 449.00 5
2 449.00 6
10 383.00 7
1 375.00 8
3 309.00 9
9 271.00 10

RANK Page 20-5


QUALIFY With no Tied Values
There are only a few small items to notice here. First, there are no duplicate values and
secondly, we are using the keyword RANK as an alias. When aliasing with a keyword, we must
enclose the name with double-quotes. Thereafter, we must always refer to it by including the
double-quotes. Lastly, we are using a QUALIFY to show the top three values.

As we stated on the previous page, a tied value gets the same rank value, and here we see the
correct result for a RANK. Below, notice that this function works quite differently in that the
window aggregates that we have seen earlier. Here we do not put a value into the rank function.
To be more precise, we are not allowed to place anything into the function (i.e. RANK(sales) is
not allowed with the window aggregate version of rank). Instead, what determines the ranking is
the ORDER BY used in the OVER portion of the syntax. The default for RANK is ASC.

SELECT itemid
,sales
,RANK() OVER (ORDER BY sales DESC)
FROM saleshist
WHERE salesdate = date '2008-05-24';

itemid sales Rank(sales)


----------- ------------ -----------
5 690.00 1
4 562.00 2
8 489.00 3
6 465.00 4
2 449.00 5
7 449.00 5
10 383.00 7
1 375.00 8
3 309.00 9
9 271.00 10

Page 20-6 RANK


QUALIFY With no Tied Values

Show the top 3 selling items for "2008-05-24"

SELECT itemid
,sales
,RANK( ) OVER (ORDER BY sales DESC) AS "Rank"
FROM saleshist
WHERE salesdate = date '2008-05-24'
QUALIFY "Rank" < 4;

Note that the reference to the alias, in the qualify, must include the double-quotes.

itemid sales Rank


----------- ------------ -----------
5 690.00 1
4 562.00 2
8 489.00 3

RANK Page 20-7


QUALIFY With Tied Ending Values
In the example on the facing page note that there are 6 rows returned even thought there are only
5 different ranking values (as requested). You may also have noted that there are two sets of tied
values. Note how the result set skipped from a rank value of “1” to a ranked value of “3”. The
next rank value after “5” will be 7.. Another value of “5”, in the result (bringing the number of
“5”s to 3) would raise the next ranking value from “7” to “8”.

SELECT itemid
,sales
,RANK() OVER (ORDER BY sales DESC) AS "Rank"
FROM saleshist
WHERE salesdate = date '2008-01-01'
QUALIFY "Rank" < 7;

itemid sales Rank


----------- ------------ -----------
5 690.00 1
4 690.00 1
8 489.00 3
6 465.00 4
2 449.00 5
7 449.00 5

Page 20-8 RANK


QUALIFY With Tied Ending Values

Show the top 5 selling items for "2008-01-01"

SELECT itemid
,sales
,RANK( ) OVER (ORDER BY sales DESC) AS "Rank"
FROM saleshist
WHERE salesdate = date '2008-01-01'
QUALIFY "Rank" < 6;

Because there is a tie for the last (5th itemid sales Rank
ranked) value, there are 2 rows in ----------- ------------ -----------
the result for that value. 4 690.00 1
5 690.00 1
8 489.00 3
Note that there are 6 ranked values 6 465.00 4
altogether. 7 449.00 5
2 449.00 5
The extra rows are only projected
when there is a tie for the very last What will the next ranking value be?
value.

RANK Page 20-9


Qualifying Without Rank Projection
The example on the facing page illustrates how one may project a specified number of rows from
the “TOP” of a list (in either ascending or descending fashion) without projecting the ranking
value. This result is quite similar to that retrieved from the “TOP N” feature taught in the
“Teradata Intro to SQL” course.

Page 20-10 RANK


Qualifying Without a Rank Projection

Show the top 3 selling items for "2008-05-24"

SELECT itemid
,sales
FROM saleshist
WHERE salesdate = date '2008-05-24'
QUALIFY RANK( ) OVER (ORDER BY sales DESC) < 4;

This shows how it is not necessary to project itemid sales


the rank value, and still qualify. ----------- ------------
5 690.00
4 562.00
8 489.00

RANK Page 20-11


Bottom Values by ASC Rank
Here we see how to retrieve a number of rows with the lowest values (e.g. 3) by using
QUALIFY. A possible issue might be needed to not only find those 3 values, but also their rank
as they would have occurred in descending fashion. For instance, maybe the result would have
been better if we saw that the value of 174.00 was ranked 85th in descending order, followed by
183.00 being 83rd and 186.00 being 82nd. How we can retrieve this information is discussed
next.

Page 20-12 RANK


Bottom Values by ASC Rank

Show the bottom 3 selling items across all dates and all items.

SELECT itemid
,sales
,RANK( ) OVER (ORDER BY sales ASC) AS "Rank"
FROM saleshist
QUALIFY "Rank" < 4;

Here we changed DESC, in the itemid sales Rank


window expression, to ASC. ----------- ------------ -----------
5 174.00 1
Now the lowest values are shown. 6 183.00 2
5 186.00 3
Unfortunately, we can't see how they
ranked in descending order.

RANK Page 20-13


Bottom Values by DESC Rank
When attempting to retrieve the lowest along with their bottom ranking values, one needs to
follow the process on the facing page. Think of the process like this:

• Perform the rank according to


RANK() OVER (ORDER BY sales DESC)
• from the result, perform the rank on this result according to
RANK() OVER (ORDER BY sales DESC)

Perhaps keeping in mind the order of operations makes this easier to understand.

1. WHERE
2. AGGREGATION
3. HAVING
4. OLAP { [ PARTITION BY ] [ ORDER BY ] [ rows ] }
5. QUALIFY [ ORDER BY ]
6. SAMPLE | TOP N
7. ORDER BY
8. FORMAT

Watch what happens if you attempt to shortcut this as shown in the following.

SELECT itemid
,sales
,RANK() OVER (ORDER BY sales DESC) AS "Rank"
FROM saleshist
QUALIFY “Rank” ASC < 4;

QUALIFY “Rank” ASC < 4;


$
*** Failure 3706 Syntax error: expected something between
the 'QUALIFY' key word and the end of the request.

Page 20-14 RANK


Bottom Values by DESC Rank

Show the bottom 3 selling items across all dates and all items
showing descending rank value.

We get the Rank Descending.


SELECT itemid
,sales
,RANK( ) OVER (ORDER BY sales DESC) AS "Rank"
FROM saleshist
QUALIFY RANK( ) OVER (ORDER BY sales ASC) < 4;

We now flip to ascending


prior to Qualifying.

Here we changed DESC, in the itemid sales Rank


window expression, to ASC, in the ----------- ------------ -----------
QUALIFY. 5 174.00 160
6 183.00 159
Now the actual lowest Rank values 5 186.00 157
are shown. 6 186.00 157

RANK Page 20-15


RANK and PARTITION
Partitioning a ranked result is no different than partitioning other results. In our example,
however, we are not projecting the ranking value in each partition. This is not necessarily
important since we can visually determine which is 1, 2, and 3 within each partition.

Page 20-16 RANK


RANK and PARTITION

Show the top 3 selling items for each day for a 3-day period, without
showing their rank value.
SELECT itemid
,salesdate
,sales
FROM saleshist
WHERE salesdate BETWEEN DATE'2008-05-24' AND DATE'2008-05-27'
QUALIFY RANK( ) OVER (PARTITION BY salesdate
ORDER BY sales DESC) < 4;
itemid salesdate sales
----------- --------- ------------
5 08/05/24 690.00
This is based upon what we should 4 08/05/24 690.00
already know about using Partition 8 08/05/24 489.00
only now with Rank. 1 08/05/25 549.00
10 08/05/25 522.00
9 08/05/25 474.00
5 08/05/26 729.00
8 08/05/26 629.00
4 08/05/26 548.00
8 08/05/27 729.00
7 08/05/27 674.00
2 08/05/27 586.00

RANK Page 20-17


ROW_NUMBER
This function assigns a sequence number to each row of a result set based upon a specified
ordering. This is very much like performing a ranking of data. As a matter of fact, if the data
values referenced by this function are unique, then the results would be the same. If this function
operates on a non-unique column, ROW_NUMBER, unlike RANK, ignores ties and assign a
different sequence value for each row, tied or not. This function is also like RANK in that it can
operate on character data values as well as numeric.

Page 20-18 RANK


ROW_NUMBER

The ROW_NUMBER function is an ANSI Standard function that assigns a sequence


number to single or multiple columns that are either character or numeric.
ROW_NUMBER IS ANSI 2003-compliant.

SELECT Last_Name, First_Name, Department_Number,


As always, note the
ROW_NUMBER( ) OVER (ORDER BY Last_Name)
default heading for
FROM Employee ROW_NUMBER.
WHERE Department_Number IN (401, 501);

last_name first_name department_number Row_Number()


----------- -------------- ----------------- ------------
Brown Alan 401 1
Hoover William 401 2
Johnson Darlene 401 3
Machado Albert 401 4
Phillips Charles 401 5
Rabbit Peter 501 6
Ratzlaff Larry 501 7
Rogers Frank 401 8
Runyon Irene 501 9
Trader James 401 10
Wilson Edward 501 11

RANK Page 20-19


ROW_NUMBER vs. RANK
The facing page illustrates something about RANK vs. ROW_NUMBER that you should already
know and understand. It is always good to see an example. And don’t be afraid of trying things
to see how they work!

Page 20-20 RANK


ROW_NUMBER vs. RANK

Unlike RANK, the ROW_NUMBER function disregards any tied values.


If the object of ROW_NUMBER is unique, then it could be used to Rank as well.

SELECT Last_Name,
ROW_NUMBER( ) OVER (ORDER BY Last_Name),
RANK( ) OVER (ORDER BY Last_Name)
FROM Employee
WHERE Department_Number IN (401, 302);

last_name Row_Number() Rank(last_name ASC)


-------------------- ------------ -------------------
Brown 1 1
Hoover 2 2
Johnson 3 3
Machado 4 4
Phillips 5 5
Rogers 6 6
Rogers 7 6
Trader 8 8

RANK Page 20-21


ROW_NUMBER and PARTITION
Just like partitioning ranking values, you can partition sequence values as well. The example on
the facing page should help to reinforce the similarities between the two functions.

Page 20-22 RANK


ROW_NUMBER and PARTITION

As always, PARTITION is available for assigning row numbers.


QUALIFY can be used to control the scope of the result.

SELECT Last_Name (CHAR(15)),


Department_Number AS Dept#,
Salary_Amount AS Sal,
ROW_NUMBER( )
OVER (PARTITION BY Dept# ORDER BY Sal DESC) AS Row#,
RANK( )
OVER (PARTITION BY Dept# ORDER BY Sal DESC) AS Rnk#
FROM Employee
WHERE Department_Number IN (301, 501)
QUALIFY Row# < 4;

last_name Dept# Sal Row# Rnk#


--------------- ----------- ------------ ----------- -----------
Kubic 301 57700.00 1 1
Stein 301 29450.00 2 2
Kanieski 301 29250.00 3 3
Runyon 501 66000.00 1 1
Ratzlaff 501 54000.00 2 2
Wilson 501 53625.00 3 3

RANK Page 20-23


ROW_NUMBER and RESET WHEN
The use of the RESET WHEN function shown here is very similar to what illustrated in a
previous module. In this example, however, a “growth” factor is derived to indicate increases
from previous amounts.

SELECT BirthDate,
MIN(Salary_Amount) OVER (ORDER BY Birthdate DESC
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS
PrevSal,
Salary_Amount,
SUM(Salary_Amount) OVER (ORDER BY Birthdate DESC
RESET WHEN Salary_Amount IS NULL OR Salary_Amount <
PrevSal
ROWS UNBOUNDED PRECEDING) AS Growth
FROM Employee
WHERE birthdate BETWEEN DATE'1976-01-01' AND DATE'1983-01-01';

Result from earlier module:

birthdate PrevSal salary_amount Growth


--------- ------------ ------------- ------------
81/11/10 ? 66000.00 66000.00
80/01/14 66000.00 25525.00 25525.00
79/12/11 25525.00 52500.00 78025.00
79/06/21 52500.00 ? ?
77/07/07 ? 34700.00 34700.00
77/06/19 34700.00 37850.00 72550.00
76/04/23 37850.00 36300.00 36300.00

Vs.

birthdate PrevSal salary_amount Growth


--------- ------------ ------------- -----------
81/11/10 ? 66000.00 1
80/01/14 66000.00 25525.00 1
79/12/11 25525.00 52500.00 2
79/06/21 52500.00 ? 1
77/07/07 ? 34700.00 2
77/06/19 34700.00 37850.00 3
76/04/23 37850.00 36300.00 1

Page 20-24 RANK


ROW_NUMBER and RESET WHEN

• This feature is an enhancement to the Teradata windows aggregate functions.


• RESET WHEN adds dynamic condition handling to the query.
• The RESET WHEN condition is evaluated at run-time.
• If the condition is TRUE a new partition is created for statistical function evaluation.
• RESET WHEN is added to ORDER BY (i.e. ORDER BY is required.)
SELECT BirthDate,
MIN(Salary_Amount) OVER (ORDER BY Birthdate DESC
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS PrevSal,
Salary_Amount,
ROW_NUMBER() OVER (ORDER BY Birthdate DESC
RESET WHEN Salary_Amount IS NULL OR Salary_Amount < PrevSal
ROWS UNBOUNDED PRECEDING) AS Growth
FROM Employee
WHERE birthdate BETWEEN DATE'1976-01-01' AND DATE'1983-01-01';

birthdate PrevSal salary_amount Growth


--------- ------------ ------------- -----------
81/11/10 ? 66000.00 1
80/01/14 66000.00 25525.00 1
79/12/11 25525.00 52500.00 2
79/06/21 52500.00 ? 1
77/07/07 ? 34700.00 2
77/06/19 34700.00 37850.00 3
76/04/23 37850.00 36300.00 1

RANK Page 20-25


Finding Median Values
As mentioned earlier, there are a great many useful strategies that one may use that involve the
ROW_NUMBER function. Here is yet another one that is only shown because median values
are often sought out, but there is no median function

Page 20-26 RANK


Finding Median Values

100.00 1 7 For an odd number of rows,


the mean is the middle
100.00 1 6
125.30 2 7
number. 125.30 2 6
127.95 3 7
127.95 3 6
210.45 4 7
210.45 4 6
222.22 5 7 For an even number of rows, 222.22 5 6
300.75 6 7 the mean is one of two 300.75 6 6
340.10 7 7 numbers.

SELECT SalesAmt,
ROW_NUMBER( ) OVER (ORDER BY SalesAmt)
AS rownum,
COUNT(*) OVER ( ) AS rowcount
FROM saleshist2
QUALIFY rownum = (rowcount + 1)/2 Lower median value if even or middle median if odd.
OR rownum = (rowcount/2) + 1; Higher median value if even or middle median if odd.

RANK Page 20-27


Module 20: Summary
A summary of this module is discussed.

Page 20-28 RANK


Module 20: Summary

• A RANK function is available for assigning ranking values based on a


ordering.

• RANK is a window aggregate and can reference PARTITION.

• ROW_NUMBER can be used to assign a sequence number to a result


set.

RANK Page 20-29


Module 20: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 20-30 RANK


Module 20: Review Questions

True or False:

1. When performing a windowed rank, a ORDER BY is not required.


False
2. The ROW_NUMBER function can work on character or numeric data
values.
True
3. RANK can work on only numeric data values.
False – this is true for the ANSI and non-ANSI forms
4. RANK can not reference PARTITION.
False
5. The default sorting order for a windowed rank is ascending.
True
6. Sorting on multiple columns may result in a rank value of 1 for each row.
False don’t confuse ORDER BY with PARTITION BY
7. The following is valid within a SELECT
RANK(sales) OVER (ORDER BY Col1)
False – there can be no reference within the parentheses

RANK Page 20-31


Module 20: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 20-32 RANK


Module 20: Lab Exercise

1. Retrieve a ranking of salary amounts from the employee table. Next


change the rank to partition by department number.

2. Rank all sales from the “salestbl” showing only the bottom three sales
amounts along with their bottom ranking values.

RANK Page 20-33


Notes:

Page 20-34 RANK


Module 21

QUANTILE

After completing this module, you will be able to:

• Use the QUANTILE function to display percentiles, quartiles,


deciles, etc.
• Use QUALIFY to eliminate rows from the initial result.
• Distinguish between uses for GROUP BY of QUANTILE vs.
Aggregation.

QUANTILE and WIDTH_BUCKET Page 21-1


Notes:

Page 21-2 QUANTILE and WIDTH_BUCKET


Table of Contents
QUANTILE ............................................................................................................................... 21-4
QUANTILE and QUALIFY ...................................................................................................... 21-6
QUANTILE with no Projected Value ........................................................................................ 21-8
Aggregation and QUANTILE .................................................................................................. 21-10
OLAP vs. Window Aggregates ................................................................................................ 21-12
QUANTILE and GROUP BY.................................................................................................. 21-14
Varying a QUANTILE............................................................................................................. 21-16
Ordering a QUANTILE ........................................................................................................... 21-18
Module 21: Summary............................................................................................................... 21-20
Module 21: Review Questions ................................................................................................. 21-22
Module 21: Lab Exercise ......................................................................................................... 21-24

QUANTILE and WIDTH_BUCKET Page 21-3


QUANTILE
A quantile is a generic interval of user-defined width. For example, percentiles divide data
amongst 100 evenly spaced intervals, deciles among 10 evenly spaced intervals, quartiles among
4, and so on. A quantile score indicates the fraction of rows having a sort_expression value
lower than the current value. For example, a percentile score of 98 means that 98 percent of
the rows in the list have a sort_expression value lower than the current value.

Although the use of QUANTILE is discouraged in favor of deriving the result in favor of using
know ANSI functionality, it is easier to use the function since it can intuitively perform some of
the more complex variations of this capability, and with less syntax. It is a Teradata extension to
the ANSI SQL-2003 standard and is retained only for backward compatibility with existing
applications.

To compute QUANTILE(q, s) using ANSI window functions, use the following:


(RANK() OVER (ORDER BY s) - 1) * q / COUNT(*) OVER()

Example – This non-AMSI syntax, which produces a “decile” result”:


SELECT salary_amount, QUANTILE(10, salary_amount) AS Quant
FROM employee;

Can be written with this ANSI syntax:


SELECT salary_amount,
(RANK() OVER (ORDER BY Salary_Amount) - 1) * 10 / COUNT(*)
OVER()
AS Quant
FROM employee
The result for both are this:
salary_amount Quant
------------- -----------
52500.00 7
53625.00 7
54000.00 8
56500.00 8
57700.00 8
66000.00 9
100000.00 9

Page 21-4 QUANTILE and WIDTH_BUCKET


QUANTILE

Quantile is used to divide rows into a QUANTILE (n, colname)


number of evenly spaced intervals: where n (width) can be any integer

Show the salaries for department 401 and their percentile.


SELECT employee_number, salary_amount
,QUANTILE (100, salary_amount) AS Quant
FROM employee
WHERE department_number = 401;

Since “n” is 100, this is an example of using the Quantile function to derive
percentiles.

employee_number salary_amount Quant A percentile is a number between 0


--------------- ------------- ----------- and 99.
1013 24500.00 0 Since there are only 7 rows
1001 25525.00 14 returned, there are not enough to fill
1022 32300.00 28 each of the 100 percentiles.
1004 36300.00 42
1003 37850.00 57 The percentiles are then evenly
1002 43100.00 71 spaced between 0 and 99.
1010 46000.00 85 Each are 14 points apart.

QUANTILE and WIDTH_BUCKET Page 21-5


QUANTILE and QUALIFY
Although the Quantile function is a Teradata extension to ANSI, it may be used in conjunction
with QUALIFY, as shown on the facing page. Because QUANTILE uses equal-width
histograms to partition the specified data, it does not partition the data equally using equal-height
histograms. In other words, do not expect equal row counts per specified quantile. Expect
empty quantile histograms when, for example, duplicate values for sort expression are found in
the data.

Page 21-6 QUANTILE and WIDTH_BUCKET


QUANTILE and QUALIFY

Quantile is an OLAP function and can be used with QUALIFY to return qualified
result sets.

Show the salaries for those employee in department 401 whose salary is in the top
20th percentile for that department.

SELECT employee_number, salary_amount


,QUANTILE (100, salary_amount) AS Quant
FROM employee
WHERE department_number = 401
QUALIFY Quant > 80;

employee_number salary_amount Quant


--------------- ------------- -----------
1010 46000.00 85

QUANTILE and WIDTH_BUCKET Page 21-7


QUANTILE with no Projected Value
Just as with the other window aggregate functions, when qualifying a quantile result, you may
elect to project the quantile value or not. In our example, we elected to project the qualified
result without projecting the quantile value.

Page 21-8 QUANTILE and WIDTH_BUCKET


QUANTILE with no Projected Value

As with qualify, you need not project the quantile value when using qualify,

Show the salaries for those employee in department 401 whose salary is in the top
20th percentile for that department.

SELECT employee_number, salary_amount


FROM employee
WHERE department_number = 401
QUALIFY QUANTILE (100, salary_amount) > 80;

employee_number salary_amount
--------------- -------------
1010 46000.00

QUANTILE and WIDTH_BUCKET Page 21-9


Aggregation and QUANTILE
Although the QUANTILE function is an OLAP function, it is not considered to be a Window
Aggregate function. OLAP is a broad category that includes window aggregates. It is, however,
mutually exclusive to standard aggregate functions. This will be shown on the next page.

Page 21-10 QUANTILE and WIDTH_BUCKET


Aggregation and QUANTILE

As an OLAP function, use of QUANTILE falls 1. WHERE


into steps 4 and 5 in the order of operations, 2. AGGREGATION
but is mutually exclusive to aggregation. 3. HAVING
4. OLAP { [ PARTITION BY ]
[ ORDER BY ] [ rows ] }
5. QUALIFY [ ORDER BY ]
6. SAMPLE | TOP N
7. ORDER BY
8. FORMAT

Get the sum of salaries of the top SELECT SUM(sals)


25% of the company. FROM
(SELECT salary_amount
FROM employee
QUALIFY QUANTILE(100, salary_amount) >= 75)
AS temp(sals);

Sum(sals)
The quantile result is placed into a derived table.
------------
The sum is then performed on the quantile result.
387825.00

QUANTILE and WIDTH_BUCKET Page 21-11


OLAP vs. Window Aggregates
The facing page shows the use of QUANTILE to be mutually exclusive to aggregation. This is
also true with respect to ANSI Window Aggregates.

Note:
SELECT department_number,
SUM(salary_amount) OVER () AS Sumsal,
QUANTILE(100, sumsal)
FROM employee;
*** Failure 5480 Ordered Analytical Functions cannot be
nested.

Page 21-12 QUANTILE and WIDTH_BUCKET


OLAP vs. Window Aggregates

Trying to get the previous result in the same manner as a Window Aggregate will
fail.
QUANTILE is a Teradata extension that is mutually exclusive to both Window
Aggregates as well as standard aggregation.

Get the sum of salaries of the top 25% of the company.

SELECT SUM(Quant), QUANTILE (100, salary_amount) AS Quant


FROM employee
QUALIFY QUANTILE(100, salary_amount) >=75;

*** Failure 5478 Aggregates are allowed only with Window Functions.

QUANTILE and WIDTH_BUCKET Page 21-13


QUANTILE and GROUP BY
The reason for QUANTILE being mutually exclusive to both window aggregates and standard
aggregates is due, in large part, to the fact that it uses GROUP BY to PARTITION. The
GROUP BY for quantile conflicts, in usage, with standard aggregation, and is also not consistent
with how partitioning is done with respect to window aggregates. The GROUP BY phrase is
also used for the other Teradata extensions as they are for QUANTILE. That is to say, MAVG,
CSUM, MSUM, RANK(col) (as opposed to RANK() OVER) and MDIFF may all use GROUP
BY in the same manner as it is used for QUANTILE, namely – for partitioning. Note, however,
that these are all mutually exclusive to the same things as are QUANTILE!

Page 21-14 QUANTILE and WIDTH_BUCKET


QUANTILE and GROUP BY

Unlike Window Aggregates, Quantile is a non-ANSI function that uses the facilities
of GROUP BY for Partitioning.

Get the employees who represent the top 25% of their respective department by salary.

SELECT Department_Number,
Salary_Amount, Note that the GROUP BY
QUANTILE (100, salary_amount) AS Quant need not reference all
FROM employee non-aggregates as with
WHERE Department_Number IN (401, 403, 501) aggregation.
QUALIFY QUANTILE(100, salary_amount) >=75
GROUP BY Department_Number; GROUP BY operates as
does PARTITION with
Window Aggregates.
department_number salary_amount Quant
----------------- ------------- -----------
401 46000.00 85 This makes Quantile
403 49700.00 83 mutually exclusive to
501 66000.00 75 aggregation.

QUANTILE and WIDTH_BUCKET Page 21-15


Varying a QUANTILE
Two more examples for the use of QUNATILE are illustrated on the facing page. There are, of
course many others that may be performed depending upon need.

Page 21-16 QUANTILE and WIDTH_BUCKET


Varying a QUANTILE

To retrieve deciles, use 10 as the quantile.


This will return a number between 0 and 9.
SELECT salary_amount, QUANTILE(10, salary_amount) AS Quant
FROM employee
salary_amount Quant
QUALIFY Quant >= 7; ------------- -----------
52500.00 7
53625.00 7
54000.00 8
56500.00 8
57700.00 8
66000.00 9
To retrieve quartiles, use 4 as the quantile. 100000.00 9
This will return a number between 0 and 3.
SELECT salary_amount, QUANTILE(4, salary_amount) AS Quant
FROM employee
salary_amount Quant
QUALIFY Quant = 3;
------------- -----------
53625.00 3
54000.00 3
56500.00 3
57700.00 3
66000.00 3
100000.00 3

QUANTILE and WIDTH_BUCKET Page 21-17


Ordering a QUANTILE
As the facing page illustrates, the default ordering for any use of QUANTILE is descending.
However the displayed order is ascending on the quantile value. In other-words, there is a
distinct difference between computational order and the displayed order. The computational
order is the relationship of the quantile value with the object of the function itself (as quantile
value descends so does the object of the quantile). The displayed order is how it is sorted for
reporting (ascending on quantile value).

To order on the quantile value descending, use an ORDER BY as shown.

SELECT employee_number AS Emp#,


salary_amount AS SalAmt ,
QUANTILE (100, salary_amount) AS Quant
FROM employee
QUALIFY QUANTILE(100, salary_amount) < 25
ORDER BY 3 DESC;

Emp# SalAmt Quant


----------- ------------ -----------
1009 31000.00 23
1006 29450.00 19
1008 29250.00 15
1023 26500.00 11
1001 25525.00 7
1014 24500.00 0
1013 24500.00 0

Page 21-18 QUANTILE and WIDTH_BUCKET


Ordering a QUANTILE

Show all employees in lowest 25th percentile of salaries:


SELECT employee_number AS Emp#,
salary_amount AS SalAmt , Emp# SalAmt Quant
QUANTILE (100, salary_amount) ------ --------- ------
AS Quant 1014 24500.00 0
FROM employee 1013 24500.00 0
QUALIFY QUANTILE(100, salary_amount) < 25; 1001 25525.00 7
1023 26500.00 11
1008 29250.00 15
1006 29450.00 19
Note that Emp# 1014 and 1013 share the same quantile.
1009 31000.00 23
Same
as
• Reporting output default is ascending on quantile.
• Default for object of quantile: Column descends as quantile
descends.
• Explicit ORDER BY may be used to reorder results.

SELECT employee_number AS Emp#,


salary_amount AS SalAmt ,
QUANTILE (100, salary_amount DESC)
AS Quant
FROM employee
QUALIFY QUANTILE(100, salary_amount) < 25;

QUANTILE and WIDTH_BUCKET Page 21-19


Module 21: Summary
A summary of this module is discussed.

Page 21-20 QUANTILE and WIDTH_BUCKET


Module 21: Summary

• The QUANTILE function can be used to return a percentile, decile,


quartile, etc.
• QUALIFY can be used to eliminate quantile result rows.
• GROUP BY can be used to partition quantile result rows.

QUANTILE and WIDTH_BUCKET Page 21-21


Module 21: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 21-22 QUANTILE and WIDTH_BUCKET


Module 21: Review Questions

True or False:

1. The QUANTILE function is considered to be a Window Aggregate.


False – They are, however, both OLAP features

2. QUANTILE and GROUP BY are mutually exclusive.


True

3. QUANTILE and QUALIFY are mutually exclusive.


False

4. QUALIFY and GROUP BY are mutually exclusive.


False

QUANTILE and WIDTH_BUCKET Page 21-23


Module 21: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 21-24 QUANTILE and WIDTH_BUCKET


Module 21: Lab Exercise

1. Display tertile (values from 0 to 2) of employee salary amounts. Then


return decile (0 to 9) followed by percentiles (0 to 100). Do each
separately and note how the rows-per-quantile thin out as the number
increases. Note how the number of rows per quantile value changes
from lower to higher.

2. Repeat #1 by performing all in a single projection.

QUANTILE and WIDTH_BUCKET Page 21-25


Notes:

Page 21-26 QUANTILE and WIDTH_BUCKET


Module 22

Extended Grouping Functions

After completing this module, you will be able to:

• Use ROLLUP to obtain summary information.


• Use CUBE to obtain more summary information.
• Use GROUPING SETS to obtain even more summary information.
• Use GROUPING to discern between a null total and a grand total.
• Create various kinds of groupings through the use of parentheses.

Extended Grouping Functions Page 22-1


Notes:

Page 22-2 Extended Grouping Functions


Table of Contents
Extended Grouping Functions ................................................................................................... 22-4
Aggregation Review .................................................................................................................. 22-6
ROLLUP .................................................................................................................................... 22-8
Two-Level Rollup .................................................................................................................... 22-10
Switching Rollup Column Order ............................................................................................. 22-12
Null Group vs. Total ................................................................................................................ 22-14
The GROUPING Function....................................................................................................... 22-16
CUBE vs. ROLLUP ................................................................................................................. 22-18
CUBE Result ............................................................................................................................ 22-20
CUBE and GROUPING Function ........................................................................................... 22-22
The GROUPING SETS Function ............................................................................................ 22-24
Adding Grand Totals ................................................................................................................ 22-26
Combining Grouping Sets ........................................................................................................ 22-28
Module 22: Summary............................................................................................................... 22-30
Module 22: Review Questions ................................................................................................. 22-32
Module 22: Lab Exercise ......................................................................................................... 22-34

Extended Grouping Functions Page 22-3


Extended Grouping Functions
So far we have seen some variations on aggregate constructs. Each if them using similar syntax,
namely: AVG; SUM; MIN; MAX; COUNT.

• Standard Aggregation – Summarize and lose detail


• Window Aggregates – Summarize but retain detail.

“Extended Grouping” functions share the same 5 syntax constructs and also summarize data but,
like standard aggregation, they lose detail. They are, therefore, more closely related to standard
aggregation than are window aggregates. For instance, for these functions all selected non-
aggregates must be part of the associated grouping construct. We say “grouping construct”
(instead of GROUP BY) because, being an “extended grouping” feature, we use, extend the
capabilities of the “GROUP BY” by adding more to the syntax so that it looks like the following:

• GROUP BY ROLLUP
• GROUP BY CUBE
• GROUP BY GROUPING SETS

Page 22-4 Extended Grouping Functions


Extended Grouping Functions

Extended Grouping Functions are aptly named in that:


They “extend” the capabilities of aggregate functionality by performing levels
of summarization unavailable to the typical flavor of aggregation.
They are similar standard aggregation in that they result in a loss of row detail due
to grouping by all non-aggregate columns.
There are 3 basic flavors of Extended Grouping functions:
ROLLUP
Creates a hierarchy of results along a single dimension such as time (e.g. year,
month, week) or geography (country, state, province).
CUBE
Supports multidimensional processing such as product by time by geography.
GROUPING SETS
Allows aggregations in single or multiple dimensions and are to this functionality as
CASE is to NULLIF and COALESCE.

A GROUPING function is available with any of these functions to identify when


a group represents a grand total versus a null set.

Extended Grouping Functions Page 22-5


Aggregation Review
We need to take a slight step backward before moving forward. The facing page illustrates how
standard aggregation summarizes data in a way the results in a loss of detailed information.
Notice the obvious:
• We use GROUP BY to reference any and all non-aggregate columns.
• We are grouping by, in this case, a positional reference - the number 1 means that we are
grouping by the first projected column, which is a non-aggregate.

Suppose that we would like to also obtain a grand total for the 5 departments shown on the
facing page? The extended grouping functionality would allow us to obtain this in addition to
the data shown.

In another example, suppose that we would like to further summarize the following, which
retrieves a summary of salary amount by manager and department, so that it also includes a sum
by manager, or a sum by department, or both – and with grand totals? Extended grouping
functions can be used to do this as well.

SELECT department_number AS Dept#,


manager_employee_number AS Mgr#,
SUM(salary_amount) AS SumSal
FROM employee
WHERE department_number < 402
GROUP BY 1, 2
ORDER BY 1, 2;

Dept# Mgr# SumSal


----------- ----------- ------------
100 801 100000.00
201 801 34700.00
201 1025 38750.00
301 801 57700.00
301 1019 58700.00
302 801 56500.00
401 801 37850.00
401 1003 207725.00

Page 22-6 Extended Grouping Functions


Aggregation Review

Produce a total of salaries by department for department numbers less than 402.

SELECT department_number AS Dept#, The first two flavors of extended


SUM(salary_amount) AS SumSal grouping functions we shall
FROM employee discuss return the result of the
WHERE department_number < 402 GROUP BY.
GROUP BY 1
ORDER BY 1; In addition to this, they have
“extended” summarization (i.e.,
Dept# SumSal grouping) capabilities.
----------- ------------
100 100000.00 These 2 functions are named:
201 73450.00 • ROLLUP
301 116400.00 • CUBE
302 56500.00
401 245575.00

Extended Grouping Functions Page 22-7


ROLLUP
Our first example is a rollup of all department salary sums into a grand total. Except for the use
of the key word “ROLLUP”, this request would look exactly like (and perform like) a standard
aggregation. The ROLLUP key word is the only differentiator. That should help to explain what
is meant by an “extended grouping function.”

The question mark represents a total line (in this case for all departments). If it weren’t for the
fact that we are doing a rollup of salaries, this can be confused with a result where the “?” might
actually be the sum of salaries for a null department. Later we will look at ways of changing this
value to something more descriptive.

Page 22-8 Extended Grouping Functions


ROLLUP

The ROLLUP function is used when aggregation is desired across all levels of a
hierarchy within a single dimension (with rollup, this single dimension has to do
with the aspect of “direction”).
Note, below, that we have an extended grouping capability – “GROUP BY ROLLUP”.
As in aggregation,
“All selected non-aggregates must be part of the associated group.”

SELECT department_number AS DeptNum Note that the '?' does not represent a
,SUM(salary_amount) AS SumSal null department number, it represents
FROM employee the 'total' of all department salaries.
WHERE department_number < 402
GROUP BY ROLLUP (department_number) The GROUPING function (discussed
ORDER BY 1; later) will allow us to differentiate the
two.
DeptNum SumSal
----------- ------------
? 591925.00 Total sum of all groups is added.
100 100000.00
201 73450.00
301 116400.00 Result of normal aggregation (i.e., GROUP BY)
302 56500.00
401 245575.00

Extended Grouping Functions Page 22-9


Two-Level Rollup
Contrast the query on the facing page to the query below. The following query returns only the
result of the normal group by. The only difference between the two queries is the following
expression, found in the query on the facing page:

GROUP BY ROLLUP (manager_employee_number, department_number)

The query below is that of one performing standard aggregation. In both queries, all selected
non-aggregates must be part of the associated group! And yes, the parentheses are valid, though
rarely used.

SELECT manager_employee_number AS Mgr,


department_number AS Dept,
SUM(salary_amount) AS SumSal
FROM employee
WHERE department_number < 402
GROUP BY (manager_employee_number, department_number)
ORDER BY 1,2;

Mgr Dept SumSal


----------- ----------- ------------
801 100 100000.00
801 201 34700.00
801 301 57700.00
801 302 56500.00
801 401 37850.00
1003 401 207725.00
1019 301 58700.00
1025 201 38750.00

Page 22-10 Extended Grouping Functions


Two-Level Rollup

Use ROLLUP to produce a hierarchy of total salaries by department within manager.

SELECT manager_employee_number AS Mgr, department_number AS Dept,


SUM(salary_amount) AS SumSal
FROM employee Rollup departments into managers
WHERE department_number < 402
GROUP BY ROLLUP (manager_employee_number, department_number)
ORDER BY 1,2;

Mgr Dept SumSal


----------- ----------- ------------
The grouping ? ? 591925.00 Grand Total Line
function (later) will 801 ? 230250.00
separate null 801 100 100000.00
departments from 801 201 34700.00
manager totals. 801 301 57700.00
801 401 37850.00
1003 ? 207725.00 Result of normal
Manager Totals 1003 401 207725.00 GROUP BY.
(rollup sum) 1019 ? 115200.00
1019 301 58700.00
1019 302 56500.00
1025 ? 38750.00
1025 201 38750.00

Extended Grouping Functions Page 22-11


Switching Rollup Column Order
The direction of a rollup is an important issue. The rollup works right-to-left. The previous
rollup was  (manager_employee_number, department_number) which meant, from right-
to-left, ‘rollup into manager totals”. Grand totals also get included when using rollup. So,
basically, the facing page returns:
• GROUP BY totals
• Manager totals
• Grand total

Page 22-12 Extended Grouping Functions


Switching Rollup Column Order

Use ROLLUP to produce a hierarchy of total salaries by manager within department.

SELECT manager_employee_number AS Mgr, department_number AS Dept,


SUM(salary_amount) AS SumSal
FROM employee Rollup managers into departments
WHERE department_number < 402
GROUP BY ROLLUP (department_number, manager_employee_number)
ORDER BY 1,2;

Mgr Dept SumSal


----------- ----------- ------------
? ? 591925.00 Grand Total
? 100 100000.00
? 201 73450.00
? 301 116400.00 Department Totals
? 302 56500.00
? 401 245575.00
801 100 100000.00
801 201 34700.00
801 301 57700.00
801 401 37850.00
1003 401 207725.00
GROUP BY Totals
1019 301 58700.00
1019 302 56500.00
1025 201 38750.00

Extended Grouping Functions Page 22-13


Null Group vs. Total
Distinguishing between a null group and a total line can become a necessity at times. This may
often be determined quite easily by visual means, however, “visually” determining the different
isn’t usually acceptable. What is needed is a mechanism by which we can provide this more
conclusively. Luckily? – there is such a mechanism, and it will be discussed next.

Page 22-14 Extended Grouping Functions


Null Group vs. Total

Sometimes a situation arises where there are both null groups and total lines.
Distinguishing between the two could become necessary.

SELECT department_number,
SUM(salary_amount)
FROM employee
GROUP BY ROLLUP (department_number)
ORDER BY 1;

department_number Sum(salary_amount)
----------------- ------------------
? 129950.00 In this example you can determine which
? 948050.00 is which by process of elimination.
301 58700.00
401 213275.00
402 52500.00
403 193500.00
501 200125.00
999 100000.00

Extended Grouping Functions Page 22-15


The GROUPING Function
To distinguish between null groups and total lines one may use the GROUPING function as
illustrated on the facing page. Although the example is a simple one (a more complex example
will be shown later), it is important for one to fully understand the sorting sequence when using
this functionality.

Note:
• That the heading of “Deptno” is left justified, signifying a character field.
• That, within this character field, there are both numeric (right-justified) and character
(left-justified) values.
• That numbers sort before characters.

As one gets to work with this functionality one will begin to see the attention to detail that is
required to carefully choose values that sort in a logical manner, so that, for instance, grand total
lines sort last, or that (seen later) subtotal lines sort IMMEDIATELY after the group they are
subtotaling.

Page 22-16 Extended Grouping Functions


The GROUPING Function

The GROUPING function distinguishes rows with nulls from rows with aggregates.
GROUPING returns:
• A “0” (zero) if the actual data for the column is null.
• A “1” (one) if it represents a total or subtotal value for the column.

SELECT CASE GROUPING (department_number)


WHEN 1 THEN 'Total'
ELSE COALESCE(department_number, 'Null Dept')
END AS Deptno,
SUM(salary_amount)
FROM employee
GROUP BY ROLLUP (department_number)
ORDER BY 1;

In the column “deptno”: Deptno Sum(salary_amount)


• Character values are left-justified ----------- ------------------
in the column space. 301 58700.00
• Numeric values are right-justified 401 213275.00
in the column space. 402 52500.00
• Numeric values sort first. 403 193500.00
501 200125.00
999 100000.00
Null Dept 129950.00
Total 948050.00

Extended Grouping Functions Page 22-17


CUBE vs. ROLLUP
As discussed on the facing page, a Cube:
• Provides the results of a standard aggregation (GROUP BY)
• Provides all rollup combinations.
• Provides a grand total.

The result set for this query is provided on the following page.

Page 22-18 Extended Grouping Functions


CUBE vs. ROLLUP

Where a ROLLUP summarizes from right-to-left -


SELECT manager_employee_number AS Mgr, department_number AS Dept,
SUM(salary_amount) AS SumSal
FROM employee Rollup departments into managers
WHERE department_number < 402
GROUP BY ROLLUP (manager_employee_number, department_number)
ORDER BY 1,2;

A CUBE summarizes from right-to-left and from left-to-right -


SELECT manager_employee_number AS Mgr, department_number AS Dept,
SUM(salary_amount) AS SumSal
FROM employee Rollup departments into managers
WHERE department_number < 402
GROUP BY CUBE (manager_employee_number, department_number)
ORDER BY 1,2;
Rollup managers into departments

Both produce a grand total.

Extended Grouping Functions Page 22-19


CUBE Result
A CUBE result is like getting rollups of all levels for the columns in the group. In other words,
for the facing page, the following rollup returns everything but the “Dept Totals” portion shown
on the facing page.

SELECT manager_employee_number AS Mgr,


department_number AS Dept,
SUM(salary_amount) AS SumSal
FROM employee
WHERE department_number < 402
GROUP BY ROLLUP (mgr, dept)
ORDER BY 1,2;

By changing ROLLUP to CUBE, we add the extra rollup level of Dept Totals”, this providing all
rollups for the group, including the normal aggregate result, and including the grand total.

Here is what happens to a CUBE of one level. The result should look familiar to to as one
shown earlier for ROLLUP of one level.

SELECT manager_employee_number AS Mgr,


SUM(salary_amount) AS SumSal
FROM employee
WHERE department_number < 402
GROUP BY CUBE (mgr)
ORDER BY 1;

Mgr SumSal
----------- ------------
? 271975.00
801 37850.00
1003 175425.00
1019 58700.00

Page 22-20 Extended Grouping Functions


CUBE Result

This CUBE produces: SELECT manager_employee_number AS Mgr,


• GROUP BY results. department_number AS Dept,
• Manager result totals. SUM(salary_amount) AS SumSal
FROM employee
• Department result totals. WHERE department_number < 402
• Grand totals. GROUP BY CUBE (mgr, dept)
ORDER BY 1,2;

Mgr Dept SumSal


----------- ----------- ------------
? ? 475525.00 Grand Total
? 100 100000.00
? 201 73450.00
Dept Totals
? 302 56500.00
? 401 245575.00
801 ? 172550.00
801 100 100000.00
801 201 34700.00
801 401 37850.00
Mgr Totals 1003 ? 207725.00
GROUP BY
1003 401 207725.00
Totals
1019 ? 56500.00
1019 302 56500.00
1025 ? 38750.00
1025 201 38750.00

Extended Grouping Functions Page 22-21


CUBE and GROUPING Function
The facing page shows an example of the grouping function for a cube.

Page 22-22 Extended Grouping Functions


CUBE and GROUPING Function

SELECT CASE GROUPING(manager_employee_number)


WHEN 1 THEN 'All Mgrs'
ELSE COALESCE(manager_employee_number, 'null mgr')
END AS Mgr,
CASE GROUPING(department_number)
WHEN 1 THEN 'Total Depts'
ELSE COALESCE(department_number, 'Null Dept')
END AS dept,
SUM(salary_amount) AS SumSal
FROM employee Mgr dept SumSal
WHERE department_number IN (401, 402, 501) ----------- ----------- ------------
GROUP BY CUBE 801 401 37850.00
(manager_employee_number, department_number) 801 402 52500.00
801 501 66000.00
ORDER BY 1, 2;
801 Total Depts 156350.00
1003 401 207725.00
1003 Total Depts 207725.00
1011 402 24500.00
When using the GROUPING function, you 1011 Total Depts 24500.00
must use the column names in both the 1017 501 134125.00
GROUP BY CUBE and GROUP BY ROLLUP – 1017 Total Depts 134125.00
All Mgrs 401 245575.00
i.e., no positional values nor aliases are All Mgrs 402 77000.00
allowed. All Mgrs 501 200125.00
All Mgrs Total Depts 522700.00

Extended Grouping Functions Page 22-23


The GROUPING SETS Function
The GROUPING SETS function is the “all purpose” extended grouping function in that one can
use it to perform all of the capabilities shown earlier plus operations that may not be able to be
performed by ROLLUP or CUBE. What differentiates this function from the others is that we
tell the database which totals we wish. So much so that it won’t even return the normal
aggregate result, nor will it return a grand total, unless we ask for it – specifically – by name!

For the query on the facing page, we are requesting sums for only department and manager. We
will not return department-manager sums nor will we return a grand total.

Page 22-24 Extended Grouping Functions


The GROUPING SETS Function

Use GROUPING SETS to produce a report showing employee salaries aggregated


by manager and also by department for department numbers less than 402.

SELECT department_number AS deptnum


,manager_employee_number AS manager
,SUM(salary_amount) AS Sumsal
FROM employee
WHERE department_number < 402
GROUP BY GROUPING SETS (department_number, manager_employee_number)
ORDER BY 1,2;

deptnum manager Sumsal GROUPING SETS returns only sums


----------- ----------- ------------
for the requested groups.
? 801 286750.00
? 1003 207725.00 In our example we want only
? 1019 58700.00
manager sums and department
? 1025 38750.00
100 ? 100000.00 sums.
201 ? 73450.00
As always, all non-aggregates must
301 ? 116400.00
302 ? 56500.00 be included somewhere in the
401 ? 245575.00 grouping.

Extended Grouping Functions Page 22-25


Adding Grand Totals
To add a grand total to the previous result, we use a set of open and close parentheses. This is
the notation for a grand total.

Page 22-26 Extended Grouping Functions


Adding Grand Totals

Use GROUPING SETS to make a report of employee salaries aggregated by manager


and by department for employee in departments less than 402.
In addition, show the grand total of all salaries.
SELECT department_number AS deptnum
,manager_employee_number AS manager
,SUM(salary_amount) AS Sumsal Add “( )” to get a grand total.
FROM employee
WHERE department_number < 402
GROUP BY GROUPING SETS (department_number, manager_employee_number, ( ) )
ORDER BY 1,2;

deptnum manager Sumsal


----------- ----------- ------------
? ? 591925.00
? 801 286750.00
? 1003 207725.00 Manager
? 1019 58700.00 Totals
? 1025 38750.00
100 ? 100000.00
201 ? 73450.00
Department Totals 301 ? 116400.00
302 ? 56500.00
401 ? 245575.00

Extended Grouping Functions Page 22-27


Combining Grouping Sets
If we wish to combine two or more columns into a single grouping, we can enclose them with
parentheses as shown on the facing page. This technique should be familiar to us from earlier
examples.

Remember: “All selected non-aggregates must be accounted for as being in the GROUP BY.”

Page 22-28 Extended Grouping Functions


Combining Grouping Sets

As shown earlier in this module, you can parenthesize a combination of columns to


form a new group.

SELECT department_number AS deptnum


,manager_employee_number AS manager
,SUM(salary_amount) AS Sumsal
FROM employee
WHERE department_number < 402
GROUP BY GROUPING SETS ( (department_number,manager_employee_number), ( ) )
ORDER BY 1,2;

Note that this result is the same as deptnum manager Sumsal


doing the following. ----------- ----------- ------------
? ? 591925.00
100 801 100000.00
SELECT manager_employee_number, 201 801 34700.00
department_number, 201 1025 38750.00
SUM(salary_amount) 301 801 57700.00
FROM employee 301 1019 58700.00
WHERE department_number < 402 302 801 56500.00
GROUP BY ROLLUP ( (Dept, Mgr) ) 401 801 37850.00
ORDER BY 1,2; 401 1003 207725.00

Extended Grouping Functions Page 22-29


Module 22: Summary
A summary of this module is discussed.

Page 22-30 Extended Grouping Functions


Module 22: Summary

• Extended Grouping functions include: ROLLUP; CUBE; and GROUPING


SETS.

• GROUPING SETS can use a “multiplier” feature.

• ROLLUP and CUBE

• In addition to rolling-up summary information, ROLLUP provides


“detailed” aggregation and grand totals.

• CUBE provides rollup information for all permutations plus “detailed”


aggregation and grand totals.

• GROUPING SETS provides only the totals presented for it.

Extended Grouping Functions Page 22-31


Module 22: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 22-32 Extended Grouping Functions


Module 22: Review Questions

True or False:

1. You can retain detailed information using Extended Grouping functions.


False
2. Parentheses can be nested within a CUBE.
True
3. For Extended Grouping functions, all non-aggregates must be enclosed
within parentheses.
False – but they must be included in the grouping
4. The following are equivalent for GROUPING SETS: ((m#, d#)) vs. (m#, d#)
False – 1 returns m# and d# totals as a group. 2 returns each individually
5. GROUPING SETS is the more explicit form of the Extended Grouping
functions.
True
6. GROUPING SETS always returns a grand total.
False
7. Grand totals are always retrieved when using CUBE and ROLLUP.
True

Extended Grouping Functions Page 22-33


Module 22: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 22-34 Extended Grouping Functions


Module 22: Lab Exercise

1. Display manager numbers, department numbers, and salary sums for


employees by manager and by department – only – (do not project a
grand total).

2. Add to #1 the grand totals plus salary sums by department and


manager.

3. Add to #2 the employee counts at all levels of that display.

Extended Grouping Functions Page 22-35


Notes:

Page 22-36 Extended Grouping Functions


Module 23

Views

After completing this module, you will be able to:

• Create a view.
• Use a view to provide secure access to data.
• Drop or modify a view.
• List several reasons for using views.

Views Page 23-1


Notes:

Page 23-2 Views


Table of Contents
What is a View? ......................................................................................................................... 23-4
Creating and Using Views ......................................................................................................... 23-6
Replacing a View via SQL Assistant ......................................................................................... 23-8
Using Views to Rename Columns ........................................................................................... 23-10
Join View ................................................................................................................................. 23-12
Joining Views ........................................................................................................................... 23-14
Using View to Format for SQL Assistant ................................................................................ 23-16
Views with Aggregates ............................................................................................................ 23-18
Aggregates and HAVING ........................................................................................................ 23-20
Views and TOP N .................................................................................................................... 23-22
Restrictions on Views .............................................................................................................. 23-24
Advantages and Suggestions .................................................................................................... 23-26
Module 23: Review Questions ................................................................................................. 23-28
Module 23: Lab Exercise ......................................................................................................... 23-30

Views Page 23-3


What is a View?
Views are objects in a database that do not require any permanent space from the database
owning them.

They can be used to filter the data of the table so that the referencing user can only see the
columns projected in the view. The view could also perform a reformatting the table data for
providing a different look and feel.

It is virtual in that no physical data really exists for the view. Hence, no indexes may be created
on them.

It can also be considered logical because it can alter the table’s appearance from many different
viewpoints without affecting the actual data found in the table that it references.

Views can be extremely helpful in eliminating the need for a referencing user to be involved with
writing complex SQL since this complexity can be embedded into the view.

Finally, a view may be also considered to be a “derived table” in that the spool generated from
the view can be used to replace the spool generated by a functionally equivalent “derived table”
discussed earlier in this course.

Page 23-4 Views


What is a View?

• A view is a ‘window’ into the data contained in relational tables.


• A view is sometimes called a ‘virtual table’.
 That is to say – users may think that it is an actual table.
• It may behave as a “filter” by:
 Qualifying to retrieve a subset of rows and/or columns.
• It may reference more than one table.
• Data is neither duplicated nor stored separately for a view. For this
reason:
 You cannot create an index on a view.
 The view’s only space requirements are for Data Dictionary space
to store the definition.
• Data can be accessed directly via a table or indirectly via a view, based
on privileges.
• Views can contain complex SQL, thereby hiding it from users.

Views Page 23-5


Creating and Using Views
The example on the facing page is a simple view. Views can become much more complex than
the one shown, and much more complex than any discussed within this module. Note how the
syntax involves a SELECT, thus emphasizing the fact that the view does not store data. Instead,
it pulls data from the sourcing tables.

Page 23-6 Views


Creating and Using Views

CREATE VIEW emp_403 AS


Create a view of the employees in
SELECT employee_number
department 403 to be used for both
,last_name
read and update purposes. Limit the
,salary_amount
view to an employee’s number, last
FROM employee
name, and salary.
WHERE department_number = 403;

To read all rows and columns from this view:

SELECT * FROM emp_403;

employee_number last_name salary_amount


--------------- ----------- -------------
1024 Brown 43700.00
1020 Charles 39500.00
1007 Villegas 49700.00
1005 Ryan 31200.00
1012 Hopkins 37900.00
1009 Lombardo 31000.00

Views Page 23-7


Replacing a View via SQL Assistant
Replacing a view via SQL assistant is relatively simple. Performing the same through BTEQ can
be accomplished via a process like this one:

SHOW VIEW Emp_430;

Result:

CREATE VIEW emp_403 AS


SELECT employee_number
,last_name
,salary_amount
FROM employee
WHERE department_number = 403;

Copy this DDL to some editor (like notepad) and make the changes needed, including changing
“CREATE” to “REPLACE”.

REPLACE VIEW emp_403 AS


SELECT employee_number
,last_name
,salary_amount
,manager_employee_number
FROM employee
WHERE department_number = 403;

Issue the replacement of the view.

Page 23-8 Views


Replacing View via SQL Assistant

3. Submit.

2. Change CREATE to REPLACE


and make any changes to it.

1. Right-Click on view name.

Views Page 23-9


Using Views to Rename Columns
There are two methods one may choose from to provide aliases for actual table column names.
The first method is the very first (original) method that was ever provided, and continues to this
day (of course). The method using “AS” cam later, and is often the preferred method that is used
to alias names. Recall that “AS” is optional.

Page 23-10 Views


Using Views to Rename Columns

Create a view which contains a subset of the employee table consisting of


employees in department 201, and simplify the names.

CREATE VIEW emp_view


(emp, dept, lname, fname, sal) Alias List
AS SELECT employee_number
,department_number SELECT emp, dept, sal
,last_name FROM emp_view
,first_name ORDER BY lname;
,salary_amount
FROM employee Emp Dept Sal
WHERE dept = 401; ------ ------ ------------
vs. 1002 401 43100.00
CREATE VIEW emp_view 1001 401 25525.00
1004 401 36300.00
AS SELECT employee_number AS Emp
1022 401 32300.00
,department_number AS Dept 1013 401 24500.00
,last_name AS Lname 1010 401 46000.00
,first_name AS Fname 1003 401 37850.00
,salary_amount AS Sal
FROM employee
WHERE dept = 401;

Views Page 23-11


Join View
Adding a certain level of complexity to the syntax may involve joins. These may be either inner
or outer joins as needed. Joins may also involve either form of explicit or implicit inner joins.
Providing joins in a view definition can be extremely helpful in that it hides, from the user, the
need to know what the one-to-many relationship would be, thus helping to guarantee that a
correct join is performed. Note that table aliases can also help one to better interpret the join
syntax.

Page 23-12 Views


Join Views

A join view consists of columns from more than one table.

Create a join view of the CREATE VIEW sql01.emp_dept AS


employee and department SELECT e.employee_number
tables for human resources. ,e.last_name
,e.first_name
,e.salary_amount
,d.department_name
,d.manager_employee_number
FROM Employee_Sales.employee e
INNER JOIN
Employee_Sales.Department d
ON d.department_number = e.department_number;

SELECT last_name
,first_name Who is employee 1002 and in
,department_name which department do they
FROM emp_dept work?
WHERE employee_number = 1002;

last_name first_name department_name


---------- ----------- -----------------
Brown Alan customer support

Views Page 23-13


Joining Views
Just as tables may be joined, views may also be joined. All tables are involved by the optimizer
as may normally be considered without the views. Both the EXPLAIN feature and the “SHOW
SELECT” feature (using the SHOW keyword in place of the EXPLAIN keyword) can be used to
identify the underlying tables by name. As one may imagine, the complexity of establishing the
various one-to-many relationships correctly would be extremely important.

Page 23-14 Views


Joining Views

Views may be joined together whether simple views like these or more complex
views that contain joins themselves.

REPLACE VIEW emp REPLACE VIEW dept


AS Join AS
SELECT * These SELECT *
FROM Employee; FROM Department;

SELECT Employee_Number, Last_Name, First_Name


FROM emp, dept
WHERE emp.Department_Number = dept.Department_Number
AND Department_Name LIKE '%support%';

employee_number last_name first_name


--------------- -------------------- ------------------------------
1014 Crane Robert
1003 Trader James
1011 Daly James
1022 Machado Albert
1010 Rogers Frank
1002 Brown Alan
1001 Hoover William
1013 Phillips Charles
1004 Johnson Darlene

Views Page 23-15


Using View to Format for SQL Assistant
You can create views to provide needed and complex formatting of data values for SQL
Assistant. These formatting techniques also work for BTEQ, but should be restricted to views
that do not involve further data type conversions from the user. You should also consider how
SQL Assistant may be used to format within the tool itself.

Page 23-16 Views


Using Views to Format in SQL Assistant

This view can be used to provide formatting for use with SQL Assistant.
This view will work fine with BTEQ as well, however, the conversions strategies are
not necessary for it.

CREATE VIEW SQLA_View AS


SELECT (employee_number (FORMAT '9999') ) (CHAR(4)) AS Emp Aggregate and
,(department_number (FORMAT '999')) (VARCHAR(3)) AS Dept derived
,last_name AS Name columns in
,first_name views must be
,(salary_amount / 12 (FORMAT '$$$$,$$9.99')) (CHAR(11)) assigned
AS Mon_Salary aliases!
FROM Employee;

SELECT *
FROM SQLA_View;

Views Page 23-17


Views with Aggregates
Views may be used to provide summary information by including aggregations. Whether
aggregations or other forms of derived data (e.g. via calculations), these must be provided with
names so that they can be reference through queries. Aggregations inside views can provide a
method for which one can nest aggregation, that is, one can perform aggregations on other
aggregations. For instance, the following is not permissible in the form shown.

SELECT AVG(SUM(C1)) FROM T1;

The facing page illustrates how this can be accomplished through the use of views that perform
the initial aggregate result.

Page 23-18 Views


Views with Aggregates

REPLACE VIEW deptsals AS


SELECT Department_Number AS deptnum
Recall:
,SUM(salary_amount) AS sumsal
Aggregate and other
,AVG(salary_amount) AS avgsal
derived columns in
,MAX(salary_amount) AS maxsal
views must be
,MIN(salary_amount) AS minsal
assigned aliases.
FROM employee
GROUP BY 1;

Use this view to:


Show the average salary Nest aggregations: Join to other tables
for all departments: or views:

SELECT deptnum SELECT AVG(sumsal) SELECT d.Department_Name,


,sumsal FROM deptsals a.sumsal,
FROM deptsals ORDER BY 1; a.avgsal
ORDER BY 1; FROM Deptsals a, Department d
WHERE a.deptnum =
d.department_number;
Whereas this will fail:
SEL AVG(SUM(Salary_Amount)
FROM Employee;

Views Page 23-19


Aggregates and HAVING
A HAVING clause may be used to restrict what a view may project based upon aggregated
values.

Page 23-20 Views


Aggregates and HAVING

A HAVING clause is used to restrict which groups participate in the view.

Modify the deptsals view to include only those


departments with an average salary of less than $36,000.

REPLACE VIEW deptsals AS


SELECT department_number AS department
,SUM (salary_amount) AS salary_total
,AVG (salary_amount) AS salary_average
,MAX (salary_amount) AS salary_max
,MIN (salary_amount) AS salary_min
FROM employee
GROUP BY department_number
HAVING AVG (salary_amount) < 36000;

Select all departments with average salaries less


than $36,000, using the view 'deptsals':

SELECT department department salary_average


,salary_average ----------- --------------
FROM deptsals; 401 35082.14

Views Page 23-21


Views and TOP N
Here a view references the “TOP N” feature. Although a view is not allowed to reference an
ORDER BY, this is an exception because “TOP N” may rely on this command in order to
perform qualified ranking for a query, so it allowed in this case. An added recommendation
might be to provide a view name that aptly describes the view as shown.

Page 23-22 Views


Views and TOP N

Although ORDER BY is not allowed in a view definition, it is allowed when used in


conjunction with the “TOP N” feature.
REPLACE VIEW Top10Emps
AS
SELECT TOP 10 *
FROM Employee
ORDER BY Salary_Amount DESC;

SELECT Last_Name, Salary_Amount FROM Top10Emps order by 1;

last_name salary_amount
-------------------- -------------
Trainer 100000.00
Runyon 66000.00
Kubic 57700.00
Rogers 56500.00
Ratzlaff 54000.00
Wilson 53625.00
Daly 52500.00
Villegas 49700.00
Rogers 46000.00
Brown 43700.00

Views Page 23-23


Restrictions on Views
The facing page lists the restrictions involved in the creation and usage of views.

Page 23-24 Views


Restrictions on Views

• An index cannot be created on a view.


• A view cannot contain an ORDER BY clause.
• The WHERE clause of a SELECT against a view can reference all
aggregated columns of that view.
• Derived and aggregated columns must be assigned a name.
• A view cannot be used for UPDATE operations if it contains:
- Data from more than one table (i.e., join views)
- The same column specified twice
- Derived columns (i.e., salary_amount/12)
- A DISTINCT clause
- A GROUP BY clause

Views Page 23-25


Advantages and Suggestions
The facing page shows various advantages for using views. It is safe to say that views are
widely used in all applications, and can be a very effective way for controlling access to table
data from both performance and integrity perspectives.

Page 23-26 Views


Advantages and Suggestions

Advantages of Using Views:


• An additional level of security.
• Controls read and update privileges.
• Simplify end-user access to data.
• Are unaffected if a column is added to a table.
• Are unaffected if a column is dropped, unless the dropped column is
referenced by the view.
Suggestions for Using Views:
• Create views to ensure that all user access to tables is via views.
• Create views which do complex joins or aggregations to simplify end-
user coding requirements.
• Use Access Locks when creating views to maximize data availability to
users.

Views Page 23-27


Module 23: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 23-28 Views


Module 23: Review Questions

True or False:

1. Views don’t require permanent space from their database.


True
2. You can not create an index on a view.
True
3. You can create a view of a view.
True
4. You can not drop a table that has a view referencing it.
False
5. HAVING and GROUP BY may both be specified within a view
definition.
True
6. You can aggregate a view column that is already being aggregated.
True

Views Page 23-29


Module 23: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 23-30 Views


Module 23: Lab Exercise

1. Create a view called “SumView” that performs a sum, average, max,


and min on salary amounts for each department number from
employee. Use the view to join to employee to find those employees
whose salaries are greater than their department average.

2. Add a HAVING clause to SumView to include only departments having


averages less than $30,000.00 and repeat the query for exercise #2.

Views Page 23-31


Notes:

Page 23-32 Views


Module 24

Derived Tables and Volatile Tables

After completing this module, you will be able to:

• Use permanent tables for ad-hoc queries.


• Use both forms of Derived table syntax.
• Recognize variations for each form of derived table.
• Create volatile tables for session use.
• Distinguish between various ad-hoc strategies.
• Identify volatile table limitations.
• Distinguish between the two ON COMMIT options.
• Use volatile tables within views and macros.

Derived Tables and Volatile Tables Page 24-1


Notes:

Page 24-2 Derived Tables and Volatile Tables


Table of Contents
Temporary Table Choices .......................................................................................................... 24-4
Another Derived Table Syntax Form ......................................................................................... 24-6
Volatile Table Syntax................................................................................................................. 24-8
Volatile Table Restrictions....................................................................................................... 24-10
HELP and SHOW (Volatile) TABLE ...................................................................................... 24-12
ON COMMIT DELETE ROWS (Implicit Transactions) ........................................................ 24-14
ON COMMIT PRESERVE ROWS (Implicit Transactions) ................................................... 24-16
ON COMMIT DELETE ROWS (Explicit Transactions) ........................................................ 24-18
ON COMMIT PRESERVE ROWS (Explicit Transactions) ................................................... 24-20
Limitations ............................................................................................................................... 24-22
Using INSERT-SELECT ......................................................................................................... 24-24
Inserting a Single Row ............................................................................................................. 24-26
UPDATE .................................................................................................................................. 24-28
Updating with Joins ................................................................................................................. 24-30
DELETE................................................................................................................................... 24-32
Deleting with Joins................................................................................................................... 24-34
Module 24: Summary............................................................................................................... 24-36
Module 24: Review Questions ................................................................................................. 24-38
Module 24: Lab Exercise ......................................................................................................... 24-40

Derived Tables and Volatile Tables Page 24-3


Temporary Table Choices
In comparing the choice on the facing page, one should be reminded that spool files cannot
survive a restart. This means that, upon a database restart, the volatile tables goes away, and so,
one will need to create the table again and repeat the process that lead to the point of failure.
This implies an ad-hoc usage. One where a person is visually active with the process while it is
processing.

Global Temporary table usage is such that “Temp” space can survive a restart, and can,
therefore, be used in a scripted application that is run unattended on a scheduled basis. Global
Temporary table are significantly more complex that are Volatile temp tables and are typically
considered more a concern of application development than a SQL one.

Derived tables also use spool, and, as such, do not survive a database restart, not do they survive
a request failure. Volatile and Global temp tables can be created to survive request failures but,
by default, do not.

Page 24-4 Derived Tables and Volatile Tables


Temporary Table Choices

Views.
• Local to a query
• Uses Spool
• May be replaced with derived tables.

Derived Tables
• Local to the query
• Incorporated into SQL query syntax
• Discarded when query finishes
• No Data Dictionary involvement
• May be replaced with views.

Volatile Tables
• Local to a session (are available to all queries during the session)
• Uses CREATE VOLATILE TABLE syntax
• Discarded automatically at session end
• No Data Dictionary involvement

Global Temporary Tables (not taught in this course)


• Local to a session (like volatile tables)
• Uses CREATE GLOBAL TEMPORARY TABLE syntax (e.g. a DBA creates the definition)
• Materialized instance of table discarded at session end (like volatile tables)
• Creates and keeps table definition in Data Dictionary (i.e. the DBA created table)

Derived Tables and Volatile Tables Page 24-5


Another Derived Table Syntax Form
Now we introduce the second form while contrasting it with the previous form. this second form
of derived table is often referred to as the “WITH” form. This form has a variation on it that will
become very important in a later module that discusses “Recursive Queries”, which uses a
recursive structure for the WITH form. Recursive queries cannot be performed by using the
(more common?) “FROM” form.

Notice that the WITH form is structured completely “upside-down” from its counterpart.

“WITH” form:
• Definition appears at the top of the query, prior to the SELECT portion.

“FROM” form:
• Definition appears in the FROM portion, after the SELECT portion.

Page 24-6 Derived Tables and Volatile Tables


Another Derived Table Syntax Form

More common usage?


Query to populate AvgT
SELECT Last_Name,
Salary_Amount,
AvgSal Table Name – “AvgT”
FROM Employee e,
(SELECT AVG(Salary_Amount)
FROM Employee) AS AvgT (AvgSal) Column Name(s) – “AvgSal”
WHERE e.Salary_Amount > AvgT.AvgSal;

Column Name(s) – “AvgSal”


WITH Form
Table Name – “AvgT”
WITH AvgT (AvgSal) AS
(SELECT AVG(Salary_Amount)
Query to populate AvgT FROM Employee)
SELECT Last_Name,
Salary_Amount,
The SELECT (projection) AvgSal
appears after the derived table FROM AvgT t, Employee e
definition. WHERE e.Salary_Amount > t.AvgSal;

Derived Tables and Volatile Tables Page 24-7


Volatile Table Syntax
Volatile tables do not have a persistent definition; they must be newly created each time you
need to use them. The table definition is cached only for the duration of the session in which it is
created.

If you frequently reuse particular volatile table definitions, consider writing a macro that contains
the CREATE TABLE text for those volatile tables. Because volatile tables are private to the
session that creates them, the system does not check their creation, access, modification, and
drop privileges. Any user that has spool can create them. Understand that by holding on to the
spool for the life of the session, the user has less spool available to them for other queries.

The following list details the general characteristics of volatile tables:


• Both the contents and the definition of a volatile table are destroyed when a system reset
occurs.
• Space usage is charged to the login user spool space.
• A single session can materialize up to 1,000 volatile tables at one time.
• The primary index for a volatile table can be either an NPPI or a PPI.
• You cannot create secondary, hash, or join indexes on a volatile table.
• You cannot collect statistics on volatile table columns, including the PARTITION
column of a PPI volatile table.

Page 24-8 Derived Tables and Volatile Tables


Volatile Table Syntax

CREATE VOLATILE TABLE vt_deptsal


(deptno SMALLINT
,avgsal DEC(9,2) Volatile tables are not defined in ANSI
,maxsal DEC(9,2)
,minsal DEC(9,2)
,sumsal DEC(9,2)
,empcnt SMALLINT);
• LOG indicates that a transaction
journal is maintained. This is the
SHOW TABLE vt_deptsal
default.

CREATE SET VOLATILE TABLE DLM.vt_deptsal , • NO LOG allows for better


FALLBACK ,
CHECKSUM = DEFAULT,
performance.
LOG
( • ON COMMIT DELETE ROWS
deptno SMALLINT, indicates to delete all table rows
avgsal DECIMAL(9,2), after a commit (end transaction).
maxsal DECIMAL(9,2), This is the default.
minsal DECIMAL(9,2),
sumsal DECIMAL(9,2),
empcnt SMALLINT)
• ON COMMIT PRESERVE ROWS
PRIMARY INDEX ( deptno ) indicates to keep table rows at TXN
ON COMMIT DELETE ROWS; end.

Derived Tables and Volatile Tables Page 24-9


Volatile Table Restrictions
The following options are not permitted for volatile tables:
• Referential integrity constraints
• CHECK constraints
• Permanent journaling
• Compressed column values
• DEFAULT clause
• TITLE clause
• Named indexes

Volatile table always use spool directly from the creating user’s spool definition. It is for this
reason that, if you specify another database name it will fail. If, however, your default database
is set to some database other than your user name, and you don’t qualify the database name in
the create (taking your default), the creation of the volatile table is successful because the default
is ignored.

Page 24-10 Derived Tables and Volatile Tables


Volatile Table Restrictions

• Up to 1000 volatile tables are allowed for a single session.


• At the time you create a volatile table, the name must be unique among all
global and permanent object names in the database that has the name of the
login user.

Explicit username must be that


of logon.
CREATE VOLATILE TABLE username.table1
Default database is ignored.
CREATE VOLATILE TABLE table1
CREATE VOLATILE TABLE databasename.table1 Explicit database name can not
use spool of another database or
user.

Each session can use the same VT name (local to session).


VT name cannot duplicate existing object name for this user
 Perm or Temp table names
 View names
 Macro names
 Trigger names, etc.

Derived Tables and Volatile Tables Page 24-11


HELP and SHOW (Volatile) TABLE
When creating a volatile table, the default is ON COMMIT DELETE ROWS. This means that
when a transaction ends, the rows are automatically deleted from the table. The same volatile
table on the facing page would be like creating it with this syntax.

CREATE VOLATILE TABLE


(deptno SMALLINT,
Avgsal DECIMAL(9,2),
maxsal DECIMAL(9,2),
minsal DECIMAL(9,2),
sumsal DECIMAL(9,2),
empcnt SMALLINT)
PRIMARY INDEX (deptno)
ON COMMIT DELETE ROWS;

The default would also include that is a SET table with LOG on. A defaulted primary index
would also result in the first column of the table being a NUPI (non-unique primary index).

Page 24-12 Derived Tables and Volatile Tables


HELP and SHOW (Volatile) TABLE

CREATE VOLATILE TABLE vt_deptsal1 HELP DATABASE command does not show
( deptno SMALLINT, VT’s and they do not appear in the Explorer
avgsal DECIMAL(9,2), Tree window in SQLAssistant.
maxsal DECIMAL(9,2),
minsal DECIMAL(9,2),
sumsal DECIMAL(9,2),
empcnt SMALLINT );

CREATE SET VOLATILE TABLE DLM.vt_deptsal1 ,


FALLBACK ,
SHOW TABLE vt_deptsal1; CHECKSUM = DEFAULT,
LOG
(
HELP VOLATILE TABLE; deptno SMALLINT,
avgsal DECIMAL(9,2),
maxsal DECIMAL(9,2),
minsal DECIMAL(9,2),
Table Name Table Id sumsal DECIMAL(9,2),
------------- ------------ empcnt SMALLINT)
vt_deptsal1 30C0BC140000 PRIMARY INDEX ( deptno )
vt_deptsal2 30C0BD140000 ON COMMIT DELETE ROWS;

DROP TABLE vt_deptsal1

Derived Tables and Volatile Tables Page 24-13


ON COMMIT DELETE ROWS (Implicit Transactions)
For Teradata mode, transaction processing is implicit. This means that each request is
automatically a commit point (or an “implied” transaction).

Since the default is ON COMMIT DELETE ROWS, and the request is an implied commit, the
moment the rows are inserted they are deleted due to the commit.

Page 24-14 Derived Tables and Volatile Tables


ON COMMIT DELETE ROWS
(Implicit Transactions)
Create a volatile table.

CREATE VOLATILE TABLE vt_deptsal


(deptno SMALLINT
,avgsal DEC(9,2)
,maxsal DEC(9,2)
,minsal DEC(9,2)
,sumsal DEC(9,2)
,empcnt SMALLINT);

Populate the INSERT INTO vt_deptsal


volatile table SELECT dept ,AVG(sal) ,MAX(sal) ,MIN(sal), SUM(sal), COUNT(emp)
with computed FROM emp
aggregates. GROUP BY 1;

SELECT * FROM vt_deptsal


*** Query completed. No rows found.
ORDER BY 3;

Remember: The default is ON COMMIT DELETE ROWS


Rows are deleted immediately after the insert for implicit transactions!

Derived Tables and Volatile Tables Page 24-15


ON COMMIT PRESERVE ROWS (Implicit Transactions)
Alternately, you can choose to commit your inserts, updates and deletes by specifying so in your
create volatile table syntax as shown on the facing page.

Page 24-16 Derived Tables and Volatile Tables


ON COMMIT PRESERVE ROWS
(Implicit Transactions)
1) Create a volatile table. 2) Populate the volatile table with
computed aggregates.
CREATE VOLATILE TABLE vt_deptsal
(deptno SMALLINT INSERT INTO vt_deptsal
,avgsal DEC(9,2) SELECT dept ,
,maxsal DEC(9,2) AVG(sal) ,
,minsal DEC(9,2) MAX(sal) ,
,sumsal DEC(9,2) MIN(sal),
,empcnt SMALLINT) SUM(sal),
ON COMMIT PRESERVE ROWS; COUNT(emp)
FROM emp_views.emp
GROUP BY 1;
3) SELECT * FROM vt_deptsal ORDER BY 3;

deptno avgsal maxsal minsal sumsal empcnt


------ ----------- ----------- ----------- ----------- ------
301 29350.00 29450.00 29250.00 58700.00 3
401 35545.83 46000.00 24500.00 213275.00 7
403 38700.00 49700.00 31000.00 193500.00 6
402 52500.00 52500.00 52500.00 52500.00 2
? 43316.67 56500.00 34700.00 129950.00 3
501 50031.25 66000.00 26500.00 200125.00 4
999 100000.00 100000.00 100000.00 100000.00 1

Derived Tables and Volatile Tables Page 24-17


ON COMMIT DELETE ROWS (Explicit Transactions)
For “explicit” transactions, we tell the database when to perform the commit by issuing a BEGIN
TRANSACTION and following that with an END TRANSACTION. These two statements
bound a number of requests making them commit when we decide – explicitly!

Since the commit is withheld until an ET statement is encountered, the rows will be deleted on
our terms, and then the table will become empty.

Page 24-18 Derived Tables and Volatile Tables


ON COMMIT DELETE ROWS
(Explicit Transactions)
CREATE VOLATILE TABLE vt_deptsal
(deptno SMALLINT
,avgsal DEC(9,2)
Create a volatile table. ,maxsal DEC(9,2)
,minsal DEC(9,2)
,sumsal DEC(9,2)
,empcnt SMALLINT);

BT;
INSERT INTO vt_deptsal (1, 2, 3, 4, 5, 6);
SELECT * FROM vt_deptsal;
deptno avgsal maxsal minsal sumsal empcnt
------ ----------- ----------- ----------- ----------- ------
1 2.00 3.00 4.00 5.00 6
ET;
SELECT * FROM vt_deptsal; The default of ON COMMIT DELETE
ROWS deleted the rows immediately
*** Query completed. No rows found.
after the ET;

This would work the same for ANSI mode explicit transactions.

Derived Tables and Volatile Tables Page 24-19


ON COMMIT PRESERVE ROWS (Explicit Transactions)
We may also elect to have the database never delete the rows of the volatile table until we issue a
DELETE DML statement. With this approach the rows of the table will never be
“automatically” deleted, on a commit, but instead the rows will be deleted under our control via a
DELETE command.

Page 24-20 Derived Tables and Volatile Tables


ON COMMIT PRESERVE ROWS
(Explicit Transactions)

CREATE VOLATILE TABLE vt_deptsal


(deptno SMALLINT
,avgsal DEC(9,2)
Create a volatile table. ,maxsal DEC(9,2)
,minsal DEC(9,2)
,sumsal DEC(9,2)
,empcnt SMALLINT)
ON COMMIT PRESERVE ROWS;

BT;
INSERT INTO vt_deptsal (1, 2, 3, 4, 5, 6);
SELECT * FROM vt_deptsal;
deptno avgsal maxsal minsal sumsal empcnt
------ ----------- ----------- ----------- ----------- ------
1 2.00 3.00 4.00 5.00 6
ET;
SELECT * FROM vt_deptsal;
deptno avgsal maxsal minsal sumsal empcnt
------ ----------- ----------- ----------- ----------- ------
1 2.00 3.00 4.00 5.00 6

Derived Tables and Volatile Tables Page 24-21


Limitations
Along with the limitations on the facing page, recall the restrictions from an earlier page.

The following options are not permitted for volatile tables:


• Referential integrity constraints
• CHECK constraints
• Permanent journaling
• Compressed column values
• DEFAULT clause
• TITLE clause
• Named indexes

Page 24-22 Derived Tables and Volatile Tables


Limitations

The following commands are not applicable to VT’s:


• COLLECT/DROP/HELP STATISTICS
• CREATE/DROP INDEX
• ALTER TABLE
• GRANT/REVOKE privileges
• DELETE DATABASE/USER (does not drop VT’s)

• Use ACCESS LOGGING.


• Be RENAMEd.
VT’s may not:
• Be loaded with MultiLoad or
FastLoad utilities.

Derived Tables and Volatile Tables Page 24-23


Using INSERT-SELECT

Page 24-24 Derived Tables and Volatile Tables


Using INSERT-SELECT

You can populate empty table using INSERT-SELECT.

There are two basic forms of INSERT-SELECT.

1) INSERT INTO targettable SELECT * FROM sourcetable;

The SELECT may include derived data values or involve any number
of features and functions as expressions.

Things to consider are:

2) INSERT INTO targettable • Match the number of source and


SELECT column1, column2, , , , target columns.
FROM sourcetable;
• Match the data types or implicit
conversions may occur.

The target table may be populated or empty.


Empty target tables populate faster than populated ones.

Derived Tables and Volatile Tables Page 24-25


Inserting a Single Row
The facing page discusses the ability of SQL to insert a single row into a table. The two forms
provide alternate methods for performing an insert. The first method is used if you know the
order of the columns in the table create statement, and assumes that you know their data types as
well. The second form also assumes that you know the data types for each column, but, rather
than knowing the table’s column order, you reference the columns by name. This means that
you need not do the order because the database can now match the columns by name. You must,
however, list the order of their values so that they match the order in the column list.

Page 24-26 Derived Tables and Volatile Tables


Inserting a Single Row

Insert a new employee into the employee table.


Value order assumes table column order.

INSERT INTO employee


VALUES (1210, NULL, 401, 412101, 'Smith', 'James', 890303, 460421, 41000);

Insert a new employee with only partial data.


Column list may be any order.
Value list must match order of column list.
Inserts default values (discussed later) for missing columns.

INSERT INTO employee


(last_name, first_name, hire_date, birthdate, salary_amount, employee_number)
VALUES ('Garcia', 'Maria', 861027, 541110, 76500.00, 1291);

As a Teradata Extension to ANSI:


• INSERT can be abbreviated as INS.
• INTO and VALUES are optional keywords.

There is no typing shortcut facility for explicitly inserting multiple rows with this form.

Derived Tables and Volatile Tables Page 24-27


UPDATE
The UPDATE clause is used to change data values for columns. The example shown is simple
enough since the predicate is specifying that we are updating only a single row (referencing a
UPI value like Employee_Number guarantees this).

Updating may also be done in bulk. In this case “bulk” means more than one row. An example
of a bulk operation would be, for instance, if the WHERE condition were to be removed from
this update so that all rows would be updated. Another example of a bulk operation is provided
in the next discussion on the following page.

The FROM clause is optional. Optional keywords are often referred to as being “noise”
(something that can be disregarded or ignored – easy enough to say I suppose).

Examples of using the DEFAULT keyword in an update follow.

UPDATE
EMPLOYEE SET Last_Name = DEFAULT
WHERE Salary_Amount = DEFAULT;
*** Failure 3811 Column 'last_name' is NOT NULL. Give the
column a value.

UPDATE
EMPLOYEE SET Department_Number = DEFAULT
WHERE Salary_Amount = DEFAULT;

Page 24-28 Derived Tables and Volatile Tables


UPDATE

EMPLOYEE
MGR
EMP EMP DEPT JOB LAST FIRST HIRE BIRTH SAL
UPDATE modifies NUM NUM NUM CODE NAME NAME DATE DATE AMT
one or more rows PK FK FK FK
1006 1019 301 312101 Stein John 761015 531015 2945000
in a single table. 1008 1019 301 312102 Kanieski Carol 770201 580517 2925000
1005 0801 403 431100 Ryan Loretta 761015 550910 3120000
1004 1003 401 412101 Johnson Darlene 761015 460423 3630000
1007 1005 403 432101 Villegas Arnando 770102 370131 4970000
1003 0801 401 411100 Trader James 760731 470619 3785000

For employee 1004, change UPDATE Employee [ FROM Employee ]


their: SET Department_Number = 403
• Department to 403 ,Job_Code = 432101
• Job to 432101 ,Manager_Employee_Number = 1005
• Manager to 1005 WHERE Employee_Number = 1004;

EMPLOYEE
MGR
EMP EMP DEPT JOB LAST FIRST HIRE BIRTH SAL
NUM NUM NUM CODE NAME NAME DATE DATE AMT
PK FK FK FK
1006 1019 301 312101 Stein John 761015 531015 2945000
1008 1019 301 312102 Kanieski Carol 770201 580517 2925000
1005 0801 403 431100 Ryan Loretta 761015 550910 3120000
1004 1005 403 432101 Johnson Darlene 761015 460423 3630000
1007 1005 403 432101 Villegas Arnando 770102 370131 4970000
1003 0801 401 411100 Trader James 760731 470619 3785000

Derived Tables and Volatile Tables Page 24-29


Updating with Joins
Updating by using joins (subqueries are joins) is considered a bulk operation event though it may
result in only affecting a single row since the process for updating that row may not necessarily
be considered a direct one.

Page 24-30 Derived Tables and Volatile Tables


Updating with Joins

Updates with joins and subqueries allow a table's rows


to be updated based on information in another table.

Give everyone in all the support departments a 10% raise.


(Assume we don't know the department numbers for all of the support departments.)

Using a subquery:
UPDATE employee
SET salary_amount = salary_amount * 1.10 Using a correlated subquery:
WHERE department_number IN
(SELECT department_number UPDATE employee e
FROM department SET salary_amount = salary_amount * 1.10
WHERE department_name LIKE '%Support%'); WHERE department_number =
(SELECT department_number
FROM department d
WHERE e.department_number =
Using an inner join: d.department_number
AND department_name LIKE '%Support%');
UPDATE employee [ FROM Department ]
SET salary_amount = salary_amount * 1.10
WHERE employee.department_number =
department.department_number
AND department_name LIKE '%Support%';

Derived Tables and Volatile Tables Page 24-31


DELETE
The DELETE clause can be used to remove rows from a table. When used conditionally, as in
the first example, it can be used to remove targeted rows based upon the explicit values
referenced in the predicate. The examples at the bottom are repeated from an earlier module, and
remove all rows from the table very quickly. The keyword “FROM” is noise, and is not
required, and the keyword DELETE may be abbreviated to “DEL”.

You may also use the DEFAULT keyword in a delete like this.

DELETE FROM EMPLOYEE WHERE Salary_Amount = DEFAULT;

Page 24-32 Derived Tables and Volatile Tables


DELETE

EMPLOYEE
MGR
DELETE removes one or EMP EMP DEPT JOB LAST FIRST HIRE BIRTH SAL
more rows from a table. NUM NUM NUM CODE NAME NAME DATE DATE AMT
PK FK FK FK
1006 1019 301 312101 Stein John 761015 531015 2945000
1008 1019 301 312102 Kanieski Carol 770201 580517 2925000
1005 0801 403 431100 Ryan Loretta 761015 550910 3120000
1004 1003 401 412101 Johnson Darlene 761015 460423 3630000
1007 1005 403 432101 Villegas Arnando 770102 370131 4970000
1003 0801 401 411100 Trader James 760731 470619 3785000

Remove employees in
DELETE FROM employee
department 301 from
WHERE department_number = 301;
the employee table.

DELETE FROM emp_data ALL;


All of these are equivalent DELETE FROM emp_data;
and empty a table. DELETE emp_data;
DEL emp_data;

Derived Tables and Volatile Tables Page 24-33


Deleting with Joins
As seen earlier with UPDATE, joins may be used to delete rows from a table. The keyword
“FROM” is noise, and is not required, and the keyword DELETE may be abbreviated to “DEL”.

Page 24-34 Derived Tables and Volatile Tables


Deleting with Joins

Remove all of the employees who are assigned to a temporary department.


(i.e. for which the department name is 'Temp'.)

DELETE FROM employee


WHERE department_number IN
Using a
(SELECT department_number
Subquery: FROM department
WHERE department_name = 'Temp');

DELETE FROM employee


Using a WHERE employee.department_number =
Join: department.department_number
AND department.department_name = 'Temp';

DELETE FROM employee e


Using a WHERE department_number =
(SELECT department_number
Correlated FROM department d
Subquery: WHERE e.department_number = d.department_number
AND d.department_name = 'Temp');

Derived Tables and Volatile Tables Page 24-35


Module 24: Summary
A summary of this module is discussed.

Page 24-36 Derived Tables and Volatile Tables


Module 24: Summary

• There are two forms for writing derived tables: WITH and FROM.

• Views, Derived, Global and Volatile tables are all examples of temporary
instance objects.

• Views, Derived tables and Volatile tables all use spool.

• ON COMMIT DELETE ROWS is the default for volatile tables.

• Volatile tables last until the end of the user’s session logoff , when the
database drops them automatically.

• Volatile tables are best suited for ad-hoc usage.

Derived Tables and Volatile Tables Page 24-37


Module 24: Review Questions
Check your understanding of the concepts discussed in this module by completing the review
questions as directed by your instructor.

Page 24-38 Derived Tables and Volatile Tables


Module 24: Review Questions

True or False:

1. You can define a volatile table with a unique primary index.


True
2. The option ON COMMIT DELETE ROWS will result in the inserted rows
being immediately deleted after the insert.
False – this is true only for implicit transactions
3. You can not perform a SHOW TABLE on a derived table.
True
4. You cannot qualify a volatile table create with a database name other than
your user name.
True
5. Volatile tables and derived tables use the same kind of space.
True
6. You can create hundreds of derived tables for a single query.
False – True in TD13
7. You can drop a volatile table at anytime during your session.
True

Derived Tables and Volatile Tables Page 24-39


Module 24: Lab Exercise
Check your understanding of the concepts discussed in this module by completing the lab
exercise as directed by your instructor.

Page 24-40 Derived Tables and Volatile Tables


Module 24: Lab Exercise

1. Create a volatile table based on the definition of the Department table


and then populate it with data from the department table. Use the
“preserve” option. Select all rows from the table you created.

2. Create another volatile table that averages the salary amounts for each
department. Now issue a HELP command to verify the existence of the
two volatile tables that you created. Do SHOW TABLE on one of them
and then drop the first one and repeat the earlier HELP command.

Derived Tables and Volatile Tables Page 24-41


Notes:

Page 24-42 Derived Tables and Volatile Tables


Appendix A

Appendix A: Answers to Review Questions

This Appendix contains answers to the review


questions for the course modules.

Teradata Proprietary and Confidential

Appendix A: Review Question Solutions Page A-1


Module 1: Review Questions

True or False:

1. Logical models do not contain data.


True
2. You can access rows from an entity.
False
3. The physical model is always a direct reflection of a logical model.
False
4. Entity attributes may become table columns.
True
5. Data types are typically based upon domains.
True
6. The Teradata database can always determine which database owns which
object without any help from the query.
False

Page A-2 Appendix A: Review Question Solutions


Module 2: Review Questions

True or False:

1. A user can have only one default database set at a time.


True
2. The following is a valid SQL request  HELP DATABASE;
False – a database name is required with this syntax.
3. The HELP TABLE command returns table index information.
False
4. You can click and drag information from the Explorer Tree to the Query
Window.
True
5. The following is a valid column name  _abc_
True
6. For the following request, the database will assume that the object “Employee”
is a database name  HELP COLUMN Employee.Last_Name;
False – it assumes it to be a table name
7. The following is a valid SQL request  SHOW COLUMN Employee.Last_Name;
False – you can not perform a SHOW on a column.

Appendix A: Review Question Solutions Page A-3


Module 3: Review Questions

True or False:

1. “SELECT * FROM Employee ORDER BY 1;” is a valid SQL construct.


True
2. The SQL DELETE is considered a DDL request.
False – it is a DML request
3. DISTINCT automatically performs a sort.
True
4. A WHERE clause can be used to eliminate columns from a result.
False – WHERE effects only row counts
5. A character literal not enclosed in single quotes is interpreted as an
object name.
True
6. Double quotes can also be used to display literal values.
False – single quotes are used to do this
7. The built-in functions DATE and TIME are ANSI standard.
False

Page A-4 Appendix A: Review Question Solutions


Module 4: Review Questions

True or False:

1. The IN operator is a short-cut for replacing a list of OR'ed conditions.


True

2. When using BETWEEN, only numeric data may be compared.


False – character data values may be compared as well

3. “Unsatisfiable” conditions are those which can never be true.


True

4. The following are equivalent operator conditions. C1 > 500 vs. C1


>= 501
True – but only if C1 is an integer

5. The items inside an IN list must be in order from lowest to highest.


False

Appendix A: Review Question Solutions Page A-5


Module 5: Review Questions

True or False:

1. NULLs are always displayed as “?”.


False
2. NULL is a data type.
False
3. A NULL is treated like a zero (0) or a space(' ').
False
4. NULLs involved in computations always return NULL.
True
5. NULLs used in comparisons always return unknown.
True
6. You can include NULL inside an IN or NOT IN list.
True – but it may not return the desired result.

Page A-6 Appendix A: Review Question Solutions


Module 6: Review Questions

True or False:

1. The FLOAT data type has more precision than does a decimal data type.
False – float only has 15 digits of precision.
2. Character data types can not be converted to a numeric data types.
False
3. FORMAT 'd2' is a valid formatting option.
False – for 2-digit day formatting “dd” must be used.
4. The expression  'a ' = 'A ' evaluates true.
True 3 spaces 10 spaces
5. You can use the CAST function to change a data type or to format results.
True
6. The comma “,” is a valid formatting character.
True
7. The formatting character “9” may be used to display leading or trailing
zeroes.
False – it can only display leading zeroes.

Appendix A: Review Question Solutions Page A-7


Module 7: Review Questions

True or False:

1. The CHARACTER_LENGTH function can accept numeric values as input.


False – Numbers must be CAST as character first.
2. POSITION returns a value of SMALLINT.
False – it is data type integer.
3. SUBSTRING can accept character and numeric values as input.
True
4. The following syntax searches for a “%” in a column  C1 LIKE '%g%%'
ESCAPE 'G'
True
5. The ADD_MONTHS function may be used to add years to a date value.
True – e.g. ADD_MONTHS(DATE, 12*n)
6. EXTRACT can be used to return the day-of-week for a date.
False
7. Functions may not be nested. (e.g. function(function(function(arg))) is not
valid)
False

Page A-8 Appendix A: Review Question Solutions


Module 8: Review Questions

True or False:

1. The INTERSECT operator returns the same result as does the MINUS
operator, however MINUS is a Teradata extension.
False – MINUS and EXCEPT are equivalent (though MINUS is an extension)
2. The following is valid for a set operator  ORDER BY Last_Name
False – you must order by a positional number
3. The ALL option may potentially return more rows than if not using it.
True
4. Set operators may cause truncation among corresponding columns of a
result sets.
True
5. An INTERSECT is just another way of returning an inner result.
False
6. If all three different set operators are referenced in a query, the UNION is
performed first.
False – the INTERSECT is first
7. “SELECT *” is a valid projection in a set operation.
True – as long as the numbers of columns projected among projections
remains constant

Appendix A: Review Question Solutions Page A-9


Module 9: Review Questions

True or False:

1. ORDER BY is not allowed in the outer query.


False

2. WHERE is not allowed on the outer query.


False

3. DISTINCT is not allowed on the inner query.


False – it is automatically performed, but may be used if desired

4. The inner query must include a semi-colon.


False

Page A-10 Appendix A: Review Question Solutions


Module 10: Review Questions

True or False:

1. For inner joins, each FROM clause requires an ON clause for join
conditions.
False – Only the explicit form requires an ON clause
2. Referencing a WHERE clause is invalid for the explicit form of inner join.
False – A WHERE clause may be need for adding residual conditions
3. Many-to-many relationships are allowed with inner joins.
True
4. When performing a self join, table aliasing is required.
True – You may not reference the same table name with creating ambiguities
5. Inner join syntax requires at least one qualifying join column.
False – A WHERE clause may be need for adding residual conditions
6. The explicit form of inner join can reject some uses of incorrect
qualifications.
True – But only in the ON clause and not in the project list
7. The implicit form of inner join is not ANSI standard.
False – Both forms are ANSI standard

Appendix A: Review Question Solutions Page A-11


Module 11: Review Questions

True or False:

1. All outer joins require use of either LEFT, RIGHT or FULL keywords.
True
2. Outer joins can return more rows that can inner joins.
True
3. Nulls returned from the inner table mean the result row is an outer
result.
False – Only for the join column, or if the column is defined as NOT NULL
4. The use of a WHERE clause is not allowed in an outer join.
False – WHERE can be used for writing residual conditions
5. The use of an ON clause is required when writing an outer join.
True
6. The keyword OUTER is required when writing outer joins.
False
7. The FULL outer join returns LEFT and RIGHT outer results.
True – It also returns inner results

Page A-12 Appendix A: Review Question Solutions


Module 12: Review Questions

True or False:

1. EXISTS and NOT EXISTS can be used in traditional subqueries.


True
2. When using EXISTS in a correlated subquery, the projected list of the
subquery is irrelevant.
True
3. Correlated subqueries are an ANSI standard.
True
4. Correlated subqueries process sets of data.
True – but they process row-at-a-time
5. Correlated subqueries can not project columns from the inner table.
True
6. You can not nest correlated subqueries.
False

Appendix A: Review Question Solutions Page A-13


Module 13: Review Questions

True or False:

1. The value zero (0) is ignored in the aggregate process.


False – Nulls are ignored, and a zero is not that same thing as a null
2. The following is a valid request  SELECT * FROM Employee GROUP BY 1;
False
3. DISTINCT can be replicated by GROUP BY.
True
4. The HAVING clause gets applied after the aggregation is performed.
True
5. Qualifying non-aggregates with the HAVING clause can impact performance.
True
6. Qualifying aggregate values with WHERE is allowed, but impacts
performance.
False
7. You can aggregate joined columns.
True

Page A-14 Appendix A: Review Question Solutions


Module 14: Review Questions

True or False:

1. CASE can be used to replace a value with a null.


True
2. NULLIF changes a null to a value if the first argument equals the second
argument.
False – It changes a value to a null
3. COALESCE can reference many arguments.
True
4. CASE, NULLIF, and COALESCE can be referenced in the predicate or the
projection.
True
5. COALESCE can be used with aggregation for including a null into an average.
True – Typically a zero
6. NULLIF can be used with aggregation for removing a non-zero number from an
average.
True – e.g. NULLIF(c1, 1) will return a null for the value 1
7. In a SELECT, COALESCE(NULLIF(C1, 0), 1) will replace a “0” with a “1” for
column C1.
True – It will also replace a null with a 1

Appendix A: Review Question Solutions Page A-15


Module 15: Review Questions

True or False:

1. In a CREATE TABLE, every column must be given a data type.


True
2. A USI can not be defined as NULL.
False
3. The HELP INDEX command references only a specific table.
True
4. MULTISET tables can be created with a unique index.
True
5. Derived tables need not specify column names.
False – Either after the table name, or inside the query using aliases
6. The DELETE TABLE syntax remove the table definition from the
dictionary.
False – It only removes rows
7. A secondary index must be provided a name.
False

Page A-16 Appendix A: Review Question Solutions


Module 16: Review Questions

True or False:

1. By default, sampling is considered random.


True
2. The RANDOM function is ANSI standard.
False – it is a Teradata extension to the ANSI standard.
3. You can perform a sample on an aggregated result.
True – by using a derived table
4. A subquery can perform a sample.
False
5. Derived tables can perform a sample.
True
6. By default, rows can appear more than once within a sample.
False – the default is “no replacement”
7. By default, random values, using the RANDOM function, are
“replaced”.
True – any value can be returned more than once.

Appendix A: Review Question Solutions Page A-17


Module 17: Review Questions

True or False:

1. Values may be replaced when using TOP N.


False - don’t confuse this option with SAMPLE
2. TOP N may be referenced in the same projection as SAMPLE.
False
3. TOP N may be referenced within a subquery.
False
4. The WITH TIES option will return only a specified number of rows.
False – it can return a percentage as well
5. ORDER BY returns the same thing as a ranked result.
False
6. The WITH TIES option is invalid if no ORDER BY clause is included in the
query.
True - but it is ignored
7. ORDER BY can be referenced along with the PERCENT option.
True

Page A-18 Appendix A: Review Question Solutions


Module 18: Review Questions

True or False:

1. Of the four Windows, this module only discussed the GROUP Window.
True
2. With the Group Window, ORDER BY, in the OVER, will not change result
values.
True
3. In the OVER clause, ORDER BY must be after PARTITION, if both are used.
True
4. PARTITION and GROUP BY may both be present within the same
projection.
True – they may be alone or together or not present at all
5. QUALIFY need not reference a projected value.
True
6. HAVING must reference a projected value.
False
7. PARTITION may return a null value.
True

Appendix A: Review Question Solutions Page A-19


Module 19: Review Questions

True or False:

1. RANK is considered a window aggregate function.


True
2. MDIFF can be partitioned even though it’s not ANSI standard.
True – by using a GROUP BY
3. A remaining window must use ORDER BY.
False – although an ORDER BY is typically found in one
4. A moving window can not contain and UNBOUNDED clause.
True
5. The RESET WHEN feature must be accompanied by an ORDER BY.
True
6. A moving window can not contain a FOLLOWING clause.
False
7. A cumulative COUNT(*) is similar to ranking values.
True

Page A-20 Appendix A: Review Question Solutions


Module 20: Review Questions

True or False:

1. When performing a windowed rank, a ORDER BY is not required.


False
2. The ROW_NUMBER function can work on character or numeric data
values.
True
3. RANK can work on only numeric data values.
False – this is true for the ANSI and non-ANSI forms
4. RANK can not reference PARTITION.
False
5. The default sorting order for a windowed rank is ascending.
True
6. Sorting on multiple columns may result in a rank value of 1 for each row.
False don’t confuse ORDER BY with PARTITION BY
7. The following is valid within a SELECT
RANK(sales) OVER (ORDER BY Col1)
False – there can be no reference within the parentheses

Appendix A: Review Question Solutions Page A-21


Module 21: Review Questions

True or False:

1. The QUANTILE function is considered to be a Window Aggregate.


False – They are, however, both OLAP features

2. QUANTILE and GROUP BY are mutually exclusive.


True

3. QUANTILE and QUALIFY are mutually exclusive.


False

4. QUALIFY and GROUP BY are mutually exclusive.


False

Page A-22 Appendix A: Review Question Solutions


Module 22: Review Questions

True or False:

1. You can retain detailed information using Extended Grouping functions.


False
2. Parentheses can be nested within a CUBE.
True
3. For Extended Grouping functions, all non-aggregates must be enclosed
within parentheses.
False – but they must be included in the grouping
4. The following are equivalent for GROUPING SETS: ((m#, d#)) vs. (m#, d#)
False – 1 returns m# and d# totals as a group. 2 returns each individually
5. GROUPING SETS is the more explicit form of the Extended Grouping
functions.
True
6. GROUPING SETS always returns a grand total.
False
7. Grand totals are always retrieved when using CUBE and ROLLUP.
True

Appendix A: Review Question Solutions Page A-23


Module 23: Review Questions

True or False:

1. Views don’t require permanent space from their database.


True
2. You can not create an index on a view.
True
3. You can create a view of a view.
True
4. You can not drop a table that has a view referencing it.
False
5. HAVING and GROUP BY may both be specified within a view
definition.
True
6. You can aggregate a view column that is already being aggregated.
True

Page A-24 Appendix A: Review Question Solutions


Module 24: Review Questions

True or False:

1. You can define a volatile table with a unique primary index.


True
2. The option ON COMMIT DELETE ROWS will result in the inserted rows
being immediately deleted after the insert.
False – this is true only for implicit transactions
3. You can not perform a SHOW TABLE on a derived table.
True
4. You cannot qualify a volatile table create with a database name other than
your user name.
True
5. Volatile tables and derived tables use the same kind of space.
True
6. You can create hundreds of derived tables for a single query.
False – True in TD13
7. You can drop a volatile table at anytime during your session.
True

Appendix A: Review Question Solutions Page A-25


Page A-26 Appendix A: Review Question Solutions
Appendix B

Solutions to Lab Exercises

This Appendix contains possible solutions


to the lab exercises.

Teradata Proprietary and Confidential

Solutions to Lab Exercises Page B-1


Module 2: Lab Solution
Solutions for exercises 1 and 2 are observational.

3. Try each of the following in the order shown, and note if it fails by looking at the
bottom-left portion of the utility screen. For those that fail, “double-click” the
“Notes” field for the failed request in the “History Window”.

HELP DATABASE;
3707: Syntax error, expected something like a name or a Unicode delimited identifier between
the 'DATABASE' keyword and ';'.

HELP DATABASE yourusername;


No rows are returned.

DATABASE yourusername;
The default database should be set to your logon and be reflected at the top of the screen in the
center banner.

SHOW TABLE Employee;


3807: Object 'Employee' does not exist.

DATABASE Employee_sales;
The default database should be set to Employee_sales and be reflected at the top of the screen
in the center banner.

SHOW TABLE Employee;


The table definition should be displayed.

SHOW TABLE CS_VIEWS.Employee;


3853: 'Employee' is not a table.

SHOW VIEW CS_VIEWS.Employee;


The view definition should be displayed.

Exercise 4 is hands-on.
Try the various drag-and-drop methods outlined in the module.

Page B-2 Solutions to Lab Exercises


Module 3: Lab Solution
1. Select all columns for all departments from the department table.

SELECT * FROM Department;

department_number department_name budget_amount manager_employee_number


----------------- ------------------------- ------------- -----------------------
403 education 932000.00 1005
600 ? ? 1099
402 ? 308000.00 1011
201 technical operations 293800.00 1025
100 president 400000.00 801
302 product planning 226000.00 1016
301 research and development 465600.00 1019
501 marketing sales 308000.00 1017
401 customer support 982300.00 1003

2. Request a report of employee last and first names and salary for all of
manager 1019's employees. Order the report in last name ascending
sequence.

SELECT Last_Name, First_Name, Salary_Amount


FROM Employee_sales.Employee
WHERE Manager_Employee_Number = 1019
ORDER BY 1;

last_name first_name salary_amount


-------------------- ------------------------------ -------------
Kanieski Carol 29250.00
Stein John 29450.00

3. Project a distinct list of job codes which have been assigned to people and are greater
than 510000 and sort the result descending.

SELECT DISTINCT Job_Code


FROM Employee
WHERE Job_Code > 510000
ORDER BY 1 DESC;

job_code
-----------
512101
511100

Solutions to Lab Exercises Page B-3


4. What are the first names of people with a last name of “Brown”?

SELECT Last_Name, First_Name


FROM Employee
WHERE Last_Name = ‘brown’;

last_name first_name
-------------------- ------------------------------
Brown Allen
Brown Alan

5. How many people have been assigned job codes greater than or equal to 510001?
(since aggregation has not been taught yet you will have to manually count
them? Or can SQL Assistant tell you?)

SELECT *
FROM Employee
WHERE Job_Code >= 510001;

The history window shows that 4 rows have been returned for this query.

Page B-4 Solutions to Lab Exercises


Module 4: Lab Solution
1. List last names, department numbers for employee in department 301, 401, and
501.

SELECT Last_Name, Department_Number


FROM Employee
WHERE Department_Number IN (301, 401, 501);

last_name department_number
-------------------- -----------------
Kanieski 301
Kubic 301
Ratzlaff 501
Hoover 401
Rogers 401
Wilson 501
Phillips 401
Machado 401
Rabbit 501
Johnson 401
Stein 301
Trader 401
Brown 401
Runyon 501

2. Project the last names of employees whose salary is greater than or equal to
$28,078.

SELECT Last_Name
FROM Employee
WHERE Salary_Amount >= 28078
ORDER BY Last_Name;

last_name
--------------------
Brown
Brown
Daly
Hopkins
Johnson
Kanieski
Lombardo
Morrissey
Ratzlaff
Rogers
Rogers
Runyon
Ryan
Short
Stein
Trader
Trainer
Villegas
Wilson

Solutions to Lab Exercises Page B-5


3. Modify #1 to include those employee who have a job code of either 512102 or
432101.

SELECT Last_Name, Department_Number


FROM Employee
WHERE Department_Number IN (301, 401, 501)
OR Job_Code IN (512101, 432101)
ORDER BY 1, 2;

last_name department_number
-------------------- -----------------
Brown 401
Hoover 401
Johnson 401
Kanieski 301
Kubic 301
Machado 401
Phillips 401
Rabbit 501
Ratzlaff 501
Rogers 401
Runyon 501
Stein 301
Trader 401
Wilson 501

4. Modify #4 to show only those whose salary amounts are between $50,000 and
$60,000.

SELECT Last_Name, Department_Number


FROM Employee
WHERE
(Department_Number IN (301, 401, 501)
OR Job_Code IN (512101, 432101))
AND
(Salary_Amount BETWEEN 50000 AND 60000)
ORDER BY 1, 2;

last_name department_number
-------------------- -----------------
Ratzlaff 501
Wilson 501

Page B-6 Solutions to Lab Exercises


Module 5: Lab Solution

1. Request separate reports of employees who have not been assigned to a


Department, then those who have not been given a job code.

SELECT last_name, first_name, department_number, job_code


FROM Employee
WHERE department_number IS NULL;

last_name first_name department_number job_code


-------------------- ------------ ----------------- -----------
Rogers Nora ? 321100
Short Michael ? 211100
Morrissey Jim ? 222101

SELECT last_name, first_name, job_code


FROM Employee
WHERE job_code IS NULL
ORDER BY 1;

last_name first_name job_code


-------------------- ------------------------------ -----------
Brown Allen ?
Charles John ?
Hopkins Paulene ?
Lombardo Domingus ?
Villegas Arnando ?

2. Using an IN list, display employees with any of the following job codes: 412101,
412109, NULL.

SELECT last_name, first_name, job_code


FROM Employee
WHERE job_code IN (412101, 412109)
OR job_code IS NULL
ORDER BY 1;

last_name first_name job_code


-------------------- ------------------------------ -----------
Rogers Frank 412101
Brown Allen ?
Hopkins Paulene ?
Lombardo Domingus ?
Villegas Arnando ?
Charles John ?
Hoover William 412101
Johnson Darlene 412101

Solutions to Lab Exercises Page B-7


3. Rewrite #2 using all OR’ed conditions.

SELECT last_name, first_name, job_code


FROM Employee
WHERE job_code = 412101
OR job_code = 412109
OR job_code IS NULL
ORDER BY 1;

last_name first_name job_code


-------------------- ------------------------------ -----------
Rogers Frank 412101
Brown Allen ?
Hopkins Paulene ?
Lombardo Domingus ?
Villegas Arnando ?
Charles John ?
Hoover William 412101
Johnson Darlene 412101

4. List employee with unassigned job codes that have salaries between 30K and 40K.

SELECT last_name, first_name, job_code, salary_amount


FROM Employee
WHERE job_code IS NULL
AND salary_amount BETWEEN 30000 and 40000
ORDER BY 1;

last_name first_name job_code salary_amount


-------------------- ---------- ----------- -------------
Hopkins Paulene ? 37900.00
Lombardo Domingus ? 31000.00

Page B-8 Solutions to Lab Exercises


Module 6: Lab Solution

1. Find and list employees first and last names for employees where their last name
begins with either “R”, “S” or “T”. (Do this without regard to case sensitivity.)

SELECT first_name, last_name


FROM Employee
WHERE last_name LIKE 'r%'
OR last_name LIKE 's%'
OR last_name LIKE 't%'
ORDER BY 1, 2;

Another solution:

SELECT first_name, last_name


FROM Employee
WHERE last_name BETWEEN 'r%' AND 't%'
ORDER BY 2, 1;

first_name last_name
------------------------------ --------------------
Peter Rabbit
Larry Ratzlaff
Frank Rogers
Nora Rogers
Irene Runyon
Loretta Ryan
Michael Short
John Stein

2. Write a request that will show the salary amount for the people identified in #1 if they
were given a 10% increase in salary that gave them a salary > 50K.

first_name last_name newsal


------------------------------ -------------------- --------------
Larry Ratzlaff 59400.000
Frank Rogers 50600.000
Nora Rogers 62150.000
Irene Runyon 72600.000

3. Project new employee job codes (from the Employee table) for all those job codes
ending in 101, increasing them by 100. Include last names, job codes, department
numbers to make help verify results.

Solutions to Lab Exercises Page B-9


SELECT last_name (CHAR(10)), first_name (CHAR(10)), job_code, job_code + 100 AS
newjob, department_number AS dept#
FROM employee;

last_name first_name job_code newjob dept#


---------- ---------- ----------- ----------- -----------
Runyon Irene 511100 511200 501
Crane Robert 422101 422201 402
Rogers Nora 321100 321200 ?
Brown Alan 413201 413301 401
Stein John 312101 312201 301
Phillips Charles 412102 412202 401
Hopkins Paulene ? ? 403
Ratzlaff Larry 512101 512201 501
Kanieski Carol 312102 312202 301
Charles John ? ? 403
Trader James 411100 411200 401
Wilson Edward 512101 512201 501
Rogers Frank 412101 412201 401
Daly James 421100 421200 402
Brown Allen ? ? 403
Short Michael 211100 211200 ?
Kubic Ron 311100 311200 301
Rabbit Peter 512101 512201 501
Trainer I.B. 111100 111200 999
Morrissey Jim 222101 222201 ?
Hoover William 412101 412201 401
Johnson Darlene 412101 412201 401
Lombardo Domingus ? ? 403
Villegas Arnando ? ? 403
Machado Albert 412102 412202 401
Ryan Loretta 431100 431200 403

Page B-10 Solutions to Lab Exercises


Module 7: Lab Solution
1. From the Employee table, display the last name first name for employees 1013, 1018,
and 1024. Concatenate the columns so that you see them as  “last, first”.

SELECT TRIM(last_name)||', '||TRIM(first_name) AS FullName


FROM employee;

FullName
----------------------------------------------------
Brown, Allen
Phillips, Charles
Ratzlaff, Larry

2. Repeat #1. Replace your WHERE Clause, using LIKE to only list employees who have
an "LL" combination in their last name.

FullName
----------------------------------------------------
Phillips, Charles
Villegas, Arnando

3. Using POSITION, change #2 to also include last names having an “FF” combination in
their last name.

FullName
------------------------
Phillips, P.
Ratzlaff, R.
Villegas, V.

Solutions to Lab Exercises Page B-11


Module 8: Lab Solution

1. For those employees who work in departments 301 or 401, remove those whose
salary is less than $35,000.00 and order this result by last name and then first
name.

SELECT last_name, first_name, salary_amount


FROM employee
WHERE salary_amount < 35000
AND department_number IN (301, 401)
ORDER BY 1, 2;

last_name first_name salary_amount


-------------------- ------------------------------ -------------
Hoover William 25525.00
Kanieski Carol 29250.00
Phillips Charles 24500.00
Stein John 29450.00

2. Use UNION to combine the results of #1 with those who earn more than $10,000.00.
Alias last name to LNM and first name to FNM.

SELECT last_name LNM, first_name FNM, salary_amount


FROM employee
WHERE salary_amount < 35000
AND department_number IN (301, 401)
UNION
SELECT last_name, first_name, salary_amount
FROM employee
WHERE salary_amount > 10000
ORDER BY 1, 2;

Page B-12 Solutions to Lab Exercises


LNM FNM salary_amount
-------------------- ------------------------------ -------------
Brown Alan 43100.00
Brown Allen 43700.00
Daly James 52500.00
Hoover William 25525.00
Hopkins Paulene 37900.00
Johnson Darlene 36300.00
Kanieski Carol 29250.00
Lombardo Domingus 31000.00
Morrissey Jim 38750.00
Phillips Charles 24500.00
Rabbit Peter 26500.00
Ratzlaff Larry 54000.00
Rogers Frank 46000.00
Rogers Nora 56500.00
Runyon Irene 66000.00
Ryan Loretta 31200.00
Short Michael 34700.00
Stein John 29450.00
Trader James 37850.00
Trainer I.B. 100000.00
Villegas Arnando 49700.00
Wilson Edward 53625.00

2. Add the ALL option to #2 and note the different result.

SELECT last_name LNM, first_name FNM, salary_amount


FROM employee
WHERE salary_amount < 35000
AND department_number IN (301, 401)
UNION ALL
SELECT last_name, first_name, salary_amount
FROM employee
WHERE salary_amount > 10000
ORDER BY 1, 2;

Solutions to Lab Exercises Page B-13


LNM FNM salary_amount
-------------------- ------------------------------ -------------
Brown Alan 43100.00
Brown Allen 43700.00
Daly James 52500.00
Hoover William 25525.00
Hoover William 25525.00
Hopkins Paulene 37900.00
Johnson Darlene 36300.00
Kanieski Carol 29250.00
Kanieski Carol 29250.00
Lombardo Domingus 31000.00
Morrissey Jim 38750.00
Phillips Charles 24500.00
Phillips Charles 24500.00
Rabbit Peter 26500.00
Ratzlaff Larry 54000.00
Rogers Frank 46000.00
Rogers Nora 56500.00
Runyon Irene 66000.00
Ryan Loretta 31200.00
Short Michael 34700.00
Stein John 29450.00
Stein John 29450.00
Trader James 37850.00
Trainer I.B. 100000.00
Villegas Arnando 49700.00
Wilson Edward 53625.00

4. Using a SET operator, change #1 to find those who satisfy both the department and
salary conditions.

SELECT last_name LNM, first_name FNM, salary_amount


FROM employee
WHERE salary_amount < 35000
AND department_number IN (301, 401)
INTERSECT
SELECT last_name, first_name, salary_amount
FROM employee
WHERE salary_amount > 10000
ORDER BY 1, 2;

LNM FNM salary_amount


-------------------- ------------------------------ -------------
Hoover William 25525.00
Kanieski Carol 29250.00
Phillips Charles 24500.00
Stein John 29450.00

Page B-14 Solutions to Lab Exercises


Module 9: Lab Solution

1. Write a subquery that finds employees who are not employee managers. (i.e. Not
managers in the employee table.)

SELECT last_name, first_name, employee_number


FROM Employee
WHERE employee_number NOT IN
(SELECT manager_employee_number FROM employee);

19 rows

last_name first_name employee_number


-------------------- ------------------------------ ---------------
Wilson Edward 1015
Crane Robert 1014
Rogers Nora 1016
Brown Alan 1002
Stein John 1006
Phillips Charles 1013
Hopkins Paulene 1012
Ratzlaff Larry 1018
Kanieski Carol 1008
Charles John 1020
Machado Albert 1022
Rogers Frank 1010
Brown Allen 1024
Johnson Darlene 1004
Lombardo Domingus 1009
Rabbit Peter 1023
Morrissey Jim 1021
Hoover William 1001
Villegas Arnando 1007

2. Edit #1 to find employee who are neither employee managers, nor department
managers.

SELECT last_name, first_name, employee_number


FROM Employee
WHERE employee_number NOT IN
(SELECT manager_employee_number FROM employee)
AND employee_number NOT IN
(SELECT manager_employee_number FROM department);

Solutions to Lab Exercises Page B-15


18 rows

last_name first_name employee_number


-------------------- ------------------------------ ---------------
Wilson Edward 1015
Crane Robert 1014
Rogers Frank 1010
Brown Alan 1002
Stein John 1006
Phillips Charles 1013
Hopkins Paulene 1012
Ratzlaff Larry 1018
Kanieski Carol 1008
Charles John 1020
Machado Albert 1022
Morrissey Jim 1021
Brown Allen 1024
Johnson Darlene 1004
Lombardo Domingus 1009
Rabbit Peter 1023
Hoover William 1001
Villegas Arnando 1007

Extra tough

4. Write a nested subquery that finds employees whose managers are department
managers that are not managers in the employee table.

SELECT last_name, first_name, employee_number


FROM Employee
WHERE manager_employee_number IN
(SELECT manager_employee_number FROM department
WHERE manager_employee_number NOT IN
(SELECT manager_employee_number FROM employee));

No rows found

To verify with two, un-nested, subqueries:

SELECT last_name, first_name, employee_number


FROM Employee
WHERE manager_employee_number IN
(SELECT manager_employee_number FROM department)
AND manager_employee_number NOT IN
(SELECT manager_employee_number FROM employee);

No rows found

Page B-16 Solutions to Lab Exercises


Module 10: Lab Solution

1. List all employees by name, the name of their department, their original salary, and
salary again with a ten percent increase, for those working in departments with
budgets > $40,000.00. Make the last and first name 10 characters each and use the
implicit form of inner join.

SELECT CAST(e.last_name AS CHAR(10)) AS Lnm,


CAST(e.first_name AS CHAR(10)) AS fnm,
d.department_name,
e.salary_amount AS OldSal,
e.salary_amount * 1.10 AS NewSal
FROM employee e, department d
WHERE e.department_number = d.department_number
And d.budget_amount > 40000
ORDER BY 1, 2;
Lnm fnm department_name OldSal NewSal
---------- ---------- ------------------------------ ------------ ---------------
Brown Alan customer support 43100.00 47410.0000
Brown Allen education 43700.00 48070.0000
Charles John education ? ?
Crane Robert ? ? ?
Daly James ? 52500.00 57750.0000
Hoover William customer support 25525.00 28077.5000
Hopkins Paulene education 37900.00 41690.0000
Johnson Darlene customer support 36300.00 39930.0000
Kanieski Carol research and development 29250.00 32175.0000
Kubic Ron research and development ? ?
Lombardo Domingus education 31000.00 34100.0000
Machado Albert customer support ? ?
Phillips Charles customer support 24500.00 26950.0000
Rabbit Peter marketing sales 26500.00 29150.0000
Ratzlaff Larry marketing sales 54000.00 59400.0000
Rogers Frank customer support 46000.00 50600.0000
Runyon Irene marketing sales 66000.00 72600.0000
Ryan Loretta education 31200.00 34320.0000
Stein John research and development 29450.00 32395.0000
Trader James customer support 37850.00 41635.0000
Villegas Arnando education 49700.00 54670.0000
Wilson Edward marketing sales 53625.00 58987.5000

2. Find the department names and employee names for employees that have both an
“i” and an “e” in their last name. Make the last and first name 10 characters each
and use the explicit form of inner join.

SELECT CAST(e.last_name AS CHAR(10)) AS Lnm,


CAST(e.first_name AS CHAR(10)) AS fnm,
d.department_name
FROM employee e inner join
department d
ON e.department_number = d.department_number
WHERE e.last_name LIKE ALL ('%i%', '%e%')
ORDER BY 1, 2;

Solutions to Lab Exercises Page B-17


Lnm fnm department_name
---------- ---------- ------------------------------
Kanieski Carol research and development
Stein John research and development
Villegas Arnando education

3. Use POSITION to list department names that have people working in them whose
job description has the word “sales” in it. List the employee names as well.

SELECT CAST(e.last_name AS CHAR(10)) AS Lnm,


CAST(e.first_name AS CHAR(10)) AS fnm,
d.department_name
FROM employee e, department d
WHERE e.department_number = d.department_number
AND POSITION('sales' IN d.department_name) > 0
ORDER BY 1, 2;

Lnm fnm department_name


---------- ---------- ------------------------------
Rabbit Peter marketing sales
Ratzlaff Larry marketing sales
Runyon Irene marketing sales
Wilson Edward marketing sales

Optional

4. Write a cross join that lists all possible combinations of first names and last
names from employee.

SELECT e1.first_name, e2.last_name


FROM employee e1, employee e2
ORDER BY 1, 2;

676 rows returned

Page B-18 Solutions to Lab Exercises


Module 11: Lab Solution

1. From the employee and department tables, list employee last names, first names, the
department names and the employees department numbers only, for all employees.
Compare this to the number of rows returned by the inner join.

Inner join: 22 rows

SELECT CAST(e.last_name AS CHAR(10)),


CAST(e.first_name AS CHAR(10)),
d.department_name
FROM employee e, department d
WHERE e.department_number = d.department_number
ORDER BY 1, 2;

last_name first_name department_name


---------- ---------- ------------------------------
Brown Alan customer support
Brown Allen education
Charles John education
Crane Robert ?
Daly James ?
Hoover William customer support
Hopkins Paulene education
Johnson Darlene customer support
Kanieski Carol research and development
Kubic Ron research and development
Lombardo Domingus education
Machado Albert customer support
Phillips Charles customer support
Rabbit Peter marketing sales
Ratzlaff Larry marketing sales
Rogers Frank customer support
Runyon Irene marketing sales
Ryan Loretta education
Stein John research and development
Trader James customer support
Villegas Arnando education
Wilson Edward marketing sales

Solutions to Lab Exercises Page B-19


Outer join: 26 rows

SELECT CAST(e.last_name AS CHAR(10)),


CAST(e.first_name AS CHAR(10)),
d.department_name
FROM employee e LEFT JOIN department d
ON e.department_number = d.department_number
ORDER BY 1, 2;

last_name first_name department_name


---------- ---------- ------------------------------
Brown Alan customer support
Brown Allen education
Charles John education
Crane Robert ?
Daly James ?
Hoover William customer support
Hopkins Paulene education
Johnson Darlene customer support
Kanieski Carol research and development
Kubic Ron research and development
Lombardo Domingus education
Machado Albert customer support
Morrissey Jim ?
Phillips Charles customer support
Rabbit Peter marketing sales
Ratzlaff Larry marketing sales
Rogers Frank customer support
Rogers Nora ?
Runyon Irene marketing sales
Ryan Loretta education
Short Michael ?
Stein John research and development
Trader James customer support
Trainer I.B. ?
Villegas Arnando education
Wilson Edward marketing sales

Page B-20 Solutions to Lab Exercises


2. For the outer join for #1, include the department number from the department table to
the projection to see which rows are actually outer results.

SELECT d.department_number,
CAST(e.last_name AS CHAR(10)),
CAST(e.first_name AS CHAR(10)),
d.department_name
FROM employee e LEFT JOIN department d
ON e.department_number = d.department_number
ORDER BY 1, 2;

department_number last_name first_name department_name


----------------- ---------- ---------- ------------------------
? Morrissey Jim ?
? Rogers Nora ?
? Short Michael ?
? Trainer I.B. ?
301 Kanieski Carol research and development
301 Kubic Ron research and development
301 Stein John research and development
401 Brown Alan customer support
401 Hoover William customer support
401 Johnson Darlene customer support
401 Machado Albert customer support
401 Phillips Charles customer support
401 Rogers Frank customer support
401 Trader James customer support
402 Crane Robert ?
402 Daly James ?
403 Brown Allen education
403 Charles John education
403 Hopkins Paulene education
403 Lombardo Domingus education
403 Ryan Loretta education
403 Villegas Arnando education
501 Rabbit Peter marketing sales
501 Ratzlaff Larry marketing sales
501 Runyon Irene marketing sales
501 Wilson Edward marketing sales

Solutions to Lab Exercises Page B-21


3. From the employee, department, and job tables, list employee last names, first names,
and the join columns from all three tables for all employees. Do any of these
employees have both an invalid department and invalid job?

Answer to question is “yes”  Michael Short

SELECT CAST(e.last_name AS CHAR(10)) AS Lname,


CAST(e.first_name AS CHAR(10)) AS Fname,
d.department_number,
j.job_code,
e.salary_amount
FROM employee e LEFT JOIN job j
ON e.job_code = j.job_code
LEFT JOIN department d
ON e.department_number = d.department_number
WHERE e.salary_amount BETWEEN 34000 AND 58000
ORDER BY 1, 2;

Lname Fname department_number job_code salary_amount


---------- ---------- ----------------- ----------- -------------
Brown Alan 401 413201 43100.00
Brown Allen 403 ? 43700.00
Daly James 402 421100 52500.00
Hopkins Paulene 403 ? 37900.00
Johnson Darlene 401 412101 36300.00
Morrissey Jim ? 222101 38750.00
Ratzlaff Larry 501 512101 54000.00
Rogers Frank 401 412101 46000.00
Rogers Nora ? 321100 56500.00
Short Michael ? ? 34700.00
Trader James 401 411100 37850.00
Villegas Arnando 403 ? 49700.00
Wilson Edward 501 512101 53625.00

Page B-22 Solutions to Lab Exercises


Module 12: Lab Solution
1. Write a correlated subquery to find employees with invalid job_codes.

SELECT last_name, first_name


FROM employee e
WHERE NOT EXISTS
(SELECT * FROM job j
WHERE e.job_code = j.job_code)
ORDER BY 1, 2;

last_name first_name
-------------------- ------------------------------
Brown Allen
Charles John
Hopkins Paulene
Lombardo Domingus
Short Michael
Villegas Arnando

2. For query #1, include their department name.

SELECT e. last_name, e.first_name, d.department_name


FROM employee e, department d
WHERE NOT EXISTS
(SELECT * FROM job j
WHERE e.job_code = j.job_code)
AND e.department_number = d.department_number
ORDER BY 1, 2;

last_name first_name department_name


-------------------- --------------- ------------------------------
Brown Allen education
Charles John education
Hopkins Paulene education
Lombardo Domingus education
Villegas Arnando education

Solutions to Lab Exercises Page B-23


Module 13: Lab Solution

1. Display the salary sums, by job code within department, for all employees who
work for manager 1003, 1004, and 1017.

SELECT department_number, job_code, SUM(salary_amount) AS sumsal


FROM employee
WHERE manager_employee_number IN (1003, 1004, 1017)
GROUP BY 1, 2
ORDER BY 1, 2;

department_number job_code sumsal


----------------- ----------- ------------
401 412101 107825.00
401 412102 24500.00
401 413201 43100.00
501 512101 134125.00

2. Add manager numbers to the result of #1.

SELECT Manager_employee_number AS Mgr#,


department_number,
job_code,
SUM(salary_amount) AS sumsal
FROM employee
WHERE manager_employee_number IN (1003, 1004, 1017)
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3;

Mgr# department_number job_code sumsal


----------- ----------------- ----------- ------------
1003 401 412101 107825.00
1003 401 412102 24500.00
1003 401 413201 43100.00
1017 501 512101 134125.00

Page B-24 Solutions to Lab Exercises


3. Find the average budget amount for each department, and another average that
includes a 50% budget increase.

SELECT department_number,
AVG(budget_amount) AS avgbudget,
CAST(Avgbudget * 1.5 AS DEC(15,2))
FROM department
GROUP BY 1
ORDER BY 1;

Note that department 600 has no employees.

department_number avgbudget (avgbudget*1.5)


----------------- ------------ -----------------
100 400000.00 600000.00
201 293800.00 440700.00
301 465600.00 698400.00
302 226000.00 339000.00
401 982300.00 1473450.00
402 308000.00 462000.00
403 932000.00 1398000.00
501 308000.00 462000.00
600 ? ?

4. Count the number of distinct manager numbers and distinct departments from the
employee table.

SELECT COUNT (DISTINCT manager_employee_number) AS CDM#,


COUNT (DISTINCT department_number) AS CDD#
FROM employee;

CDM# CDD#
----------- -----------
7 6

Solutions to Lab Exercises Page B-25


5. Find the minimum and maximum salaries within each department by department
name with a count of the number of employees for each department. Return only
where there are more than 5 employees in the department.

SELECT e.department_number AS d#,


d.department_name AS DName,
MIN(e.salary_amount) AS MinSal,
MAX(e.salary_amount) AS MaxSal,
COUNT(*) AS cntstar
FROM employee e JOIN department d
ON e.department_number = d.department_number
GROUP BY 1, 2
HAVING cntstar > 5
ORDER BY 1;

d# DName MinSal MaxSal cntstar


------- ----------------- ----------- ------------ ------------
401 customer support 24500.00 46000.00 7
403 education 31000.00 49700.00 6

Page B-26 Solutions to Lab Exercises


Module 14: Lab Solution

1. Display employee information you deem necessary to compare salary changes for
the people in their respective departments as shown in the chart below.

Where their salary amount is null, make it equal to their job code BEFORE DOING THE
CHANGE.

Department Change in Salary


301 10%
NULL use job code for salary
501 20%

SELECT CAST(last_name AS CHAR(10)) AS Lnm,


CAST(First_name AS CHAR(10)) AS Fnm,
Department_number AS d#,
Job_code AS Job,
Salary_Amount AS Salary,
CASE WHEN d# = 301 THEN salary * 1.1
WHEN d# = 501 THEN salary * 1.2
WHEN d# IS NULL THEN job
END AS new_salary
FROM employee
WHERE d# IS NULL
OR d# IN (301, 501)
ORDER BY 1, 2;

Lnm Fnm d# Job Salary new_salary


---------- ---------- ------ ----------- ------------ ---------------
Kanieski Carol 301 312102 29250.00 32175.000
Kubic Ron 301 311100 ? ?
Morrissey Jim ? 222101 38750.00 222101.000
Rabbit Peter 501 512101 26500.00 31800.000
Ratzlaff Larry 501 512101 54000.00 64800.000
Rogers Nora ? 321100 56500.00 321100.000
Runyon Irene 501 511100 66000.00 79200.000
Short Michael ? 211100 34700.00 211100.000
Stein John 301 312101 29450.00 32395.000
Wilson Edward 501 512101 53625.00 64350.000

Solutions to Lab Exercises Page B-27


2. Display whatever employee information you deem necessary to return the sums and
two averages for department salaries. One average should be that where a null salary
is made into a zero, and another where a zero salary is made into a null.

Verify that, where different averages occur, the zero-to-null salary averages should
be larger.

SELECT department_number AS d#,


AVG(CASE WHEN salary_amount IS NULL THEN 0 ELSE salary_amount
END) AS null_to_zero,
AVG(CASE WHEN salary_amount = 0 THEN NULL ELSE salary_amount
END) AS zero_to_null,
AVG(salary_Amount) AS Avgal
FROM employee
GROUP BY 1
ORDER BY 1;

Note that “zero to null” averages should be equal to or larger than “null to zero” values
since ignoring nulls means division by smaller counts that ignore nulls.

Since there are no salary amounts of zero (0), the last average should match the second
average.

Since there are null salary amounts, the first average should be equal to or less than the
other two averages since we are increasing the number of null nulls to divide by.

d# null_to_zero zero_to_null Avgal


----------- ------------ ------------ ------------
? 43316.67 43316.67 43316.67
301 19566.67 29350.00 29350.00
401 30467.86 35545.83 35545.83
402 26250.00 52500.00 52500.00
403 32250.00 38700.00 38700.00
501 50031.25 50031.25 50031.25
999 100000.00 100000.00 100000.00

Page B-28 Solutions to Lab Exercises


Module 15: Lab Solution

1. Use a derived table to list those departments whose budgets are greater than the
average budget for all departments.

Note that this is a cross join. A variation is shown afterwards.

SELECT department_name, budget_amount, avgbudget


FROM (SELECT AVG(budget_amount) AS avgbudget
FROM department) AS ab,
Department d
WHERE budget_amount > avgbudget;

department_name budget_amount avgbudget


------------------------------ ------------- ------------
education 932000.00 489462.50
customer support 982300.00 489462.50

Same query, different syntax.

SELECT department_name, budget_amount, avgbudget


FROM (SELECT AVG(budget_amount) AS avgbudget
FROM department) AS ab
CROSS JOIN
Department d
WHERE budget_amount > avgbudget;

Solutions to Lab Exercises Page B-29


2. Add to exercise #1 those employees who work in those departments.

SELECT e.last_name, d.department_name, d.budget_amount, ab.avgbudget


FROM (SELECT AVG(budget_amount) AS avgbudget
FROM department) AS ab,
Department d,
Employee e
WHERE e.department_number = d.department_number
AND d.budget_amount > ab.avgbudget
ORDER BY 2, 1;

last_name department_name budget_amount avgbudget


---------- -------------------- ------------- ------------
Brown customer support 982300.00 489462.50
Hoover customer support 982300.00 489462.50
Johnson customer support 982300.00 489462.50
Machado customer support 982300.00 489462.50
Phillips customer support 982300.00 489462.50
Rogers customer support 982300.00 489462.50
Trader customer support 982300.00 489462.50
Brown education 932000.00 489462.50
Charles education 932000.00 489462.50
Hopkins education 932000.00 489462.50
Lombardo education 932000.00 489462.50
Ryan education 932000.00 489462.50
Villegas education 932000.00 489462.50

3. Modify #1 to add the differences between the department’s budget and the
average.

SELECT department_name, budget_amount, avgbudget,


budget_amount - avgbudget AS BudgeDiff
FROM (SELECT AVG(budget_amount) AS avgbudget
FROM department) AS ab,
Department d
WHERE budget_amount > avgbudget;

department_name budget_amount avgbudget BudgeDiff


------------------ ------------- ----------- -------------
education 932000.00 489462.50 442537.50
customer support 982300.00 489462.50 492837.50

Page B-30 Solutions to Lab Exercises


Module 16: Lab Solution

1. Return two 15-row samples from the employee table. Anything less than 15 rows
per sample is unacceptable. Project last and first names plus hire dates and birth
dates.

SELECT employee.*, SAMPLEID


FROM employee
SAMPLE WITH REPLACEMENT 15, 15
ORDER BY SAMPLEID;

This will return 2 samples but, due to the nature of sampling, it cannot be duplicated.

2. Return two 50% row samples of employees for each of departments 401 and 501.

SELECT employee.*, SAMPLEID


FROM employee
SAMPLE .5, .5
ORDER BY SAMPLEID;

Due to the nature of sampling, this result would not be duplicated.

3. Using the same sampling as #2, SUM the salaries for each sample.

SELECT SID, SUM(salary_amount)


FROM
(SELECT salary_amount, SAMPLEID AS SID
FROM employee
SAMPLE .5, .5) AS t1
GROUP BY 1
ORDER BY 1;

Due to the nature of sampling, this result would not be duplicated.

Solutions to Lab Exercises Page B-31


Module 17: Lab Solution

1. List the top 5 salaries amount VALUES in the employee table along with the last
names and first name of the employee.

SELECT TOP 5 last_name, first_name, salary_amount


FROM employee
ORDER BY 3 DESC;

last_name first_name salary_amount


-------------------- ------------------------------ -------------
Trainer I.B. 100000.00
Runyon Irene 66000.00
Rogers Nora 56500.00
Ratzlaff Larry 54000.00
Wilson Edward 53625.00

2. To see if it can be used on character data values, list the top 3 department names
from the department table by VALUE (in descending order).

SELECT TOP 3 department_name FROM department ORDER BY 1 DESC;

department_name
------------------------------
technical operations
research and development
product planning

3. Retrieve half of the job descriptions that have “manager” in the description using the
TOP N feature. Verify the result by doing a COUNT(*) from job.

Due to the nature of the feature you may not be able to duplicate this result.

SELECT TOP 50 PERCENT description


FROM JOB;

description
----------------------------------------
?
Hardware Engineer
Dispatcher
Manager - Marketing Sales
Manager - Research and Development
Sales Rep
Manager - Education
Corporate President
Manager - Customer Support
Mechanical Assembler

There are 20 rows in the job table.

Page B-32 Solutions to Lab Exercises


4. Add employee names to those descriptions for #3.

SELECT e.last_name, e.first_name, description


FROM employee e,
(SELECT TOP 50 PERCENT job_code, description
FROM JOB) j
WHERE e.job_code = j.job_code
ORDER BY 1, 2;

Due to the nature of the feature you may not be able to duplicate this result.

last_name first_name description


---------- ---------- ----------------------------------------
Brown Alan Dispatcher
Kanieski Carol Hardware Engineer
Kubic Ron Manager - Research and Development
Rabbit Peter Sales Rep
Ratzlaff Larry Sales Rep
Runyon Irene Manager - Marketing Sales
Ryan Loretta Manager - Education
Trader James Manager - Customer Support
Trainer I.B. Corporate President
Wilson Edward Sales Rep

5. Retrieve the top 3 department salary sums by VALUE - descending.

SELECT TOP 3 *
FROM
(SELECT department_number, SUM(salary_amount) AS sumsal
FROM employee
GROUP BY 1) e
ORDER BY 2 DESC;

department_number sumsal
----------------- ------------
401 213275.00
501 200125.00
403 193500.00

Solutions to Lab Exercises Page B-33


Module 18: Lab Solution

1. From the “SalesTbl”, list the store id, product id, sales, for each row, include with
each projected row the sum of the sales for each product across all stores and
order by store id within product id.

SELECT storeid, prodid, sales,


SUM(sales) OVER (ORDER BY prodid, storeid)
FROM salestbl;

storeid prodid sales Group Sum(sales)


----------- ------ ----------- ----------------
1001 A 100000.00 610000.00
1002 A 40000.00 610000.00
1003 A 30000.00 610000.00
1003 B 65000.00 610000.00
1001 C 60000.00 610000.00
1002 C 35000.00 610000.00
1003 C 20000.00 610000.00
1001 D 35000.00 610000.00
1002 D 25000.00 610000.00
1003 D 50000.00 610000.00
1001 F 150000.00 610000.00

2. Add the minimum and maximum sales to #1 and order by storeid.

Note that ordering by a non-unique store ID only, has a random effect on the order of
prod ID (as well as the other columns).

SELECT storeid, prodid, sales,


SUM(sales) OVER () AS SumSales,
MIN(sales) OVER () AS MinSales,
MAX(sales) OVER (ORDER BY storeid) AS MaxSales
FROM salestbl;

storeid prodid sales SumSales MinSales MaxSales


----------- ------ ----------- ----------- ----------- -----------
1001 D 35000.00 610000.00 20000.00 150000.00
1001 A 100000.00 610000.00 20000.00 150000.00
1001 F 150000.00 610000.00 20000.00 150000.00
1001 C 60000.00 610000.00 20000.00 150000.00
1002 A 40000.00 610000.00 20000.00 150000.00
1002 D 25000.00 610000.00 20000.00 150000.00
1002 C 35000.00 610000.00 20000.00 150000.00
1003 B 65000.00 610000.00 20000.00 150000.00
1003 D 50000.00 610000.00 20000.00 150000.00
1003 A 30000.00 610000.00 20000.00 150000.00
1003 C 20000.00 610000.00 20000.00 150000.00

Page B-34 Solutions to Lab Exercises


3. Use a GROUP function to find employees having a salary greater that their
department average and display the average.

SELECT last_name, first_name, salary_amount,


AVG(salary_amount)
OVER (PARTITION BY department_number ORDER BY last_name, first_name)
AS DeptAvg
FROM employee
QUALIFY salary_amount > deptavg;

last_name first_name salary_amount DeptAvg


---------- ---------- ------------- ------------
Rogers Nora 56500.00 43316.67
Stein John 29450.00 29350.00
Brown Alan 43100.00 35545.83
Johnson Darlene 36300.00 35545.83
Rogers Frank 46000.00 35545.83
Trader James 37850.00 35545.83
Brown Allen 43700.00 38700.00
Villegas Arnando 49700.00 38700.00
Ratzlaff Larry 54000.00 50031.25
Runyon Irene 66000.00 50031.25
Wilson Edward 53625.00 50031.25

Solutions to Lab Exercises Page B-35


Module 19: Lab Solution

1. Write a window aggregate that provides a rank of salary amounts for all employees
using a cumulative window.

SELECT last_name, first_name, salary_amount,


COUNT(*) OVER
(ORDER BY salary_amount DESC ROWS UNBOUNDED PRECEDING)
FROM employee;

last_name first_name salary_amount Row_Number()


---------- ---------- ------------- ------------
Trainer I.B. 100000.00 1
Runyon Irene 66000.00 2
Rogers Nora 56500.00 3
Ratzlaff Larry 54000.00 4
Wilson Edward 53625.00 5
Daly James 52500.00 6
Villegas Arnando 49700.00 7
Rogers Frank 46000.00 8
Brown Allen 43700.00 9
Brown Alan 43100.00 10
Morrissey Jim 38750.00 11
Hopkins Paulene 37900.00 12
Trader James 37850.00 13
Johnson Darlene 36300.00 14
Short Michael 34700.00 15
Ryan Loretta 31200.00 16
Lombardo Domingus 31000.00 17
Stein John 29450.00 18
Kanieski Carol 29250.00 19
Rabbit Peter 26500.00 20
Hoover William 25525.00 21
Phillips Charles 24500.00 22
Charles John ? 23
Crane Robert ? 24
Kubic Ron ? 25
Machado Albert ? 26

Page B-36 Solutions to Lab Exercises


2. Change #1 to perform the rank within department.

SELECT last_name, first_name, salary_amount, department_number,


COUNT(*) OVER
(PARTITION BY department_number
ORDER BY salary_amount DESC
ROWS UNBOUNDED PRECEDING)
FROM employee;

last_name first_name salary_amount dept# Row_Number()


---------- ---------- ------------- ------ ------------
Rogers Nora 56500.00 ? 1
Morrissey Jim 38750.00 ? 2
Short Michael 34700.00 ? 3
Stein John 29450.00 301 1
Kanieski Carol 29250.00 301 2
Kubic Ron ? 301 3
Rogers Frank 46000.00 401 1
Brown Alan 43100.00 401 2
Trader James 37850.00 401 3
Johnson Darlene 36300.00 401 4
Hoover William 25525.00 401 5
Phillips Charles 24500.00 401 6
Machado Albert ? 401 7
Daly James 52500.00 402 1
Crane Robert ? 402 2
Villegas Arnando 49700.00 403 1
Brown Allen 43700.00 403 2
Hopkins Paulene 37900.00 403 3
Ryan Loretta 31200.00 403 4
Lombardo Domingus 31000.00 403 5
Charles John ? 403 6
Runyon Irene 66000.00 501 1
Ratzlaff Larry 54000.00 501 2
Wilson Edward 53625.00 501 3
Rabbit Peter 26500.00 501 4
Trainer I.B. 100000.00 999 1

Solutions to Lab Exercises Page B-37


3. Write a moving window aggregate from the daily_sales table that compares each
sales amount to the averages of the two preceding days.

SELECT itemid, salesdate, sales,


AVG(sales) OVER (ORDER BY salesdate
ROWS BETWEEN 2 PRECEDING AND 1 PRECEDING) AS avg_prev2,
sales - avg_prev2 AS diff
FROM daily_sales;

Note that the sales dates are not continuous.

*** Query completed. 58 rows found.

itemid salesdate sales avg_prev2 diff


----------- ---------- ----------- ----------- ----------
10 1997-01-01 350.00 ? ?
10 1997-01-02 100.00 350.00 -250.00
10 1997-01-03 250.00 225.00 25.00
10 1997-01-05 350.00 175.00 175.00
10 1997-01-10 450.00 300.00 150.00
10 1997-01-21 250.00 400.00 -150.00
10 1997-01-25 300.00 350.00 -50.00
10 1997-01-31 100.00 275.00 -175.00
10 1997-02-01 550.00 200.00 350.00
10 1997-02-03 350.00 325.00 25.00
10 1997-02-06 150.00 450.00 -300.00
10 1997-02-17 250.00 250.00 .00
10 1997-02-20 500.00 200.00 300.00
10 1997-02-27 150.00 375.00 -225.00
10 1997-08-01 150.00 325.00 -175.00
10 1997-08-02 200.00 150.00 50.00
10 1997-08-03 250.00 175.00 75.00
10 1997-08-05 350.00 225.00 125.00
10 1997-08-10 550.00 300.00 250.00
10 1997-08-21 150.00 450.00 -300.00
10 1997-08-25 200.00 350.00 -150.00
10 1997-08-31 100.00 175.00 -75.00
10 1997-09-01 150.00 150.00 .00
10 1997-09-03 250.00 125.00 125.00
10 1997-09-06 350.00 200.00 150.00
10 1997-09-17 550.00 300.00 250.00
10 1997-09-20 450.00 450.00 .00
10 1997-09-27 350.00 500.00 -150.00
10 1998-01-01 150.00 400.00 -250.00
10 1998-01-02 200.00 250.00 -50.00
10 1998-01-03 250.00 175.00 75.00
10 1998-01-05 350.00 225.00 125.00
10 1998-01-10 550.00 300.00 250.00
10 1998-01-21 150.00 450.00 -300.00
10 1998-01-25 200.00 350.00 -150.00
10 1998-01-31 100.00 175.00 -75.00
10 1998-02-01 150.00 150.00 .00
10 1998-02-03 250.00 125.00 125.00
10 1998-02-06 350.00 200.00 150.00
10 1998-02-17 550.00 300.00 250.00
10 1998-02-20 450.00 450.00 .00
10 1998-02-27 350.00 500.00 -150.00
Page B-38 Solutions to Lab Exercises
10 1998-08-01 150.00 400.00 -250.00
10 1998-08-02 200.00 250.00 -50.00
10 1998-08-03 250.00 175.00 75.00
10 1998-08-04 250.00 225.00 25.00
10 1998-08-05 350.00 250.00 100.00
10 1998-08-10 550.00 300.00 250.00
10 1998-08-21 150.00 450.00 -300.00
10 1998-08-25 200.00 350.00 -150.00
10 1998-08-31 100.00 175.00 -75.00
10 1998-09-01 150.00 150.00 .00
10 1998-09-03 250.00 125.00 125.00
10 1998-09-06 350.00 200.00 150.00
10 1998-09-09 450.00 300.00 150.00
10 1998-09-17 550.00 400.00 150.00
10 1998-09-20 450.00 500.00 -50.00
10 1998-09-27 350.00 500.00 -150.00

Solutions to Lab Exercises Page B-39


Module 20: Lab Solution

1. Retrieve a ranking of salary amounts from the employee table. Next change the
rank to partition by department number.

SELECT employee_number, salary_amount,


RANK() OVER (ORDER BY salary_amount DESC)
FROM employee;

employee_number salary_amount Rank(salary_amount)


--------------- ------------- -------------------
801 100000.00 1
1017 66000.00 2
1016 56500.00 3
1018 54000.00 4
1015 53625.00 5
1011 52500.00 6
1007 49700.00 7
1010 46000.00 8
1024 43700.00 9
1002 43100.00 10
1021 38750.00 11
1012 37900.00 12
1003 37850.00 13
1004 36300.00 14
1025 34700.00 15
1005 31200.00 16
1009 31000.00 17
1006 29450.00 18
1008 29250.00 19
1023 26500.00 20
1001 25525.00 21
1013 24500.00 22
1020 ? 23
1019 ? 23
1014 ? 23
1022 ? 23

Page B-40 Solutions to Lab Exercises


SELECT employee_number, salary_amount, department_number,
RANK() OVER (PARTITION BY department_number
ORDER BY salary_amount DESC) AS PartRnk
FROM employee;

employee_number salary_amount d# PartRnk


--------------- ------------- ------ -----------
1016 56500.00 ? 1
1021 38750.00 ? 2
1025 34700.00 ? 3
1006 29450.00 301 1
1008 29250.00 301 2
1019 ? 301 3
1010 46000.00 401 1
1002 43100.00 401 2
1003 37850.00 401 3
1004 36300.00 401 4
1001 25525.00 401 5
1013 24500.00 401 6
1022 ? 401 7
1011 52500.00 402 1
1014 ? 402 2
1007 49700.00 403 1
1024 43700.00 403 2
1012 37900.00 403 3
1005 31200.00 403 4
1009 31000.00 403 5
1020 ? 403 6
1017 66000.00 501 1
1018 54000.00 501 2
1015 53625.00 501 3
1023 26500.00 501 4
801 100000.00 999 1

2. Rank all sales from the “salestbl” showing only the bottom three sales amounts
along with their bottom ranking values.

SELECT sales, RANK() OVER (ORDER BY sales DESC)


FROM salestbl
QUALIFY RANK() OVER (ORDER BY sales ASC) < 4;

sales Rank(sales)
----------- -----------
20000.00 11
25000.00 10
30000.00 9

Solutions to Lab Exercises Page B-41


Module 21: Lab Solution

1. Display tertile (values from 0 to 2) of employee salary amounts. Then return decile
(0 to 9) followed by percentiles (0 to 100). Do each separately and note how the
rows-per-quantile thin out as the number increases. Note how the number of rows
per quantile value changes from lower to higher.

SELECT salary_amount, QUANTILE(3, salary_amount)


FROM EMPLOYEE;

salary_amount Quantile(3,salary_amount)
------------- -------------------------
? 0
? 0
? 0
? 0
24500.00 0
25525.00 0
26500.00 0
29250.00 0
29450.00 0
31000.00 1
31200.00 1
34700.00 1
36300.00 1
37850.00 1
37900.00 1
38750.00 1
43100.00 1
43700.00 1
46000.00 2
49700.00 2
52500.00 2
53625.00 2
54000.00 2
56500.00 2
66000.00 2
100000.00 2

Page B-42 Solutions to Lab Exercises


SELECT salary_amount, QUANTILE(10, salary_amount)
FROM EMPLOYEE;

salary_amount Quantile(10,salary_amount)
------------- --------------------------
? 0
? 0
? 0
? 0
24500.00 1
25525.00 1
26500.00 2
29250.00 2
29450.00 3
31000.00 3
31200.00 3
34700.00 4
36300.00 4
37850.00 5
37900.00 5
38750.00 5
43100.00 6
43700.00 6
46000.00 6
49700.00 7
52500.00 7
53625.00 8
54000.00 8
56500.00 8
66000.00 9
100000.00 9

Solutions to Lab Exercises Page B-43


SELECT salary_amount, QUANTILE(100, salary_amount)
FROM EMPLOYEE;

salary_amount Quantile(100,salary_amount)
------------- ---------------------------
? 0
? 0
? 0
? 0
24500.00 15
25525.00 19
26500.00 23
29250.00 26
29450.00 30
31000.00 34
31200.00 38
34700.00 42
36300.00 46
37850.00 50
37900.00 53
38750.00 57
43100.00 61
43700.00 65
46000.00 69
49700.00 73
52500.00 76
53625.00 80
54000.00 84
56500.00 88
66000.00 92
100000.00 96

Page B-44 Solutions to Lab Exercises


2. Repeat #1 by performing all in a single projection.

SELECT salary_amount,
QUANTILE(3, salary_amount) AS Tert,
QUANTILE(10, salary_amount) AS Decl,
QUANTILE(100, salary_amount) AS Pctl
FROM EMPLOYEE;

salary_amount Tert Decl Pctl


------------- ----------- ----------- -----------
? 0 0 0
? 0 0 0
? 0 0 0
? 0 0 0
24500.00 0 1 15
25525.00 0 1 19
26500.00 0 2 23
29250.00 0 2 26
29450.00 0 3 30
31000.00 1 3 34
31200.00 1 3 38
34700.00 1 4 42
36300.00 1 4 46
37850.00 1 5 50
37900.00 1 5 53
38750.00 1 5 57
43100.00 1 6 61
43700.00 1 6 65
46000.00 2 6 69
49700.00 2 7 73
52500.00 2 7 76
53625.00 2 8 80
54000.00 2 8 84
56500.00 2 8 88
66000.00 2 9 92
100000.00 2 9 96

Solutions to Lab Exercises Page B-45


Module 22: Lab Solution

1. Display manager numbers, department numbers, and salary sums for employees by
manager and by department – only – (do not project a grand total).

SELECT manager_employee_number (SMALLINT) AS Mgr#,


department_number (SMALLINT) AS Dpt#,
SUM(salary_amount)
FROM employee
GROUP BY GROUPING SETS (1, 2)
ORDER BY 1, 2;

Recall that there is a null department!

Mgr# Dpt# Sum(salary_amount)


------ ------ ------------------
? ? 129950.00
? 301 58700.00
? 401 213275.00
? 402 52500.00
? 403 193500.00
? 501 200125.00
? 999 100000.00
801 ? 378750.00
1003 ? 175425.00
1005 ? 162300.00
1011 ? ?
1017 ? 134125.00
1019 ? 58700.00
1025 ? 38750.00

Page B-46 Solutions to Lab Exercises


2. Add to #1 the grand totals plus salary sums by department and manager.

SELECT CASE GROUPING(manager_employee_number)


WHEN 1 THEN 'Tot of Mgrs'
ELSE COALESCE(manager_employee_number,'Null Mgr')
END AS Mgr#,
CASE GROUPING(department_number)
WHEN 1 THEN 'Tot of dpts'
ELSE COALESCE(department_number,'Null Dept')
END AS Dpt#,
SUM(salary_amount)
FROM employee
GROUP BY GROUPING SETS
((manager_employee_number,department_number), manager_employee_number,
department_number, ())
ORDER BY 1, 2;
Mgr# Dpt# Sum(salary_amount)
----------- ----------- ------------------
801 301 ?
801 401 37850.00
801 402 52500.00
801 403 31200.00
801 501 66000.00
801 999 100000.00
801 Null Dept 91200.00
801 Tot of dpts 378750.00
1003 401 175425.00
1003 Tot of dpts 175425.00
1005 403 162300.00
1005 Tot of dpts 162300.00
1011 402 ?
1011 Tot of dpts ?
1017 501 134125.00
1017 Tot of dpts 134125.00
1019 301 58700.00
1019 Tot of dpts 58700.00
1025 Null Dept 38750.00
1025 Tot of dpts 38750.00
Tot of Mgrs 301 58700.00
Tot of Mgrs 401 213275.00
Tot of Mgrs 402 52500.00
Tot of Mgrs 403 193500.00
Tot of Mgrs 501 200125.00
Tot of Mgrs 999 100000.00
Tot of Mgrs Null Dept 129950.00
Tot of Mgrs Tot of dpts 948050.00

Solutions to Lab Exercises Page B-47


3. Add to #2 the employee counts at all levels of that display.

SELECT CASE GROUPING(manager_employee_number)


WHEN 1 THEN 'Tot of Mgrs'
ELSE COALESCE(manager_employee_number,'Null Mgr')
END AS Mgr#,
CASE GROUPING(department_number)
WHEN 1 THEN 'Tot of dpts'
ELSE COALESCE(department_number,'Null Dept')
END AS Dpt#,
SUM(salary_amount),
COUNT(*)
FROM employee
GROUP BY GROUPING SETS
((manager_employee_number,department_number), manager_employee_number,
department_number, ())
ORDER BY 1, 2;
Mgr# Dpt# Sum(salary_amount) Count(*)
----------- ----------- ------------------ -----------
801 301 ? 1
801 401 37850.00 1
801 402 52500.00 1
801 403 31200.00 1
801 501 66000.00 1
801 999 100000.00 1
801 Null Dept 91200.00 2
801 Tot of dpts 378750.00 8
1003 401 175425.00 6
1003 Tot of dpts 175425.00 6
1005 403 162300.00 5
1005 Tot of dpts 162300.00 5
1011 402 ? 1
1011 Tot of dpts ? 1
1017 501 134125.00 3
1017 Tot of dpts 134125.00 3
1019 301 58700.00 2
1019 Tot of dpts 58700.00 2
1025 Null Dept 38750.00 1
1025 Tot of dpts 38750.00 1
Tot of Mgrs 301 58700.00 3
Tot of Mgrs 401 213275.00 7
Tot of Mgrs 402 52500.00 2
Tot of Mgrs 403 193500.00 6
Tot of Mgrs 501 200125.00 4
Tot of Mgrs 999 100000.00 1
Tot of Mgrs Null Dept 129950.00 3
Tot of Mgrs Tot of dpts 948050.00 26

Page B-48 Solutions to Lab Exercises


Solutions to Lab Exercises Page B-49
Module 23: Lab Solution
1. Create a view called “SumView” that performs a sum, average, max, and min on
salary amounts for each department number from employee. Use the view to join
to employee to find those employees whose salaries are greater than their
department average.

Note the importance of database qualifications!

REPLACE VIEW username.SumView


AS
SELECT department_number AS Dept#,
SUM(salary_amount) AS sumsal,
AVG(salary_amount) AS avgsal,
MAX(salary_amount) AS maxsal,
MIN(salary_amount) AS minsal
FROM employee_sales.employee
GROUP BY 1;

SELECT e. employee_number, v.dept#, e.salary_amount, v.avgsal


FROM username.sumview v, employee_sales.employee e
WHERE e.department_number = v.dept#
AND e.salary_amount > v.avgsal;

employee_number Dept# salary_amount avgsal


--------------- ----------- ------------- ------------
1017 501 66000.00 50031.25
1010 401 46000.00 35545.83
1002 401 43100.00 35545.83
1006 301 29450.00 29350.00
1024 403 43700.00 38700.00
1004 401 36300.00 35545.83
1018 501 54000.00 50031.25
1007 403 49700.00 38700.00
1003 401 37850.00 35545.83
1015 501 53625.00 50031.25

Page B-50 Solutions to Lab Exercises


2. Add a HAVING clause to SumView to include only departments having averages
less than $30,000.00 and repeat the query for exercise #2.

REPLACE VIEW username.SumView


AS
SELECT department_number AS Dept#,
SUM(salary_amount) AS sumsal,
AVG(salary_amount) AS avgsal,
MAX(salary_amount) AS maxsal,
MIN(salary_amount) AS minsal
FROM employee_sales.employee
GROUP BY 1
HAVING avgsal < 30000;

SELECT e. employee_number, v.dept#, e.salary_amount, v.avgsal


FROM dlm.sumview v, employee_sales.employee e
WHERE e.department_number = v.dept#
AND e.salary_amount > v.avgsal;

employee_number Dept# salary_amount avgsal


--------------- ----------- ------------- ------------
1006 301 29450.00 29350.00

Solutions to Lab Exercises Page B-51


Module 24: Lab Solution

1. Create a volatile table based on the definition of the Department table and then
populate it with data from the department table. Use the “preserve” option. Select
all rows from the table you created.

Note that you will have to change the table name!


Also not that a default database will be ignored if one has been defined.

CREATE VOLATILE TABLE department_new


(department_number SMALLINT,
department_name CHAR(30),
budget_amount DECIMAL(10,2),
manager_employee_number INTEGER)
UNIQUE PRIMARY INDEX (department_number)
ON COMMIT PRESERVE ROWS;

INSERT INTO dlm.department_new SELECT * FROM employee_sales.department;

The volatile table should be populated and the SELECT should return rows.

Page B-52 Solutions to Lab Exercises


2. Create another volatile table that averages the salary amounts for each
department. Now issue a HELP command to verify the existence of the two volatile
tables that you created. Do SHOW TABLE on one of them and then drop the first
one and repeat the earlier HELP command.

CREATE VOLATILE TABLE department_avgs


(department_number SMALLINT,
Avgsals DECIMAL(10,2))
UNIQUE PRIMARY INDEX (department_number)
ON COMMIT PRESERVE ROWS;

HELP VOLATILE TABLE;

Table Name Table Id


------------------------------ ------------
department_new 1CC057B40000
department_avgs 1CC058B40000

SHOW TABLE department_avgs;

CREATE SET VOLATILE TABLE DLM.department_avgs ,NO FALLBACK ,


CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO,
LOG
(
department_number SMALLINT,
Avgsals DECIMAL(10,2))
UNIQUE PRIMARY INDEX (department_number)
ON COMMIT PRESERVE ROWS;

DROP TABLE department_new;

HELP VOLATILE TABLE;

Table Name Table Id


------------------------------ ------------
department_avgs 1CC058B40000

Solutions to Lab Exercises Page B-53

Вам также может понравиться