Teradata Basic

Tera-Tom on Teradata
Basics for V2R5

Understanding is the key!
First Edition
Published by
Coffing Publishing
First Edition June, 2004

Web Page: www.Tera-Tom.com and www.CoffingDW.com
E-Mail address:
Tom.Coffing@CoffingDW.Com
Written by W. Coffing
Teradata, NCR, BYNET, V2R5 are registered trademarks of NCR Corporation,
Dayton, Ohio, U.S.A., IBM and DB2 are registered trademarks of IBM Corporation,
ANSI is a registered trademark of the American National Standards Institute. In
addition to these products names, all brands and product names in this document are
registered names or trademarks of their respective holders.
Coffing Data Warehousing shall have neither liability nor responsibility to any person or
entity with respect to any loss or damages arising from the information contained in this
book or from the use of programs or program segments that are included. The manual is
not a publication of NCR Corporation, nor was it produced in conjunction with NCR
Corporation.
Copyright July 2004 by Coffing Publishing
All rights reserved. No part of this book shall be reproduced, stored in a retrieval system,
or transmitted by any means, electronic, mechanical, photocopying, recording, or
otherwise, without written permission from the publisher. No patent liability is assumed
with respect to the use of information contained herein. Although every precaution has
been taken in the preparation of this book, the publisher and author assume no
responsibility for errors or omissions, neither is any liability assumed for damages
resulting from the use of information contained herein. For information, address:
Coffing Publishing
7810 Kiester Rd.
Middletown, OH 45042
International Standard Book Number: ISBN 0-9704980-8-X
Printed in the United States of America

All terms mentioned in this book that are known to be trademarks or service have been
stated. Coffing Publishing cannot attest to the accuracy of this information. Use of a
term in this book should not be regarded as affecting the validity of any trademark or
service mark.
About Coffing Data Warehousings CEO Tom Coffing

Tom is President, CEO, and Founder of Coffing Data Warehousing. He is an
internationally known consultant, facilitator, speaker, trainer, and executive coach with
an extensive background in data warehousing. Tom has helped implement data
warehousing in over 40 major data warehouse accounts, spoken in over 20 countries, and
has provided consulting and Teradata training to over 8,000 individuals involved in data
warehousing globally.
Tom has co-authored over 20 books on Teradata and Data Warehousing. To name a few:
Secrets of the Best Data Warehouses in the World

Teradata SQL - Unleash the Power
Tera-Tom on Teradata Basics
Tera-Tom on Teradata E-business
Teradata SQL Quick Reference Guide - Simplicity by Design
Teradata Database Design - Giving Detailed Data Flight
Teradata Users Guide -The Ultimate Companion
Teradata Utilities - Breaking the Barriers
Mr. Coffing has also published over 20 data warehousing articles and has been a
contributing columnist to DM Review on the subject of data warehousing. He wrote a
monthly column for DM Review entitled, "Teradata Territory". He is a nationally known
speaker and gives frequent seminars on Data Warehousing. He is also known as "The
Speech Doctor" because of his presentation skills and sales seminars.
Tom Coffing has taken his expert speaking and data warehouse knowledge and
revolutionized the way technical training and consultant services are delivered. He
founded CoffingDW with the same philosophy more than a decade ago. Centered around
10 Teradata Certified Masters this dynamic and growing company teaches every Teradata
class, provides world class Teradata consultants, offers a suite of software products to
enhance Teradata data warehouses, and has eight books published on Teradata.
Tom has a bachelor's degree in Speech Communications and over 25 years of business
and technical computer experience. Tom is considered by many to be the best technical
and business speaker in the United States. He has trained and consulted at so many
Teradata sites that students affectionately call him Tera-Tom.
Teradata Certified Master
- Teradata Certified Professional
- Teradata Certified Administrator
- Teradata Certified Developer
- Teradata Certified Designer
- Teradata Certified SQL Specialist

- Teradata Certified Implementation
Specialist
Table of Contents
Chapter 1 The Rules of Data Warehousing ................................................................... 1
Teradata Facts ..................................................................................................................... 2
Teradata: Brilliant by Design.............................................................................................. 3
The Teradata Parallel Architecture ..................................................................................... 4
A Logical View of the Teradata Architecture..................................................................... 6
The Parsing Engine (PE)..................................................................................................... 7
The Access Module Processors (AMPs)............................................................................. 8
The BYNET ........................................................................................................................ 9
A Visual for Data Layout.................................................................................................. 10
Teradata is a shared nothing Architecture ........................................................................ 11
Teradata has Linear Scalability......................................................................................... 12
How Teradata handles data access.................................................................................... 13
Teradata Cabinets, Nodes, VPROCs, and Disks............................................................... 14
LAN Connection for Network Attached Clients .............................................................. 15
Mainframe Connection to Teradata .................................................................................. 16
Chapter 2 Data Distribution Explained........................................................................ 17
Rows and Columns ........................................................................................................... 18
The Primary Index ............................................................................................................ 19
The Two Types of Primary Indexes.................................................................................. 20
Unique Primary Index (UPI)............................................................................................. 21
Non-Unique Primary Index............................................................................................... 22
How Teradata Turns the Primary Index Value into the Row Hash .................................. 23
The Row Hash determines the Rows Destination............................................................. 24
The Row is Delivered to the Proper AMP ........................................................................ 25
The AMP will add a Uniqueness Value............................................................................ 26
An Example of an UPI Table............................................................................................ 27
An Example of an NUPI Table......................................................................................... 28
How Teradata Retrieves Rows with the Primary Index.................................................... 29
Row Distribution............................................................................................................... 30
A Visual for Data Layout.................................................................................................. 31
Teradata accesses data in three ways ................................................................................ 32
Data Layout Summary ...................................................................................................... 33
Chapter 3 Teradata Space ............................................................................................ 35
How Permanent Space is calculated ................................................................................. 35
How Permanent Space is Given........................................................................................ 36
The Teradata Hierarchy .................................................................................................... 37
How Spool Space is calculated ......................................................................................... 38
A Spool Space Example.................................................................................................... 39
PERM, SPOOL and TEMP Space .................................................................................... 40
Spool Space controls system time..................................................................................... 41
A quiz on Perm and Spool Space...................................................................................... 42
Another quiz on Perm and Spool Space ........................................................................... 45
Table of Contents
Chapter 4 V2R5 Partition Primary Indexes ................................................................. 47

V2R4 Example.................................................................................................................. 48
V2R5 Partitioning ............................................................................................................. 49
Partitioning doesnt have to be part of the Primary Index ................................................ 50
Partition Elimination can avoid Full Table Scans............................................................. 51
The Bad NEWS about Partitioning on a column that is not part of the Primary Index.... 52
Two ways to handle Partitioning on a column that is not part of the Primary Index ....... 53
Partitioning with CASE_N ............................................................................................... 54
Partitioning with RANGE_N............................................................................................ 55
NO CASE, NO RANGE, or UNKNOWN........................................................................ 56
Chapter 5 Data Protection............................................................................................ 57
Transaction Concept & Transient Journal ........................................................................ 58
How the Transient Journal Works .................................................................................... 59
FALLBACK Protection .................................................................................................... 60
How Fallback Works ........................................................................................................ 61
Fallback Clusters............................................................................................................... 62
Down AMP Recovery Journal (DARJ) ............................................................................ 63
Redundant Array of Independent Disks (RAID) .............................................................. 64
Cliques .............................................................................................................................. 65
Cliques A two node example ......................................................................................... 66
Cliques A four node example ........................................................................................ 67
Permanent Journal............................................................................................................. 68
Table create with Fallback and Permanent Journaling ..................................................... 69
Locks................................................................................................................................. 70
Teradata has 4 locks for 3 levels of Locking .................................................................... 71
Locks and their compatibility ........................................................................................... 72
Chapter 6 Loading the Data ......................................................................................... 73
FastLoad............................................................................................................................ 75
FastLoad Picture ............................................................................................................... 76
Multiload........................................................................................................................... 77
Multiload Picture .............................................................................................................. 78
TPump............................................................................................................................... 79
TPump Picture .................................................................................................................. 80
FastExport ......................................................................................................................... 81
FastExport Picture............................................................................................................. 82
Chapter 7 Secondary Indexes....................................................................................... 83
Unique Secondary Index (USI)......................................................................................... 85
USI Subtable Example...................................................................................................... 86
How Teradata retrieves an USI query............................................................................... 87
II
Table of Contents
NUSI Subtable Example ................................................................................................... 88
How Teradata retrieves a NUSI query.............................................................................. 89
Value Ordered NUSI......................................................................................................... 90
How Teradata retrieves a Value Ordered NUSI query ..................................................... 91
Secondary Index Summary ............................................................................................... 92
Chart for Primary and Secondary Access ......................................................................... 93
Chapter 8 The Active Data Warehouse ....................................................................... 95
OLTP Environments ......................................................................................................... 96
The DSS environment....................................................................................................... 97
Mixing OLTP and DSS environments.............................................................................. 98
Detail Data ........................................................................................................................ 99
Easy System Administration........................................................................................... 100
Data Marts....................................................................................................................... 101
Teradata Tools - SQL Assistant...................................................................................... 102
TDQM............................................................................................................................. 103
Index Wizard................................................................................................................... 104
Archive Recovery ........................................................................................................... 105
Teradata Analyst Suite.................................................................................................... 106
III
Chapter 1
Chapter 1 The Rules of Data Warehousing
Let me once again explain the rules.

Teradata rules!
Tera-Tom Coffing
The Teradata RDBMS was designed to eliminate the technical pitfalls of data
warehousing and it is parallel processing that allows Teradata to rule this industry. The
problem with Data Warehousing is that it is so big and so complicated that there literally
are no rules. Anything goes! Data Warehousing is not for the weak or faint of heart
because the terrain can be difficult and that is why 75% of all data warehouses fail.
Teradata data warehouses provide the users with the ability to build a data warehouse for
the business without having to compromise because their database is unable to meet the
challenges and requirements of constant change. That is why 90% of all Teradata data
warehouses succeed.
Teradata allows businesses to quickly respond to changing conditions. Relational
databases are more flexible than other database types and flexibility is Teradatas middle
name. Here is how Teradata Rules:
8 of the Top 13 Global Airlines use Teradata
10 of the Top 13 Global Communications Companies use Teradata
9 of the Top 16 Global Retailers use Teradata
8 of the Top 20 Global Banks use Teradata
40% of Fortune's "US Most Admired" companies use Teradata
Teradata customers account for more than 70% of the revenue generated by the
top 20 global telecommunication companies
top 30 Global retailers
top 20 global airlines
More than 25% of the top 15 global insurance carriers use Teradata
Copyright Open Systems Services 2004
Page 1
Chapter 1
Teradata Facts
In the Sea of Information your Data

Warehouse Users can be powered by the
Winds of Chance or the Steam of
Understanding!
Tom Coffing
Teradata allows for maximum flexibility in selecting and using data and it therefore can
be designed to represent a business and its practices.
A data warehouse is one of the most exciting technologies of today. It can contain
Terabytes of detail data; have thousands of users, with each user simultaneously asking a
different question on any data at any time. Gathering the information is somewhat easy,
but querying the data warehouse is an art. Before you are ready to query you must first
understand the basics. This book will make it happen.
A data warehouse environment should be built with Christopher Columbus in mind.
When he set sail from Spain he did not know where he was going. When he got there he
didnt know where he was. And when he returned he didnt know where hed been. A
world-class data warehouse must understand that users will ask different questions each
and every day. A good understanding will allow users to set sail today and navigate a
different route tomorrow.
Most database vendors designed their databases around Online Transaction Processing
(OLTP) where they already knew the questions. Teradata is designed for Decision
Support where different questions arise every day. Teradata is always ready to
perform even as your environment and users change and grow.
Page 2
Chapter 1
Teradata: Brilliant by Design
The man who has no imagination has no

wings.
Muhammad Ali
The Teradata database was originally designed in 1976, and it has been floating like a
butterfly and stinging like a bee ever since. It is the Muhammad Ali of data warehousing
because parallel processing is pretty and definitely The Greatest invention developed in
our computer industry. Nearly 25 years later, Teradata is still considered ahead of its
time. While most databases have problems getting their data warehouses off the ground,
Teradata provides wings to give detail data flight. Because Teradata handles the
technical issues, users can reach as far as their imagination will take them and it is the
queries that have a tendency to fly. Teradata was founded on mathematical set theory.
Teradata is easy to understand and allows customers to model the business.
In 1976, IBM mainframes dominated the computer business. However, the original
founders of Teradata noticed that it took about 4 years for IBM to produce a new
mainframe. At the same time, they also noticed a little company called Intel. Intel
created a new processing chip every 2 years. With mainframes moving forward every
4 years as compared to Intels ability to produce a new microprocessor every 2 years,
Teradata envisioned a breakthrough design that would shake the pillars of the industry.
This vision was to network several microprocessor chips together enabling them to be
able to work in parallel. This vision provided another benefit, which was that the cost of
networking microprocessor chips would be hundreds of times cheaper than a mainframe.
IBM laughed out loud! They said, Lets get this straight ... you are going to network a
bunch of PC chips together and overpower our mainframes? Thats like plowing a field
with a 1,000 chickens! In fact, IBM salespeople are still trying to dismiss Teradata as
just a bunch of PCs in a cabinet.
Even with this being stated, Teradata still believed they could produce a product that
could handle large amounts of data and achieve the impossible: replace mainframe
technology. The founders of Teradata believed in the Napoleon Bonaparte philosophy
that stated, The word impossible is not in my dictionary. The Teradata founders set
two primary goals when they designed Teradata which were:
Perform parallel processing
Accommodate Terabytes of data
In 1984, the DBC/1012 was introduced. Since then, Teradata has been the dominant
force in data warehousing.
Page 3
Chapter 1
The Teradata Parallel Architecture
Fall seven times, stand up eight.

--Japanese Proverb
Teradata never falls, but it can stand up to incredible amounts of work because of parallel
processing. Most databases crumble under the extreme pressures of data warehousing.
Who could blame them with thousands of users, each asking a different question on
Terabytes of data? Most databases were born for OLTP processing, while Teradata was
born to be parallel. While most databases fall and dont get up Teradata remains
outstanding and ready for more. Teradata has been parallel processing from the
beginning which incredibly dates back to 1979 and is still the only database that loads
data in parallel, backs-up data in parallel and processes data in parallel. The idea of
parallel processing gives Teradata the ability to have unlimited users, unlimited power,
and unlimited scalability. So, what is parallel processing? Here is a great analogy.
It was 12 a.m. on a Saturday night and two friends were out on the town. One of the
friends looked at his watch and said, I have to get going. The other friend responded,
Whats the hurry? His friend went on to tell him that he had to leave to do his laundry
at the Laundromat. The other friend could not believe his ears. He responded, What!
Youre leaving to do your laundry on a Saturday night? Why dont you do it
tomorrow?. His buddy went on to explain that there were only 10 washing machines at
the laundry. If I wait until tomorrow, it will be crowded and I will be lucky to get one
washing machine. I have 10 loads of laundry, so I will be there all day. If I go now,
nobody will be there, and I can do all 10 loads at the same time. Ill be done in less than
an hour and a half.
This story describes what we call Parallel Processing. Teradata was born to be
parallel, and instead of allowing just 10 loads of wash to be done simultaneously,
Teradata allows for hundreds even thousands of loads to be done simultaneously.
Page 4
Chapter 1
Tera-Tom Parallel Processing Laundry Mat
Only one customer allowed at a time
After enlightenment, the laundry

Zen Proverb
After parallel processing the laundry,

enlightenment!
Teradata Zen Proverb
Teradata was born to be parallel. The optimizer is Parallel Aware so there is always
unconditional parallelism and Teradata automatically distributes the data so each
table is automatically processed in parallel.
Page 5
Chapter 1
A Logical View of the Teradata Architecture
Kites rise highest against the wind not

with it.
Sir Winston Churchill
Many of the largest data warehouses in the world are on Teradata. Teradata provides
customers a centrally located architecture. This provides a single version of the truth
and it minimizes synchronization. Having Teradata on your side is a sure win-ston. If
Churchill had been a data warehouse expert, he would agree that most data warehouses
eventually receive the blitz and stop working while Teradata has the strength from
parallel processing to never give up.
Many data warehouse environments have an architecture that is not designed for decision
support, yet companies often wonder why their data warehouse failed. The winds of
business change can be difficult and starting with the right database is the biggest key to
rising higher.
P
E
P
BYNET Network
A
M
P
A
M
P
A
M
P
A
M
P
Disk
Disk
Disk
Disk
A
M
P
A
M
P
A
M
P
A
M
P
Disk
Disk
Disk
Disk
The user submits SQL to the Parsing Engine (PE). The PE checks the syntax and then
the security and comes up with a plan for the AMPs. The PE communicates with the
AMPs across the BYNET. The AMPs act on the data rows as needed and required.
Page 6
Chapter 1
The Parsing Engine (PE)
The greatest weakness of most humans is

their hesitancy to tell others how much they
love them while theyre alive.
O.A. Battista
If you havent told someone lately how much you love them you need to find a way.
Leadership through love is your greatest gift. Teradata has someone who greets you with
love with every logon. That person is the Parsing Engine (PE), which is often referred to
as the optimizer. When you logon to Teradata the Parsing Engine is waiting with tears in
its eyes and love in its heart ready to make sure your session is taken care of completely.
The Parsing Engine does three things every time you run an SQL statement.
Checks the syntax of your SQL
Checks the security to make sure you have access to the table
Builds a plan for the AMPs to follow
The PE creates a PLAN that tells the AMPs exactly what to do in order to get the data.
The PE knows how many AMPs are in the system, how many rows are in the table, and
the best way to get to the data. The Parsing Engine is the best optimizer in the data
warehouse world because it has been continually improved for over 25 years at the top
data warehouse sites in the world.
The Parsing Engine verifies SQL requests for proper syntax, checks security, maintains
up to 120 individual user sessions, and breaks down the SQL requests into steps.
Welcome!
I will be taking
care of you this
entire session.
PE
Logon
My wish is your
Commands!
Page 7
Chapter 1
The Access Module Processors (AMPs)
A true friend is one who walks in when the

rest of the world walks out.
Anonymous
The AMPs are truly mans best friend because they will work like a dog to read and write
the data. (Their bark is worse then their byte). An AMP never walks out on a friend.
The AMPs are the worker bees of the Teradata system because the AMPs read and write
the data to their assigned disks. The Parsing Engine is the boss and the AMPs are the
workers. The AMPs merely follow the PEs plan and read or write the data.
The AMPs are always connected to a singe virtual disk or Vdisk. The philosophy of
parallel processing revolves around the AMPs. Teradata takes each table and spreads the
rows evenly among all the AMPs. When data is requested from a particular table each
AMP retrieves the rows for the table that they hold on their disk. If the data is spread
evenly then each AMP should retrieve their rows simultaneously with the other AMPs.
That is what we mean when we say Teradata was born to be parallel.
The AMPs will also perform output conversion while the PE performs input
conversion. The AMPs do the physical work associated with retrieving an answer set.
The PE is the boss and the AMPs are the workers. Could you have a Teradata system
without AMPs? No who would retrieve the data? Could you have a Teradata
system without PEs? Of course not could you get along without your boss?!!!
USER SQL
Select *
FROM Order_Table
Order by Order_No;
PEs PLAN
PE
(1) Retrieve all Orders from the Order_Table.

(2) Sort them by Order_No in Ascending order.
(3) Pass the results back to me over the BYNET.
BYNET
AMP
Order_Table
Order_Item_Table
AMP
Order_Table
Order_Item_Table
AMP
Order_Table
Order_Item_Table
AMP
Order_Table
Order_Item_Table
Page 8
Chapter 1
The BYNET
Not all who wander are lost.

J. R. R. Tolkien
The BYNET is the communication network between AMPs and PEs. Data and
communication never wanders and is never lost. How well does the BYNET know
communication? It is the lord of the things! How often does the PE pass the plan to the
AMPs over the BYNET? Every time it makes it a hobbit!
The PE passes the PLAN to the AMPs over the BYNET. The AMPs then retrieve the
data from their disks and pass it to the PE over the BYNET.
The BYNET provides the communications between AMPs and PEs so no matter how
large the data warehouse physically gets, the BYNET makes each AMP and PE think that
they are right next to one another. The BYNET gets its name from the Banyan tree.
The Banyan tree has the ability to continually plant new roots to grow forever.
likewise, the BYNET scales as the Teradata system grows in size. The BYNET is
scalable.
There are always two BYNETs for redundancy and extra bandwidth. AMPs and PEs can
use both BYNETs to send and retrieve data simultaneously. What a network!
The PE checks the users SQL Syntax;

The PE checks the users security rights;
The PE comes up with a plan for the AMPs to follow;
The PE passes the plan along to the AMPs over the BYNET;
The AMPs follow the plan and retrieve the data requested;
The AMPs pass the data to the PE over the BYNET; and
The PE then passes the final data to the user.
PE
PE
PE
BYNET 0
BYNET 1
AMP
AMP
AMP
AMP
AMP
AMP
AMP
AMP
Page 9
Chapter 1
A Visual for Data Layout
I saw the angel in the marble and carved

until I set him free.
--Michelangelo
Teradata saw the users in the warehouse and parallel processed until it set them free.
Free to ask any question at any time on any data. The Sistine Chapel wasnt painted in a
day and a true data warehouse takes time to carve. Sculpt your warehouse with love and
caring and you will build something that will allow your company to have limits that go
well beyond the ceiling. Below is a logical view of data on AMPs. Each AMP holds a
portion of a table. Each AMP keeps the tables in their own separate drawers.
AMP 1
AMP 2
AMP 3
AMP 4
Employee
Table
Employee
Table
Employee
Table
Employee
Table
Order
Table
Order
Table
Order
Table
Order
Table
Customer
Table
Customer
Table
Customer
Table
Customer
Table
Student
Table
Student
Table
Student
Table
Student
Table
Each AMP holds a portion of every table.

Each AMP keeps their tables in separate drawers.
Teradatas Parallel Architecture and a very mature optimizer (PE) make it

completely unique.
Page 10
Chapter 1
Teradata is a shared nothing Architecture
To have everything is to possess nothing.

--Buddha
Each AMP has its own processor, memory, and disk. Each AMP shares nothing with
the other AMPs. Each is connected over a network called the BYNET. This
architecture allows unlimited scalability and is called a shared nothing architecture.
To Parallel Process everything is to

Share nothing.
Tera-Tom Coffing
AMP
AMP
AMP
AMP
Memory
Memory
Memory
Memory
Disk
Disk
Disk
Disk
Customer_Table
Customer_Table
Customer_Table
Customer_Table
Order_Table
Order_Table
Order_Table
Order_Table
Employee_Table
Employee_Table
Employee_Table
Employee_Table
Dept_Table
Dept_Table
Dept_Table
Dept_Table
Page 11
Chapter 1
Teradata has Linear Scalability
The most important thing a father can do

for his children is to love their mother.
- Anonymous
The most important thing a father can do for his children is to love their mother. As the
family grows so should the love. The most important thing a database can do for its data
warehouse children is grow. When a data warehouse stops growing it has reached its
potential. Teradata has the ability to start small and grow forever without losing any
power. This is called Linear Scalability.
A data warehouse should start small and focused with the end goal of evolving into an
Enterprise Data Warehouse. When your data is centralized in one area, you can take full
advantage of completely understanding and analyzing all your data. That is why it is
important to purchase a database that is ready for growth.
Anytime you want to double the speed, simply double the number of PE and AMP
processors (VPROCs). This is known as Linear Scalability. This ability to scale
permits unlimited growth potential and increased response times.
Double your AMPs and double your speed!

AMP
AMP
AMP
AMP
AMP
AMP
AMP
AMP
AMP
AMP
AMP
AMP
Teradatas Linear Scalability is excellent protection for Application Development.

Because the data warehouse environment can change so rapidly, it is imperative that
Teradata has the power and capability to scale up for increased workloads without
decreased throughput.
Page 12
Chapter 1
How Teradata handles data access
The PE handles session control functions
The AMPs retrieve and perform database functions on their requested rows
The BYNET sends communications between the nodes
PE
Handles its Users Sessions

Checks the Syntax
Checks Security
Builds a Plan for the AMPs
The BYNET is the
communication highway
that AMPs and PEs use.
BYNET
AMP
AMP
AMP
Order_Table
Order_Table
Order_Table
Cust_Table
Cust_Table
Cust_Table
The AMPs retrieve and

perform database
functions on their
requested rows.
Each AMP has their
own virtual disk where
the table rows they own
reside.
These statements are true about session control responsible for load balancing across the
BYNET:
The Parser checks statements for proper syntax.
The Optimizer (PE) develops a new and separate plan to determine the best
response.
The Dispatcher takes steps from the parser and transmits them over the
BYNET.
Page 13
Chapter 1
Teradata Cabinets, Nodes, VPROCs, and Disks
The best way to predict the future is to

create it.
- Sophia Bedford-Pierce
Teradata predicted data warehousing 20 years before its time by creating it. Who could
have imagined 100 Terabyte systems back in the 1970s? Teradata did and created an
architecture that can scale indefinitely.
In the picture below we see a Teradata cabinet with four nodes. Each node has two
Intel processors of lightning speed. Inside each nodes memory are the AMPs and PEs
which are referred to as Virtual Processors or VPROCs. Each node is attached to both
BYNETs and each node is attached directly to a set of disks. Each AMP then has one
virtual disk where it stores its tables and rows. If you want to expand your system then
merely buy another Node Cabinet and another Disk Cabinet.
System Mgmt Chassis

BYNET 0
Disk Array Cabinet
D
A
C
D
A
C
D
A
C
D
A
C
D
A
C
D
A
C
D
A
C
D
A
C
Dual Power
BYNET 1
Node 1
PEs
Memory
AMPs
Node 2
PEs
Memory
AMPs
Node 3
PEs
Memory
AMPs
Node 4
PEs
Memory
AMPs
Dual Power
Page 14
Chapter 1
LAN Connection for Network Attached Clients

To connect to Teradata from a LAN the physical connections are PC to network to
ETHERNET card to Gateway Software to PE. The software needed for the client (users
PC) is CLI, MOSI, and MTDP. CLI talks directly to Teradata. MTDP tells Teradata
information about the client so Teradata can bring back the answer set to the correct PC
in the proper format. MOSI is used as the networking software. The Gateway is like a
gatekeeper for all LAN connected users. The Gateway is where logons are enabled or
disabled for LAN users. To Enable logons just use the ENABLE LOGONS command.
The three software components on a Teradata node are the AMP, PE, and PDE
software. These are referred to as VPROCs. Access Module Processors (AMPs), Parsing
Engines (PEs), and Parallel Database Extensions (PDE).
NODE 1
PEs
AMPs
NODE 2
PEs
AMPs
NODE 3
PEs
AMPs
NODE 4
PEs
AMPs
G
A
T
E
W
A
Y
S
O
F
T
W
A
R
E
CLI
MOSI
MTDP
E
T
H
E
T
H
LAN
E
T
H
E
T
H
CLI
MOSI
MTDP
CLI
MOSI
MTDP
To connect to a Teradata network host you need:

PE
Gateway software
Ethernet Card
You can also attach your LAN connections directly to the node. Often customers connect
two different LAN connections for redundancy.
Page 15
Chapter 1
Mainframe Connection to Teradata

To connect to Teradata from a Mainframe the physical connections are Mainframe to
ESCON Connection or BUS/TAG Cables and then to a Host Channel Adapter and then to
a dedicated Parsing Engine (PE). The software needed for the client (mainframe) is CLI
and the Teradata Director Program (TDP). CLI talks directly to Teradata. TDP tells
Teradata information about the client so Teradata can bring back the answer set to the
correct terminal in the proper format.
You attach a mainframe host connection directly to a Teradata node and users can
access Teradata via the mainframe. All you need to make that happen is an ESCON
channel, Host Channel Adapter and a Parsing Engine (PE).
Mainframe Connection to Teradata
NODE 1
PEs
AMPs
Bus/Tag Cables
Host
Adapter
NODE 2
PEs
AMPs
NODE 3
PEs
AMPs
CLI
TDP
ESCON Channel
Host
Adapter
ESCON Channel
NODE 4
PEs
AMPs
Bus/Tag Cables
Host
Adapter
Host
Adapter
CLI
TDP
To connect to Teradata via a MAINFRAME you need:

ESCON/BUS TAG CABLES
Host Adapter
PE
Page 16
Chapter 2
Chapter 2 Data Distribution Explained
There are three keys to selling real estate.

They are location, location, and location.
Teradata knows a little about real estate because the largest and best data warehouses
have been sold to the top companies in countries around the world. This is because
Teradata was meant for data warehousing. When Teradata is explained to the business
and they ask if they are interested in a purchase the word most often used is SOLD!
There are three keys to how Teradata

spreads the data among the AMPs. They
are Primary Index, Primary Index, and
Primary Index.
Every time I begin to teach a data warehousing class, an experienced Teradata manager
or DBA will come up to me and say, Please explain to the students the importance of the
Primary Index.. The Primary Index of each table lays out the data on the AMPs!
Page 17
Chapter 2
Rows and Columns
I never lost a game; time just ran out on

me.
Michael Jordan
Michael Jordan never lost a game; time just ran out on him; however, many data
warehouses lose their game because managing the data can become so intense that life
turns into sudden-death double overtime. Teradata allows the data to be placed by the
system and not the DBA. Talk about a slam dunk!
EMP DEPT LNAME
FNAME
SAL
UPI
1
2
3
4
40
20
20
?
BROWN CHRIS
JONES
JEFF
NGUYEN XING
BROWN SHERRY
95000.00
70000.00
55000.00
34000.00
Teradata stores its information inside Tables. A table consists of rows and columns. A
row is one instance of all columns. According to relational concepts column positions
are arbitrary and a column always contains like data. Teradata does not care what
order you define the columns and Teradata does not care about the order of rows in a
table. Rows are arbitrary also, but once a row format is established then Teradata will
use that format because a Teradata table can have only one row format.
There are many benefits of not requiring rows to be stored in order. Unordered data
does not have to be maintained to preserve the order. Unordered data is independent
of the query.
ROW
40 Brown Chris
95000
Every AMP will hold a portion of every table. Rows are sent to their destination AMP
based on the value of the column designated as the Primary Index.
Page 18
Chapter 2
The Primary Index
Alone we can do so little; together we can

do so much.
Helen Keller
Helen Keller may have been blind, but she saw so much more then the rest of us. Can
you imagine living in a world of such darkness, yet becoming such a shining light?
Helen Keller was the ultimate leader and she helped millions realize that they should
continue to always learn, and that the journey of life is the ultimate destination.
Teradata uses the Primary Index of each table to provide a row its destination to the
proper AMP. This is why each table in Teradata is required to have a Primary Index.
The biggest key to a great Teradata Database Design begins with choosing the correct
Primary Index. The Primary Index will determine on which AMP a row will reside.
Because this concept is extremely important, let me state again that the Primary Index
value for a row is the only thing that will determine on which AMP a row will reside.
Many people new to Teradata assume that the most important concept concerning the
Primary Index is data distribution. INCORRECT! The Primary Index does determine
data distribution, but even more importantly, the Primary Index provides the fastest
physical path to retrieving data. The Primary Index also plays an incredibly important
role in how joins are performed. Remember these three important concepts of the
Primary Index and you are well on your way to a great Physical Database Design.
The Primary Index plays 3 roles:

Data Distribution
Fastest Way to Retrieve Data
Incredibly important for Joins
What needs to be known prior to selecting the Primary Index to ensure excellent
distribution? Columns that define the index. If they are unique or nearly unique then
Teradata will spread the data evenly.
Page 19
Chapter 2
The Two Types of Primary Indexes
A man who chases two rabbits

catches none.
Roman Proverb
Every table must have at least one column as the Primary Index. The Primary Index is
defined when the table is created. There are only two types of Primary Indexes, which
are a Unique Primary Index (UPI) or a Non-Unique Primary Index (NUPI).
A man who chases two rabbits misses both

by a HARE! A person who chases two
Primary Indexes misses both by an ERR!
Tera-Tom Proverb
Every table must have one and only one Primary Index. Because Teradata distributes the
data based on the Primary Index columns value it is quite obvious that you must have a
primary index and that there can be only one primary index per table.
The Primary index is the Physical Mechanism used to retrieve and distribute data. The
primary index is limited to the number of columns in the primary index. This means
that the primary index is comprised totally of all the columns in the primary index.
You can have up to 16 multi-column keys comprising your primary index or as little
as one column as your primary index..
Most databases use the Primary Key as the physical mechanism. Teradata uses the
Primary Index. There are two reasons you might pick a different Primary Index then
your Primary Key. They are (1) for Performance reasons and (2) known access
paths.
Page 20
Chapter 2
Unique Primary Index (UPI)
Always remember that you are unique just

like everyone else.
Anonymous
A Unique Primary Index (UPI) is unique and cant have any duplicates. It is as unique
as you are. Nobody is like you and you are extremely beautiful and amazing. Not one
other person in the history of mankind has ever been exactly like you. You are the
creation of your beautiful parents and must realize how important you are to the world.
A Unique Primary Index is not as amazing as you are, but it is also special.
A Unique Primary Index means that the values for the selected column must be unique.
If you try and insert a row with a Primary Index value that is already in the table, the row
will be rejected. A Unique Primary Index will always spread the table rows evenly
amongst the AMPs. Please dont assume this is always the best thing to do. Below is a
table that has a Unique Primary Index. We have selected EMP to be our Primary Index.
Because we have designated EMP to be a Unique Primary Index, there can be no
duplicate employee numbers in the table.
Employee Table
EMP DEPT LNAME
FNAME
SAL
UPI
1
2
3
4
40
20
20
?
BROWN CHRIS
JONES
JEFF
NGUYEN XING
BROWN SHERRY
95000.00
70000.00
55000.00
34000.00
A Unique Primary Index (UPI) will always spread the rows of the table evenly amongst
the AMPs. UPI access is always a one-AMP operation. It also requires no duplicate
row checking.
Page 21
Chapter 2
Non-Unique Primary Index
You miss 100 percent of the shots you

never take.
Wayne Gretzky
Take a shot at using a Non-Unique Primary Index in your Teradata tables. A NonUnique Primary Index (NUPI) means that the values for the selected column can be
non-unique. You can have many rows with the same value in the Primary Index. A
Non-Unique Primary Index will almost never spread the table rows evenly. Please
dont assume this is always a bad thing. Below is a table that has a Non-Unique Primary
Index. We have selected LNAME to be our Primary Index. Because we have designated
LNAME to be a Non-Unique Primary Index we are anticipating that there will be
individuals in the table with the same last name.
EMP DEPT LNAME
FNAME
SAL
NUPI
1
2
3
4
40
20
20
?
BROWN CHRIS
JONES
JEFF
NGUYEN XING
BROWN SHERRY
95000.00
70000.00
55000.00
34000.00
A Non-Unique Primary Index (UPI) will

almost NEVER spread the rows of the table
evenly amongst the AMPs.
A Non-Unique Primary Index (NUPI) will contain like data. There can be more than
one row with the same Primary Index value because it is non-unique.
An All-AMP operation will take longer if the data is unevenly distributed. You might
pick a NUPI over an UPI because the NUPI column may be more effective for query
access and joins.
Page 22
Chapter 2
How Teradata Turns the Primary Index Value into

the Row Hash
The Primary Index is the only thing that determines where a row will reside. It is
important that you understand this process. Here are the fundamentals in the simplest
form. When a new row arrives into Teradata, the following steps occur:
Teradatas PE examines the Primary Index value of the row.
Teradata takes that Primary Index value and runs it through a Hashing Algorithm.
The output of the Hashing Algorithm (i.e., a formula) is a 32-bit Row Hash.
The 32-bit Row Hash will perform two functions:
1. The 32-bit Row Hash will point to a certain spot on the Hash Map, which will
indicate which AMP will hold the row.
2. The 32-bit Row Hash will always remain with the Row as part of a Row
Identifier (Row ID).
Hashing is a mathematical process where an Index (UPI, NUPI) is converted into a 32bit row hash value. The key to this hashing algorithm is the Primary Index. When this
value is determined, the output of this 32-bit value is called the Row Hash.
New Teradata
Row
Hash the
PI Value
PI
Value
EMP DEPT LNAME FNAME SAL
------ ------- ---------- ---------- ------99
10
Hosh
Roland 50000
99 / HASH FORMULA =
00001111000011110000111100001111
A new row is going to be inserted into Teradata. The Primary Index is the column called
EMP. The value in EMP for this row is 99. Teradata runs the value of 99 through the
Hash Formula and the output is a 32-bit Row Hash. In his example our 32-bit Row Hash
output: 00001111000011110000111100001111.
Page 23
Chapter 2
The Row Hash determines the Rows Destination

The first 16 bits of the Row Hash (a.k.a., Destination Selection Word) are used to
locate an entry in the Hash Map. This entry is called a Hash Map Bucket. The only thing
that resides inside a Hash Map Bucket is the AMP number where the row will reside.
Row Hash 00001111000011110000111100001111

1
4
3
2
2
1
4
3
3
2
1
4
4 1 2
3 4 1
2 3 4
1 2 3
3
2
1
4
Four AMP Hash Map
The first 16 bits of the Row Hash of 00001111000011110000111100001111 are used to

locate a bucket in the Hash Map. A bucket will contain an AMP number. We now know
that employee 99 whose row hash is 00001111000011110000111100001111 will reside
on AMP 4. Note: The AMP uses the entire 32 bits in storing and accessing the row.
If we took employee 99 and ran it through the hashing algorithm again and again, we
would always get a row hash of 00001111000011110000111100001111.
If we take the row hash of 00001111000011110000111100001111 again and again, it
would always point to the same bucket in the hash map.
The above statement is true about the Teradata Hashing Algorithm. Every time employee
99 is run through the hashing algorithm, it returns the same Row Hash. This Row Hash
will point to the same Hash Bucket every time. That is how Teradata knows which AMP
will hold row 99. It does the math and it always gets what it always got!
Hash values are calculated using a hashing formula.
The Hash Map will change if you add additional AMPs.
Page 24
Chapter 2
The Row is Delivered to the Proper AMP

Now that we know that Employee 99 is to be delivered to AMP 4, Teradata packs up the
row, places the Row Hash on the front of the row, and delivers it to AMP 4.
AMP 4
Row Hash
EMP DEPT LNAME

------ ------- ---------00001111000011110000111100001111 99
10
Hosh
FNAME SAL
---------- ------Roland 50000
The entire row for employee 99 is delivered

to the proper AMP accompanied by the Row
Hash, which will always remain with the
row as part of the Row ID.
Review:
A row is to be inserted into a Teradata table
The Primary Index Value for the Row is put into the Hash Algorithm
The output is a 32-bit Row Hash
The Row Hash points to a bucket in the Hash Map
The bucket points to a specific AMP
The row along with the Row Hash are delivered to that AMP
Page 25
Chapter 2
The AMP will add a Uniqueness Value

When the AMP receives a row it will place the row into the proper table, and the AMP
checks if it has any other rows in the table with the same row hash. If this is the first row
with this particular row hash, the AMP will assign a 32-bit uniqueness value of 1. If this
is the second row hash with that particular row hash, the AMP will assign a uniqueness
value of 2. The 32-bit row hash and the 32-bit uniqueness value make up the 64-bit Row
ID. The Row ID is how tables are sorted on an AMP.
AMP 4
Uniqueness
Value
Row Hash
00001111000011110000111100001111

------ ------- ------------ ---------- ------99
10
Hosh
Roland 50000
The Row Hash and the Uniqueness Value = Row ID
The Row Hash always accompanies when an

AMP receives a row.
The AMP will then assign a Uniqueness
Value to the Row Hash. It assigns a 1 if the
Row Hash is unique or a 2 if it is the second
or a 3 if the third, etc.
Page 26
Chapter 2
An Example of an UPI Table

Below is an example of a portion of a table on one AMP. The table has a Unique
Primary Index of EMP.
AMP 4
Uniqueness
Value
Row Hash
00001111000011110000111100001111
01010101010101010000000000000000
01010111111111111111111111111111
11111111111111111100000000000000
1
1
1
1

------ ------- ------------ ---------- ------99
10
Hosh
Roland 50000
21
10
Wilson
Barry
75000
20
Holland
Mary
86000
44
30
Davis
Sandy
54000
The above Employee Table has a Unique Primary Index on the column EMP. Notice that
Row ID sorts the portion of the table on AMP 4. Notice that the Uniqueness Value for
each row is 1.
Page 27
Chapter 2
An Example of an NUPI Table

Below is an example of a portion of a table on one AMP. The table has a Non-Unique
Primary Index (NUPI) on the Last Name called LNAME.
AMP 4
Uniqueness
Value
Row Hash
00000000000000000000000000111111
00000000000000000000000000111111
00000000000000000000000000111111
11111111110000000000000000000000
1
2
3
NUPI

------ ------- ------------ ---------- ------Davis
Roland 150000
65
20
77
10
Davis
Sara
75000
20
Davis
Mary
86000
10
Allen
Sandy
54000
The above Employee Table has a Non-Unique Primary Index on the column LNAME.
Notice that each row with the LNAME of Davis has the exact same Row Hash. Notice
that the Uniqueness Value for each Davis is incremented by 1.
Each time the LNAME is Davis, the Hashing Algorithm generates the Row Hash:
000000000000000000000000000011111
That Row Hash points to the exact same bucket in the Hash Map. This particular bucket
in the Hash Map references (or points to) AMP 4.
The Row Hash accompanied each row to AMP 4. The AMP assigned Uniqueness Values
of 1, 2 and 3 to the three rows with the LNAME of Davis.
Notice that Row ID sorts the portion of the table on AMP 4.
Page 28
Chapter 2
How Teradata Retrieves Rows with the Primary

Index
In the example below, a user runs a query looking for information on Employee 99. The
PE sees that the Primary Index Value EMP is used in the SQL WHERE clause. Because
this is a Primary Index access operation, the PE knows this is a one AMP operation. The
PE hashes 99 and the Row Hash is 00001111000011110000111100001111. This points
to a bucket in the Hash Map that represents AMP 4. AMP 4 is sent a message to get the
Row Hash: 00001111000011110000111100001111 and make sure its EMP 99.
SQL
SELECT *
FROM Employee
WHERE EMP = 99;
PE
99 / HASH Formula
Row Hash 00001111000011110000111100001111
1
2 3 4
1 2 3
4
3
1 2 3 4 1 2
4 1 2 3 4 1
3 4 1
2 3 4
Four AMP Hash Map
AMP 4
Row Hash
00001111000011110000111100001111
01010101010101010000000000000000
01010111111111111111111111111111
11111111111111111100000000000000
1
1
1
1

------ ------- ------------ ---------- ------99
10
Hosh
Roland 50000
21
10
Wilson
Barry
75000
20
Holland
Mary
86000
44
30
Davis
Sandy
54000
Page 29
Chapter 2
Row Distribution
In the examples below we see three different Teradata Systems. The first system has
used Last_Name as a Non-Unique Primary Index (NUPI). The second example has used
Sex_Code as a Non-Unique Primary Index (NUPI).
The last example uses
Employee_Number as a Unique Primary Index (UPI).
Example # 1
Non -Unique Primary Index using Last Names
AMP AMP
**
Davis
Davis
Woods
Example # 2
Jones
Rex
Male
Male
Male
*
*
Smith
Johnson
Smith
AMP
Kelly
Kelly
Hanson
Hanson
Tess
Non -Unique Primary Index using Employee Sex Code
AMP AMP
Example # 3
AMP
AMP
AMP
Female
Female
Female
Unique Primary Index using Employee Number
AMP AMP
1
5
77
22
9
15
AMP
AMP
13
99
2
34
16
4
Page 30
Chapter 2
A Visual for Data Layout

Below is a logical view of data on AMPs. Each AMP holds a portion of a table. Each
AMP keeps the tables in their own separate drawers. The Row ID is used to sort each
table on an AMP.
AMP 1
AMP 2
AMP 3
AMP 4
Employee
Table
Employee
Table
Employee
Table
Employee
Table
Order
Table
Order
Table
Order
Table
Order
Table
Customer
Table
Customer
Table
Customer
Table
Customer
Table
Student
Table
Student
Table
Student
Table
Student
Table
Each AMP holds a portion of every

table.
Each AMP keeps their tables in separate
drawers.
Each table is sorted by Row ID.
Page 31
Chapter 2
Teradata accesses data in three ways
Primary Index (fastest)

Secondary Index (second fastest way)
Full Table Scan (slowest way)
Primary Index (fastest) - When ever a Primary Index is utilized in the SQL WHERE
Clause the PE will be able to use the Primary Index to get the data with a one-AMP
operation.
Secondary Index (next fastest) - If the Primary Index is not utilized sometimes Teradata
can utilize a secondary index. It is not as fast as the Primary Index, but it is much faster
than a full table scan.
Full Table Scan (FTS) (Slowest)
Teradata handles full table scans brilliantly because Teradata accesses each data row
only once because of the parallel processing. Full Table Scans are a way to access
Teradata without using an index. Each data block per table is read only once.
AMP
Sal
Emp Dept Name
99
10 Vu Du 55000
88
20 Sue Lou 59000
75
30 Bill Lim 76000
AMP
Sal
Emp Dept Name
45
10 Ty Law 58000
56
20 Kim Hon 57000
83
30 Jela Rose 79000
AMP
Sal
Emp Dept Name
22
10 Al Jon 85000
38
40 Bee Lee 59000
25
30 Kit Mat 96000
AMP
Sal
Emp Dept Name
44
40 Sly Win 85000
57
40 Wil Mar 59000
93
10 Ken Dew 96000
When Teradata does a Full Table Scan of the above how many
rows are read? 12 How many per AMP? 3
Page 32
Chapter 2
Data Layout Summary

Teradata lays out data totally based on the Primary Index value. If the Primary Index is
Unique, the data layout will be spread equally among the AMPs. If the Primary Index is
Non-Unique, the data distribution across the AMPs may be skewed. A Non-Unique
Primary Index is acceptable if the data values provide reasonably even distribution.
Every table must have a Primary Index and it is created at CREATE TABLE time. When
users utilize a Unique Primary Index in the WHERE clause of their query, the query will
be a one-AMP operation. Why?
A Unique Primary Index value only uses one AMP to return at most one row.
A Non-Unique Primary Index value also uses one AMP to return zero to many rows. The
same values run through the Hashing Algorithm will return the exact same Row Hash.
Therefore, like values will go to the same AMP. The only difference will be the
Uniqueness Value.
Every row in a table will have a Row ID. The Row ID consists of the Primary Index
Value Row Hash and the Uniqueness Value.
Primary Number Rows

Index of AMPs Returned
UPI
NUPI
0-1
0-Many
Page 33
Chapter 3
Page 34
Chapter 3
Chapter 3 Teradata Space

How Permanent Space is calculated
No one is so generous as he who has

nothing to give.
French Proverb
If you dont have any perm you are at the Merci of the DBA. There are three types of
space in Teradata and they are Perm, Spool, and Temp. It all starts with Perm Space.
Users will most likely not have any Perm space because Perm is for Permanent tables,
secondary indexes, and the Permanent Journals. If a user is given Perm space it is not
allocated immediately, but is an upper limit of space for their tables.
Teradata permanent space is calculated by adding up all the available space on an AMPs
attached disks and that is the size of your Teradata warehouse. When a system is
delivered the user DBC owns all the Permanent Space. It is important to remember that
Teradata was born to be parallel so Teradata always calculates all space on a per AMP
basis. In the pictures below we see that the original system was 100 Gigabytes. DBC
owned all 100 Gigabytes. Since there are four AMPs then we actually calculate the space
as 25 Gigabytes per AMP.
100 Gigabytes
DBC Owns 100%
of the Permanent
Space of a new system
If DBC owns 100 Gigabytes of Perm Space
then it actually owns 25 GB (per AMP)
on a 4-AMP system because all space is
Calculated on a per AMP basis.
AMP
AMP
AMP
AMP
25 GB
25 GB
25 GB
25 GB
Page 35
Chapter 3
How Permanent Space is Given
The Constitution only gives people the

right to pursue happiness. You have to
catch it yourself.
Ben Franklin
The Teradata Constitution states, We the Users, in order to form a more perfect UNION,
or INTERSECT, establish SQL as the holder of Truths to be self joined, that all users are
created equal, EXCEPT power users, with certain unalienable Access Rights, that among
these SELECTed are Life, Liberty, and the Pursuit of Managements Happiness. The
real truth is that when a system is delivered the user DBC owns all the Permanent Space
so forget about We the People. DBC sits on top of the hierarchy. It is up to DBC to
give up some of its Perm Space to others so it is Ye the DBC.
Until this system changes size there will always be 100 Gigabytes of PERM Space. It
will merely be owned by multiple users or databases. Permanent space defines the
upper limit of space and it is not allocated at Table Create time.
100 Gigabytes
DBC Owns 100%
of the Permanent
Space of a new system
If DBC gives 40 Gigabytes of Perm Space to MRKT
100 Gigabytes
DBC Owns 60 GB
MRKT Owns 40 GB
Page 36
Chapter 3
The Teradata Hierarchy
In the end well remember not the words of

our enemies, but the silence of our friends.
Martin Luther King, Jr.
One of the greatest human beings of all-time in our opinion was Dr. Martin Luther King,
Jr. who had a dream. Teradata has a dream that all users can be judged not by the color
of their skin, but by the characters in their SQL.
Teradata is hierarchical in nature. Anyone above you in the hierarchy is your parent or
owner. Anyone below you is your child. The key point is that anytime you give away
some of your Permanent Space you lose it until the child is dropped or gives it back. It is
like money. If you give it away you have less in your account. Give it all away and you
are broke! Also notice that the total Permanent Space in the system below is still 100
Gigabytes.
Permanent Space is where objects (i.e., databases, users, tables) are created and stored.
Permanent Space is released when data is deleted or when objects are dropped.
Permanent space defines the upper limit of space for a database or user.
100 Gigabytes
What would the Hierarchy

Look like if DBC created SALES
With 10 Gigabytes of Perm
And MRKT created Advertising
And gave them 20 GB of Perm?
DBC Owns 60 GB
MRKT Owns 40 GB
DBC
10 GB
SALES
50 GB
MRKT
Advertising
20 GB
20 GB
Page 37
Chapter 3
How Spool Space is calculated
Its not the size of the dog in the fight, but

the size of the fight in the dog.
Archie Griffin
Spool space is a wonderful thing unless the query goes to Hies-man! Each user who runs
queries is allocated a certain amount of Spool Space for the query answer set. If your
answer set runs past its spool limit the query is aborted. Some users logon twice thinking
they can trick the system, but spool is calculated on a user basis and once you are over
you are stopped at the goal line. Running out of spool space makes users mad dog mean.
Spool space is literally unused permanent space. Spool space is system wide so
anywhere there is empty PERM space it can be used for spool. It is important to
remember that Teradata was born to be parallel so Teradata always calculates all space
on a per AMP basis. The difference in how you calculate PERM Space is radically
different than Spool Space. PERM Space always totals the total space available in the
system. Each database or user might own a portion of the space, but what counts is
whether or not the space has been filled with Tables, Secondary Index subtables or
Permanent Journals.
Spool and Temp space are nothing more than unused PERM. If tables are not filling the
disks then users can utilize this empty space for their Spool and Temp space. A user will
run out of spool space if they exceed their limit on a per AMP basis.
Tables
Tables
Tables
Tables
The total amount of PERM allocated is always 100%.

However, the actual loading of the tables took up 50% of the disks.
There is 50% of the disks available system wide for SPOOL.
Page 38
Chapter 3
A Spool Space Example
Speak in a moment of anger and youll

deliver the greatest speech youll ever
regret.
Anonymous
When you are angry hold your tongue and keep your cool. When your query aborts hold
your tongue and raise your spool. It is important to remember that Teradata was born to
be parallel so Teradata always calculates all space on a per AMP basis. The difference
in how you calculate PERM Space is radically different than Spool Space. PERM Space
always totals the total space available in the system. The total amount of spool space is
whatever is left over from PERM once the tables have been loaded. Remember, that if
you totaled everyones spool it could be a thousand times more then the total perm. This
is because it is assumed not everyone will be logged on at the same time.
In our example below we have 3 users in MRKT. Each could be assigned the maximum
amount of Spool Space that MRKT is assigned and that is 20GB. Each could run their
queries simultaneously and all could be just under 20GB and the system would not care.
Spool space is an upper limit for your query answer sets. You dont add or subtract when
you are giving someone else spool. The total amount of spool in the system will always
exceed the actual Perm space. There are two times you run out of Spool Space. When
the system is completely out of free space or when your query exceeds its spool limit.
Sales is assigned
10 Gigabytes of Spool
MRKT is assigned
20 Gigabytes of Spool
USER 1
USER 3
USER 2
How much spool space can be assigned to the users in MRKT? Could they each run a
query simultaneously that reached 19.5 Gigabytes of spool? Yes!
Page 39
Chapter 3
PERM, SPOOL and TEMP Space
Every sunrise is a second chance.

Unknown
There is a tribe in Africa that awakes during darkness and prays for the sun to come up.
They have been doing so for thousands of years and every day their prayers are answered.
We all owe them a debt of gratitude because every sunrise is a second chance. There are
times when you might feel down, but dont forget to give yourself a second chance.
Teradata will always give you another chance if you run out of Perm, Spool or Temp.
A user or database is assigned at least two types of space. They are Permanent Space and
Spool Space. PERM space is used to store tables and most users wont get any Perm.
SPOOL space is for users to run their queries and every user gets Spool. If your query
exceeds your allocated Spool space you will need a second chance because your query
is immediately aborted. For users who want to utilize Global Temporary Tables another
space is used and it is called TEMP space.
A user who is assigned no Permanent Space cant create tables in their user space.
They could however create a view, macro, or trigger because these objects dont use
Perm space.
Permanent Space is used for Tables, Secondary Indexes, and Permanent

Journals.
Spool Space is intermediate query results.
Temp Space is intermediate query space for Global Temporary Tables.
A
M
P
A
M
P
A
M
P
A
M
P
PERM SPACE
SPOOL SPACE
TEMP SPACE
Page 40
Chapter 3
Spool Space controls system time
Danger Will Robinson Danger!

-Robot on 1960s TV Show Lost in Space
Permanent Space and Spool space were designed to control how data warehouse space is
allocated and how long a users query can run. Spool space limits are designed to handle
run away queries and control how much time a users query can run before it is deemed
Hogging the system. When a query is run the result set is produced on the AMPs disk
in a Spool file until it is ready to be transmitted over the BYNET to the PE which then
passes the answer to the user. If a user exceed their spool limit by one byte the query is
immediately aborted. Not every user has the same amount of spool space. Power users
are often given more spool then someone who is a new user.
Danger Your Query has exceeded its

limit and will be aborted before it does some
ROBBING SON!
-Abort Button of 1507s TV Show Lost in Spool Space
Spool Space comes from PERM space that has not been allocated. Spool space is unused
PERM. The primary reason to have SPOOL space available is to store intermediate and
final results of queries that are being processed in Teradata. Spool Space is released
when the query is over or when the query no longer needs it. Spool space is
Permanent Space that is not currently being used.
Temporary Space is Permanent Space that is not currently being used. Some users
may be assigned Temporary Space which defines the upper limit of space that the user
can utilize in Global Volatile tables.
Page 41
Chapter 3
A quiz on Perm and Spool Space
MRKT starts with 10,000,000 bytes of Perm and

10,000,000 bytes of Spool
SALES starts with 5,000,000 bytes of Perm and
5,000,000 bytes of Spool
Steve and Mandy are then created with 1,000,000
bytes of Perm and 10,000,000 bytes of spool each
MRKT
10,000,000 Bytes of Perm
10,000,000 Bytes of Spool
STEVE
SALES
Mandy
The New Hierarchy looks like this

MRKT
STEVE
SALES
Mandy
Page 42
Chapter 3
Once Steve and Mandy are created:

(1) How much Perm in MRKT? ___________
(2) How much Spool in MRKT? ___________
If Steve is Given to SALES then:

(3) How much Perm and Spool in MRKT NOW?
__________________ ___________________
(4) How much Perm and Spool in Steve?
__________________ __________________
(5) How much Perm and Spool in SALES?
__________________ __________________
If Steve is then Dropped from the System:

__________________ ___________________
_________________
__________________
Page 43
Chapter 3
Answers:
(1) How much Perm in MRKT? __8 M______

(2) How much Spool in MRKT? __10 M_____
If Steve is Given to SALES then:

_______8M _________
_______10M_________
(4) How much Perm and Spool in Steve?

________1M________ ________10M_______
________5M________ _________5M_______
If Steve is then Dropped from the System:

________8M________ _______10M__________
_______6M________
_______5M___________
Page 44
Chapter 3
Another quiz on Perm and Spool Space
A system has 200 Gigabytes of Space and User A is

assigned 60 Gigabytes of permanent space. User A
gives User B 40 Gigabytes of permanent space.
How much space will User A have left?
20 Gigabytes
A system has 200 Gigabytes of permanent space in
the system. 100 Gigabytes is reserved for spool. The
system currently has 60 Gigabytes of user data. How
much could be left for spool?
140 Gigabytes
Page 45
Chapter 4
Page 46
Chapter 4
Chapter 4 V2R5 Partition Primary Indexes
Life is a succession of lessons, which must be

lived to be understood.
--Ralph Waldo Emerson
As Ralph Waldo Emerson once said, Life is a succession of lessons, which must be
lived to be understood. Teradata has lived and understood the data warehouse
environment for decades over their competitors. One of the key fundamentals of the
V2R5 release is in the ability to allow the AMPs to access data quicker with Partition
Primary Indexes.
In the past Teradata has hashed the Primary Index, which produced a Row Hash. From
the Row Hash, Teradata was able to send the row to a specific AMP. The AMP would
place a uniqueness value and the Row Hash plus the Uniqueness value made up the Row
ID. The data on each AMP was grouped by table and sorted by ROW ID.
Through years of experience working with data warehouse user queries Teradata has
decided to take the hashing to an additional level.
In the past you could choose a Unique Primary Index (UPI) or a Non-Unique Primary
Index (NUPI). Now Teradata will let you choose either a Partition Primary Index (PPI)
or a Non-Partition Primary Index (NPPI).
This allows for fantastic flexibility because user queries will often involve ranges or are
specific to a particular department, location, region, or code of some sort. Now the
AMPs can find the data quicker because the data is grouped in alphabetical order. You
can avoid Full Table Scans more often.
An example is definitely called for here. I will show you a table that is hashed and
another that has a Partition Primary Index.
Page 47
Chapter 4
V2R4 Example
If you are on a V2R4 machine then each table is distributed to the AMPs based on
Primary Index Row Hash and then sorted on that AMP by Row ID. The example below
is also a Non-Partitioned Primary Index in V2R5.
An Example of Teradata V2R4

AMP 1
AMP 2
Order Table
Order Table
Row
Hash
Order
Date
01
05
08
09
80
87
98
2-1-2003
1-1-2003
3-1-2003
1-2-2003
1-5-2003
2-4-2003
3-2-2003
Order
Number
99
88
95
6
77
14
17
Row
Hash
02
04
12
42
52
55
88
Order
Date
2-2-2003
1-10-2003
3-5-2003
1-6-2003
3-6-2003
2-5-2003
1-22-2003
Order
Number
44
53
16
100
35
15
74
Primary Index is Order Date

Notice that the Primary Index is ORDER_DATE. The Order_Date was hashed and rows
were distributed to the proper AMP based on Row Hash and then sorted by Row Id.
Unfortunately the query below results in a full table scan to satisfy the query.
SELECT * FROM Order_Table

WHERE Order_Date between 1-1-2003 and 1-31-2003;
Page 48
Chapter 4
V2R5 Partitioning
Notice that the Primary Index is now a Partition Primary Index on ORDER_DATE. The
Order_Date was hashed and rows were distributed to the same exact AMP as before. The
only difference is that the data in partitions of Order Date Months and then sorted by
Row Hash. The query below does not take a Full Table Scan because the January orders
are all together in their partition. Partitioned Primary Indexes (PPI) are best for
queries that specify range constraints.
An Example of PPI on Teradata V2R5

AMP 1
AMP 2
Order Table
Order Table
Row Order
Hash Date
05
09
80
01
87
08
98
1-1-2003
1-2-2003
1-5-2003
2-1-2003
2-4-2003
3-1-2003
3-2-2003
Order
Number
88
6
77
99
14
95
17
Row Order
Hash Date
04
42
88
02
55
12
52
1-10-2003
1-6-2003
1-22-2003
2-2-2003
2-5-2003
3-5-2003
3-6-2003
Order
Number
53
100
74
44
15
16
35
Partition Primary Index is Order_Date

SELECT * FROM Order_Table
WHERE Order_Date between 1-1-2003 and 1-31-2003;
Page 49
Chapter 4
Partitioning doesnt have to be part of the Primary

Index
A Journey of a thousand miles begins with a

single step.
-Lao Tzu
Understanding Teradata begins with a single step and that is reading and understanding
this book. You will soon be a Teradata master and that is quite an accomplishment.
Understanding Partitioning is easy once you understand the basic steps. You do not have
to partition by a column that is the primary index. Here is an example:
CREATE SET TABLE EMPLOYEE_TABLE

(
EMPLOYEE
INTEGER NOT NULL
,DEPT
INTEGER
,FIRST_NAME
VARCHAR(20)
,LAST_NAME
CHAR(20)
,SALARY
DECIMAL(10,2)
)
PRIMARY INDEX (EMPLOYEE)
PARTITION BY DEPT;
Non-Unique
Primary Index
You can NOT have a UNIQUE PRIMARY INDEX on a table that is partitioned by
something not included in the Primary Index.
Remember, data is never distributed based on the partition. Data is only distributed
based on the Primary Index of a table (even if it is a PPI Table).
Page 50
Chapter 4
Partition Elimination can avoid Full Table Scans
AMP 1
Part 1
Part 2
AMP 2
Employee_Table
Employee_Table
Employee Dept First_Name
99
75
56
30
54
40
10
10
10
20
20
20
Tom
Mike
Sandy
Leona
Robert
Morgan
13
12
21
16
55
70
10
10
10
20
20
20
Ray
Jeff
Randy
Janie
Chris
Gareth
Partition Primary Index is Dept

How many partitions on each AMP will need to be read for the following query?
SELECT *
FROM Employee_Table
WHERE Dept = 20;
Answer: 1
Partition Primary Indexes reduce the number of rows that are processed by using
partition elimination.
Page 51
Chapter 4
The Bad NEWS about Partitioning on a column

that is not part of the Primary Index
Before you get too excited about partitioning by a column that is not part of the primary
index you should remember The Alamo. This is because when you run queries that
dont mention the PARTITION COLUMN in your SQL you have to check every
partition and this can be some serious battle. Partitions can range from 1-65,535
partitions. The example below will have to check every partition so be careful.
AMP 1
Part 1
Part 2
AMP 2
Employee_Table
Employee_Table
99
75
56
30
54
40
10
10
10
20
20
20
Tom
Mike
Sandy
Leona
Robert
Morgan
13
12
21
16
55
70
10
10
10
20
20
20
Ray
Jeff
Randy
Janie
Chris
Gareth
Partition Primary Index is Dept

SELECT *
FROM Employee_Table
WHERE employee = 99
GREAT Things about Partition Primary Indexes:
PPI avoids full table scans without the overhead of a secondary index and allows for
instantaneous dropping of old data and rapid addition of newer data.
Remember these rules: A Primary KEY cant be changed. A Primary Index always
distributes the data. A Partition Primary Index (PPI) partitions data to avoid full
table scans.
Page 52
Chapter 4
Two ways to handle Partitioning on a column that

is not part of the Primary Index
You have two ways to handle queries when you partition by a column that is not part of
the Primary Index.
a. You can assign a Unique Secondary Index (when appropriate).
b. You can include the partition column in your SQL.
Example 1:
CREATE UNIQUE INDEX (Employee)

On Employee_Table
SELECT *
FROM Employee_Table
WHERE employee = 99
Example 2:
SELECT *
FROM Employee_Table
WHERE employee = 99
AND Dept = 10;
Partition is Dept
In all examples above only one partition would need to

be read.
Page 53
Chapter 4
Partitioning with CASE_N

You Cracked the Case Honey
Vinny My cousin Vinny
Teradata now allows you to crack the CASE statement as a partitioning option. Here are
the fundamentals:
Use of CASE_N results in:
Just like the CASE statement it evaluates a list of conditions picking only the first
condition met.
The data row will be placed into a partition associated with that condition.
CREATE TABLE Order_Table

(
Order_Number
Integer NOT NULL
,Customer_Number Integer NOT NULL
,Order_Date
Date
,Order_Total
Decimal (10,2)
)
PRIMARY INDEX(Customer_Number)
PARTITION BY CASE_N
(Order_Total < 1000
,Order_Total < 5000
,Order_Total < 10000
,Order_Total < 50000, NO Case, Unknown
);
Note: We cant have a Unique Primary Index (UPI) here because we are
partitioning by ORDER_TOTAL and ORDER_TOTAL is not part of the Primary
Index.
Data Distribution of a Partitioned Primary Index table is based only on the Primary
Index.
Page 54
Chapter 4
Partitioning with RANGE_N

Teradata also has a western theme because they allow your partitions to go Home on the
Range by using the RANGE_N function. Here are the fundamentals. Use of
RANGE_N results in:
The expression is evaluated and associated to one of a list of ranges.
Ranges are always listed in increasing order and cant overlap
The data row is placed into the partition that falls within the associated range.
The test value in RANGE_N function must be an INTEGER or DATE. (This

includes BYTEINT or SMALLINT)
In the example below please notice the arrows. They are designed to illustrate that you
can use a UNIQUE PRIMARY INDEX on a Partitioned table when the Partition is part
of the PRIMARY INDEX.
CREATE TABLE Order_Table

(
Order_Number
Integer NOT NULL
,Customer_Number Integer NOT NULL
,Order_Date
Date
,Order_Total
Decimal (10,2)
)UNIQUE PRIMARY INDEX
(Customer_Number, Order_Date)
PARTITION BY Range_N
(Order_Date
BETWEEN DATE 2003-01-01
AND 2003-06-30
EACH INTERVAL 1 DAY
);
Page 55
Chapter 4
NO CASE, NO RANGE, or UNKNOWN
We only have one person to blame, and

thats each other
Barry Beck, NY Ranger, on who started a fight during a hockey game
In no case does no range have anything to do with a New York Ranger, but if you specify
NO Case, NO Range, or Unknown with partitioning their will be no fighting amongst the
partitions. These keywords tell Teradata which penalty box or partition to place bad data.
You can specify a NO CASE or NO RANGE Partition as well as a partition for
UNKNOWN.
A NO CASE or NO RANGE partition is for any value, which isnt true for any previous
CASE_N or RANGE_N expression.
If UNKNOWN is included as part of the NO CASE or NO RANGE option with an OR
condition, then any values that are not true for any previous CASE_N or RANGE_N
expression and any unknown (e.g., NULL) will be put in the same partition. This
example has a total of 5 partitions.
Partition by CASE_N
( Salary < 30000,
Salary < 50000,
Salary < 100000
Salary < 1000000,
NO CASE OR UNKNOWN)
If you dont see the OR operand associated with UNKNOWN then NULLs will be placed
in the UNKNOWN Partition and all other rows that dont meet the CASE criteria will be
placed in the NO CASE partition. This example has a total of 6 partitions.
Partition by CASE_N
( Salary < 30000,
Salary < 50000,
Salary < 100000
Salary < 1000000,
NO CASE, UNKNOWN)
Page 56
Chapter 5
Chapter 5 Data Protection

"Age
does not protect you from love. But

love, to some extent protects you from age. "
-Jeanne Moreau, French Actress
As a man was driving down the interstate highway, his cell phone rang. When he
answered he heard his wife warn him urgently, "George, I just heard on the news that
there's a car going the wrong way on I-26!" George replied, "I'm on I-26 right now and
it's not just one car. It's hundreds of them!"
How do you protect your data when things go the wrong way? Murphy s Law states,
The more mission critical a data warehouse, the more likely the system will crash at
the most critical moment of the mission. Ironically, most DBAs think Murphy was an
optimist.
A database not prepared to defend itself is like an unsigned contract. It is not worth the
paper it is written on. However, Teradata is always prepared and it will protect your
data better than a wild pit bull. As a matter of fact, the difference between Teradata and
a pit bull is that eventually the pit bull will get bored and let go.
System and user errors are inevitable in any large system. For example, an associate
may accidentally give everyone a 100% raise instead of a 10% raise. Or, what if a
million-dollar transaction fails right at the wrong time? Or an AMP or DISK goes
down? In any of these cases, Teradata will have many ways to protect your data. Some
processes for protection are automatic and some of them are optional.
The protection features we will discuss are:
Transaction Concept
Transient Journal
FALLBACK
RAID
Clustering
Cliques
Permanent Journaling
Page 57
Chapter 5
Transaction Concept & Transient Journal
The afternoon knows what the morning

never suspected.
- Swedish Proverb
At any time something could go wrong with a transaction. An old proverb suggests,
The afternoon often knows what the morning never suspected, likewise the
Transient Journal knows what the transaction never suspected.
What good would it do if you could gather, store and analyze terabytes of data, but
doubted the integrity of the data? Teradata makes every effort to ensure a database
doesnt get corrupt. Fundamental to this assurance is the Transaction Concept, which
means that an SQL statement is viewed as a transaction. Simply stated, either it works
or it fails.
The Transient Journal knows what the

Transaction never suspected.
Swedish Proverb after a rollback
The Transient Journals job is to ensure if an insert, update, or delete fails, then the
rows affected can be reverted back to their original state. This is called a Rollback.
In Teradata, all SQL statements are considered transactions. This applies whether you
have one statement or multiple statements executing (MACRO). If all SQL statements
cannot be performed successfully, the following happens:
The user receives immediate feedback in the form of a failure message;

The entire transaction is rolled back, and any changes made to the database are
reversed;
Locks are released
Spool files are discarded
The Transient Journal is automatic and it takes a before picture of any update or delete
for rollback purposes.
Page 58
Chapter 5
How the Transient Journal Works
Beware of the young doctor and the old

barber.
- Benjamin Franklin
Wouldnt it be great if every time you got a haircut, the barber or stylist took a picture of
your hairdo before they cut a single strand? Then after he or she cut your hair, asked if
you liked it? If you didnt like it, then you could ask to have it restored? Well, that is
what the Transaction Journal does. If a row is going to change because of an INSERT,
UPDATE, or DELETE, it takes a BEFORE picture. If the transaction fails, then the
journal restores it to the way it was.
The TRANSIENT JOURNAL is an automatic system function. It is not optional. The
BEFORE image is actually stored in the AMPs Transient Journal. Every AMP has a
transient journal that is maintained in DBCs PERM space. If the transaction is aborted
for any reason, the AMP restores the data to match the before-image stored in the
Transient Journal. The data will then revert to its original state. When a transaction is
successful, the PE and the AMPs shake hands on it and the Transient Journal is wiped
clean. The handshake is called the COMMIT. After a COMMIT, all the AMPS have a
party to celebrate, and the user is invited to join in the festivities! In other words,
Transaction Journal Cleanliness is next to Godliness. If it is clean, then things went
good!
The Transient Journal provides two system events that occur automatically to ensure data
integrity. An automatic rollback of changed rows occurs in the event of a transaction
failure. This is done because before images are retained on each AMP as changes
occur. Data is always returned to its original state after a transaction failure.
AMP
Transient
Journal
AMP
Transient
Journal
AMP
AMP
Transient
Journal
Transient
Journal
Before picture of row being changed
Transient Journal
Camera
Employee Department First

Number
Name
Number
99
10
Bill
Last
Name
Davis
Salary
78,000
Page 59
Chapter 5
FALLBACK Protection
United we stand divided we fall.

-Circular letter, Boston during the American Revolution
FALLBACK is a table protection feature used in case an AMP fails. Fallback is similar
to mirroring in that a duplicate copy of a row is created and maintained on another
AMP for redundancy purposes. Essentially, anytime you define a table with Fallback
you are using twice the space. You can use FALLBACK on all tables, some tables or
no tables. You can also create a table with or without FALLBACK and then add or
drop the feature at any time.
Divided we stand united we Fallback.

-AMP during the computer revolution
Fallback is similar to mirroring in that it creates and maintains a duplicate copy of each
row, but it is designed in a revolutionary manner for performance purposes. With
mirroring if one disk goes down another duplicate disk takes over. Fallback however
will take all the rows that one AMP is responsible for in a fallback protected table and
store them on multiple AMPs. If the AMP fails then multiple AMPs will be
responsible for delivering the failed AMPs rows.
We have the right to bear arms.

-2nd amendment of the constitution
Teradata believes its constitution is to protect the data and so a duplicate copy is always
maintained on another AMP.
We have no access rights to bare amps.

-2nd amendment of the Teradata constitution
Page 60
Chapter 5
How Fallback Works
Its deja vu all over again!

-Yogi Berra
Fallback is like dj vu all over again because when a table is fallback protected the rows
are duplicated on other AMPs. Fallback is similar to mirroring, but different. The
similarities is that both provide a duplicate copy, but the difference is that Fallback places
copies of its rows on multiple AMPs so if a failure occurs Teradata can use the
parallelism to help the failed AMP.
Below is a diagram of four AMPs holding a base table. For examples sake, lets assume
that the base table is the Employee Table. There are 12 employees with employee
numbers ranging from 1 to 12. The data is spread evenly in the table with each AMP
responsible for 3 employees.
The Employee Table has been created with Fallback, so each row of the base table is
duplicated on another AMP in the Fallback Table. Notice three very important features:
(1) No base table row is on the same AMP with its Fallback protected duplicate copy.
(2) Each AMP spreads their Fallback rows evenly to multiple AMPs.
(3) The perm space used for the table is double because of the fallback
The system can lose any single AMP or Disk in this system. If multiple AMPs or Disks
fail in the picture below then Teradata wont be able to run queries that ask for all the
data.
1
5
9
2
6
10
3
7
11
4
8
12
Base
Table
Rows
10
7
4
1
11
8
5
2
12
9
6
3
Fallback
Rows
Page 61
Chapter 5
Fallback Clusters
Fallback is always associated with CLUSTERS. Fallback can be specified at the
table level. Fallback is worth the price because when an AMP fails users still have
access to the data even while the AMP is offline. Any data that has changed is
automatically restored during the AMP offline period.
If we can lose any one AMP/disk, what happens if we lose two? The chance of losing
two AMPs in a four-AMP system is rare, however some systems have nearly 2,000
AMPs. Therefore, the chance of losing two AMPs in a 2,000 AMP system is much
greater than in a four-AMP system. Thats why Teradata designed Clustering. With
Clustering, Teradata can lose one AMP/disk per cluster. Lets look at this next
example with 8 AMPs in two clusters.
Notice that the data in the base table lays out evenly with 24 records on 8 AMPs. What
is key to notice is that the fallback copy remains within the cluster. In other words, the
base table rows in cluster one are fallback protected within cluster one. The base table
rows in cluster two are fallback protected within cluster two. We can lose one
AMP/disk in both cluster one and cluster two and the system is fine.
Cluster # 1
1
9
17
2
10
18
3
11
19
4
12
20
Base
Table
Rows
18
1
19
12
9
2
20
17
10
3
Fallback
Rows
5
13
21
6
14
22
7
15
23
8
16
24
Base
Table
Rows
22
15
8
5
23
16
13
6
24
21
14
7
Fallback
Rows
11
4
Cluster # 2
Page 62
Chapter 5
Down AMP Recovery Journal (DARJ)
Once the game is over, the king and the

pawn go back in the same box.
- Italian Proverb
The Down AMP Recovery Journal (DARJ) is started on all AMPs in the cluster when
an AMP is down. This allows for three AMPs to check on their mate. Since there are
four AMPs in most clusters and all Fallback for a particular AMP remains within the
cluster there are Three AMPs that will hold Fallback rows for a down AMP.
The Down AMP Recovery Journal (DARJ) is a special journal used only for
FALLBACK rows when an AMP is not working. Like the TRANSIENT JOURNAL,
the DARJ, also known as the RECOVERY JOURNAL, gets it space from the DBCs
PERM space. When an AMP fails, the rest of the AMPs in its cluster initiate a DARJ.
The DARJ keeps track of any changes that would have been written to the failed AMP.
When the AMP comes back online, the DARJ will catch-up the AMP by completing
missed transactions. Once everything is caught-up the DARJ is dropped.
Cluster # 1
1
9
17
18
11
4
Down AMP Recovery Journal (DARJ)

DARJ
DARJ
DARJ
2
10
18
3
11
19
4
12
20
Base
Table
Rows
1
19
12
9
2
20
17
10
3
Fallback
Rows
DARJ Example for Online Catch-Up

Updated Base 1 Salary = 92000
Updated Base 9 Last_Name = Smith
Updated Fallback 4 Dept_no = 10
Cluster # 2
5
13
21
6
14
22
7
15
23
8
16
24
Base
Table
Rows
22
15
8
5
23
16
13
6
24
21
14
7
Fallback
Rows
Page 63
Chapter 5
Redundant Array of Independent Disks (RAID)
I know that you believe that you

understand what you think I said, but I am
not sure you realize that what you heard is
not what I meant.
-Sign on Pentagon office wall
RAID never gets confused. It always knows exactly what the disk said and it mirrors it
exactly! Redundant Array of Independent Disks (RAID) protects against a disk failure.
There are many levels of RAID in the data storage industry. The most common level,
and one that is used by Teradata, is RAID-1, also called Transparent MIRRORING.
With RAID-1, each primary disk has a mirror image, or an exact copy of all its data on
another disk. The contents of both disks are identical. Each AMP has one virtual disk
meaning only that AMP can access it disks, but there are actually four physical disks.
When data is written on the primary disk, it is also written on the mirror disk. The down
side of RAID-1, like FALLBACK, is that it requires a 100% overhead of disk space.
RAID 1 data is mirrored across paired disks. RAID 5 Data and Parity are stripped
across a rank of disks. Data is reconstructed on a disk failure. Fallback and RAID 1
provide the highest level of protection.
A
M
P
Disk Array Controller

Data
Mirror
Data
Mirror
2 Ben Hon 2 Ben Hon

10 Don Roy 10 Don Roy
Four Physical Disks

One Virtual Disk
Page 64
Chapter 5
Cliques
Teradata CLIQUES (pronounced cleeks) are a method of system protection against the
failure of an entire node. Each node contains in memory AMP VPROCs. Each AMP is
attached to one virtual disk (Vdisk) and that AMP is the only Vproc allowed access to
its Vdisk. A Clique utilizes access to a set of disks from another node. If a node fails the
AMP VPROCs can migrate to the node that has the backup access to its virtual disk. The
migrating AMP can continue to read and write to its Vdisk while its home node is down.
When the home node is fixed and available again the VPROCs return home.
If a Teradata system uses two-node cliques then when one node fails all of its AMP
VPROCs migrate to the other node. The system is now about 50% slower. To solve this
problem Teradata allows bigger cliques such as eight nodes. If one node fails, its
VPROCs split up and migrate amongst the seven other nodes in the clique without much
performance degradation.
If a node fails, Vprocs can migrate to the other node

NODE 1
INTEL
NODE 2
BYNET
INTEL
INTEL
BYNET
AMPs
INTEL
AMPs
Clique
Cables
D
A
C
D
A
C
Clique
Cables
D
A
C
D
A
C
And still have access to their Virtual Disks (Vdisks)
Page 65
Chapter 5
Cliques A two node example

During a node failure All AMPs migrate from the failing node to another node within
the clique. Vdisks can still be accessed by their AMPs during a failover.
Cliques help protect node failures, but have nothing to do with how the data is spread
across the AMPs. In a two, three, or four node CLIQUE Teradata system data is spread
across all AMPs in a system.
In the same way, when a node goes down the software AMPs and PEs migrate over the
BYNET to a temporary home on another node.
Node 1 Fails so Vprocs Migrate

NODE 1
INTEL
NODE 2
BYNET
INTEL
INTEL
X
D
A
C
to Node 2
INTEL
AMPs
BYNET
Clique
Cables
D
A
C
Clique
Cables
D
A
C
D
A
C
And still have access to their Virtual Disks (Vdisks)
Page 66
Chapter 5
Cliques A four node example

Below is an example of a four node clique. If a node goes down the VPROCs in the
failed node will distribute evenly among the remaining nodes in the clique. Degradation
is minimal.
Access to all data is maintained during a failover, yet performance degradation is
inversely proportional to clique size. The bigger the clique the less the performance
degradation.
4-Node Clique Cable
System Mgmt Chassis

BYNET 0
Disk Array Cabinet

D
A
C
D
A
C
D
A
C
D
A
C
D
A
C
D
A
C
D
A
C
D
A
C
Dual Power
BYNET 1
Node 1
PEs
Memory
AMPs
Node 2
PEs
Memory
AMPs
Node 3
PEs
B
Y
N
E
T
Memory
AMPs
Node 4
PEs
Memory
AMPs
Dual Power
Page 67
Chapter 5
Permanent Journal
The absent are always in the wrong.

English Proverb
If a system had five million rows and used FALLBACK protection, then it would have
five million FALLBACK rows. However, this would be quite costly because
FALLBACK actually stores a duplicate copy of all the rows on other AMPs within the
same cluster. FALLBACK is used either because the system is mission critical or the
system is not backed up regularly. For customers who backup data regularly, another
option for data restoration is the Permanent Journal. When a company is not severely
impacted by a couple of hours for a restoration to be completed, this is a very good
option. The Permanent Journal works in conjunction with backup procedures, plus its a
lot more cost effective than FALLBACK.
The absent are always in the write.

Permanent Journal Proverb
The Permanent Journal stores only images of rows that have been changed due to
an INSERT, UPDATE, or DELETE command. That is why when data is lost or
absent the permanent journal can write it back to the disks. The permanent journal keeps
track of all new, deleted or modified data since the last Permanent Journal backup. This
option is usually less expensive than storing the additional five million FALLBACK
rows.
Like FALLBACK, the Permanent Journal is optional. It may be used on specific tables
of your choosing or on no tables at all. It provides the flexibility to customize a Journal
to meet specific needs. The Permanent Journal must be manually purged from time to
time.
There are four image options for the Permanent Journal:
Before Journal
After Journal
Dual Before Journal
Dual After Journal
Page 68
Chapter 5
Table create
Journaling
with
Fallback
and
Permanent
The example created the table called Employee in the Teratom database, and is
FALLBACK protected. A BEFORE Journal and a DUAL AFTER Journal are specified.
Remember that both FALLBACK and JOURNALING have defaults of NO - meaning
if you dont specify this protection at either the table or database level the default is NO
FALLBACK and NO JOURNALING.
CREATE TABLE Teratom.employee,
FALLBACK,
BEFORE JOURNAL,
DUAL AFTER JOURNAL
(
emp
,dept
,lname
,fname
,salary
,hire_date
INTEGER
INTEGER
CHAR(20)
VARCHAR(20)
DECIMAL(10,2)
DATE
)
UNIQUE PRIMARY INDEX(emp);
Page 69
Chapter 5
Locks
Some birds arent meant to be caged, their

feathers are just too bright. And when they
fly away, the part of you that knows it was a
sin to lock them up, does rejoice.
Shawshank Redemption
You dont lock up a bird, but you always lock a query. Teradata uses a lock manager to
automatically lock at the database, table or row hash level. Teradata will lock objects
using four types of locks:
Exclusive - Exclusive locks are placed only on a database or table when the object is
going through a structural change. An Exclusive lock restricts access to the object by any
other user. This lock can also be explicitly placed using the LOCKING modifier.
Write - A Write lock happens on an INSERT, DELETE, or UPDATE request. A Write
lock restricts access by other users. The only exception is for users reading data that are
not concerned with data consistency and override the applied lock by specifying an
Access lock. This lock can also be explicitly placed using the LOCKING modifier.
Read - This is placed in response to a SELECT request. A Read lock restricts access by
users who require Exclusive or Write locks. This lock can also be explicitly placed using
the LOCKING modifier. Read locks put the word integrity in data integrity. If you
have a multi-user environment with updates occurring and you need to keep data
consistent, you want a read lock.
Access - Placed in response to a user-defined LOCKING FOR ACCESS phrase. An
Access lock permits the user to access to READ an object that may already be locked for
READ or WRITE. An access lock does not restrict access by another user except when
an Exclusive lock is required. A user requesting access cannot be concerned with data
consistency.
When Teradata locks a resource for a user the lifespan of the transaction lock is forever
or until the user releases the lock.
This is different then a deadlock situation.
youngest query is always aborted.
If two transactions are deadlocked the
Page 70
Chapter 5
Teradata has 4 locks for 3 levels of Locking
When you go into court you are putting

your fate into the hands of twelve people
who werent smart enough to get out of jury
duty.
- Norm Crosby
Teradata uses a lock manager to be judge, jury, and executioner of SQL. There are four
locks placed on objects at the database, table, or row hash level.
Exclusive Lock
Database
Write Lock
Table
Read Lock
Row Hash
Access Lock
Page 71
Chapter 5
Locks and their compatibility
Frankly, my dear, I dont give a damn.

- Rhett Butler Gone with the Wind (1939)
Not everyone is compatible and Teradata locks are no exception. Locks that are
compatible can lock the same object simultaneously. Clark Gable would have been a
great Teradata user because he always used a Rhett Lock and according to Scarlet was
almost never Write!
Locks that are compatible can share access to objects simultaneously so READ locks
are great because one or a thousand users can read the same object at the same time.
Teradata will not allow a user to change a table while others are reading it. This prevents
database corruption.
Teradata Lock
Compatible Locks
Exclusive Lock
No Compatibility
Write Lock
Access Lock
Read Lock
Read Lock
Access Lock
Access Lock
Read Lock
Write Lock
Access Lock
An ACCESS Lock is an excellent way to avoid waiting for a write lock currently on a
particular table. Two statements allow this:
Locking Row for Access
Locking Tablename for Access
Page 72
Chapter 6
Chapter 6 Loading the Data
I dont know who my grandfather was. I

am more interested in who his grandson will
become.
Abraham Lincoln, 12th president of the United States
My son once told me he did not feel like studying. I said to him, When Abraham
Lincoln was your age, he studied by candlelight. My son retorted, When Abraham
Lincoln was your age, he was president.
Data within a warehouse environment is often historic in nature, so the sheer volume of
data can overwhelm many systems. But, not Teradata!
Abraham Lincoln will go down as one of

the greatest presidents in history, but
Teradata is even better because it will not
go down when it loads history.
Tom Coffing, 1st president of Coffing Data Warehousing
Teradata is so advanced in the data-loading department that other database vendors cant
hold a candle to it. A Teradata data warehouse brings enormous amounts of data into the
system. This is an area that most companies overlook when purchasing a data warehouse.
Most company officials think loading of data is simply that just loading data. Some
people actually ask, Are data loads that critical? Come on, ASCII stupid question and
get a stupid ANSI.
Data warehouses fail because customer cannot load the data fast enough once it reaches a
certain volume. As one Teradata developer said, It is not the load that brings them
down, but the way they carry it. Even an experienced body builder must use a good
technique to lift the weight over his head. While most database vendors are new to the
data warehouse game, Teradata has had 15 years of experience of loading the largest data
Page 73
Chapter 6
warehouses in the world. The combination of FastLoad, MultiLoad, and TPump can load
millions, even billions of records in record time.
FastLoad is designed to load flat file data from a mainframe or LAN directly into an
empty Teradata table. This is how a Teradata table is populated the first time. I have
personally seen Teradata load over one billion large rows in less than 6 hours. Plus, I
have seen Teradata load millions of rows in minutes. How is Teradatas speed and
performance accomplished? Once again its through the power of parallel processing.
Where FastLoad is meant to populate empty tables with INSERTs, MultiLoad is meant to
process INSERTs, UPDATEs, and DELETEs on tables that have existing data.
MultiLoad is extremely fast. One major Teradata data warehouse company processes
120 million inserts, updates, and deletes nightly during its batch window.
The TPump utility is designed to allow OLTP transactions to immediately load into a
data warehouse. When I started working with Teradata, more than 10 years ago, most
companies loaded data on a monthly basis. Suddenly, companies began to load data
weekly.
Today, most companies load data nightly, and industry leaders are loading data hourly.
TPump is the beginning step of an Active Data Warehouse (ADW). ADW combines
OLTP transactions with the power of a Decision Support System (DSS).
The TPump utility theoretically acts like a water faucet. TPump can be set to full throttle
to load millions of transactions during off peak hours or turned down to trickle small
amounts of data during the data warehouse daily rush hour. It can also be automatically
preset to load levels at certain times during the day, and can be modified at any time.
Also, TPump locks at a row level so users have access to the rest of the rows while the
table is being loaded. Another advantage of this load utility is that it allows for multiple
updates to be conducted on a table simultaneously.
When the utilities start, the Parsing Engine comes up with a plan for the AMPs. The
Parsing Engine then steps back and lets the AMPs do their work. The data is loaded in
large 64K blocks. Each AMP is given a 64K block of rows for loading. Like a line of
workers trying to pass sand bags to prevent a flood, Teradata passes these blocks from
AMP to AMP until all the data is on Teradata. Next, all AMPs take the blocks they
received and hash the Primary Index value sending the rows over the BYNET to their
destination AMP. Once this is done, each AMP sorts its data by Row ID and the table is
ready for business.
Page 74
Chapter 6
FastLoad
If you are all wrapped up in yourself, you

are overdressed
Kate Halverson
The Teradata FastLoad utility is wrapped up in your data and even though it appears
under dressed without fancy dressings it is one of the best utilities every built. It may not
be dressed to kill, but it is designed to thrill!
FastLoad is actually designed to load flat file data from a mainframe or LAN directly
into an empty Teradata table. This is how a Teradata table is populated the first time. I
have personally seen Teradata load over one billion large rows in less than 6 hours. Plus,
I have seen Teradata load millions of rows in minutes. Teradata has the quickest time to
solution, and has the most powerful performance in the data warehousing industry.
How is Teradatas speed and performance accomplished? Its done through parallel
processing.
FastLoad understands one SQL command - INSERT. It inserts rows into an empty table.
The process is as follows: A flat file is prepared for loading on a mainframe or LAN.
The FastLoad utility needs three pieces of information to process: where the flat file
located, what is its file definition, and what table the data should be loaded into in
Teradata.
When the FastLoad utility starts, the Parsing Engine comes up with a plan for the AMPs.
The Parsing Engine then steps back and lets the AMPs do their work. The data is loaded
in large 64K blocks. Each AMP is given a 64K block of rows for loading. Like a line of
workers trying to pass sand bags to prevent a flood, Teradata passes these blocks from
AMP to AMP until all the data is on Teradata. Next, all AMPs take the blocks they
received, hash the rows in those blocks (in parallel) and send the rows to the proper AMP
over the BYNET. Once this is done, each AMP sorts its data by Row ID and the table is
ready for business. FastLoad Basics:
Loads data to Teradata from a Mainframe or LAN flat file;

Only one table may be loaded at a time;
The table to be loaded must be empty;
There can be no secondary indexes, referential integrity, or triggers;
It locks at the table level.
FastLoad populates empty tables at the block level. Teradata LOADs using FastLoad.
Page 75
Chapter 6
FastLoad Picture
Input File from

Mainframe or LAN
64K Block
64K Block
64K Block
64K Block
Teradata
PE
AMP
AMP
AMP
AMP
Fastload inserts into empty tables at the Block Level.

No Secondary Indexes, Referential Integrity or Triggers allowed.
AMP
Empty
Table
AMP
Empty
Table
AMP
Empty
Table
AMP
Empty
Table
Page 76
Chapter 6
Multiload
No wonder nobody comes here Its too

crowded
Yogi Berra
Tera-Tom has actually had dinner with Yogi and he was a real pleasure. As an AllAmerican Athlete who placed third in the NCAAs for the University of Arizona in 1979
Tera-Tom got to spend some time with Yogi. Yogi is a lot like Multiload. He is fast on
his feet, is extremely versatile, and he knows a little bit about clean-up. Multiload can
handle the high heat or the curve when inserting, updating or deleting data.
Where FastLoad is meant to populate empty tables with INSERTS, Multiload is meant to
process INSERTS, UPDATES, and DELETES on tables that have existing data.
Multiload is extremely fast. One major Teradata data warehouse company processes 120
million inserts, updates, and deletes during its nightly batch.
Multiload works similar to FastLoad. Data originates as a flat file on either a mainframe
or LAN. When the Multiload utility is executed, the Parsing Engine creates a plan for the
AMPs to follow. The data is then passed to the AMPs, in parallel, in 64K blocks, and the
AMPs hash the rows to the proper AMP. Last, the INSERTS, UPDATES, and
DELETES are applied.
In the previous diagram the mainframe/LAN is talking to the Parsing Engine. The PE
passes the data across the BYNET for the AMPs to retrieve. Keep in mind, many
systems have hundreds to thousands of AMPs. The load takes place, continually, in
parallel when the 64K packets are delivered to the AMPs. Multiload has been designed
for users who have a need for speed. Multiload locks at the table level. Therefore,
while Multiload is running, the table is unavailable unless users utilize an Access Lock.
Multiload Basics:

Up to 20 INSERTS, UPDATES, or DELETES may be executed on up to 5 tables;
Receiving tables are usually populated;
There can be no Unique secondary indexes, referential integrity, or triggers;
It locks at the table level.
Multiload loads to populated tables at the block level. Teradata UPDATEs using
MULTILOAD.
Page 77
Chapter 6
Multiload Picture
Input File from

Mainframe or LAN
64K Block
64K Block
64K Block
64K Block
Teradata
PE
AMP
AMP
AMP
AMP
Multiload inserts, updates, upserts and deletes rows into

populated tables at the Block Level. It does not allow Triggers,
Unique Secondary Indexes (USIs) or Referential Integrity.
AMP
AMP
AMP
AMP
Populated
Table
Populated
Table
Populated
Table
Populated
Table
Page 78
Chapter 6
TPump
You dont drown by falling into the water;

you drown by staying in the water.
-Edwin Louis Cole
The TPump utility is designed to allow OLTP transactions to immediately load into a
data warehouse. When I started working with Teradata, more than 10 years ago, most
companies loaded data on a monthly basis. Suddenly, companies began to load data
weekly. Today, most companies load data nightly, and industry leaders are loading data
hourly. TPump is the beginning step of an Active Data Warehouse (ADW). ADW
combines OLTP transactions with a Decisions Support System (DSS).
If the data is not flowing, a company can drown in it! The utility is called TPump because
it theoretically acts like a water faucet. TPump can be set to full throttle to load millions
of transactions during off peak hours or turned down to trickle small amounts of data
during the data warehouse rush hour. It can also be automatically preset to load different
levels at certain times during the day, and can be modified at any time.
Also, TPump locks at a row level so users have access to the rest of the rows while the
table is being loaded.
Basics:

Processes INSERTS, UPDATES, or DELETES;
Tables are usually populated;
It can have secondary indexes, triggers, and referential integrity;
It locks at the row level.
TPump is used for continuous updates to rows in a table. Teradata STREAMs using
TPump.
Page 79
Chapter 6
TPump Picture
Input File from

Mainframe or LAN
Packets
Packets
Packets
Packets
Teradata
PE
AMP
AMP
AMP
AMP
Tpump inserts, updates, upserts and deletes rows into

populated tables at the Row Level. It supports Triggers,
all Secondary Indexes and Referential Integrity.
AMP
AMP
AMP
AMP
Populated
Table
Populated
Table
Populated
Table
Populated
Table
Row Level
Locks
Row Level
Locks
Row Level
Locks
Row Level
Locks
Page 80
Chapter 6
FastExport
The most exciting phrase to hear in

science, the one that heralds the most
discoveries, is not Eureka!, but Thats
funny
Isaac Asimov
The most exciting words when loading or unloading data is That Fast. Put a seat belt
on before running FastExport because this utility will blow your socks off.
FastExport is designed to export Teradata data to a flat file on a mainframe or LAN.
FastExport merely takes an SQL Select command and places the output to a host.
FastExport exports data from multiple tables and exports data to a host file.
Teradata LOADs using FASTLOAD
Teradata UPDATEs using MULTILOAD
Teradata STREAMs using TPump
Teradata Exports using FASTEXPORT
Page 81
Chapter 6
FastExport Picture
Output to a
Mainframe or LAN
Teradata
PE
Host
File
AMP
AMP
AMP
AMP
Fastexport uses a SELECT statement to retrieve rows from

one or more tables and exports the result set to a host
file on a mainframe or LAN.
AMP
AMP
AMP
AMP
Populated
Table
Populated
Table
Populated
Table
Populated
Table
Page 82
Chapter 7
Chapter 7 Secondary Indexes
I dont skate to where the puck is, I skate to

where I want the puck to be.
Wayne Gretzky
What Wayne Gretzky is saying is that he finds the best path to the goal and expects the
puck to be there when he arrives for the shot. Secondary indexes are similar because they
define a path that will deliver the data quickly to meet the users expected goals. A
secondary index is an alternate path to the data. They can be defined as a Unique
Secondary Index (USI) or a Non-Unique Secondary Index (NUSI). Without any
secondary indexes, your data warehouse could be skating on thin ice!
When it comes to working with large amounts of data that is centrally located, demands
for performance to access this data is key. So what can a user do to influence the way
data is accessed? The first rule of thumb, which is essential when it comes to working
with centralized databases today, is to know your data. Second, understand how Teradata
manages data distribution and what a user can do to enhance performance. A query that
utilizes a Primary Index in the WHERE column is the fastest path to the data. A query
that utilizes a Secondary Index will provide an alternate path to the data and be the
second fastest access method. This chapter is dedicated to secondary indexes.
Secondary Indexes
Secondary Indexes provide another path to access data. Lets say that you were planning
a road trip to your hometown. To determine the best way to get there, you need to utilize
a map. This map will give you many alternatives to plan your trip. In this case, you need
to get there, ASAP. So you choose the best route to get there in the shortest period of
time. Secondary indexes work very similar to this above example because they provide
another path to the data. Teradata allows up to 32 secondary indexes per table. Keep in
mind that the base table data rows arent redistributed when secondary indexes are
defined. The value of secondary indexes is that they reside in a subtable and are stored
on all AMPs, which is very different from how the primary indexes (part of base table)
are stored. Keep in mind that Secondary Indexes (when defined) do take up additional
space.
Secondary Indexes are frequently used in a WHERE clause. The Secondary Index can be
changed or dropped at any time. However, because of the overhead for index
maintenance, it is recommended that index values should not be frequently changed.
There are two different types of Secondary Indexes, Unique Secondary Index (USI), and
Non-Unique Secondary Index (NUSI). Unique Secondary Indexes are extremely
efficient. A USI is considered a two-AMP operation. One AMP is utilized to access the
Page 83
Chapter 7
USI subtable row (in the Secondary Index subtable) that references the actual data row,
which resides on the second AMP.
A Non-Unique Secondary Index is an All-AMP operation and will usually require a spool
file. Although a NUSI is an All-AMP operation, it is faster than a full table scan.
Secondary indexes can be useful for:
Satisfying complex conditions
Processing aggregates
Value comparisons
Matching character combinations
Joining tables
Below is a general illustration of a secondary index subtable row:
Secondary Index Subtable Columns
Secondary
Index Value
(Actual Length)
Secondary
Index Row-ID
8 Bytes
Primary Index
Row-ID
8 Bytes
Secondary Index Column Lengths
Page 84
Chapter 7
Unique Secondary Index (USI)
Measure a thousand times and cut once.

-Turkish Proverb
Secondary Indexes provide an alternate path to the data, and should be used on queries
that run thousands of times. Teradata runs extremely well without secondary indexes, but
since secondary indexes use up space and overhead, they should only be used on
KNOWN QUERIES or queries that are run over and over again. Once you know the
data warehouse, environment you can create secondary indexes to enhance its
performance.
Measure a thousand query times and

create a secondary index.
-Turkish Teradata Certified Professional
Whenever a secondary index is created, Teradata creates a secondary index subtable on
each AMP. All secondary index subtables contain:
Secondary Index Value
Secondary Index Row ID
Primary Index Row ID
A UNIQUE Secondary Index (USI) will improve data retrieval and can also be used to
enforce uniqueness on a primary key. Typically, only two AMPs are used on a
Unique Secondary Index (USI) access.
A Non-Unique Secondary Index (NUSI) is AMP local and is an All AMP operation,
but not a full table scan.
Four major index types with Teradata are Join Index, Hash Index, Sparse Index and a
Value Ordered Index.
Page 85
Chapter 7
USI Subtable Example

When a USI is designated on a table, each AMP will build a subtable to point back to
the base table. If you create 32 USI indexes on a table, then each AMP will build 32
separate subtables. Therefore, choose your Secondary Indexes wisely, because space is
used when these indexes are created. When a user inputs SQL that utilizes a USI in the
WHERE clause, then Teradata will know that either one row or no rows can be returned.
Reason, the column in the WHERE clause is unique. The following example illustrates
how a USI Subtable is created and how it works to speed up queries.
Employee Table with Unique Secondary Index (USI) on Soc_Security

Employee Base Table
ROW ID
04,1
18,1
25,1
Employee Base Table
Emp Dept Fname Lname Soc_Security ROW ID

88
75
15
20 John
10 Mary
30 John
Secondary
Index Value
123-99-8888
146-69-2650
235-83-8712
Marx
Mavis
Davis
Secondary
Index Row-ID
102,1
118,1
134,1
276-68-2130
235-83-8712
423-87-8653
Base Table
Row-ID
45,1
14,1
18,1
Secondary Index Subtable
14,1
38,1
45,1
Emp Dept Fname Lname Soc_Security

45
32
65
10
10
40
Max
Will
Oki
Secondary
Index Value
276-68-2130
423-87-8653
212-53-4532
Wiles
Berry
Ngu
146-69-2650
212-53-4532
123-99-8888
Secondary
Base Table
Index Row-ID Row-ID
121,1
138,1
144,1
04,1
25,1
38,1
When A USI is created Teradata will immediately build a secondary

index subtable on each AMP.
Each AMP will then hash the secondary index value for each of their
rows in the base table. In our example, each AMP hashes the
Soc_Security column for all employee rows they hold.
The output of the Soc_Security hash will utilize the hash map to point
to a specific AMP and that AMP will hold the secondary index
subtable row for the secondary index value.
Page 86
Chapter 7
How Teradata retrieves an USI query

When an USI is used in the WHERE clause of an SQL statement, the PE Optimizer
recognizes the Unique Secondary Index. It will perform a two-AMP operation to find
the base row. Teradata knows it is looking for only one row and it can find it easily. It
will hash the secondary index value and the hash map will point to the AMP where the
row resides in the subtable. The subtable row will hold the base table Row-ID and
Teradata will then find the base row immediately.
SELECT * FROM Employee

WHERE Soc_Security = 123-99-8888 ;
Step 1: Hash the
Value 123-99-8888
Step 2: Go to the AMP

where the Hash Map Points.
Hash Map
Take the row hash

output and point to
a bucket in the Hash Map
to locate the AMP holding
the subtable row for
Soc_Security 123-99-8888
1
2
2
1
1
2
Locate the Soc_Security

123-99-8888 row in the
subtable and get the
base Row-ID. Use the
base Row-ID to find
the base row.
Employee Table with Unique Secondary Index (USI) on Soc_Security

Employee Base Table
ROW ID
04,1
18,1
25,1
Emp Dept Fname Lname Soc_Security ROW ID

88
75
15
20 John
10 Mary
30 John
Secondary
Index Value
STEP 1
Employee Base Table
123-99-8888
146-69-2650
235-83-8712
Locate the
Secondary
Index Value
In the Subtable.
Find the Base Table
Row-ID.
Secondary
Marx
Mavis
Davis
Secondary
Index Row-ID
102,1
118,1
134,1
276-68-2130
235-83-8712
423-87-8653
Base Table
Row-ID
45,1
14,1
18,1
14,1
38,1
45,1
S
T
E
P

45
32
65
10
10
40
Max
Will
Oki
Secondary
Index Value
Wiles
Berry
Ngu
Secondary
Index Row-ID
276-68-2130
423-87-8653
212-53-4532
121,1
138,1
144,1
146-69-2650
212-53-4532
123-99-8888
Base Table
Row-ID
04,1
25,1
38,1
Index Subtable
Use the Base Table

Row-ID to find the
Base Table row. Secondary
Index Subtable
Page 87
Chapter 7
NUSI Subtable Example

When a Non-Unique Secondary Index (NUSI) is designated on a table, each AMP will
build a subtable. The NUSI subtable is said to be AMP local because each AMP will
create its secondary index subtable to point to its own base rows. In other words, every
row in an AMPs NUSI subtable will reflect and point to the base rows it owns. When a
user inputs SQL that utilizes a NUSI in the WHERE clause, then Teradata will have each
AMP check its subtable to see if it has any qualifying rows. Only the AMPs that contain
the values that are needed will be involved in the actual retrieve.
Employee Table with Non-Unique Secondary Index (NUSI) on Fname

Employee Base Table
ROW ID
04,1
18,1
25,1

88
75
15
20 John
10 Mary
30 John
Secondary
Index Value
John
Mary
Marx
Mavis
Davis
276 -68-2130
235 -83-8712
423 -87-8653
Secondary
Base Table
Index Row-ID Row-ID
145,1
156,1
04,1 25,1
18,1
Employee Base Table
ROW ID
14,1
38,1
45,1

45
32
65
10
10
40
Max
Will
Oki
Secondary
Index Value
Max
Will
Oki
Wiles
Berry
Ngu
146 -69-2650
212 -53-4532
123 -99-8888
Secondary
Base Table
Index Row-ID Row-ID
134,1
157,1
159,1
14,1
38,1
45,1
When A NUSI is created Teradata will immediately build a secondary

index subtable on each AMP.
Each AMP will hold the secondary index values for their rows in the
base table only. In our example, each AMP holds the Fname column
for all employee rows in the base table on their AMP (AMP local).
Each AMP Local Fname will have the Base Table Row-ID (pointer)
so the AMP can retrieve it quickly if needed. If an AMP contains
duplicate first names, only one subtable row for that name is built
with multiple Base Row-IDs.
Page 88
Chapter 7
How Teradata retrieves a NUSI query

When an NUSI is used in the WHERE clause of an SQL statement, the PE Optimizer
recognizes the Non-Unique Secondary Index. It will perform an all AMP operation to
look into the subtable for the requested value. If it contains the value it will continue
participation. If it does not contain the requested value it will no longer participate. A
NUSI query is an All AMP operation, but not a Full Table Scan (FTS).

WHERE Fname = John ;
Step 1: Hash the
Value John for speed.
Step 2: All AMPs who

contain a John will
retrieve their rows.
Take the row hash for John

and have each AMP check
its subtable to see if it has
a John.
Any AMP that does not

contain the name John will
no longer participate in the
query.
Employee Table with Non-Unique Secondary Index (NUSI) on Fname

Employee Base Table
ROW ID
04,1
18,1
25,1

88
75
15
20 John
10 Mary
30 John
Secondary
Index Value
Find John
Employee Base Table
John
Mary
* Marx
Mavis
Davis
*
276 -68-2130
235 -83-8712
423 -87-8653
ROW ID
14,1
38,1
45,1
* 04,1
18,1
25,1
45
32
65
10
10
40
Max
Will
Oki
Secondary
Index Value
Secondary
Base Table
Index Row-ID Row-ID
145,1
156,1
Max
Will
Oki
O
W
S
Wiles
Berry
Ngu
146 -69-2650
212 -53-4532
123 -99-8888
Secondary
Base Table
Index Row-ID Row-ID
134,1
157,1
159,1
14,1
38,1
45,1
Page 89
Chapter 7
Value Ordered NUSI

When a Value Ordered Non-Unique Secondary Index (Value Ordered NUSI) is
designated on a table, each AMP will build a subtable. The NUSI subtable is said to be
AMP local because each AMP will create its secondary index subtable to point to its own
base rows. In other words, every row in an AMPs NUSI subtable will reflect and point to
the base rows it owns. It is called a Value Ordered NUSI because instead of the subtable
being sorted by Secondary Index Value HASH it is sorted numerically.
Employee Table with Value Ordered Non-Unique Index Secondary Index on Dept
Employee Base Table
ROW ID

88
75
15
04,1
18,1
25,1
20 John
10 Mary
30 John
Secondary
Index Value
10
20
30
Marx
Mavis
Davis
276 -68-2130
235 -83-8712
423 -87-8653
Secondary
Base Table
Index Row -ID Row-ID
145,1
156,1
158,1
18,1
04,1
25,1
Employee Base Table
ROW ID
14,1
38,1
45,1

45
32
65
10
10
40
Max
Will
Oki
Secondary
Index Value
10
40
Wiles
Berry
Ngu
146 -69-2650
212 -53-4532
123 -99-8888
Secondary
Base Table
145,1
159,1
14,1 38,1
45,1
When A Value Ordered NUSI is created Teradata will immediately build a

secondary index subtable on each AMP and sort it in order.
Each AMP will hold the secondary index values for their rows in the base table
only. In our example, each AMP holds the Dept column for all employee rows in
the base table on their AMP (AMP local).
Each AMP Local Dept will have the Base Table Row-ID (pointer) so the AMP
can retrieve it quickly if needed. This is excellent for Range queries because the
subtable is sorted numerically by Dept.
Page 90
Chapter 7
How Teradata retrieves a Value Ordered NUSI

query
When a Value Ordered NUSI is used in the WHERE clause of an SQL statement, the PE
Optimizer recognizes the Value Ordered Non-Unique Secondary Index. It will perform
an all AMP operation to look into the AMP-Local subtable for the requested value. It is
excellent at checking ranges because all subtable rows are in order. If an AMP contains
the value or values requested it will continue participation. If it does not contain the
requested value or values it will no longer participate. A Value Ordered NUSI query is
an All AMP operation, but very seldom a Full Table Scan (FTS). A Value Ordered NUSI
must be non-unique and it must be a numeric data type. A DATE column type is
considered numeric and there for may be a Value-Ordered NUSI.

WHERE Dept BETWEEN 10 AND 20
Step 1: Check the subtable
for Dept values ranging
from 10 to 20
Step 2: If the AMP has

qualifying rows then
retrieve the rows in the
range.
If no rows are found then

The AMP should no longer
Participate in the query.
Employee Table with Value Ordered Non-Unique Index Secondary Index on Dept
Employee Base Table
ROW ID
04,1
18,1
25,1

88
75
15
20 John
10 Mary
30 John
Secondary
Index Value
10
20
30
Marx
Mavis
Davis
276 -68-2130
235 -83-8712
423 -87-8653
Secondary
Base Table
145,1
156,1
158,1
18,1
04,1
25,1
Employee Base Table
ROW ID
14,1
38,1
45,1

45
32
65
10
10
40
Max
Will
Oki
Secondary
Index Value
10
40
Wiles
Berry
Ngu
146 -69-2650
212 -53-4532
123 -99-8888
Secondary
Base Table
145,1
159,1
14,1 38,1
45,1
Page 91
Chapter 7
Secondary Index Summary

You can have up to 32 secondary indexes for a table.
Secondary Indexes provide an alternate path to the data.
The two types of secondary indexes are USI and NUSI.
Every secondary index defined causes each AMP to create a
subtable.
USI subtables are hash distributed.
NUSI subtables are AMP local.
USI queries are Two-AMP operations.
NUSI queries are All-AMP operations, but not Full Table
Scans.
Value-Ordered NUSIs can be any non-unique index of
integer type.
Always Collect Statistics on all NUSI indexes.
The PE will decide if a NUSI is strongly selective and worth
using over a Full Table Scan.
Use the Explain function to see if a NUSI is being utilized or
if bitmapping is taking place.
Page 92
Chapter 7
Chart for Primary and Secondary Access

The chart below shows that Primary Index access is a one-AMP operation. For Unique
Secondary Index (USI) access, it is a two-AMP operation. For Non-Unique Secondary
Index (NUSI) access, it is an all-AMP operation, but not a Full Table Scan (FTS). Keep
this chart near and dear to your heart.
Primary Number Rows

Index of AMPs Returned
UPI
0-1
NUPI
0-Many
USI
0-1
NUSI
All
0-Many
Page 93
Chapter 8
Page 94
Chapter 8
Chapter 8 The Active Data Warehouse
Only he who attempts the ridiculous may

achieve the impossible.
Don Quixote
For years it has always been the belief that there were computer systems designed for
Online Transaction Processing (OLTP) and others designed for Decision Support. IBMs
DB2 and Oracle were originally designed for quick transactions in an OLTP world.
Teradata was originally designed for Decision Support (DSS) in the data warehousing
world. Teradata has attempted what many once felt was the ridiculous by combining
OLTP quick transactions with the power of DSS to achieve the impossible. This
incredible concept is called the Active Data Warehouse.
Here is how the Active Data Warehouse came to evolve. Back in the early 1990s
companies loaded new data to the data warehouse on a monthly basis. This was pretty
much the standard practice. As competition began to get more prevalent companies
decided they needed an edge and began to load data on a weekly basis. It was only a
matter of time before most companies were doing nightly loads. Now, companies want
to load data in near real-time processing. What advantage does this bring?
The Active Data Warehouse allows companies to take their OLTP transactions and load
them into the data warehouse in near real-time so users can analyze data and make
decisions before their competitors.
Some of the characteristics of an active data warehouse environment are mission critical
applications, tactical queries and a need for 24/7 reliability.
Active data warehouses provide scalability in order to support large amounts of detail
data. Users are allowed to update the operational data store directly, and an integrated
environment supporting a wide mix of queries is created.
Page 95
Chapter 8
OLTP Environments
Always be a first-rate version of yourself,

instead of a second-rate version of
somebody else.
-Judy Garland
OLTP environments are quite different than DSS environments. OLTP environments
involve many quick transactions where DSS environments have long transactions. A
Transaction is considered a logical unit of work.
In an OLTP environment transactions typically occur in seconds and not minutes. The
number of rows per transaction is also smaller. There are a great deal of writes, but
because the rows are small it is not considered write intensive.
OLTP applications will utilize very little I/O processing to complete transactions and
most often only access a few of many possible tables. For example updating a checking
or savings account to reflect a deposit or withdrawal would only affect one or two
tables and only one or two rows would be updated.
An example of an OLTP transaction is going to a retail store and buying a pencil or
making an ATM money withdrawal from your local bank.
You know you have arrived at an active data warehousing environment when you have
Analytical Modeling, continuous updates and even-based triggering.
Page 96
Chapter 8
The DSS environment
We're going to have the best-educated

American people in the world.
Dan Quayle
If Dan Quayle would have had a Teradata system he would probably be president.
Instead he is often considered the potatoe head of vice presidents.
Teradata is designed around Decision Support (DSS). If you were designing a data
warehouse for a customer to use for strategic long range planning and answering what
if questions, you would definitely want a DSS system.
The DSS environment has many users asking a wide variety of questions. Most of the
questions involve reading records so READ locks are primarily used.
With DSS environments most queries take minutes to hours. The transaction usually
involves multiple tables and millions of rows. DSS environments can be brought to their
knees if the have to continually wait due to locks on the system.
A true data warehousing environment will need to support three types of environments
for Pre-defined reports, Ad Hoc Queries, and Data Mining and Analytical Modeling.
Data Warehouse Environments
Pre-defined Reports
Ad Hoc Queries
Data Mining
Analytical Modeling
Page 97
Chapter 8
Mixing OLTP and DSS environments
Am I not destroying my enemies when I

make friends of them?
-Abraham Lincoln
Teradata makes friends of a data warehouses worst enemy OLTP transactions.
In an OLTP world the query times are predictable. In the DSS world the query times are
unpredictable. OLTP depends on throughput and DSS depends on power. Mixing the
environments is difficult. This is especially true because OLTP environments do a lot of
writes where DSS does a lot of reads.
OLTP queries run quickly and are often called Tactical Queries. An example of a tactical
query might be altering a campaign based on current results or determining the best
offer for a specific customer.
.
Table
Tactical
Query
Write
Lock
Read
Lock
DSS
Query
Tactical
Query
Write
Lock
An Active Data Warehouse consists of short tactical OLTP type queries mixed with large
Decision Support Queries. The OLTP queries like to WRITE lock the data which is bad
when other queries need to READ the data only.
The evolution of a true data warehouse takes time and the data warehouse activities will
naturally evolve towards an active data warehouse. In the beginning the warehouse is
used for analyzing which over time evolves into predicting and finally into
operationalizing.
Page 98
Chapter 8
Detail Data
Can't died when Could was born.

--- Author Unknown
Detail data is the foundation for a great data warehouse. For years most companies said
they Could keep that much data when the real facts were the database Couldnt handle
it. Teradata said Could while the others said Cant! Some companies are now processing
over 50 Terabytes of raw data.
The ability to use detail data and Ad Hoc Queries as well as the decreased need for
summary data are a few aspects of DSS environments that have gained importance.
Detail data is the cornerstone of a good warehouse. Without detail data users cant dig
into the detail. If they ask a question and get a summarized answer they can check the
detail for the explanation.
In the past, detail data was not used as often because most systems did not have the
power to read millions of records, sort millions of records, execute full table scans, and
perform aggregations on millions of rows.
Teradata has always been great with the detail and continues the tradition today.
Page 99
Chapter 8
Easy System Administration
I have had dreams and I have had

nightmares. I overcame my nightmares
because of my dreams
--- Author Unknown
A data warehouse brings dreams of turning data into information and saving the
corporation millions of dollars. A data warehouse brings nightmares to someone who has
to administer and manage this dynamic and daily growing giant. Most data warehouses
require from 4 to 10 system administrators working rampantly around the clock. I always
recommend two system administrators for Teradata data warehouses. Why two? In case
one gets hit by a bus! Now, this is a dream come true.
With most databases, the system administrator is responsible for setting up the database,
placing and partitioning data, running database reorganizations, and tuning queries. This
is a tremendous responsibility, especially when dealing with large amounts of data in a
complex data warehouse environment. Plus, data warehouses on average are doubling in
size each year. Teradata was designed to let the system manage these functions and the
larger the database, the more Teradata outshines the competition.
I have travelled around the globe from one corporation to another training the world on
data warehousing and thousands of people on Teradata. The topics include system
administration, load utilities, architecture, SQL, and operations. Students who have
experience with other databases literally think I am out of my mind when I explain the
Teradata database. They say things like, "If loading data and system administration is
that easy, why isnt everyone doing it?" The answer is simple. Teradata was originally
designed around parallel processing with hands-off operations to work in conjunction
with large amounts of mainframe data. This design concept must be done in the original
design and most databases overlooked it.
The DBA never has to do Database Reorganizations and never has to pre-prepare data
to load and there is never a pre-allocation of table space.
Page 100
Chapter 8
Data Marts
I have found the best way to give advice to

your children is to find out what they want
and then advise them to do it.
--Harry S. Truman
Data Marts are always designed for a particular use and will either be summary data
for a particular use or detailed data for a particular use. Because they have a
particular use they are designed for speed.
Data Warehouse
Tables of Detail Data
Tables of Summary Data
A Logical
Data Mart
There are two types of data marts in logical and physical data marts. A logical data
mart is an existing part of the data warehouse, but a physical data mart resides on
another platform.
Page 101
Chapter 8
Teradata Tools - SQL Assistant
He who asks a question may be a fool for

five minutes, but he who never asks a
question remains a fool forever.
Unknown
SQL Assistant is a tool that allows users to become cool. Nothing makes a user cooler
then positively affecting the company bottom line. SQL Assistant allows access to
Teradata and other databases as well. SQL Assistant is how users submit their SQL
and soon questions are being answered and data warehouse genius is born.
Page 102
Chapter 8
TDQM
Be not afraid of going slowly,

Be afraid of standing still.
- Chinese Proverb
The wrong mix of Teradata queries can make users afraid because the system can not
only slow down, but appear to be standing still. TDQM uses rules to make sure your
system doesnt stand still or go slowly. Teradata Database Query Manager (TDQM)
provides users with the ability to schedule SQL requests at a later time using the
Teradata DQM Scheduled Request Viewer. TDQM automatically manages system
workflow by stopping queries from executing if they violate predefined rules. TDQM
can limit certain types of joins and can even control access to certain database objects.
Queries can be delayed or cancelled based on predefined rules. TDQM requires queries
to be based on set thresholds.
The TDQM server can be started or stopped through the Control Panel Services
application or from the TDQM Scheduled Requests Operations Utility.
The TDQM Scheduled Requests Operations utility menu has the following menus:
File
Configuration
Server
Information
Error Log
Help
TDQM allows for a period of time to be established when TDQM can execute scheduled
requests that are waiting to run. This is usually done during off peak hours. TDQM
schedules jobs, which are considered an individual execution of an instance of a
scheduled request. A request is considered a definition of the parameters and text
associated with a scheduled request. Finally, a scheduled request is a stored script of
SQL requests to be executed at a scheduled time later in the day.
Page 103
Chapter 8
Index Wizard
If the facts dont fit the theory, change the

facts
-Albert Einstein
The Index Wizard will allow Teradata to find the theory of relativity for Secondary
Indexes. Index Wizard is designed to help with Secondary Index recommendations
(not primary) and comes with a beautiful Graphical User Interface (GUI). The wizard
works by analyzing SQL statements in a defined workload and then recommends the
best secondary indexes to utilize based on What If analysis.
Index Wizard analyzes a workload of SQL and then creates a series of reports and index
recommendations describing the costs and statistics associated with the
recommendations. The reports help you backup your decision to apply an index.
Both Index Wizard and Statistics Wizard allow a user to import workloads from other
Teradata tools such as Database Query Log (DBQL) or Query Capture Database (QCD).
Here are the steps to using the Index Wizard in exact order:
1.
2.
3.
4.
5.
6.
Define a workload
The workload is analyzed
The Wizard recommends Secondary Indexes
Reports are generated
Indexes are validated
Indexes can be applied
You can define a workload:
Using DBQL Statements

Using QCD
Entering SQL Text
Importing a Workload
Creating a new workload from an existing one
Page 104
Chapter 8
Archive Recovery
A Diamond is a lump of coal that could

handle the pressure
-William Coffing
The Archive Recovery tool (ARC) is a diamond in the restore. The Archive Recovery
utility allows you to copy a table and restore it to another Teradata Database. The
Archive and Recovery (ARC) utility backs up and restores database tables, objects, and
database DBCs Data Dictionary. The ARC utility performs three major tasks:
Archive Dumps data onto portable storage (usually tape)

Restore Reverses the archive process and moves data from the stored media
Recovery Uses information stored in the Permanent Journals for
Rollback/Rollforward information
ARC provides data protection when there is a loss of data on a failed AMP containing
Non-Fallback tables or when multiple AMPs go down within the same cluster rendering
the Fallback useless for the cluster. It can also be used when objects are dropped or rows
are deleted from a table or even Batch Processing miscues. When you think of ARC
think first of Disaster Recovery and second think of accidental stupid mistakes. Either
way ARC has got your back!
ARC does NOT work with Join Indexes or Hash Indexes. If you need to recover a Join
Index or Hash Index just make sure the tables that the Join Index were created on are
alright and then drop and recreate the Join Index or Hash Index manually. Many DBAs
actually save the DDL for Join Index and Hash Index creation for this purpose.
There are several ways to invoke ARC including NetVault, NetBackup, ASF2, Command
Line of ARCMain, or directly from the host or Mainframe.
Page 105
Chapter 8
Teradata Analyst Suite
What lies behind us and what lies before us

are tiny matters compared to what lies
within us
-Ralph Waldo Emerson
The Teradata Analyst Suite has three tools that dont lie because they use facts to provide
information so users can find the brilliance that lies within them. This allows analyses of
the Teradata system, which can make Teradata, perform better and it doesnt get any
sweeter then that.
The three tools and utilities that are part of the Teradata Analyst Suite are:
Teradata Index Wizard
Query Capture Database
Teradata System Emulation Tool
We hope you have enjoyed this book and its simple explanations of Teradata. The basics
are the foundation to base the rest of your Teradata knowledge. Now, go pass the
Teradata Certification test and become a Teradata Certified Professional. What lies
before you will be huge after you pass that test. This is the only book you need to study
and always remember, Where you find bold you will find gold.
Page 106
Chapter 8
This page blank on purpose
Page 107

Teradata Basic

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Teradata Basic

Загружено:

Авторское право:

Доступные форматы

Tera-Tom on Teradata

Basics for V2R5

First Edition June, 2004

Printed in the United States of America

About Coffing Data Warehousings CEO Tom Coffing

Secrets of the Best Data Warehouses in the World

- Teradata Certified SQL Specialist

Chapter 4 V2R5 Partition Primary Indexes ................................................................. 47

Chapter 1 The Rules of Data Warehousing

Let me once again explain the rules.

8 of the Top 13 Global Airlines use Teradata

10 of the Top 13 Global Communications Companies use Teradata

9 of the Top 16 Global Retailers use Teradata

8 of the Top 20 Global Banks use Teradata

40% of Fortune's "US Most Admired" companies use Teradata

Copyright Open Systems Services 2004

In the Sea of Information your Data

Copyright Open Systems Services 2004

Teradata: Brilliant by Design

The man who has no imagination has no

Copyright Open Systems Services 2004

The Teradata Parallel Architecture

Fall seven times, stand up eight.

Copyright Open Systems Services 2004

Tera-Tom Parallel Processing Laundry Mat

Only one customer allowed at a time

After enlightenment, the laundry

After parallel processing the laundry,

Copyright Open Systems Services 2004

A Logical View of the Teradata Architecture

Kites rise highest against the wind not

Copyright Open Systems Services 2004

The Parsing Engine (PE)

The greatest weakness of most humans is

Copyright Open Systems Services 2004

The Access Module Processors (AMPs)

A true friend is one who walks in when the

(1) Retrieve all Orders from the Order_Table.

Copyright Open Systems Services 2004

Not all who wander are lost.

The PE checks the users SQL Syntax;

Copyright Open Systems Services 2004

A Visual for Data Layout

I saw the angel in the marble and carved

Each AMP holds a portion of every table.

Teradatas Parallel Architecture and a very mature optimizer (PE) make it

Copyright Open Systems Services 2004

Teradata is a shared nothing Architecture

To have everything is to possess nothing.

To Parallel Process everything is to

Copyright Open Systems Services 2004

Teradata has Linear Scalability

The most important thing a father can do

Double your AMPs and double your speed!

Teradatas Linear Scalability is excellent protection for Application Development.

Copyright Open Systems Services 2004

How Teradata handles data access

The PE handles session control functions

The BYNET sends communications between the nodes

Handles its Users Sessions

The AMPs retrieve and

The Parser checks statements for proper syntax.

Copyright Open Systems Services 2004

Teradata Cabinets, Nodes, VPROCs, and Disks