Академический Документы
Профессиональный Документы
Культура Документы
Document Type
Date Created
December 1, 2008
Current Version
Version 1.1
Authors
Approved By
Manish Korgaonkar
Approved Date
Prepared By
Preface
Purpose:
This document will take you through the enhancements in V2R12.0.0 as compared to V2R6.1.X.
Audience:
The primary audience includes database and system administers and Application developers.
Prerequisites:
You should be familiar with TD V2R6.1.X features.
Table of Contents
1 Introduction......................................................................................6
2 Performance Enhancements...............................................................7
2.1 Collect Statistics Improvements...................................................................................................7
2.1.1 Better Cardinality Estimate....................................................................................................8
2.1.2 Improved AMP-level statistics..............................................................................................16
2.2 Parameterized Statement Caching............................................................................................18
2.2.1 Where this feature applies ..................................................................................................19
2.2.2 Where this feature does not apply.......................................................................................19
2.2.3 How this feature works........................................................................................................20
3 Database Enhancements..................................................................22
3.1 Restartable Scandisk.................................................................................................................22
3.1.1 Usage..................................................................................................................................22
3.2 Check table Enhancements ......................................................................................................30
3.2.1 Usage of Check table .........................................................................................................30
3.2.2 Differences among Checking Levels...................................................................................30
3.2.3 New Features in Teradata 12.0 ..........................................................................................31
3.2.4 Checktable checks compressed values...............................................................................31
3.3 Software Event Log....................................................................................................................33
4 Security Enhancements....................................................................34
4.1 Password Enhancements..........................................................................................................34
4.1.1 Password Enhancements....................................................................................................34
4.1.2 How does it Work?..............................................................................................................36
4.1.3 Data Dictionary Modifications..............................................................................................36
5 Enhancements to Utilities.................................................................37
5.1 Normalized Resusage/DBQL data and Ampusage views for coexistence systems...................37
5.1.1 Resusage Data....................................................................................................................38
5.1.2 DBQL Data..........................................................................................................................38
5.1.3 Ampusage View...................................................................................................................39
5.2 Tdpkgrm New option to remove all non-current TD packages................................................40
5.3 MultiTool New DIP option........................................................................................................42
8 TASM ..............................................................................................65
8.1 Query Banding...........................................................................................................................65
8.1.1 How to Set a Query Band....................................................................................................65
8.1.2 Query Band and Workload Management............................................................................69
8.1.3 Using Query banding to Improve Resource Accounting......................................................78
8.1.4 Using Both the Session and Transaction Query Bands.......................................................79
8.2 State Matrix................................................................................................................................81
8.2.1 Managing the System through a State Matrix......................................................................81
8.2.2 System Conditions...............................................................................................................82
8.2.3 Operating Environments......................................................................................................86
8.2.4 State....................................................................................................................................88
8.2.5 Events.................................................................................................................................89
8.2.6 Periods:...............................................................................................................................94
8.3 Global Exception / Multiple Exception......................................................................................104
8.3.1 Global Exception Directive.................................................................................................104
8.3.2 Multiple Global Exception Directive...................................................................................109
8.4 Utility Management..................................................................................................................112
9 Usability Features.........................................................................114
9.1 Complex Error Handling..........................................................................................................114
9.2 Multilevel Partitioned Primary Index .......................................................................................117
9.2.1 Features of MLPPI.............................................................................................................117
9.2.2 MLPPI Table joins and Optimizer join plans......................................................................125
9.3 Schmon Enhancements .........................................................................................................130
9.3.1 Comparison of Options available in TD6.1 and TD12.0.....................................................130
9.3.2 Delay Modifier (-d) ............................................................................................................131
9.3.3 Display PG Usage.............................................................................................................133
9.4 Enhanced Explain Plan Details ..............................................................................................134
9.5 DBC Indexes Contains Join Index ID.......................................................................................140
9.6 List all Global Temporary Tables..............................................................................................142
9.7 ANSI Merge.............................................................................................................................143
1 Introduction
Teradatas mission is to provide an integrated, optimized, and extensible enterprise data warehouse
solution to power better, faster decisions. Teradata 12.0 is a highly integrated solution that continues to
advance Teradata further along in this mission.
Teradata 12.0:
Extends its lead in enterprise intelligence by supporting both strategic and operational intelligence.
Continues to be the only true choice for concurrently using detailed data in operational applications,
while using business intelligence, and deep analytics to direct business.
Strengthens business logic processing capability, high availability, and performance of the EDW and
ADW foundations.
Enhances query performance.
Advances its enterprise fit characteristics, including partner friendliness and ease of enterprise
integration.
Improves availability, supportability, and security.
2 Performance Enhancements
2.1
Internal enhancements to the way statistics are collected will capture a larger quantity of data
demographic information with more accuracy, so that the Optimizer can create better query execution
plans.
Description:
This feature provides the following benefits:
Improved decision support (DSS) query performance as a result of improved query execution
plans.
The statistics collection improvements allow the Optimizer to better estimate the cardinality (number of
elements) in the data in the following ways:
Statistics are stored as a histogram (collection of occurrences of values), and the more granular the
statistics, the better the query execution plans can be. In Teradata 12, the maximum number of intervals
has been increased from 100 to 200, providing the Optimizer with a more detailed picture of the actual
column data distribution for estimating purposes.
In TD6.1:
Screen shot:
TD 6.1
Collect
stats in
100
Intervals
In TD 6.1 Statistics are collected in only 100 Intervals as displayed in above Screen Shot.
TD12.0:
Screen Shot:
T12 collect
stats in
200
Intervals
In TD 12.0 Statistics are collected in 200 intervals as displayed in above Screen shot.
Advantages:
Statistics are stored as a histogram (collection of occurrences of values), and the more granular the
statistics, the better the query execution plans can be. In Teradata 12, the maximum number of intervals
has been increased from 100 to 200, providing the Optimizer with a more detailed picture of the actual
column data distribution for estimating purposes.
Certain types of queries will experience improved performance, specifically:
2.1.1.2
Improved Statistics collection for multi-column NULL
values
Prior to Teradata 12, the system counted rows with at least one NULL value as a NULL row, even if some
columns in a row did have values. This improvement more accurately identifies and counts the following:
All-NULL rows in multi-column statistics and multi-column index statistics (used by the Optimizer
to detect skew)
Partially NULL rows in multi-column statistics and multi-column index statistics (used by the
Optimizer to estimate single-table selectivity)
Unique rows in a multi-column scheme (used by the Optimizer to estimate the number of distinct
hash values during a redistribution operation)
Example:
To see Improved Statistics count, suppose there are statistics collected on the table below:
Teradata 6.1: Statistics would indicate four NULL rows and two rows with unique values.
Teradata 12.0 and above: The statistics more accurately indicate two all-NULL rows and four
rows with unique values.
For Comparison between TD6.1 and TD 12.0 Please use below Scripts:
CollectStats_All_NUL
LS.txt
CollectStats_UNIQUE
_Values.txt
In TD 6.1 Collect
Stats
shows
2
unique values
In TD 12.0 Collect
Stats
shows
4
unique values
Advantages:
With the more refined count of all-NULL rows, the Optimizers join plans are improved, especially
for large tables where a significant number of rows have Nulls. In addition, any data redistribution
effort is more accurately estimated.
This improvement does not change procedures for collecting or dropping statistics or any
associated timing for collecting statistics.
Prior to Teradata 12, average rows-per-value (RPV) statistics were obtained using a probability model,
which often underestimated the actual rows.
In Teradata 12, this measure is calculated exactly using the following formula:
Advantages:
The new RPV calculation formula makes the cost estimates for joins much more accurate.
This improvement does not change procedures for collecting or dropping statistics or any
associated timing for collecting statistics.
For Comparison between TD6.1 and TD 12.0 Please use below Scripts:
In TD 12.0:
2.2
This internal feature improves the logic that caches optimized plans for parameterized queries (SQL
statements that include variables).
In previous releases, the Optimizer did not evaluate the value of the parameter (the actual USING,
CURRENT_DATE, or DATE value) when creating the query plan for a parameterized request. If the same
request was resubmitted with different parameters, the old cached plan was used, often generating suboptimal plans. Query plans that remain the same regardless of the parameter values are called generic
plans.
With Teradata 12, the Optimizer first determines whether the request would benefit if the parameter
values are evaluated. If so, then the Optimizer will include the parameter values when optimizing the
request. Plans in which the Optimizer peeks at parameter values and generates a plan optimized for
those values are called specific plans.
For example, the Optimizer considers the user-supplied product_code value when generating a plan for
the following request:
USING (x INT) SELECT * FROM SalesHistory
WHERE product_code =: x OR store_number = 56;
With this feature, performance improvements have been observed in the following situations:
Partition Elimination
NUSI access
Join plans
These queries are already highly optimized, and any evaluation of the parameter value is redundant
and/or would not change the query plan.
The first time a parameterized request is submitted, the system determines whether its parameter
values should be evaluated in the first place. If so, the system peeks at the parameter values,
and the Optimizer generates a specific query plan. If the parsing cost of this specific plan is small
enough, then all subsequent submissions of this request will always result in specific plans.
If the request hasnt already been marked as a specific-always request, the Optimizer will
generate a generic plan the next time that request is submitted.
The system compares the run times of the specific and generic plans, and then
whether, from that point on:
decides
1) To execute the generic cached plan each time that request is submitted.
2) To generate a specific plan each time based on the new user supplied values.
Parameterized Query
Caching.ppt
The CURRENT_DATE variable will be resolved for all queries, parameterized or otherwise, and
replaced with the actual date prior to optimization.
This will help in generating a more optimal plan in cases of partition elimination, sparse join
indexes, or NUSIs that are based on CURRENT_DATE.
For a parameterized request that uses CURRENT_DATE, a generic plan with CURRENT_DATE
resolved will be referred to as DateSpecific Generic Plan.
Similarly for a parameterized request that uses the CURRENT_DATE, a specific plan with
CURRENT_DATE resolved will be referred to as DateSpecific Specific Plan.
The explain text, for queries for which CURRENT_DATE has been resolved, will show the resolved date
in TD 12.0.
In TD 6.1:
Explain select * from retail.item where l_receiptdate= current_date;
Explanation
1)
for
First, we lock a distinct retail."pseudo table" for read on a RowHash to prevent global deadlock
retail.item.
2)
3)
We do an all-AMPs RETRIEVE step from retail.item by way of an all-rows scan with a
condition of
("retail.item.L_RECEIPTDATE = DATE") into Spool 1 (group_amps), which is built
locally on the
AMPs. The input table will not be cached in memory, but it is eligible for
synchronized scanning. The size of Spool 1 is estimated with no confidence to be 6,018 rows. The
estimated time for this step is 0.58 seconds.
4)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the
request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.58 seconds.
In TD 12.0:
Explain select
Explanation
1)
First, we lock a distinct retail."pseudo table" for read on a RowHash to prevent global deadlock
for retail.item.
2)
3)
We do an all-AMPs RETRIEVE step from retail.item by way of an all-rows scan with a
condition of ("retail.item.L_RECEIPTDATE = DATE '2008-12-01'") into Spool 1 (group_amps), which
is built
locally on the AMPs. The input table will not be cached in memory, but it is eligible for
synchronized scanning. The size of Spool 1 is estimated with no confidence to be 6,018 rows
(806,412 bytes). The estimated time for this step is 0.59 seconds.
4)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the
request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.59 seconds.
3 Database Enhancements
3.1
Restartable Scandisk
3.1.1 Usage
Scandisk is a diagnostic tool designed to check for inconsistencies between the key file system data
structures such as the Master Index, Cylinder Index and the Data blocks.
As an administrator, you can perform this procedure as preventive maintenance to validate the file
system, as part of other maintenance procedures, or when users report file system problems.
The SCANDISK command:
Checks that all the sectors are allocated to one and only one of the Bad sector list, the free sector
list, or a data block.
In TD12.0, with the Restartable Scandisk feature, Scandisk utility can be restarted either from a defined
point or from the last table scanned.
Restartability allows you to halt the Scandisk process during times of extremely heavy system use and
then restart it at some later time (e.g. off-peak hours)
Syntax:
SCANDISK TAB[L[E]] starting_tableid [ starting_rowid ] [ TO ending_tableid [
ending_rowid]]
We do see the comparison of the options available in both the versions by giving the command (Screen
shots shown below).
Help scandisk
In TD 6.1:
Syntax:
SCANDISK [ /dispopt ] [ { DB | CI | MI | FREECIS } ] [ FIX ] [ /dispopt ]
In TD12:
Syntax:
SCANDISK [ /dispopt ] [ { DB | CI | MI | FREECIS | WMI | WCI | WDB } ] [ /dispopt ][INQUIRE
<interval>] [ { NOCR | CR } ]
Where
CI --Scans the MI and CIs. If the scope is AMP or all tables, rather than selected tables, the free CIs are
also scanned.
DB-- Scans the MI, CIs, and DBs. This is the default for the normal file system, which can be overridden
by the CI, MI, or FREECIS options. If the scope is AMP or all tables, rather than selected tables, the free
CIs are also scanned.
FREECIS --- Scans the free CIs only. This option also detects missing WAL and Depot cylinders.
MI ---- Scans the MI only.
WCI --- Scans the WMI and WCIs.
WDB --- Scans the WMI, WCIs, and WDBs. This is the default for the WAL log, which can be overridden
by the WCI or WMI options.
WMI --- Scans the WMI only.
INQUIRE --- Reports SCANDISK progress as a percentage of total time to completion and display the
number of errors encountered so far.
Interval --- An integer which defines the time interval, in seconds, to automatically display SCANDISK
progress.
CR ----specifies to use Cylinder reads
NOCR --- Specifies to use regular data block preloads, instead of cylinder reads.
CR option:
When we specify the SCANDISK CR option, the utility uses the cylinder reads.
This option is not supported in TD6.1 (Screenshot shown below).
In TD6.1:
In TD12.0:
In TD6.1:
In TD12.0:
Scandisk wci:
Scandisk wdb:
Scandisk wmi:
3.2
Check Table is a console-startable utility and a diagnostic tool for Teradata DBS software. Check Table
checks for inconsistencies among internal data structures, such as table headers, row identifiers and
secondary indexes. Although Check Table identifies and isolates data inconsistencies and corruption, it
cannot repair inconsistencies or data corruption.
Compares primary and fallback copies of data and hashed secondary-index sub tables.
Compares table headers across AMPs and compares table headers to information in the Data
Dictionary.
Validates that the set of tables found on all AMPs matches the set found in the Data Dictionary.
This check is done only if a check of database DBC or ALL TABLES is specified.
"LEVEL ONE" compares only counts of data and index sub tables.
"LEVEL TWO" compares lists of index ids and data row ids.
"LEVEL THREE" compares entire rows.
"LEVEL PENDINGOP" finds all the tables on which there are pending operations (e.g., mload,
fast load, table rebuild).
In TD6.2
Syntax
Error
generated
In TD12.0:
Checks Compressed Values
Usage
The feature can be invoked from the Teradata Manager Check table Utility Menu Item.
Compresscheck
Syntax:
Check <table name to be checked> at <level> compresscheck;
CheckTable
Syntax with
Td12
Compresschec
k
3.3
During log system processing in Teradata Database 12.0, all Teradata messages are captured in the
software event log so that all messages are available in one place.
Common repository data across all platforms. DBC.SW_EVENT_LOG could be archived and stored
across different machines/platforms. It is a TABLE.
4 Security Enhancements
4.1
Password Enhancements
TD6.1 supported two forms of password encryption from previous releases, namely DES and SHA-256
truncated to 27. (This support will continue)
Teradata Database 12.0 includes the following password enhancements:
Full implementation of 32-byte Secure Hashing Algorithm (SHA) 256 encrypted passwords.
All passwords created on old releases will continue to work and will be changed to full SHA-256
encryption when next modified.
The restricted passwords feature includes the new column, PasswordRestrictWords, in the table
DBC.SysSecDefaults, having the following possible, single character values:
N, n = Do not restrict any words from being contained within in a password string. This is the
default.
Default Value:
The default value is 30. Maximum number of characters allowed in a valid password is 30.
The parameter PasswordMaxChar in dbc.SysSecDefaults sets the maximum characters allowed for a
valid password.
Screen shot below shows the table DBC.SysSecDefaults by which we come to know that
PasswordRestrictWords column was missing in TD6.1
In TD6.1:
In TD12.0:
System-wide by using
DBC.SysSecDefaults.
By user profile using the New CREATE/MODIFY PROFILE syntax clause RESTRICTWORDS =
'Y' | 'N'. The default is N.
the
new
column,
PasswordRestrictWords,
in
the
table
Remove all ASCII numbers and ASCII special characters from the beginning and end of the
password string.
DBC.RestrictedWords: A new view which is created via DIP scripts for access to the system
table DBC.PasswordRestrictions. This view will not have PUBLIC access, even for SELECT.
DBC.SysSecDefaults: Contains one new field, PasswordRestrictWords. The default value for
this field is N.
5 Enhancements to Utilities
5.1
Normalized Resusage/DBQL data and Ampusage
views for coexistence systems
Teradata Database 12.0 provides normalized CPU time(which was not present in TD6.1) in Resusage
Data, DBQL Data, Ampusage view which provides more accurate performance statistics for mixed node
systems , particularly in the areas of CPU skewing and capacity planning.
This feature adds the following fields to the ResUsageSpma table:
To compare the columns from the ResUsageSpma table in TD6.1 and TD12.0, please find the documents
attached below:
dbcResusagespmaTD
61.rtf
dbcResusagespmaTD
12.rtf
MaxCPUAmpNumberNorm: Number of the AMP with the maximum normalized CPU time.
To compare the columns from DBQLogTbl and QRYLOG tables from TD6.1 and TD12.0, please find the
documents attached below:
dbqlogtbl_TD61.rtf
dbqlogtblTD12.rtf
qrylogTD61.rtf
qrylogTD12.rtf
MaxCPUAmpNumberNorm: Number of the AMP with the maximum normalized CPU time.
To compare the columns from DBQLStepTbl and QRYLOGSteps tables from TD6.1 and TD12.0, please
find the documents attached below:
DBQLStepTbl_TD61.r
tf
dbqlsteptblTD12.rtf
QRYLogSteps_TD61.
rtf
qrylogstepsTD12.rtf
To compare the columns from DBQLSummaryTbl and QRYLOGSummary tables from TD6.1 and
TD12.0 , please find the documents attached below:
DBQLSummaryTbl_T
D61.rtf
dbqlsummarytblTD12
.rtf
QRYLogSummary_TD
61.rtf
qrylogsummaryTD12.
rtf
ampusage_TD61.rtf
ampusageTD12.rtf
5.2
Tdpkgrm New option to remove all non-current TD
packages
Tdpkgrm (Teradata Package)
This is used to remove non-current packages.
Prior to Teradata 12.0, you had to manually remove each non-current package which was time
consuming.
New option to remove all non-current Teradata packages:
Teradata 12.0 adds an option in tdpkgrm that allows you to remove all non-current Teradata packages of
all the components at once with the command line option a as shown below:
$ tdpkgrm a
Screenshots below shows that , in TD6.1 it was not possible to remove all non-current Teradata packages
at once as the command line option -a is not supported in TD12.0.
In TD6.1:
In TD12.0:
If there are no non current packages, then after running the tdpkgrm a it will give the message No
Teradata Software non current version available for removal
5.3
A new DIP option DIPPWD-Password Restrictions has been added to Teradata Database 12.0 as part
of the password enhancement feature which was not available in TD6.1 (Screen shots shown below)
This feature allows DBA to create list of restricted words that are not allowed in new or modified
passwords.
In TD6.1:
DIPSQLJ:
The SQLJ database and its views are used by the system to manage JAR files that implement Java
external stored procedures.
The DIP script used to create the SQLJ database and its objects is called DIPSQLJ. This script is run as
part of DIPALL.
DIPDEM:
Loads tables, stored procedures, and a UDF that enable propagation, backup, and recovery of database
extensibility mechanisms (DEMs). DEMs include stored procedures, UDFs, and UDTs that are distributed
as packages, which can be used to extend the capabilities of Teradata Database.
Note: DIPSQLJ and DIPDEM were supported from TD6.2
Prepared by GCC India (Mumbai) ADMIN-COE MS Team
Page 43 of 147
Online Archive
Online archive allows the archival of a running database; that is, a database (or tables within a database)
can be archived in conjunction with concurrently executing update transactions for the tables in the
database. Transactional consistency is maintained by tracking any changes to a table in a log such that
changes applied to the table during the archive can be rolled back to the transactional consistency point
after the restore.
arcTD12data.txt
update_qry.txt
restore.txt
STEP 2: Note the data before starting which you are going to update while taking backup -
DataBeforeBackup.J
PG
STEP 3-A: Start the Online backup using the script and start.
STEP 3-B: As soon as the archiving of the table is started.
arcmain_update.J PG
STEP 4: Wait for the Completion of the backup as well as update. Make sure that data Online Log info
contains data which is highlighted in the below screenshot
OnlineLogInfo.J PG
The three lines displayed indicate that the table was archived online, the consistency point for that table
(e.g. when online logging was started), and how many log bytes and rows were archived (indicating the
amount of change in the table during the online archive). These lines are also displayed during a restore,
copy, or analyze of the archive, as an indication that the archive was done online.
STEP 5: Check for the column data after taking backup in the database, following is the screenshot
DataAfterBackup.J P
G
DeleteDataAfterBack
up.J PG
STEP 7: Restore the Table using the restore script, following is the screenshot
Restore_Snapshot.J
PG
STEP 8: Check for the data after restore, following is the screenshot
DataAfterRestore.J P
G
6.1.2 Usage
The Archive privilege on the database or table that being logged. The privilege may be granted to
a user or an active role for the user.
The Archive privilege on the database or table that being logged. The privilege may be granted to
a user or an active role for the user.
6.1.2.4
Archive Statement
An ARCHIVE statement can start online archive with the ONLINE option without being initiated first by
the LOGGING ONLINE ARCHIVE ON statement. This allows you to start the online archive
immediately. In this case, online archive logging will be started implicitly. The ARCMAIN starts online
archive logging on the specified objects before it starts archiving. The ARC statement of ARCHIVE/DUMP
has been extended to support the online archive. The SQL DUMP statement doesnt support the new
ONLINE option. The online archive feature adds the options ONLINE and KEEP LOGGING to the
syntax.
ARCHIVE DATA TABLES . . .
[, ONLINE] [, KEEP LOGGING] [, RELEASE LOCK ] [,
INDEXES ] [, ABORT ]
[, USE [ GROUP ] READ LOCK ] [, NONEMPTY DATABASE[S] ]
, FILE = name [,FILE = name];
6.1.3.1.1
all-db_TD12.txt
6.1.3.1.2
single-db_TD12.txt
6.1.3.1.3
single-db_TD12_log.t
xt
single-exclude_TD12
.txt
6.1.3.1.4
all-db_TD12_log.txt
single-exclude_TD12
_log.txt
multiple-exlude_TD1
2.txt
multiple-exlude_TD1
2_og.txt
single_table_TD12.tx
t
6.1.3.2.2
single_table_TD12_lo
g.txt
multi-table_TD12.txt
multi-table_TD12_log
.txt
Following are the backup scripts and the Log files attached, please have a look of the log file which tell us
that online database level dictionary backup is not supported.
db_dict_TD12.txt
6.1.3.3.2
db_dict_TD12_log.tx
t
Following are the backup scripts and the Log files attached. Please have a look of the log file which tells
us that online table level dictionary backup is not supported.
tbl-dict_TD12.txt
tbl-dict_TD12_log.txt
PartitionBackup.doc
6.1.3.5
6.1.3.5.1
There is no change in the restore script but you can see in the log file that table has been taken with
ONLINE enabled. Please find the below restore and log scripts.
restore_t2.txt
6.1.3.5.2
restore_t2_log.txt
rest_drop_tbl.txt
rest_drop_tbl_log.tx
t
In TPT 12.0:
TPT 12.0:
UTF16SUPPORT12.t
xt
Teradata PT 12.0 added a new ArraySupport attribute to the Stream Operator that allows the use of the
Array Support database feature for DML statements. Array support improves Stream driver performance.
By default, this feature is automatically turned on if the database supports it. User action is not needed to
take advantage of this feature.
With the ARRAYSUPPORT attribute enabled
In TPT 8.1:
Attached is the log file in which we see that the Array support feature is not showing up in the Attribute
definitions
ArraySupportTD61_
ON_log.txt
In TD12.0:
Prepared by GCC India (Mumbai) ADMIN-COE MS Team
Page 51 of 147
Attached is the log file in which we see that the Array support feature is shown (which was not showing in
TPT 8.1)
ArraySupportTD12_
ON_Log.txt
In TPT 8.1:
Attached is the log file in which we see that the Array support feature is not showing up in the Attribute
definitions
ArraySupportTD61_
OFF_log.txt
In TD12.0:
Attached is the log file in which we see that the Array support feature is shown (which was not showing in
TPT 8.1)
ArraySupportTD12_
OFF_Log.txt
TIME
TIMESTAMP
INTERVAL YEAR
INTERVAL MONTH
INTERVAL DAY
INTERVAL HOUR
INTERVAL MINUTE
INTERVAL SECOND
7.1.4 Delimiters
Delimiters are usually used in conjunction with the Data Connector operator. The following attributes will
imply the use of delimiters in the data file:
VARCHAR FORMAT = Delimited
VARCHAR TextDelimiter = | (this is the default delimiter)
VARCHAR ExcapeTextDelimiter = \ (this is the default escape delimiter)
If delimiters are expected to be embedded within delimited data they must be preceded by the backslash
('\') escape character or an alternative designated escape character.
The TextDelimiter attribute is used to specify the delimiter.
The EscapeTextDelimiter attribute is used to change the default escape delimiter to something other
than the backslash character (\).
Example:
hi|how|\|r|you
Would be broken up into
hi how |r you
In TD12.0:
Prepared by GCC India (Mumbai) ADMIN-COE MS Team
Page 55 of 147
Script attached has been run with and with out the n options
Load2Streams.txt
From the screenshots below we clearly see the how n option continues the job even if an error is
encountered.
RATE and PERIODICITY values will be used for the Stream operator for this job.
Note: This RATE and PERIODICITY value will be for the complete job.
With MVS:
F Myjob APPL=Stream RATE=2000, PERIODICITY=20
Controlling Stream operator at Step level:
The OperatorCommandId feature allows you to create identifiers associated with a specific Stream
operator within a specific job step. This permits you to assign new Rate or Periodicity values to all
instances of that operator within the step after the job has begun by using the twbcmd command.
Normally, TPT generates a default value for OperatorCommandId composed of <operator object name>
+ <process Id> for each copy of the operator in the APPLY specification. Process Id numbers for
operators appear on the command screen when the job step that contains the operator begins to execute.
Ex: operator_name1234
Use the following command to identify an operator by default ID and change the rate value
twbcmd Jobname operator_name1234 RATE=2000
We can also assign own identifier for subsequent referencing.To accomplish this, we can either:
Declare the OperatorCommandId attribute in the DEFINE operator statement for the Stream
operator.
You can assign a value to the OperatorCommandID attribute in the APPLY statement (which
overrides the attribute specified in the DEFINE operator statement).
The RATE or PERIODICITY change will apply to all the instances of the Stream operator running within
that job step.
lab7_1.txt
Step 2-b: Run using twbcmd command as soon as the job id generates in Step2-a.
Step 3: After successfully executing the job check the log using tlogview
CaseSensitiveTD61.t
xt
In TPT 8.1:
In TD12.0:
Script attached
CaseSensitiveTD12.t
xt
7.2
The Application Program Interface (API) is a feature of TPT which permits developers to create programs
with a direct interface to the load and unload protocols used by the utilities. The following are new
features which have been added to the API as a result of Teradata Release 12.
8 TASM
8.1
Query Banding
A set of Name/Value pairs that can be set on a Session or Transaction to identify the querys originating
source enabling improved workload management and classification.
BT;
set query_band = 'EXTUserid=CV1;EXTGroup=Finance;UnitofWork=Fin123;' for
TRANSACTION;
INS
INS
ET;
The transaction query band is discarded when the transactions ends (commit, rollback, or abort) or the
query band is set to NONE.
8.1.1.1 GetQueryBand
A system user defined function (UDF) is provided to return the current query band for the session by
using the following syntax:
SEL GetQueryBand();
Below screenshot shows query band has been successfully set and is active at session level (We can
activate it at Transaction level by giving the above command).
8.1.1.2 GetQueryBandValue
A system UDF is also provided to retrieve the value of a specified name in the query band. This can be
used to retrieve the name of the end user.
SEL GetQueryBandValue(0,'ExtUserId');
The output below displays the QueryBand value (CV1) associated with the QueryBand name (ExtUserId)
as we had given the value for the ExtUserId as CV1 while setting the queryband.
8.1.1.3 GetQueryBandPairs
A system table function will return the name/value pairs in the query band in name and value columns.
Sel QBName (FORMAT X (20)'), QBValue (FORMAT X (20)')
FROM TABLE (GetQueryBandPairs(0)) AS t1 ORDER BY 1;
The below output shows the QueryBand value associated with each QueryBand Name set in the
queryband.
8.1.1.4 MonitorQueryBand
An administrative UDF is provided to retrieve the query band for a specified session. The DBA can use
this UDF in order to track down the originator of a blocker request or one using excessive resources.
Below output shows the query band for the session bearing session number 1207.
Filters
Teradata DWM Filter Rules allow the administrator to control access to and from specific Teradata
Database objects and users. There are two types of filter rules:
Object access filters limit access to all objects associated with the filter during a specified time period.
Queries referencing objects associated with a filter during the time the filter applies are rejected.
Query Resource filters limit database resource usage for objects associated with the filter. Queries
exceeding the resource usage limit (estimated number of rows, estimated processing time, types of joins,
table scans) during the time the filter applies are rejected.
Query band name/value pairs can be associated with Teradata TWM Filter rules.
Below screenshot shows the additional options (Include QueryBand and Exclude QueryBand)
Below screenshots shows stepwise how to Include QueryBand in the WHO criteria
Select Include QueryBand in the WHO criteria
Clicking on Choose will pop up with the Include QueryBand window which will allow to Load Names
and the QueryBand values associated with the Names. (Load Names option will be highlighted as seen
in the screenshot)
Clicking on the Load Names options will show all QueryBand Names.
Selecting a particular QueryBand Name will highlight Load Values option which will display the Load
Value associated the selected QueryBand Name as shown below:
Selecting the QueryBand Value and clicking on the Add option will add the queryband in the QueryBand
classification criteria.
If a name/value pair in the Query band set for the session or transaction matches the query band pair
associated with a filter rule, the filter rule will be applied to request.
If we select two name / value set in the QueryBand classifications, it will be ANDed as shown below:
In the same way we can use the Exclude QueryBand option in the WHO criteria.
Workload Definitions
Workload Definitions are another component of Teradata DWM. A workload definition (WD) is a type of
rule that groups queries for management based on the querys operational properties. The attributes that
can be defined for a WD are who (user, account, profile, etc.), what (CPU limits, estimated processing
time, row counts, etc.), and where (databases, tables, macros, etc.). The attributes of the WDs are
compared to the attributes of each the incoming request and the request are classified into a WD. The
WD determines the priority of the request.
Query band name/value pairs can be defined as additional who attributes. This enables us to solve the
following problems mentioned in the first section:
To set the priority of a request based on the end user when submitted through a connection pool
To assign requests from the same application different priorities
Query Band
Name
Value
EXTUserId
EXTGroup
Marketing
Importance
Online
WD Name: Marketing-Batch
Priority: Normal
Classification Criteria;
Query Band
Name
Value
EXTUserId
EXTGroup
Marketing
Importance
Batch
Note Query Band classification criteria with different names (EXTUserID, EXTGroup, Importance) are
ANDed so that all must be present in the query band to match the WD classification criteria. Query Band
classification criteria with different values for the same name are ORed.
Query banding UDFs can be used to extract accounting reports from the DBQL log table.
Sel t1.AMPCPUTime, t1.ParserCPUTime from dbc.dbqlogtbl t1 where
GetQueryBandValue(t1.queryband, 0, 'EXTUserId') = CV185018' AND
GetQueryBandValue(t1.queryband, 0, 'unitofwork') = Fin123 AND t1.QueryBand is NOT NULL;
One reason to use both in the same session is to have the transaction query band add additional pairs.
For example, using the WD classification criteria in the previous section, if the session query band is set
as follows:
SET QUERY_BAND='EXTUserId=MG123;EXTGroup=Marketing;' for session;
Then you would need to set the transaction query band to add the Importance name/value pair to
determine which WD to use for request.
SET QUERY_BAND='Importance=Online;' for transaction;
SEL * FROM cust_table;
A name/value pair in the transaction query band will override the pair in the session query band if the
name is the same in both query bands. Using the same example as above, say we want to the set
session query band so that the default Importance=Batch.
SET QUERY_BAND='EXTUserId=MG123;EXTGroup=Marketing;Importance=Batch' for session;
Then for a request that needs a higher priority, the transaction query band can be used to set the
Importance name/value pair to the Online priority.
SET QUERY_BAND='Importance=Online;' for transaction; SEL * FROM cust_table;
When Teradata DWM searches the query band associated with a request for comparisons with rules and
classification criteria, it always searches the transaction query band first and stops when the name in the
queryband pair matches that of the name in the rule or WD.
8.2
State Matrix
The state matrix is a two-dimensional diagram that can help you visualize how you want the system to
behave in different situations.
It extends active workload management to automatically detect, notify, and act on planned and unplanned
system and enterprise events. TASM then automatically implements a new set of workload management
rules specific to the detected events and resulting system condition.
System Condition (SysCon) The condition or health of the system. For example, SysCons
include system performance and availability considerations such as number of AMPs in flow
control or nodes down at system startup.
Operating Environment (OpEnv) the kind of work the system is expected to perform. It is
usually indicative of time periods or workload windows when particular critical applications such
as crucial data load or month end jobs running.
The combination of a SysCon and an OpEnv reference a specific state of the system.
Associated with each state are a set of workload management behaviors, such as throttle thresholds,
Priority Scheduler weights, and so on. When specific events occur, they can direct a change of SysCon or
OpEnv, resulting in a state change and therefore an adjustment of workload management behavior.
The same way we can add the SysCon Degraded. Screenshot below shows State Matrix with the System
Conditions (Normal which is the default, Base, Degraded).
(Operating Environment Base is by default)
When a unique system condition is defined, there is an option to associate it with a minimum
duration. Otherwise, consider a system condition of RED associated with degraded health.
When an event results in this system condition are being activated, the state will be transitioned
appropriately. That state may have working values for tighter throttles, more restrictive priority
scheduler weights, more filters, etc. If by invoking the state, the system immediately returns to
good health (i.e., the events that result in the system condition are no longer valid), the system
could conceivably realize another state transition that removes the more restrictive working
values. By removing the more restrictive working values, the system could conceivably put itself
right back into RED state and yet another state transition, etc.
A minimum duration can be set for the System Condition. That way regardless of the associated
event status, the system will remain in the same state for at least the minimum duration, giving
the system a better chance of more fully working itself out of the situations that are putting it into
degraded state.
It is recommended that System Conditions that are activated entirely by internal event detections
and not external user-defined event detections be set to have a minimum duration > 0, perhaps
10 minutes or so.
Click OK and then click on Accept so that the new OpEnv will be added as shown below:
The below screenshot shows the State Matrix with the SysCon and OpEnv defined.
(Base is the default state which later can be changed)
8.2.4 State
The combination of a SysCon and an OpEnv reference a specific state of the system.
Associated with each state are a set of workload management behaviors, such as throttle thresholds,
Priority Scheduler weights, and so on. When specific events occur, they can direct a change of SysCon or
OpEnv, resulting in a state change and therefore an adjustment of workload management behavior.
State can be defined as shown below:
Select States and click New State which will pop up with the State window in which the Name of the
State can be mentioned.
8.2.5 Events
A System Condition, Operating Environment or the State can change as directed by event directives
defined by the DBA.
Establishing Event Combinations and Associating Associations to the State Matrix for System
Conditions:
Once you have defined your state matrix, you will need to define the event combinations that will put you
into the particular system conditions or operating environments defined in the state matrix.
Screenshot below shows the Event Combination NodeDownAction which was defined above which will
change the System Condition to Busy.
Likewise we create the other events (AWTLimitEvent) and associate Action to be taken against the
Event as shown below:
Screenshot below shows the Events which have been created AWTLimitEvent and NodeDown
Screenshot below shows the Event Combination after creating the Events and the associate action to be
taken against the event
8.2.6 Periods:
These are intervals of time during the day, week, or month.TDWM monitors the system time,
automatically triggers an event when the period starts and it will last until the period ends.
Screenshots below shows how we can create Periods
Select Periods and then click on New Period which will open New Period window in which we can give
the Period Name
Uncheck Everyday and 24 hours option so that we can select the days of week and the time as shown
below
After specifying the days and the time, click Accept so that the period can be saved.
Likewise we can create other periods too. As shown in the below screenshot we see two periods defined (
EndOfMonthReporting and LoadWindowPeriod)
Establishing Event Combinations and Associating Associations to the State Matrix for Operating
Environments:
Screenshot below shows the Operating Environments defined
Below we associate a particular action against an Event. In this when the Event LoadWindowPeriod is
active, it will change the OpEnv to Load Window (9am to 4pm)
Screenshot below shows the Event Combination and what action will be taken against this Event
Combination
Likewise we create the other Event Combinations and associate the Actions to be taken.
After defining all the necessary parameters for State Matrix, it will look as shown below:
Guidelines to establish Event Combinations and Associate Actions to the State Matrix:
Once you have defined your state matrix, you will need to define the event combinations that will put you
into the particular system conditions or operating environments defined in the state matrix. Event
combinations are logical combinations of events that you have defined.
The current release of Teradata DWM offers the following Event Types:
AMP Activity Level Event Types. To avoid unnecessary detections, these must persist for a qualified
amount of time you specify (default is 180 seconds) on at least the number of AMPs that you also specify:
Period Events. These are enabled or disabled depending on the current date/time relative to your
defined periods of time. E.g. If daytime is defined as daily 8am-6pm, the daytime period is
enabled everyday at 8am, and disabled every day at 6pm.
User-Defined Events. These are enabled via openAPI or PMAPI calls to the database, and are
disabled via an expiration timer given by the enable call, or through an explicit disable call.
Here we discuss guidelines for the usage and associated settings of event types meriting additional
discussion as well as general event detection considerations.
The Node_Down Event Type threshold you define is the maximum percent of nodes down in a clique, and
is representative of the performance degradation the system will incur. Consider a system configuration
with mixed clique types, some with more nodes per clique than others, and some with Hot Standby Nodes
(HSN).
If a single node were to go down, what is the associated performance degradation? It is roughly
synonymous with the maximum percent down in a clique, and depends on which clique bears the down
node:
Cliques 4: = 25%
Clique 5: = 50%
Clique 6: 0/3 (because the HSN took the burden of the down node) = 0%
In the example above, you probably dont want to take much, if any, action if cliques 1, 2 or 6 were to
have a node down, however if clique 4 or especially 5 were to experience a node down, that would be a
very serious problem that requires immediate attention to resolve, and drastic workload management
controls to assure critical work can still be addressed during the degraded period.
Recommendation: If your system is designed to run with some amount of degradation (for example, many
very large systems with hundreds of nodes may be sized expecting that there is always a single node
down somewhere in the system) it is suggested to set the threshold such that the Node Down event will
activate only when that degradation exceeds what was sized for. For example, if the example system
above were sized to meet workload expectations as long as clique 5 did not experience a down node, you
might set your Node_Down event to activate at a threshold > 25%, in other words, to only activate if
clique 5 experienced a node down. At that time you could change your system condition appropriately.
Prior to that threshold, you could possibly define a second Node_Down event with a lower threshold with
an action to Notify only. If you are only interested in sending a notification, you could simply rely on the
Teradata Managers alert policy manager to send an alert. However the Alert Policy Manager cannot
notify to a queue. Also, when detecting a node down, the Alert Policy Manager cannot distinguish
between the severity of a node going down in a clique with HSN vs. a node going down in a 2 node
clique.
In general, assuming your system is NOT designed to expect nodes down (as is the case with many small
to moderate sized systems), a good threshold to set Down_Nodes Event Type Threshold to is roughly
24%.
Guidelines for AMP Activity Level Event Types (AWT Limit and Flow Control):
Consider adding or lowering workload throttles, object and/or utility throttles on lower
priority requests across all appropriate states.
Consider reserving AWTs for tactical WDs by selecting the expedited option for your tactical AGs.
This allows a special reserve pool of AWTs to be set aside. Up to 5 AWTs can be defined into the
pool. Expedited message types will not be subject to flow control caused by standard new work
and will receive priority over standard new work in the AWT queue.
Consider follow-up correlation analysis to determine what, if any, changes to the affected
states working values should be considered, such as
Consider reserving AWTs for tactical WDs by selecting the expedited option for
your tactical AGs.
Further, consider a state change with the unique working values described above as a legitimate option
when flow control is persistently detected. This is due to the potential seriousness associated with the
loss of priority control, and it is appropriate to act automatically to resolve the situation ASAP.
As an example, consider that a single Teradata system may be part of an enterprise of systems that may
include multiple Teradata systems cooperating in a dual-active role, various application servers and
source systems. When one of these other systems in the enterprise is degraded or down, it may in turn
affect anticipated demand on the Teradata system. An external application can convey this information by
means of a well-known user-defined event via open APIs to Teradata. Teradata can then act
automatically, for example, by changing the system condition and therefore the state, and employ
different workload management directives appropriate to the situation.
The situations tend to boil down to either an increase of decrease in user demand. Via the state matrix
directives, you may choose to disable filters and raise throttles of lower priority work in times of
anticipated lower user demand, and do the opposite in times of anticipated higher user demand.
2> To convey business-oriented events
Many businesses have events that impact the way a Teradata system should manage its workloads. For
example, there are business calendars, where daily, weekly, monthly, quarterly or annual information
processing increases or changes the demand put on the Teradata System. While period event types
provide alignment of a fixed period of time to some of these business events, user-defined events provide
the opportunity to de-couple the events from fixed windows of time that often do not align accurately to
the actual business event timing.
For example, through the use of a period event defined as 6PM till 6AM daily, you could define an event
combination that changes the Operating Environment to LoadWindow when the clock ticked 6PM.
However the actual source data required to begin the load might be delayed, and therefore the actual
load may not begin for several hours. Also, it is typical to define the period event to encompass far more
hours than the actual business situation will require just to compensate for these frequently experienced
delays. Even then, sometimes the delays are so severe that the period transpires while the load is still
executing, leading to workload management issues.
However, if instead of using a period event, you could define a user-defined event called Loading. The
load application could activate the event via an OpenAPI call prior to the load commencing, and deactivate it upon completion. The end result is that workload management is accurately adjusted for the
complete duration of the actual load processing, and not shorter or longer than that duration.
Note that period events are not capable of operating on a business calendar, for example, that includes
holidays, end-of quarter dates, etc. However they can be conveyed to the Teradata System through userdefined events.
The current version of TDWM provides many opportunities to automate based on system-wide events,
and we anticipate that subsequent releases of TDWM will continue to enhance those capabilities through
the addition of new event types. However, until those new event types are available, an external
application, through the use of PM/API and OpenAPI commands or other means, can monitor the
Teradata System for key situations that are useful to act on. Some example key situations that an
external application might monitor for include a persistent miss of critical WDs SLG (such as a tactical
workload or a heartbeat monitoring workload), persistent high or low CPU usage, arrival rate surges and
throttle queue depths associated with a workload. Or the external application could provide even more
complex correlated analysis on the key situations observed to derive more specific knowledge.
Once detected through the use of the external application, the event can be conveyed to Teradata in the
form of a user-defined event that can be included in an event combination with actions, for example, to
change the System Condition and therefore the State of the system. (Generally utilizing an action type of
notification has limited value-add here because the external application could have provided that
notification directly without involving TDWM. The real value is in automatically invoking a more
appropriate state associated with the detected event.)
8.3
Click on New button to add new Global Exception name BadQuery and Optional Description and then
on OK, Screenshot below:
Define one of the Exception Criterias available and Exceptions Actions to take into effect, Screenshot
below:
After defining Exception Criteria and Exception Actions, to apply operating environments to your new
exception directive, select Apply. The Exception Apply dialog box displays with the operating
environments you defined, Screenshot below:
Select the WDs to which you want each operating environment to apply. You can select one or several
WDs, or you can select ALL WDs. Then select OK, Screenshot below:
Select Overview to view the operating environments, workloads, and exceptions you applied to the
exception directive, Screenshot below:
Create another Exception with BadIO with Exception Criteria and Exception Actions, Screenshot below:
Teradata DWM follows these guidelines to resolve conflicting exception actions when necessary:
Teradata DWM orders local and global exception actions to their defined precedence for
resolving situations similar to the following case:
o
o
o
If you did not specify Abort and Log or Abort on Select and Log, and you specified multiple
global Change Workload exception actions, the global Change Workload exception action with
highest precedence occurs. Teradata DWM logs all other Change Workload exception actions
as overridden.
If you did not specify Abort and Log or Abort on Select and Log, and you specified multiple
local Change Workload exception actions, the local Change Workload exception action with
highest precedence occurs. Teradata DWM logs all other Change Workload exception actions
as overridden.
If you did not specify Abort and Log or Abort on Select and Log, and you specified multiple
global and local Change Workload exception actions, the local Change Workload exception
action with highest precedence occurs, since local exception actions take precedence over global
exception actions. Teradata DWM logs all other Change Workload exception actions as
overridden.
Aborts take precedence over any Change Workload exception actions. If you specified Abort
and Log or Abort on Select and Log, and you specified multiple global and local Change
Workload exception actions, Teradata DWM aborts the query and logs all Change Workload
exception actions as overridden.
8.4
Utility Management
Utility Management Feature provides the ability to ensure that these utilities do not impact higher priority
system work and can be controlled when system state changes or can get prioritized when deemed
necessary (for instance, during a batch window).
Utility Management helps in Capacity Planning and system utilization reporting by enabling better
management of mixed workloads to allow critical work to complete.
In TD6.1:
In previous versions Load Utility Rule directly rejects but in TD12.0 it got the option to delay the load when
concurrency load is reached.
In TD12.0:
TD12.0 extends its Utility management from load and export control to include backup and recovery jobs
as well. Utility type Archive/Restore option has been added to Utility Throttles.
Delay Option to Utility Throttle is provided for queuing of jobs exceeding the threshold instead of directly
rejecting.
9 Usability Features
9.1
Teradata Database 12.0 provides complex error handling capabilities during bulk SQL Insert operations
(MERGE-INTO or INSERT-SELECT) through the use of new SQL-based error tables.
Errors such as duplicate row, CHECK constraints, and LOB data truncations arising from a bulk insert
operation are logged in an error table while the bulk insert operation continues to run instead of aborting.
This feature increases the flexibility and opportunity in developing load strategies by allowing SQL to be
used for batch updates that contain errors. It also provides error reporting similar to current load utilities
while overcoming current load utility restrictions on having Unique Secondary Indexes (USIs), Join or
Hash indexes, Referential Indexes (RIs) and triggers resident on target tables.
In TD 6.1:
Teradata Database 6.1 doesnt provide any error table SQL Insert operations (MERGE-INTO or INSERTSELECT).
In TD 12.0:
Following scripts to create table structure:
Scripts.txt
If the query ID is not saved or captured, it may be extracted from DBC.DBQLogTbl if DBQL is enabled.
Select querytext, starttime, queryid (format '-z (17)9') from dbc.dbqlogtbl where username='myusername'
order by 1;
If the query ID is not available because the query output is not saved and DBQL is disabled, then the
ETC_TimeStamp value in the error table may be used to associate error rows with the approximate times
of different loads.
Example:
DROP ERROR TABLE FOR test.t1
SHOW and HELP
Error table structure and column information may be displayed with the following requests respectively:
SHOW ERROR TABLE FOR <data table>;
SHOW TABLE <error table>;
HELP ERROR TABLE FOR <data table>;
HELP TABLE <error table>;
Example:
Help database test
HELP TABLE et1;
9.2
PARTITION BY
RANGE_N(claim_date BETWEEN DATE '2000-01-01'
AND DATE '2000-12-31'
EACH INTERVAL '1' MONTH);
Successful Message:
In TD6.1:
Create Table script (MPPI):
CREATE TABLE claims
(claim_id INTEGER NOT NULL,
claim_date DATE NOT NULL,
state_id integer NOT NULL,
claim_info VARCHAR(200) NOT NULL)
PRIMARY INDEX (claim_id)
PARTITION BY (
RANGE_N(claim_date BETWEEN DATE '2000-01-01'
AND DATE '2000-12-31'
Screenshot/Output:
Error Message:
In TD12.0:
Success Message:
Internally, these range partitions are combined into a single partitioning expression that defines
how the data is partitioned on the AMP
If only one partitioning expression is specified, that PPI is called a single-level partitioned primary
index (or single level PPI).
If more than one partitioning expression is specified, that PPI is called a multi-level partitioned
primary index (or multi-level PPI).
For PPI tables, the rows continue to be distributed across the AMPs in the same fashion, but on
each AMP the rows are ordered first by partition number and then within each partition by hash.
In a ML-PPI table, any single partition or any number or combination of the partitions may be
referenced and used for partition elimination.
Insert.txt
Explain plan:
Explain
SELECT * FROM claims
WHERE claim_date BETWEEN '2000/01/01' AND '2000/12/30';
12 months * 10
StateIds =
120 Partitions
Explanation
1)
2)
3)
4)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the
request.
->
The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.03 seconds.
Explain
SELECT * FROM claims
WHERE state_id BETWEEN 1and 2;
36 months * 2
StateIds
=
72
Partitions
Explanation
1)
First, we lock a distinct TD12."pseudo table" for read on a RowHash to prevent global deadlock for
TD12.claims.
2)
3)
4)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
->
The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.03 seconds.
Explain
SELECT * FROM claims
WHERE state_id = 1and claim_date between '2000/01/01' and '2000/12/30'
order by 2;
12 months * 1
StateIds = 12
Partitions
Explanation
1)
First, we lock a distinct TD12."pseudo table" for read on a RowHash to prevent global deadlock for
TD12.claims.
2)
3)
4)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
->
The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.03 seconds.
Explain
SELECT * FROM claims
WHERE state_id =1 and claim_date BETWEEN 2000/01/01'
AND 2000/01/20';
1 mo
n
Partit th * 1 St
ateId
ion
= 1
Explanation
1)
First, we lock a distinct USER01."pseudo table" for read on a RowHash to prevent global deadlock
for USER01.claims.
2)
3)
4)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
->
The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.03 seconds.
Advantage:
The use of ML-PPI on table(s) affords a greater opportunity for the Teradata Optimizer to achieve a
greater degree of partition elimination at a more granular level which in turn results in achieving a greater
level of query performance.
C:\Documents and
Settings\vb185032\Desktop\MLPPI Create Script.txt
C:\Documents and
Settings\vb185032\Desktop\NPPI Create Script.txt
C:\Documents and
Settings\vb185032\Desktop\MLPPI Insert.txt
C:\Documents and
Settings\vb185032\Desktop\NPPI Insert Script.txt
Explain
SELECT * FROM Claim_MLPPI A, Claim_NPPI B
WHERE A.Claim_ID = B.Claim_ID
AND A. State_ID = B.State_ID;
Optimizer is using
SLIDING-WINDOW
Merge Join Technique
Explanation
1)
First, we lock a distinct AU."pseudo table" for read on a RowHash to prevent global deadlock
for AU.b.
2)
Next, we lock a distinct AU."pseudo table" for read on a RowHash to prevent global deadlock
for AU.a.
3)
4)
We do an all-AMPs JOIN step from AU.b by way of a RowHash match scan with no residual
conditions, which is joined to AU.a by way of a RowHash match scan with no residual conditions.
AU.b and AU.a are joined using a sliding-window merge join (contexts = 1, 7),
with a join
condition of ("(AU.a.claim_id = AU.b.claim_id) AND (AU.a.state_id = U.b.state_id)"). The result goes
into Spool 1 (group_amps), which is built locally on the AMPs. The size of
Spool 1 is estimated
with low confidence to be 2 rows (362 bytes). The estimated time for this step is 0.05 seconds.
5)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the
request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.05 seconds.
Case Study 2:
Join on MLPPI Low Level Partition Column with NPPI Column:
Optimizer used merge join while joining on MLPPI Low Level Partition Column with NPPI Table Column
Explain
SELECT * FROM Claim_MLPPI A, Claim_NPPI B
WHERE A. State_ID = B.State_ID
Explanation
1)
First, we lock a distinct AU."pseudo table" for read on a RowHash to prevent global deadlock
for AU.B.
2)
Next, we lock a distinct AU."pseudo table" for read on a RowHash to prevent global deadlock
for AU.A.
3)
4)
Optimizer is using
Normal Merge Join
1)
We do an all-AMPs RETRIEVE step from AU.B by way of an all-rows scan with no
residual
conditions into Spool 2 (all_amps), which is redistributed by the hash code of
(AU.B.state_id) to all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of
Spool 2 is estimated with low confidence to be 2 rows (186 bytes). The estimated time for
this step is 0.01 seconds.
2)
We do an all-AMPs RETRIEVE step from AU.A by way of an all-rows scan with no
residual conditions into Spool 3 (all_amps), which is redistributed by the hash code of
(
AU.A.state_id) to all AMPs. Then we do a SORT to order Spool 3 by row hash. The
size of Spool 3 is estimated with low confidence to be 2 rows (186 bytes). The estimated
time for this step is 0.01 seconds.
5)
We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a RowHash match scan,
which is joined to Spool 3 (Last Use) by way of a RowHash match scan. Spool 2 and Spool 3 are
joined using a merge join, with a join condition of ("state_id = state_id"). The result goes into Spool
1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no
confidence to be 3 rows (543 bytes). The estimated time for this step is 0.06 seconds.
6)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the
request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.07 seconds.
C:\Documents and
Settings\vb185032\Desktop\MLPPI_2 Create Script.txt
C:\Documents and
Settings\vb185032\Desktop\MLPPI_2 Insert Scrpt.txt
Explain
SELECT * FROM Claim_MLPPI A, Claim_MLPPI_2 B
WHERE A.Claim_ID = B.Claim_ID AND A. State_ID = B.State_ID
Optimizer
is
using
SLIDING-WINDOW
Merge Join Technique
Explanation
1)
First, we lock a distinct AU."pseudo table" for read on a RowHash to prevent global deadlock for
AU.b.
2)
Next, we lock a distinct AU."pseudo table" for read on a RowHash to prevent global deadlock for
AU.a.
3)
4)
We do an all-AMPs RETRIEVE step from AU.b by way of an all-rows scan with no residual
conditions into Spool 2 (all_amps), which is built locally on the AMPs. Then we do a SORT to order
Spool 2 by the hash code of (AU.b.claim_id). The size of Spool 2 is estimated with low confidence
to be 806 rows (74,958 bytes). The estimated time for this step is 0.01 seconds.
5)
We do an all-AMPs JOIN step from AU.a by way of a RowHash match scan with no residual
conditions, which is joined to Spool 2 (Last Use) by way of a RowHash match scan. AU.a and Spool
2 are joined using a sliding-window merge join (contexts = 7, 1), with a join Condition of
("(AU.a.claim_id = claim_id) AND (AU.a.state_id = state_id)"). The result goes into Spool 1
(group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low
confidence to be 3 rows (543 bytes). The estimated time for this step is 0.06 seconds.
6)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.07 seconds.
Case Study 2:
Join on MLPPI Low Level Partition Columns:
Optimizer used merge join while joining on MLPPI Low Level Partition Column with another MLPPI
Low Level Partition Column.
Note: Used only Low Level Partition columns in the Join Condition.
Explain
SELECT * FROM Claim_MLPPI A, Claim_MLPPI_2 B
WHERE A. State_ID = B.State_ID
Explanation
1)
First, we lock a distinct AU."pseudo table" for read on a RowHash to prevent global deadlock for
AU.B.
2)
Next, we lock a distinct AU."pseudo table" for read on a RowHash to prevent global deadlock for
AU.A.
3)
4)
Optimizer is using
Normal Merge Join
1)
We do an all-AMPs RETRIEVE step from AU.B by way of an all-rows scan with no residual
conditions into Spool 2 (all_amps), which is built locally on the AMPs. Then we do a SORT
to order Spool 2 by the hash code of (AU.B.state_id). The size of Spool 2 is estimated with
low confidence to be 806 rows (74,958 bytes). The estimated time for this step is 0.01
seconds.
2)
We do an all-AMPs RETRIEVE step from AU.A by way of an all-rows scan with no residual
conditions into Spool 3 (all_amps), which is duplicated on all AMPs. Then we do a SORT
to order Spool 3 by the hash code of (AU.A.state_id). The size of Spool 3 is estimated with
low confidence to be 4 rows (372 bytes). The estimated time for this step is 0.01 seconds.
5)
We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a RowHash match scan, which is
joined to Spool 3 (Last Use) by way of a RowHash match scan. Spool 2 and Spool 3 are joined
using a merge join, with a join condition of ("state_id = state_id"). The result goes into Spool 1
(group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no
confidence to be 57 rows (10,317 bytes). The estimated time for this step is
0.06 seconds.
6)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
->
The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.07 seconds.
9.3
Schmon Enhancements
In TD6.1:
In TD12.0:
In TD6.1:
Delay modifier option is not supported in TD6.1 (shown in the screen shot below)
In TD12.0:
The command works as shown in the below screen shot
Note: In the above screenshot, the command should execute 5 times with a delay of 1 second, but we
see that it takes the delay as 5seconds, as 5seconds is the minimum delay we need to specify.
The following command does not repeat 2 times with a delay of 5 seconds because it is not preceded by
one of more of the id, all, -S, -T, or -P options. Instead it will repeat schmon -s 5 indefinitely with a delay of
5 (because the minimum delay allowed is 5 seconds). Here the 5 is not viewed as the delay option, but
rather the <id> option.
Therefore, the following command will output data related to session id 5, repeat forever, and delay 5
seconds between repetitions. A warning is displayed to tell the user that an invalid interval was entered.
schmon -s 5 2:
In TD6.1:
We do see in the above screenshot that it gives the message Invalid Set Division type
In TD12.0:
9.4
Teradata Database 12.0 adds additional information to the EXPLAIN output including cost
estimates, spool size estimates, view names and actual column names for Hashing, Sorting or Grouping
columns.
These enhancements improve readability and understanding as well as aid in debugging of complex
queries and the identification of intermediate result spool skewing.
1)
Adding Spool Size Estimates: Currently most steps that generate a spool have an estimate
of the number of rows the spool contains but not its size in bytes. At the place the number of rows is
printed, the spool size in bytes will also be printed.
In TD 6.1:
Explain select * from retail.contract;
Explanation
1)
First, we lock a distinct retail."pseudo table" for read on a
deadlock for retail.contract.
2)
3)
We do an all-AMPs RETRIEVE step from retail.contract by way of an all-rows scan with no
residual conditions into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool
1 is estimated with high confidence to be 15,000 rows. The estimated time for this step is 0.23
seconds.
4)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the
request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.23 seconds.
In TD 12.0:
Explain select * from retail.contract;
Explanation
1)
First, we lock a distinct retail."pseudo table" for read on a RowHash to prevent global
deadlock for retail.contract.
2)
3)
We do an all-AMPs RETRIEVE step from retail.contract by way of an all-rows scanwith no
residual conditions into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool
1 is estimated with high confidence to be 15,000 rows (1,320,000 bytes). The estimated time for this
step is 0.24 seconds.
4)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the
request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.24 seconds.
View Names:
2)
Prints view names with table names once in every step. If a view is from a different database with a table,
its database name is also printed.
Create view emp_v as
Select * from employee;
In TD 6.1:
Explain select * from emp_v;
Explanation
1)
First, we lock a distinct CUSTOMER_SERVICE."pseudo table" for read on a RowHash to
prevent global deadlock for CUSTOMER_SERVICE.employee.
2)
3)
We do an all-AMPs RETRIEVE step from CUSTOMER_SERVICE.employee by way of an allrows scan with no residual conditions into Spool 1(group_amps), which is built locally on the AMPs.
The size of Spool 1 is estimated with high confidence to be 26 rows. The estimated time for this step
is 0.03 seconds.
4)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the
request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.03 seconds.
In TD 12.0:
Explain select * from emp_v;
Explanation
1)
First, we lock a distinct CUSTOMER_SERVICE."pseudo table" for read on a RowHash to
prevent global deadlock for CUSTOMER_SERVICE.employee.
2)
3)
We do an all-AMPs RETRIEVE step from CUSTOMER_SERVICE.employee in view emp_v
by way of an all-rows scan with no residual conditions into Spool 2 (group_amps), which is built
locally on the AMPs.
The size of Spool 2 is estimated with low confidence to be 24 rows (2,040 bytes). The estimated time
for this step is 0.03 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 2 are sent back to the user as the result of statement 1. The total estimated
time is 0.03 seconds.
3) Hashing/Sorting/Grouping columns
For grouping columns, hashing columns and sorting columns, we trace back their sources and print
original base tables fields.
In TD 6.1:
Explain
Select TRANS_number, sum (TRANS_AMOUNT) from temp_trans
Where TRANS_number< 1
Group by TRANS_number
Order by 1;
Explanation
1)
First, we lock a distinct USER01."pseudo table" for read on a RowHash to prevent global deadlock
for USER01.temp_trans.
2)
3)
4)
We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1
(group_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 1 by the sort
key in spool field1. The size of Spool 1 is estimated with no confidence to be 11,090 rows. The
estimated time for this step is 0.09 seconds.
5)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
->
The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.78 seconds.
In TD 12.0:
Explain
Select TRANS_number, sum(TRANS_AMOUNT) from temp_trans
Where TRANS_number< 1
Group by TRANS_number
Order by 1;
Explanation
1)
2)
3)
We
do
an
all-AMPs
SUM
step
to
aggregate
from
temporary
table
CUSTOMER_SERVICE.temp_trans by way of an all-rows scan with a condition of
("CUSTOMER_SERVICE.temp_trans.TRANS_NUMBER
<
1")
grouping
by
field1
(
CUSTOMER_SERVICE.temp_trans.TRANS_NUMBER). Aggregate Intermediate Results are
computed locally, then placed in Spool 3. The size of Spool 3 is estimated with no confidence to be
1 row (29 bytes). The estimated time for this step is 0.03
seconds.
4)
We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool
1 (group_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 1 by the
sort key in spool field1 (CUSTOMER_SERVICE.temp_trans.TRANS_NUMBER). The size of
Spool 1 is estimated with no confidence to be 1 row (33 bytes). The estimated time for this step is
0.04 seconds.
5)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
->
The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated
time is 0.07 seconds.
In TD 6.1:
Explain
Insert into emp1
Select * from employee;
Explanation
1)
First, we lock a distinct USER01."pseudo table" for write on a RowHash to prevent global deadlock
for USER01.emp1.
2)
Next, we lock a distinct USER01."pseudo table" for read on a RowHash to prevent global deadlock
for USER01.employee.
3)
4)
5)
6)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
->
In TD 12.0:
Explain
Insert into emp1
Select * from employee;
Explanation
1)
First, we lock a distinct TEST."pseudo table" for write on a RowHash to prevent global deadlock for
TEST.emp1.
2)
Next, we lock a distinct user01."pseudo table" for read on a RowHash to prevent global deadlock
for user01.employee.
3)
4)
We do an all-AMPs MERGE into TEST.emp1 from user01.employee. The size is estimated with
no confidence to be 28 rows. The estimated time for this step is 1.92 seconds.
5)
6)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
9.5
Prior to Teradata Database 12.0 it was not possible to do any SQL queries to determine the base tables a
join index covers. A new column, JoinIndexTableID, has been added to DBC.Indexes in TD12.0 (not
present in TD6.1- Compare the screenshots shown below)
Use this feature when you want to determine the tables a join or hash index covers.
In TD6.1:
In TD12.0:
Screenshot below shows the JoinIndexTableID for the join index created
9.6
Prior to Teradata Database 12.0, to get a list of all global temporary tables, you would have to get a list of
all the databases and then execute a HELP DATABASE for each database.
In order to provide an efficient way to obtain a list of all global temporary tables, both the CommitOpt and
TransLog columns from DBC.TVM are now included in DBC.Tables, DBC.TablesX, DBC.TableV, and
DBC.TablesVX.
Please find documents attached which shows the addition of the two columns in DBC.Tables,
DBC.TablesX, DBC.TableV and DBC.TablesVX in TD12.0 (which are not there in TD6.1)
GTT_TD61.doc
GTT_TD12.doc
Note: The highlighted columns are added in TD12.0 in the above mentioned tables.
9.7
ANSI Merge
Apart from SELECT requests, typical SQL DML operations involve INSERTS, UPDATES and DELETES.
ANSI devised a new SQL statement, MERGE INTO, as a part of the SQL-2003 standards. MERGE INTO
has the capability of performing UPDATES and INSERTS together in a single statement. A rudimentary
form of MERGE INTO was implemented in the TERADATA Database V2R5.0 release. As a part of the 12
.0 release, the MERGE INTO statement has been enhanced to remove some of the restrictions imposed
by the TERADATA Database V2R5.0 MERGE INTO statement. This document describes the
enhancements made to the MERGE INTO statement. It also discusses the various restrictions and
performance implications and considerations you must consider when using the enhanced MERGE INTO
statement with complex error handling.
The Teradata 12.0 MERGE INTO statement offers the following enhancements to the Teradata Database
V2R5.0 MERGE INTO statement:
1) Allows multiple source rows to be merged into the target table, unlike the Teradata Database
V2R5.0 MERGE statement that enforced a restriction that the source table could not have more
than one row. Consequently if the source table happens to be a single table, it is not necessary
for it to have a UPI or USI defined on it for it to be used in the context of MERGE INTO statement.
2) Provides complex error handling support for the MERGE INTO statement.
First, we lock a distinct USER01."pseudo table" for read on a RowHash to prevent global deadlock
for USER01.t2.
2)
Next, we lock a distinct USER01."pseudo table" for write on a RowHash to prevent global deadlock
for USER01.t1.
3)
4)
We do an all-AMPs RETRIEVE step from USER01.t2 by way of an all-rows scan with no residual
conditions into Spool 1 (all_amps), which is built locally on the AMPs. The size of Spool 1 is
estimated with low confidence to be 22 rows (550 bytes). The estimated time for this step is 0.01
seconds.
5)
We do an all-AMPs MERGE into USER01.t1 from Spool 1 (Last Use). The size is estimated with
low confidence to be 22 rows. The estimated time for this step is 0.23 seconds.
6)
7)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
First, we lock a distinct USER01."pseudo table" for read on a RowHash to prevent global deadlock
for USER01.t2.
2)
Next, we lock a distinct USER01."pseudo table" for write on a RowHash to prevent global deadlock
for USER01.t1.
3)
4)
We do an all-AMPs merge with unmatched inserts into USER01.t1 from USER01.t2 with a condition
of ("(1=0)"). The number of rows merged is estimated with low confidence to be 22 rows.
5)
6)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
->
For example:
>explain insert into t1 sel a2, b2, c2 from t2, t1 where not (a1=a2);
Explanation
--------------------------------------------------------------------------1)
First, we lock a distinct USER01."pseudo table" for read on a RowHash to prevent global deadlock
for USER01.t2.
2)
Next, we lock a distinct USER01."pseudo table" for write on a RowHash to prevent global deadlock
for USER01.t1.
3)
4)
We do an all-AMPs RETRIEVE step from USER01.t1 by way of an all-rows scan with no residual
conditions into Spool 2 (all_amps), which is duplicated on all AMPs. The size of Spool 2 is
estimated with low confidence to be 44 rows (748 bytes). The estimated time for this step is 0.03
seconds.
5)
We do an all-AMPs JOIN step from USER01.t2 by way of an all-rows scan with no residual
conditions, which is joined to Spool 2 (Last Use) by way of an all-rows scan. USER01.t2 and Spool
2 are joined using a product join, with a join condition of ("a1 <> USER01.t2.a2"). The result goes
into Spool 1 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 1
by the hash code of (USER01.t2.a2). The size of Spool 1 is estimated with no confidence to be 104
rows (2,600 bytes). The estimated time for this step is 0.03 seconds.
6)
We do an all-AMPs MERGE into USER01.t1 from Spool 1 (Last Use). The size is estimated with no
confidence to be 104 rows. The estimated time for this step is 0.23 seconds.
7)
8)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
->
You can rewrite the same INSERT-SELECT request using ANSI MERGE, as shown below:
>explain merge into t1
Using t2
On a1=a2
When not matched then
INS (a2, b2, c2);
Explanation
--------------------------------------------------------------------------1)
First, we lock a distinct USER01."pseudo table" for read on a Row Hash to prevent global deadlock
for USER01.t2.
2)
Next, we lock a distinct USER01."pseudo table" for write on a Row Hash to prevent global deadlock
for USER01.t1.
3)
4)
We do an all-AMPs merge with unmatched inserts into USER01.t1 from USER01.t2 with a condition
of ("USER01.t1.a1 = USER01.t2.a2"). The number of rows merged is estimated with low confidence
to be 22 rows.
5)
6)
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
->
As you can see, there is no spooling of the source table, and there is no separate join step.
Both of these operations are achieved by the single merge with unmatched inserts step, and the request
runs very quickly compared to the equivalent conditional INSERT-SELECT request.