Вы находитесь на странице: 1из 150

Chapter 5: CLI/ODBC Programming

Overview
Thirteen percent (13%) of the DB2 UDB V8.1 Family Application Development exam (Exam 703) is designed to test your knowledge of ODBC and DB2 UDB's Call Level Interface and to test your ability to create a simple CLI/ODBC application. The questions that make up this portion of the exam are intended to evaluate the following: Your ability to identify the types of handles used by CLI/ODBC applications. Your ability to establish a connection to a data source within a CLI/ODBC application. Your ability to identify the correct sequence to use when calling CLI/ODBC functions. Your ability to configure the DB2 ODBC driver. Your ability to evaluate CLI/ODBC return codes and to obtain diagnostic information. Your ability to identify how cursors are created and opened by CLI/ODBC applications. This chapter is designed to introduce you to CLI/ODBC programming and to walk you through the basic steps used to construct a CLI/ODBC application. This chapter is also designed to introduce you to many of the CLI/ODBC functions commonly used in application development. Terms you will learn: Open Database Connectivity (ODBC) DB2 Call Level Interface (DB2 CLI) Application Programming Interface (API) functions SQLAllocHandle() SQLFreehandle() SQLConnect() SQLDriverConnect() SQLBrowseConnect() SQLDisconnect() SQLSetEnvAttr() SQLSetConnectAttr() SQLSetStmtAttr() SQLPrepare() SQLExecute() SQLExecDirect() SQLBindParameter() SQLBindCol() SQLFetch() SQLEndTran() Handle Environment handle SQL_HANDLE_ENV Connection handle SQL_HANDLE_DBC Statement handle SQL_HANDLE_STMT Descriptor handle SQL_HANDLE_DESC Application Parameter Descriptors (APD) Implementation Parameter Descriptors (IPD) Application Row Descriptors (ARD) Implementation Row Descriptors (IRD) Attributes SQLSTATE Connection string

Parameter markers Transaction SQL_NTS Environment attributes Connection attributes Statement attributes Return code SQL_SUCCESS SQL_SUCCESS_WITH_INFO SQL_NO_DATA (or SQL_NO_DATA_FOUND) SQL_INVALID_HANDLE SQL_NEED_DATA SQL_STILL_EXECUTING SQL_ERROR Techniques you will master: Understanding the difference between environment handles, connection handles, statement handles, and descriptor handles. Recognizing the order in which CLI/ODBC functions must be called. Knowing how to establish a data source connection from a CLI/ODBC application. Knowing how to configure the DB2 ODBC driver through environment, connection, and SQL statement attributes. Knowing how to evaluate function return codes and obtain diagnostic information. Recognizing how cursors are managed by CLI/ODBC applications.

An Introduction to ODBC and CLI


One of the biggest drawbacks to developing applications with embedded SQL is the lack of interoperability that such an application affordsembedded SQL applications developed specifically for DB2 UDB will have to be modified (and in some cases completely rewritten) before they can interact with other relational database management systems. Because this limitation exists in every embedded SQL application regardless of which relational database management system is used, the X/Open Company and the SQL Access Group (SAG), now a part of X/Open, jointly developed a standard specification for a callable SQL interface in the early 1990s. This interface was known as the X/Open Call-Level Interface, or X/Open CLI, and much of the X/Open CLI specification was later accepted as part of the ISO CLI international standard. The primary purpose of X/Open CLI was to increase the portability of database applications by allowing them to become independent of any one database management system's programming interface. In 1992, Microsoft Corporation developed a callable SQL interface known as Open Database Connectivity (or ODBC) for the Microsoft Windows operating system. ODBC is based on the X/Open CLI standards specification but provides extended functionality and capability that is not part of X/Open CLI. ODBC relies on an operating environment in which data source-specific ODBC drivers are dynamically loaded at application run time (based on information provided when a connection is requested) by a component known as the ODBC Driver Manager. Each data source-specific driver is responsible for implementing any or all of the ODBC functions defined in the ODBC specification and for providing interaction with the data source for which the driver was written (i.e., an ODBC driver provides ODBC functionality for a specific data source). The ODBC Driver Manager provides a central point of control; as an ODBC application executes, each ODBC function call it makes is sent to the ODBC Driver Manager, where it is forwarded to the appropriate data source driver for processing. By using drivers, you can link an application directly to a single ODBC driver library, rather than to each product-specific database. DB2's Call Level Interface (DB2 CLI) is based on the ISO CLI international standard and provides most of the functionality that is found in the ODBC specification. Applications that use DB2 CLI instead of ODBC are linked directly to the DB2 CLI load library, and any ODBC Driver Manager can load this library as an ODBC driver. DB2 UDB applications can also use the DB2 CLI load

library independently; however, when the library is used in this manner, the application will not be able to communicate with other data sources. Figure 5-1 illustrates how applications use the DB2 CLI load library in an ODBC Driver Manager environment; Figure 5-2 illustrates how applications use the DB2 CLI load library in a DB2 CLI environment.

Figure 5-1: How applications use the DB2 CLI load library in an ODBC environment.

Figure 5-2: How applications use the DB2 CLI load library in a DB2 CLI environment.

Differences between Embedded SQL and CLI Applications


Where embedded SQL applications are constructed by embedding SQL statements directly into one or more source code files that are written using high-level programming language, CLI/ODBC applications rely on a standardized set of Application Programming Interface (API) functions to execute SQL statements and perform related database operations. The API functions that can be used in a DB2 CLI/ODBC application are identified in Table 5-1. Table 5-1: DB2 UDB CLI Functions CLI/ODBC Function Purpose CLI/ODBC Resource Management SQLAllocHandle() Allocates resources for an environment, connection, SQL statement, or descriptor handle. (Detailed information about each of these handles will be provided a little later.) Releases all resources previously allocated for an environment,

SQLFreehandle()

Table 5-1: DB2 UDB CLI Functions CLI/ODBC Function Purpose

connection, SQL statement, or descriptor handle. Data Source Connection Management SQLConnect() SQLDriverConnect( ) Establishes a connection to a specific data source using just an authentication (user) ID and a corresponding password. Establishes a connection to a specific data source or driver using a connection string (a string that is comprised of parameters that are needed to establish a connection such as data source name, user ID, and password) OR causes a dialog that allows the end user to provide connection information such as data source name, authentication ID, and password to be displayed. (A connection is then established using the information provided.) Returns successive connection attributes and corresponding valid values to the calling application. When values have been selected for each connection attribute returned, a connection to the appropriate data source is established. Sets the current active connection in applications that have established multiple concurrent data source connections and that contain embedded SQL statements along with CLI function calls.

SQLBrowseConnect( )

SQLSetConnection( )[*] SQLDisconnect()

Terminates a data source connection. CLI/ODBC Driver Management SQLDataSources() SQLGetInfo() SQLGetFunctions() SQLGetTypeInfo() SQLGetEnvAttr() SQLSetEnvAttr() SQLGetConnectAttr () SQLSetConnectAttr () SQLGetStmtAttr() SQLSetStmtAttr() SQL Statement Processing SQLPrepare() SQLExtendedPrepar Prepares an SQL statement for execution. Prepares an SQL statement for execution and assigns values to one Generates a list of data sources that an application can establish a connection to. Obtains information about a specific ODBC driver and its associated data source. Obtains information about whether a specific CLI/ODBC function is supported by a driver (and its associated data source). Obtains information about the native data types that are supported by a driver (and its associated data source). Obtains the current value assigned to a specific environment attribute. Assigns a value to a specific environment attribute. Obtains the current value assigned to a specific connection attribute. Assigns a value to a specific connection attribute. Obtains the current value assigned to a specific SQL statement attribute. Assigns a value to a specific SQL statement attribute.

Table 5-1: DB2 UDB CLI Functions CLI/ODBC Function e()[*] SQLBindParameter( ) SQLExtendedBind()
[*]

Purpose or more SQL statement attributes at the same time. Associates a local application variable with a parameter marker in an SQL statement. Associates an array of local application variables with parameter markers in an SQL statement. Obtains the number of parameter markers used in an SQL statement. Obtains information about a specific parameter marker in an SQL statement. Executes a prepared SQL statement. Prepares and executes an SQL statement. Retrieves the text of an SQL statement containing vendor-specific SQL extensions (referred to as vendor escape clauses) after a data source driver has translated it. Is used in conjunction with the SQLPutData() function to send long data values for data-at-execution parameters (parameters defined as SQL_DATA_AT_ EXEC) to a data source for processing. Is used in conjunction with the SQLParamData() function to send part or all of a long data value to a data source for processing. Cancels SQL statement processing. Commits or rolls back the current transaction. Ends SQL statement processing, closes any associated cursor, discards pending results and optionally, frees all resources associated with a statement handle. Returns a set of commonly-used descriptor information (column name, type, precision, scale, nullability) for a specific column in a result data set. Obtains information about the number of columns found in a result data set. Obtains information about a specific column in a result data set. Associates a local application variable with a column in a result data set. Obtains the name assigned to the cursor (if any) associated with a statement handle. Assigns a name to the cursor (if any) associated with a statement handle. Advances the cursor pointer to the next row in a result data set, retrieves a single row of data from a result data set, and copies any data retrieved to "bound" application variables.

SQLNumParams() SQLDescribeParam( ) SQLExecute() SQLExecDirect() SQLNativeSql()

SQLParamData()

SQLPutData() SQLCancel() SQLEndTran() SQLFreeStmt()

Query Results Retrieval SQLDescribeCol()

SQLNumResultCols( ) SQLColAttribute() SQLBindCol() SQLGetCursorName( ) SQLSetCursorName( ) SQLFetch()

Table 5-1: DB2 UDB CLI Functions CLI/ODBC Function SQLFetchScroll() Purpose Advances the cursor pointer, retrieves multiple rows of data (a rowset) from a result data set, and copies any data retrieved to "bound" application variables. Closes the cursor (if any) associated with a statement handle. Retrieves data from a single column (of the current row) of a result data set. Determines whether more result data sets or row count values are available, and if there are, initializes processing for them. Provides non-sequential access to multiple result data sets returned by a stored procedure. Positions a cursor within a fetched block of data and allows an application to refresh data in a rowset or update and delete data in a result data set. Performs bulk insertions and bookmark operations, including update, delete, and fetch-by-bookmark operations. Obtains information about the number of rows affected by the execution of insert, update, and delete operations. Obtains the current values of multiple fields of a descriptor record. (This information describes the name, data type, and storage sizes used by the data associated with a particular parameter marker or result data set column.) Assigns values to multiple fields of a descriptor record. Obtains the current value assigned to a field in a descriptor header or a parameter/column record. Assigns a value to a field in a descriptor header or a parameter/column record. Copies the contents of one descriptor to another. Associates a large object file with a parameter marker in an SQL statement. Associates a large object file with a column in a result data set. Obtains the length of a large object data value that is referenced by an LOB locator. Determines the starting position of a substring within a source string that is referenced by an LOB locator. Creates a new LOB locator that references a substring within a source string that is referenced by an LOB locator.

SQLCloseCursor() SQLGetData() SQLMoreResults() SQLNextResult()[*] Data Modification SQLSetPos()

SQLBulkOperations () SQLRowCount() Descriptor Management SQLGetDescRec()

SQLSetDescRec() SQLGetDescField() SQLSetDescField() SQLCopyDesc() Large Object Support SQLBindFileToPara m()[*] SQLBindFileToCol( )[*] SQLGetLength()[*] SQLGetPosition()[*] SQLGetSubString()
[*]

Table 5-1: DB2 UDB CLI Functions CLI/ODBC Function DataLink Support SQLBuildDataLink( )[*] SQLGetDataLinkAtt r()[*] System Table Processing SQLTables() Obtains a list of catalog names, schema names, table names, and/or table types that have been defined for a particular data source. Obtains a list of table names, along with the authorization information associated with those tables that have been defined for a particular data source. Obtains a list of column names that have been defined for one or more tables. Obtains a list of column names, along with the authorization information associated with those columns, that have been defined for a particular table. Obtains a list of the optimal set of columns that uniquely identify a row of data in a particular table. Obtains a list of column names that make up the primary key of a particular table. Obtains a list of column names that make up the foreign keys (if any) of a particular table or a list of foreign keys that refer to a particular table. Obtains a list of stored procedures that have been stored in and are available for use with a data source. Obtains a list of input and output parameters associated with a stored procedure, return values for a particular stored procedure, and columns in the result data set produced by a stored procedure (if any). Obtains statistical information about a specific table and a list of associated indexes for that table. Obtains error, warning, and/or status information from a diagnostic record that was generated by the last CLI/ODBC function executed. Obtains the current value assigned to a field in a diagnostic record header or a diagnostic record that was generated by the last CLI/ODBC function executed. Obtains SQLCA data structure information that is associated with the last SQL statement executed. (Although this function is available, it has been depreciatedthe SQLGetDiagRec() and SQLGetDiagField() should be used to obtain this information Purpose Builds a DataLink value from values provided. Obtains the current value assigned to a specific attribute of a DataLink value.

SQLTablePrivilege s() SQLColumns() SQLColumnPrivileg es() SQLSpecialColumns () SQLPrimaryKeys() SQLForeignKeys()

SQLProcedures() SQLProcedureColum ns()

SQLStatistics() Error Handling SQLGetDiagRec() SQLGetDiagField()

SQLGetSQLCA()[*]

Table 5-1: DB2 UDB CLI Functions CLI/ODBC Function Purpose

instead.) Adapted from Table 1 on Pages 1-7 of the DB2 Call Level Interface Guide and Reference, Volume 2
[*]

Functions available only with DB2 CLI; these functions are not supported by ODBC.

Embedded SQL applications and CLI/ODBC applications also differ in the following ways: CLI/ODBC applications do not require the explicit declaration and use of host variables; any variable can be used to send data to or retrieve data from a data source. Cursors do not have to be explicitly declared by CLI/ODBC applications. Instead, cursors are automatically generated, if needed, whenever the SQLPrepare() or SQLExecDirect() function is executed. Cursors do not have to be explicitly opened in CLI/ODBC applications; cursors are implicitly opened as soon as they are generated. CLI/ODBC functions manage environment, connection, and SQL statementrelated information using handles, which allows this information to be treated as an abstract object. The use of handles eliminates the need for CLI/ODBC applications to use database product-specific data structures such as DB2 UDB's SQLCA, SQLCHAR, and SQLDA data structures. Unlike embedded SQL, CLI/ODBC applications inherently have the ability to establish multiple connections to multiple data sources or to the same data source at the same time. (Embedded SQL applications can only connect to multiple data sources at the same time if Type 2 connections are used.) Despite these differences, one important common concept exists between embedded SQL applications and CLI/ODBC applications: CLI/ODBC applications can execute any SQL statement (using a CLI/ODBC function) that can be dynamically prepared in an embedded SQL application. This is guaranteed because CLI/ODBC applications pass all of their SQL statements directly to the data source for dynamic execution. (CLI/ODBC applications can also execute some SQL statements that cannot be dynamically prepared, such as compound SQL statements, but for the most part, static SQL is not supported.) By allowing the data source to process all SQL statements submitted by a CLI/ODBC application, the portability of CLI/ODBC applications is guaranteed. This is not always the case with embedded SQL applications because the way SQL statements are dynamically prepared can vary with each relational database product used. Also, because COMMIT and ROLLBACK SQL statements can be dynamically prepared by some database products (including DB2 UDB) but not by others, they are typically not used in CLI/ODBC applications. Instead, CLI/ODBC applications rely on the SQLEndTran() function to terminate active transactions (when manual-commit is used). This ensures that CLI/ODBC applications can successfully end transactions, regardless of which database product is being used.

Parts of a CLI/ODBC Application


Now that we have identified some of the major differences between embedded SQL applications and CLI/ODBC applications, let's take a look at how CLI/ODBC applications are developed. All CLI/ODBC applications are constructed such that they perform three distinct tasks: Initialization Transaction Processing Termination The actual work associated with these three tasks is conducted by invoking one or more CLI/ODBC functions. Furthermore, many of the CLI/ODBC functions used to carry out these

tasks must be called in a specific order or an error will occur. Figure 5-3 illustrates some of the basic CLI/ODBC functions used to perform initialization and termination.

Figure 5-3: CLI/ODBC application initialization and termination. Aside from performing these three basic tasks, CLI/ODBC applications can also perform additional tasks like error handling and message processing. (We will look at how errors are handled in a CLI/ODBC application a little later.)

Initialization
During initialization, resources that will be needed by the transaction processing task are allocated (and initialized), and connections to any data source(s) that the transaction processing task will interact with are established. Three basic steps are performed as part of the initialization process. They are: Resource allocation Application ODBC version declaration Data source connection management

Resource Allocation
The resources used by CLI/ODBC applications consist of special data storage areas that are identified by unique handles. A handle is simply a pointer variable that refers to a data object controlled by DB2 CLI or the ODBC Driver Manager and referenced by CLI/ODBC function calls. By using data storage areas and handles, CLI/ODBC applications are freed from the responsibility of having to allocate and manage global variables and/or data structures like the SQLCA and SQLDA data structures used with embedded SQL applications. Four types of handles are available: Environment handles Connection handles Statement handles Descriptor handles Environment handle. An environment handle is a pointer to a data storage area that contains CLI/ODBC- specific information that is global in nature. This information includes: The current state of the environment. The current value of each environment attribute available. A handle for each connection data storage area currently associated with the environment handle. The number of connections currently available in the environment, along with their current states (i.e., "Connected" or "Disconnected"). Diagnostic information about the current environment. Every CLI application must begin by allocating an environment handle. Usually, only one environment handle is allocated per application, and that handle must exist before any other handles can be allocated; all other handles are managed within the context of the environment handle used. Environment handles are allocated by calling the SQLAllocHandle() function with the SQL_HANDLE_ENV option specified. Thus, the source code used to allocate an environment handle looks something like this : SQLHANDLE EnvHandle = 0; SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &EnvHandle); Connection handle. A connection handle is a pointer to a data storage area that contains information about a data source (database) connection being managed by CLI/ODBC. This information includes: The current state of the connection being managed. The current value of each connection attribute available. A handle for every SQL statement data storage area associated with the connection handle. A handle for every descriptor data storage area associated with the connection handle. Diagnostic information about the connection being managed. All connections to data sources are made via connection handles; therefore, a connection handle must exist before a connection to any data source can be established. Connection handles are allocated by calling the SQLAllocHandle() function with the SQL_HANDLE_DBC option and a valid environment handle specified. The source code used to allocate a connection handle looks something like this: SQLHANDLE ConHandle = 0; if (EnvHandle != 0) SQLAllocHandle(SQL_HANDLE_DBC, EnvHandle, &ConHandle); Statement handle. A statement handle is a pointer to a data storage area that contains specific information about a single SQL statement. This information includes: The current state of the associated SQL statement. The current value of each SQL statement attribute available.

The addresses of all application variables that have been bound to (associated with) any parameter markers used by the SQL statement. The addresses of all application variables that have been bound to (associated with) the columns of any result data set produced by the SQL statement. Diagnostic information about the SQL statement.

The statement handle is the real workhorse of CLI/ODBC. Statement handles are used to process each SQL statement contained in an application (both user-defined SQL statements and SQL statements that are performed behind-the-scenes when certain CLI/ODBC functions are called). Notably, statement handles are used to bind application variables to parameter markers used in a statement, prepare and submit a statement to the appropriate data source for execution, obtain metadata about any result data set(s) produced in response to a statement, bind application variables to columns found in any result data set(s) produced, retrieve (fetch) data from any result data set(s) produced, and obtain diagnostic information when a statement fails to execute. Each SQL statement used in an application must have its own statement handle, and each statement handle used can be associated with only one connection handle. However, any number of statement handles can be associated with a single connection handle. Statement handles are allocated by calling the SQLAllocHandle() function with the SQL_HANDLE_STMT option and a valid connection handle specified. Thus, the source code used to allocate a statement handle looks something like this: SQLHANDLE StmtHandle = 0; if (ConHandle != 0) SQLAllocHandle(SQL_HANDLE_STMT, ConHandle, &StmtHandle); Descriptor handle. A descriptor handle is a pointer to a data storage area that contains a collection of metadata that describes either the application variables that have been bound to parameter markers in an SQL statement or the application variables that have been bound to the columns of a result data set produced in response to a query. Four types of descriptors are recognized by CLI/ODBC: Application Parameter Descriptors (APD). Contain information about the application variables (buffers) that have been bound to parameter markers used in an SQL statement such as their addresses, lengths, and high-level programming language data types. Implementation Parameter Descriptors (IPD). Contain information about the parameters used in an SQL statement such as their SQL data types, lengths, and nullability characteristics. Application Row Descriptors (ARD). Contain information about the application variables (buffers) bound to the columns of a result data set such as their addresses, lengths, and highlevel programming language data types. Implementation Row Descriptors (IRD). Contain information about the columns in a result data set such as their SQL data types, lengths, and nullability characteristics. Four descriptor handles (one of each descriptor type) are automatically allocated and associated with a statement handle as part of the statement handle allocation process. Once allocated, these descriptor handles remain associated with the corresponding statement handle until it is destroyed (at which time the descriptor handles will also be destroyed). Descriptor handles can also be explicitly allocated and associated with a statement handle to fulfill the role of an APD or ARD descriptor that was implicitly allocated. Most CLI/ODBC operations can be performed using the descriptor handles that are implicitly defined. However, when used, explicitly-defined descriptor handles can provide a convenient shortcut for performing some operations.

Handle management. As we have already seen, handles are initialized and their corresponding data storage areas are allocated by calling the SQLAllocHandle() function with the appropriate handle type and parent handle specified. (Handle types include SQL_HANDLE_ENV [environment], SQL_HANDLE_DBC [connection], SQL_HANDLE_STMT [SQL statement], and SQL_HANDLE_DESC [descriptor].) Once created, handles remain available for use by other CLI/ODBC functions until they are explicitly freed, either during transaction processing or as part of the termination task.

Declaring the Application CLI/ODBC Version


Both CLI and ODBC use product-specific drivers to communicate with data sources (databases), and most of these drivers contain a set of dynamic parameters that can be changed to alter the driver's behavior to meet an application's needs. These parameters are referred to as attributes, and every environment, connection, and statement handle allocated has its own set of attributes. (As you can see in Table 5-1, a set of CLI/ODBC functions that provides an application with the ability to retrieve and change the attributes associated with an environment, connection, or statement handle is available). One such environment attribute is the SQL_ATTR_ODBC_VERSION attribute and this attribute must be assigned the value SQL_OV_ODBC3 or SQL_OV_ODBC2 after an environment handle has been allocated, but before any corresponding connection handles are allocated. This tells DB2 CLI and/or the ODBC Driver Manager that the application intends to adhere to the CLI/ODBC 3.x (or later) or the CLI/ODBC 2.0 (or earlier) specification, respectively. It is important that DB2 CLI and/or the ODBC Driver Manager knows which specification an application has been coded for because many of the return code values (otherwise known as SQLSTATEs) that are returned by a CLI/ODBC function vary from one version to the next. Additionally, later versions of DB2 CLI and ODBC allow wild cards to be used in some function parameters, but earlier versions do not.

Connecting to a Data Source


Once an environment handle and a corresponding connection handle have been allocated, the connection handle can be used to physically establish a connection to a data source. Three CLI/ODBC functions can be used to establish a data source connection: SQLConnect() SQLDriverConnect() SQLBrowseConnect() Applications can use any combination of these functions to connect to any number of data sources. Note Some data sources limit the number of active connections they support. An application can find out how many active connections a particular data source supports by calling the SQLGetInfo() function with the SQL_MAX_DRIVER_CONNECTIONS information type specified. The SQLConnect() function. The simplest CLI/ODBC connection function available, the SQLConnect() function, assumes that the only information needed to establish a connection is a data source name and, optionally, a user ID (authorization ID) and password. (Any other information needed is stored in either the [COMMON] section of the db2cli.ini file, the [ODBC] section of the ODBC.INI file, or the ODBC subkey in the system registry.) This function works well for applications that need to connect to data sources that require only a user ID and password and for applications that want to provide their own connection interface or require no user interface at all. (The SQLConnect() function is also the only connection function supported by both the X/Open 95 and the ISO/IEC 92 CLI standards.) The SQLDriverConnect() function. The SQLDriverConnect() function allows an application to send connection information to a data source driver using a connection string (as opposed to storing this information in the db2cli.ini file, the ODBC.INI file, or the system registry and allowing the driver to retrieve it). A connection string is simply a series of keyword/value

pairs, separated by semicolons, that contains information to be used to establish a connection to a data source. Some of the more common keyword/value pairs used are identified in Table 5-2. Table 5-2: Common Keyword/Value Pairs Used to Construct Connection Strings Keyword/Value DSN=DataSourceName Purpose Specifies the name of a data source (as returned by the SQLDataSources() function) that a connection is to be established with. Specifies the user ID (authorization ID) of the user attempting to establish the connection. Specifies the password corresponding to the user ID (authorization ID) specified. If a password is not required for the specified user ID, an empty password string should be used (PWD=;). Specifies the new password that is to be assigned to the user ID (authorization ID) specified. If the NEW-PWD keyword is used but no new password is provided (NEWPWD=;), the DB2 CLI driver will prompt the user to provide a new password.

UID=UserID PWD=Password

NEWPWD=NewPassword

Thus, an application that always connects to a database named "PAYROLL" using the authorization ID "db2admin" and the corresponding password "ibmdb2" could use a connection string with the SQLDriverConnect() function that looks something like this: DSN=PAYROLL;UID=db2admin;PWD=ibmdb2; When invoked, the SQLDriverConnect() function parses the connection string provided and, using the data source name specified, attempts to retrieve additional information needed from the system (for example, the server's IP address) to establish a connection. Using this information, the function then logs on to the appropriate server and attempts to connect to the designated data source. Applications using the SQLDriverConnect() function can also let the driver prompt the user for any connection information needed. For example, when the SQLDriverConnect() function is called with an empty connection string, DB2 CLI will display the dialog shown in Figure 5-4.

Figure 5-4: The DB2 UDB SQLDriverConnect() user input dialog. This dialog prompts the user to select a data source from a list of data sources recognized by DB2 CLI and to provide a user ID along with a corresponding password. Once this information has been provided, an appropriate connection string is constructed and that string is used to establish a connection to the appropriate data source. Whether this dialog will be displayed is determined by one of the parameter values passed to the SQLDriverConnect() function: if this function is called with the SQL_DRIVER_PROMPT, SQL_DRIVER_COMPLETE, or SQL_DRIVER_COMPLETE_REQUIRED option specified, the dialog will be displayed if the connection string provided does not contain enough information to establish a data source connection. On the other hand, if this function is called with the SQL_DRIVER_NOPROMPT option specified and more information is needed, an error will be generated. The SQLBrowseConnect() function. Like the SQLDriverConnect() function, the SQLBrowseConnect() function uses a connection string to send connection information to a driver. However, unlike the SQLDriverConnect() function, the SQLBrowseConnect() function can be used to construct a complete connection string at application run time. This difference allows an application to: Use its own dialog to prompt users for connection information, thereby retaining control over its "look and feel." Search (browse) the system for data sources that can be used by a particular driver, possibly in several steps. For example, an application might first browse the network for servers and after choosing a server, it can then browse the server for databases that can be accessed by a specific driver. Regardless of which function is used, when a connection is made to a data source by way of a connection handle, the driver that is used to communicate with the corresponding data source is loaded into memory, an array of pointers to the CLI/ODBC functions that are supported by that driver are stored in the data storage area that the connection handle points to, and the data

source itself is made available to the application that established the connection. At that point, the connection handle is said to be in an "Active" or "Connected" state (prior to that point, the connection handle was in an "Allocated" state). There-after, each time a CLI/ODBC function is called with the connection handle specified, the corresponding function entry point is located in the array of function pointers associated with the connection handle and the call is routed to the appropriate function within the driver, where it is executed.

Transaction Processing
The transaction processing task follows initialization and makes up the bulk of a CLI/ODBC application. This is where the SQL statements that query and/or manipulate data are passed to the appropriate data source (which in our case is typically a DB2 UDB database) by various CLI/ODBC function calls for processing. In the transaction processing task, an application performs the following five steps (in the order shown): 1. Allocates one or more statement handles 2. Prepares and executes one or more SQL statements 3. Retrieves and processes any results produced 4. Terminates the current transaction by committing it or rolling it back 5. Frees all statement handles allocated Figure 5-5 illustrates the basic steps that are performed during the transaction processing task and identifies the CLI/ODBC function calls that are typically used to execute each step.

Figure 5-5: Overview of a typical CLI/ODBC application's transaction processing task.

Allocating Statement Handles


As mentioned earlier, a statement handle refers to a data object that contains information about a single SQL statement. This information includes: The actual text of the SQL statement itself. Details about the cursor (if any) associated with the statement. Bindings for all SQL statement parameter marker variables. Bindings for all result data set column variables. The statement execution return code. Status information. As we saw earlier, statement handles are allocated by calling the SQLAllocHandle() function with the SQL_HANDLE_STMT option and a valid connection handle specified. At a minimum, one statement handle must be allocated before any SQL statements can be executed by a CLI/ODBC application.

Preparing and Executing SQL Statements


Once a statement handle has been allocated, there are two ways in which SQL statements can be processed: Prepare then Execute. This approach separates the preparation of the SQL statement from its actual execution and is typically used when an SQL statement is to be executed repeatedly. This method is also used when an application needs advance information about the columns that will exist in the result data set produced when the SQL statement is executed. The CLI/ODBC functions SQLPrepare() and SQLExecute() are used to process SQL statements in this manner. Execute Direct. This approach combines the preparation and execution of an SQL statement into a single step and is typically used when an SQL statement is to be executed only once. This method is also used when the application does not need additional information about the result data set that will be produced, if any, when the SQL statement is executed. The CLI/ODBC function SQLExecDirect() is used to process SQL statements in this manner. Both of these methods allow the use of parameter markers in place of constants and/or expressions in an SQL statement. Parameter markers are represented by the question mark (?) character and they indicate the position in the SQL statement where the current value of one or more application variables is to be substituted when the statement is actually executed. When an application variable is associated with a specific parameter marker in an SQL statement, that variable is said to be "bound" to the parameter marker. Such binding is carried out by calling the SQLBindParameter() function, and once an application variable is bound to a parameter marker, the association with that variable remains in effect until it is overridden or until the corresponding statement handle is freed. Although binding can take place any time after an SQL statement has been prepared, data is not actually retrieved from a bound variable until the SQL statement that the application variable has been bound to is executed. By using bound variables, an application can execute a single SQL statement multiple times and obtain different results simply by modifying the contents of any bound variable between each execution. The following pseudo-source code example, written in the C programming language, illustrates how an application variable would be bound to a parameter marker that has been coded in a simple SELECT SQL statement. It also illustrates the way in which a value would be provided for the bound parameter before the statement is executed: ... // Define A SELECT SQL Statement That Uses A Parameter Marker strcpy((char *) SQLStmt, "SELECT EMPNO, LASTNAME FROM "); strcat((char *) SQLStmt, "EMPLOYEE WHERE JOB = ?");

// Prepare The SQL Statement RetCode = SQLPrepare(StmtHandle, SQLStmt, SQL_NTS); // Bind The Parameter Marker Used In The SQL Statement To // An Application Variable RetCode = SQLBindParameter(StmtHandle, 1, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR, sizeof(JobType), 0, JobType, sizeof(JobType), NULL); // Populate The "Bound" Application Variable strcpy((char *) JobType, "DESIGNER"); // Execute The SQL Statement RetCode = SQLExecute(StmtHandle); ...

Retrieving and Processing Results


Once an SQL statement has been prepared and executed, any results produced will need to be retrieved and processed. (Result information is stored in the data storage areas referenced by the connection and statement handles that are associated with the SQL statement that was executed.) If the SQL statement executed was anything other than a SELECT statement, the only additional processing required after execution is a check of the CLI/ODBC function return code to ensure that the function executed as expected. However, if a SELECT statement was executed and a result data set was produced, the following steps are needed to retrieve each row of data from the result data set: 1. Determine the structure (i.e., the number of columns, column data types, and data lengths) of the result data set produced. This is done by executing the SQLNumResultCols(), SQLDescribeCol(), and/or the SQLColAttributes() function. 2. Bind application variables to the columns in the result data set using the SQLBindCol() function (optional). 3. Repeatedly fetch the next row of data from the result data set produced and copy it to the bound application variables. This is typically done by repeatedly calling the SQLFetch() function within a loop. (Values for columns that were not bound to application variables in Step 2 can be retrieved by calling the SQLGetData() function each time a new row is fetched.) In the first step, the prepared or executed SQL statement is analyzed to determine the structure of the result data set produced. If the SQL statement was hard-coded into the application, this step is unnecessary because the structure of the result data set produced is already known. However, if the SQL statement was generated at application run time, the result data set produced must be queried to obtain this information. Result data set structure information can be obtained by calling the SQLNumResultsCol(), SQLDescribeCol(), and SQLColAttribute() functions immediately after the SQL statement has been prepared or executed. Once the structure of a result data set is known, one or more application variables can be bound to specific columns in the result data set, much like application variables are bound to SQL

statement parameter markers. In this case, application variables are used as output arguments rather than as input, and data is retrieved and written directly to the arguments whenever the SQLFetch() function is called. However, because the SQLGetData() function can also be used to retrieve data from a result data set, application variable/column binding is optional. In the third step, data stored in the result data set is retrieved by repeatedly calling the SQLFetch() function (usually in a loop) until data is no longer available. If application variables have been bound to columns in the result data set, their values are automatically updated each time SQLFetch() is called. On the other hand, if column binding was not performed, the SQLGetData() function can be used to copy data from a specific column to an appropriate application variable. The SQLGetData() function can also be used to retrieve large variable length column data values in several small pieces (which cannot be done when bound application variables are used). All data stored in a result data set can be retrieved by using any combination of these two methods; if data conversion is necessary, it will take place automatically when the SQLFetch() function is calledprovided bound variables are used. (The appropriate conversion to use can be specified as input to the SQLGetData() function.) The following pseudo-source code example, written in the C programming language, illustrates how application variables can be bound to the columns in a result data set, and how data in a result data set is normally retrieved by repeatedly calling the SQLFetch() function: ... // Bind The Columns In The Result Data Set Returned // To Application Variables SQLBindCol(StmtHandle, 1, SQL_C_CHAR, (SQLPOINTER) EmpNo, sizeof(EmpNo), NULL); SQLBindCol(StmtHandle, 2, SQL_C_CHAR, (SQLPOINTER) LastName, sizeof(LastName), NULL); // While There Are Records In The Result Data Set // Produced, Retrieve And Display Them while (RetCode != SQL_NO_DATA) { RetCode = SQLFetch(StmtHandle); if (RetCode != SQL_NO_DATA) printf("%-8s %s\n", EmpNo, LastName); } ... It is important to note that, unlike embedded SQL, in which cursor names are used to retrieve, update, or delete rows stored in a result data set, cursor names for CLI/ODBC are needed only for positioned update and positioned delete operations (because they reference cursors by name). Thus, cursor names are automatically generated, if appropriate, when a result data set is produced; however, they are seldom used explicitly.

Terminating the Current Transaction


You may recall that a transaction (also known as a unit of work) is a sequence of one or more SQL operations grouped together as a single unit, usually within an application process. Transactions are important because the initiation and termination of a single transaction defines

points of data consistency within a database; the effects of all operations performed within a transaction are either applied to the database and made permanent (committed) or backed out (rolled back) and the database is returned to the state it was in before the transaction was initiated. CLI/ODBC applications can connect to multiple data sources simultaneously, and each data source connection can constitute a separate transaction boundary. Figure 5-6 shows how multiple transaction boundaries can coexist when a CLI/ODBC application interacts with two separate data sources at the same time.

Figure 5-6: Transaction boundaries in an application that interacts with multiple data sources simultaneously. From a transaction processing viewpoint, a CLI/ODBC application can be configured such that it runs in one of two modes: auto-commit or manualcommit. When auto-commit mode is used, each individual SQL statement is treated as a complete transaction, and each transaction is automatically committed after the SQL statement successfully executes. For anything other than SELECT SQL statements, the commit operation takes place immediately after the statement is executed. For SELECT statements, the commit operation takes place immediately after the cursor being used to process any result data set produced is closed. (Remember that CLI/ODBC automatically declares and opens a cursor if one is needed.) Auto-commit mode is the default commit mode used and is usually sufficient for simple CLI/ODBC applications. Larger applications, howeverparticularly applications that perform update operationsshould switch

to manual-commit mode (by calling the SQLSetConnectAttr() function with the SQL_ATTR_ AUTOCOMMIT option specified) as soon as a data source connection is established. When manual-commit mode is used, transactions are started implicitly the first time an application accesses a data source, and transactions are explicitly ended when the SQLEndTran() function is called. This CLI/ODBC function is used to either roll back or commit all changes made by the current transaction. Thus, all operations performed against a data source between the time it was first accessed and the time the SQLEndTran() function is called are treated as a single transaction. Regardless of which type of commit mode is used, all transactions associated with a particular data source should be completed before the connection to that data source is terminated. However, you should not wait until you are about to terminate a data source connection before you decide to end a transactiondoing so can cause concurrency and locking problems among other applications to occur. Likewise, it is not always a good idea to use auto-commit mode or to call the SQLEndTran() function after each SQL statement is executedthis behavior increases overhead and can lead to poor application performance. When deciding on when to end a transaction, consider the following: Only the current transaction can be committed or rolled back; therefore, all dependent SQL statements should be kept within the same transaction. Various table-level and row-level locks may be held by the current transaction. When the transaction is ended, these locks are released and other applications are given access to all data that was locked. Once a transaction has been successfully committed or rolled back, it is fully recoverable from the transaction log files. Any transaction that has not been committed or rolled back at the time a system failure or application program failure occurs will be "lost" and its effects will be discarded. Therefore, transactions should be ended as soon as reasonably possible. Note DB2 UDB guarantees that successfully committed or rolled-back transactions are fully recoverable from the system log files; however, this might not be true for other database products. If a CLI/ODBC application is designed to interact with multiple database products, you should refer to the appropriate product documentation to determine when transactions are considered recoverable. When defining transaction boundaries, keep in mind that all resources associated with the transactionwith the exception of those coupled with a held cursorare released. Prepared SQL statements, cursor names, parameter bindings, and column bindings, however, are maintained from one transaction to the next. This means that once an SQL statement has been prepared, that statement does not need to be prepared againeven after a commit or roll back operation is performedprovided it remains associated with the same statement handle. Also, by default, cursors are preserved after a transaction is committed and are closed when a transaction is rolled back.

Freeing Allocated Statement Handles


When the results of an SQL statement have been processed and the SQL statement data storage area that was allocated when transaction processing began is no longer needed, the memory reserved for that data storage area needs to be freed. The data storage area associated with a particular statement handle is freed by calling the SQLFreeHandle() function with the SQL_HANDLE_STMT option and the appropriate statement handle specified (for example, SQLFreeHandle(SQL_HANDLE_STMT, StmtHandle)). When invoked, this CLI/ODBC function performs one or more of the following tasks: Unbinds all previously bound parameter application variables Unbinds all previously bound column application variables. Closes any cursors that are open and discards their results.

Destroys the statement handle and releases all associated resources.

Note

If a statement handle is not freed, it can be used to process other SQL statements. However, when such a statement handle is reused, any access plans that had been cached for the previous SQL statement associated with that handle will be discarded.

Termination
As you might imagine, the termination task takes place at the end of a CLI/ODBC application, just before control is returned to the operating system. This is where all data source connections that have been established are terminated and where all resources that were allocated during initialization are freed. (Usually, these resources consist of an environment data storage area and one or more connection data storage areas.) Existing database connections are terminated by calling the SQLDisconnect() function with the appropriate connection handle specified; corresponding connection data storage areas are freed by calling the SQLFreeHandle() function with the SQL_HANDLE_DBC option and the appropriate connection handle specified. Once all previously allocated connection data storage areas have been freed, the environment data storage area that the connection data storage areas were associated with is also freed by calling the SQLFreeHandle() function, this time with the SQL_HANDLE_ENV option and the appropriate environment handle specified.

Putting It All Together


Now that we have examined the three distinct tasks (initialization, transaction processing, and termination) that all CLI/ODBC applications are comprised of, let's see how the CLI/ODBC functions used to perform each of these tasks are typically coded in an application. A simple CLI/ODBC application, written in the C programming language, that obtains and prints employee identification numbers and last names for all employees who have the job title DESIGNER' might look something like this: #include <stdio.h> #include <string.h> #include <sqlcli1.h> int main() { // Declare The Local Memory Variables SQLHANDLE EnvHandle = 0; SQLHANDLE ConHandle = 0; SQLHANDLE StmtHandle = 0; SQLRETURN RetCode = SQL_SUCCESS; SQLCHAR SQLStmt[255]; SQLCHAR JobType[10]; SQLCHAR EmpNo[10]; SQLCHAR LastName[25]; /*------------------------------------------------------*/ /* INITIALIZATION */ /*------------------------------------------------------*/

// Allocate An Environment Handle SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &EnvHandle); // Set The ODBC Application Version To 3.x if (EnvHandle != 0) SQLSetEnvAttr(EnvHandle, SQL_ATTR_ODBC_VERSION, (SQLPOINTER) SQL_OV_ODBC3, SQL_IS_UINTEGER); // Allocate A Connection Handle if (EnvHandle != 0) SQLAllocHandle(SQL_HANDLE_DBC, EnvHandle, &ConHandle); // Connect To The Appropriate Data Source if (ConHandle != 0) RetCode = SQLConnect(ConHandle, (SQLCHAR *) "SAMPLE", SQL_NTS, (SQLCHAR *) "db2admin", SQL_NTS, (SQLCHAR *) "ibmdb2", SQL_NTS); /*------------------------------------------------------*/ /* TRANSACTION PROCESSING */ /*------------------------------------------------------*/ // Allocate An SQL Statement Handle if (ConHandle != 0 && RetCode == SQL_SUCCESS) SQLAllocHandle(SQL_HANDLE_STMT, ConHandle, &StmtHandle); // Define A SELECT SQL Statement That Uses A Parameter // Marker strcpy((char *) SQLStmt, "SELECT EMPNO, LASTNAME FROM "); strcat((char *) SQLStmt, "EMPLOYEE WHERE JOB = ?"); // Prepare The SQL Statement RetCode = SQLPrepare(StmtHandle, SQLStmt, SQL_NTS); // Bind The Parameter Marker Used In The SQL Statement To // An Application Variable RetCode = SQLBindParameter(StmtHandle, 1, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,

sizeof(JobType), 0, JobType, sizeof(JobType), NULL); // Populate The "Bound" Application Variable strcpy((char *) JobType, "DESIGNER"); // Execute The SQL Statement RetCode = SQLExecute(StmtHandle); // If The SQL Statement Executed Successfully, Retrieve // The Results if (RetCode == SQL_SUCCESS) { // Bind The Columns In The Result Data Set Returned // To Application Variables SQLBindCol(StmtHandle, 1, SQL_C_CHAR, (SQLPOINTER) EmpNo, sizeof(EmpNo), NULL); SQLBindCol(StmtHandle, 2, SQL_C_CHAR, (SQLPOINTER) LastName, sizeof(LastName), NULL); // While There Are Records In The Result Data Set // Produced, Retrieve And Display Them while (RetCode != SQL_NO_DATA) { RetCode = SQLFetch(StmtHandle); if (RetCode != SQL_NO_DATA) printf("%-8s %s\n", EmpNo, LastName); } } // Commit The Transaction RetCode = SQLEndTran(SQL_HANDLE_DBC, ConHandle, SQL_COMMIT); // Free The SQL Statement Handle if (StmtHandle != 0) SQLFreeHandle(SQL_HANDLE_STMT, StmtHandle); /*------------------------------------------------------*/ /* TERMINATION */ /*------------------------------------------------------*/

// Terminate The Data Source Connection if (ConHandle != 0) RetCode = SQLDisconnect(ConHandle); // Free The Connection Handle if (ConHandle != 0) SQLFreeHandle(SQL_HANDLE_DBC, ConHandle); // Free The Environment Handle if (EnvHandle != 0) SQLFreeHandle(SQL_HANDLE_ENV, EnvHandle);

// Return Control To The Operating System return(0); } You may have noticed that a special code named SQL_NTS was passed as a parameter value for some of the CLI/ODBC functions used in this application. CLI/ODBC functions that accept character string values as arguments usually require the length of the character string to be provided as well. The value SQL_NTS can be used in place of an actual length value to indicate that the corresponding string is null-terminated.

Obtaining Information about a Driver and Controlling Driver Attributes


Because a CLI\ODBC application can connect to a variety of data sources, there may be times when it is necessary to obtain information about a particular data source that an application is connected to. By design, all CLI/ODBC drivers must support three specific functions that, when used, provide information about the capabilities of the driver and the driver's underlying data source. Using this set of functions, an application can determine the capabilities and limitations of a particular data source and adjust its behavior accordingly. The first of these functions, the SQLGetInfo() function, can be used to obtain information about the various characteristics of a data source. The second function, SQLGetFunctions(), tells an application whether a particular CLI/ODBC function is supported by a data source/driver. And the last function, SQLGetTypeInfo(), provides an application with information about the native data types used by a data source. Of the three, SQLGetInfo() is probably the most powerfulmore than 165 different pieces of information can be obtained by this function alone. The information returned by the SQLGetInfo(), SQLGetFunctions(), and SQLGetTypeInfo() functions is static in naturethat is, the characteristics of the data source or driver that the information returned by these three functions refers to cannot be altered by the calling application. However, most data source drivers contain additional information that can be changed to alter the way in which a driver behaves for a particular application. This updatable information is referred to as driver attributes. Three types of driver attributes are available: Environment attributes Connection attributes SQL statement attributes

Environment Attributes
Environment attributes affect the way in which CLI/ODBC functions that operate under a specified environment behave. An application can retrieve the value of an environment attribute at any time by calling the SQLGetEnvAttr() function and can change the value of an environment attribute by calling the SQLSetEnvAttr() function. Some of the more common environment attributes include: SQL_ATTR_ODBC_VERSION. Determines whether certain functionality exhibits ODBC 2.0 behavior or ODBC 3.x behavior. SQL_ATTR_OUTPUT_NTS. Determines whether the driver is to append a null-terminator to string data values before they are returned to an application. It is important to note that environment attributes can be changed as long as no connection handles have been allocated against the environment. Once a connection handle has been allocated, attribute values for that environment can be retrieved but cannot be altered.

Connection Attributes
Connection attributes affect the way in which connections to data sources and drivers behave. An application can retrieve the value of a connection attribute at any time by calling the SQLGetConnectAttr() function and can change the value of a connection attribute by calling the SQLSetConnectAttr() function. Some of the more common connection attributes include: SQL_ATTR_AUTOCOMMIT. Determines whether the data source/driver will operate in autocommit mode or manual-commit mode. SQL_ATTR_MAXCONN. Determines the maximum number of concurrent data source connections that an application can have open at one time. (The default value for this attribute is 0, which means that an application can have any number of connections open at one time.) SQL_ATTR_TXN_ISOLATION. Specifies the isolation level to use for the current connection (Valid values for this attribute include SQL_TXN_SERIALIZABLE [Repeatable Read], SQL_TXN_REPEATABLE_READ [Read Stability], SQL_TXN_READ_COMMITTED [Cursor Stability], and SQL_TXN_READ_UNCOMMITTED [Uncommitted Read]). As with environment attributes, timing becomes a very important element when setting connection attributes. Some connection attributes can be set at any time; some can be set only after a corresponding connection handle has been allocated but before a connection to a data source is established; some can be set only after a connection to a data source is established; and finally, some can be set only after a connection to a data source is established and while no outstanding transactions or open cursors are associated with the connection.

SQL Statement Attributes


Statement attributes affect the way in which many SQL statement-level CLI/ODBC functions behave. An application can retrieve the value of a statement attribute at any time by calling the SQLGetStmtAttr() function and can change the value of a statement attribute by calling the SQLSetStmtAttr() function. Some of the more common statement attributes include: SQL_ATTR_CONCURRENCY. Specifies the cursor concurrency level to use (read-only, lowlevel locking, or value-comparison locking). SQL_ATTR_CURSOR_SENSITIVITY. Specifies whether cursors on a statement handle are to make changes made to a result data set by another cursor visible (insensitive or sensitive). SQL_ATTR_CURSOR_TYPE. Specifies the type of cursor that is to be used when processing result data sets (forward-only, static, keysetdriven, or dynamic). SQL_ATTR_RETRIEVE_DATA. Specifies whether the SQLFetch() and SQLFetchScroll() functions are to automatically retrieve data after they position the cursor. Again, timing is important when setting statement attributes. Some statement attributes must be set before the SQL statement associated with the statement handle is executed, some can be set

at any time but are not applied until the SQL statement associated with the statement handle is used again, and some can be set at any time. Note A complete listing of the environment, connection, and SQL statement attributes available can be found in the IBM DB2 Universal Database, Version 8 CLI Guide and Reference Volume 2 product documentation.

Diagnostics and Error Handling


Each time a CLI/ODBC function is invoked, a special value known as a return code is returned to the calling application to indicate whether the function executed as expected. (If the function did not execute as expected, the return code value generated will indicate what caused the function to fail.) A list of possible return codes that can be returned by any CLI/ODBC function can be seen in Table 5-3. Table 5-3: CLI/ODBC Function Return Codes Return Code SQL_SUCCESS Meaning The CLI/ODBC function completed successfully. (The CLI/ODBC function SQLGetDiagField() can be used to obtain additional information from the diagnostic header record.) The CLI/ODBC function completed successfully; however, a warning or non-fatal error condition was encountered. (The CLI/ODBC functions SQLGetDiagRec() and SQLGetDiagField() can be used to obtain additional information.) The CLI/ODBC function completed successfully, but no relevant data was found. (The CLI/ODBC functions SQLGetDiagRec() and SQLGetDiagField() can be used to obtain additional information.) The CLI/ODBC function failed because an invalid environment, connection, statement, or descriptor handle was specified. This return code is only produced when the handle specified either has not been allocated or is the wrong type of handle (for example, if a connection handle is provided when an environment handle is expected). Because this type of error is a programming error, no additional information is provided. The CLI/ODBC function failed because data that the function expected to be available at execution time (such as parameter marker data or connection information) was missing. This return code is typically produced when parameters or columns have been bound as data-atexecution (SQL_DATA_AT_EXEC) parameters or columns. (The CLI/ODBC functions SQLGetDiagRec() and SQLGetDiagField() can be used to obtain additional information.) A CLI/ODBC function that was started asynchronously is still executing. (The CLI/ODBC functions SQLGetDiagRec() and

SQL_SUCCESS_WITH_INFO

SQL_NO_DATA or SQL_NO_DATA_FOUND

SQL_INVALID_HANDLE

SQL_NEED_DATA

SQL_STILL_EXECUTING

Table 5-3: CLI/ODBC Function Return Codes Return Code Meaning SQLGetDiagField() can be used to obtain additional information.) SQL_ERROR The CLI/ODBC function failed. (The CLI/ODBC functions SQLGetDiagRec() and SQLGetDiagField() can be used to obtain additional information.)

The return code SQL_INVALID_HANDLE always indicates a programming error and should never be encountered during application run time. All other return codes provide run-time information about the success or failure of the CLI/ODBC function called (although the SQL_ERROR return code can sometimes indicate a programming error as well). Error handling is an important part of any application, and CLI/ODBC applications are no exception. At a minimum, a CLI/ODBC application should always check to see if a called CLI/ODBC function executed successfully by examining the return code produced; when such a function fails to execute as expected, users should be notified that an error or warning condition has occurred and, whenever possible, they should be provided with sufficient diagnostic information to locate and correct the problem.

SQLSTATE Values
Although a return code will notify an application program if an error or warning condition occurred, it does not provide the application (or the developer or user) with specific information about what caused the error or warning condition to be generated. Because additional information about an error or warning condition is usually needed to resolve a problem, DB2 (as well as other relational database products) uses a set of error message codes known as SQLSTATEs to provide supplementary diagnostic information for warnings and errors. SQLSTATEs are alphanumeric strings that are five characters (bytes) in length and have the format ccsss, where cc indicates the error message class and sss indicates the error message subclass. An SQLSTATE that has a class of 01 corresponds to a warning; an SQLSTATE that has a class of HY corresponds to an error that was generated by DB2 CLI; and an SQLSTATE that has a class of IM corresponds to an error that was generated by the ODBC Driver Manager. (Because different database servers often have different diagnostic message codes, SQLSTATEs follow standards that are outlined in the X/Open CLI standard specification. This standardization of SQLSTATE values enables application developers to process errors and warnings consistently across different relational database products.) Unlike return codes, SQLSTATEs are often treated as guidelines, and drivers are not required to return them. Thus, while drivers should always return the proper SQLSTATE for any error or warning they are capable of detecting, applications should not count on this always happening. Because SQLSTATEs are not returned reliably, most applications simply display them, along with any corresponding diagnostic message and native error code, to the user. Loss of functionality rarely occurs with this approach because applications normally cannot base programming logic on SQLSTATEs anyway. For example, suppose an application calls the SQLExecDirect() function and the SQLSTATE 42000 (syntax error or access violation) is returned. If the SQL statement that caused this error to occur is hard-coded into the application or constructed at application run time, the error can be attributed to a programming error and the source code will have to be modified. However, if the SQL statement that caused this error to occur was provided by the user at run time, the error can be attributed to a user mistake, in which case the application has already done all that it can do by informing the user of the problem.

Obtaining Diagnostic Information


How are SQLSTATE values, diagnostic messages, and native error codes obtained when a CLI/ODBC function fails to execute properly? This information can be acquired by calling the SQLGetDiagRec() function, the SQLGetDiagField() function, or both. These functions accept an environment, connection, statement, or descriptor handle as input and return diagnostic information about the last CLI/ODBC function executed, using the handle specified. If multiple diagnostic records were generated, an application must call one or both of these functions several times. (The total number of diagnostic records available can be determined by calling the SQLGetDiagField() function with record number 0 (zero)the header record numberand the SQL_DIAG_NUMBER option specified.) Diagnostic information is stored in memory as diagnostic records, and applications can retrieve SQLSTATE values, diagnostic messages, and native error codes from a diagnostic record in a single step by calling the SQLGetDiagRec() function. However, this function cannot be used to retrieve information from the diagnostic header record; applications must use the SQLGetDiagField() function to retrieve information from the diagnostic header record. The SQLGetDiagField() function can also be used to obtain the values of individual diagnostic record fields (thus, the SQLGetDiagField() function could be used to obtain just SQLSTATE values when an error occurs).

Obtaining SQLCA Information


In the previous chapter, we saw that embedded SQL applications rely solely on the SQL Communications Area (SQLCA) data structure for diagnostic information. Although much of the information the SQLCA data structure provides to embedded SQL applications is available to CLI/ODBC applications via the SQLGetDiagRec() and SQLGetDiagField() functions, occasionally the need to acquire just SQLCA information can arise. In such situations, the SQLGetSQLCA() function can be used to obtain SQLCA information for the last SQL statement processed. However, you should keep in mind that the SQLGetSQLCA() function has been depreciated and the SQLCA data structure will contain meaningful information only if the last SQL statement executed had some type of interaction with the corresponding data source.

A Diagnostics/Error Handling Example


Now that we have seen how return codes and diagnostic records can be used to detect when an error or warning condition occurs and to provide feedback on how to correct the problem that caused the error/warning to be generated, let's see how error handling and diagnostics information retrieval is typically performed in a CLI/ODBC application. A simple CLI/ODBC application, written in the C programming language, that attempts to connect to a data source using an invalid user ID and that displays the diagnostic information generated when the connection attempt fails might look something like this: #include <stdio.h> #include <string.h> #include <sqlcli1.h> int main() { // Declare The Local Memory Variables SQLHANDLE EnvHandle = 0; SQLHANDLE ConHandle = 0; SQLRETURN RetCode = SQL_SUCCESS;

SQLSMALLINT Counter = 0; SQLINTEGER NumRecords = 0; SQLINTEGER NativeErr = 0; SQLCHAR SQLState[6]; SQLCHAR ErrMsg[255]; SQLSMALLINT ErrMsgLen = 0; // Allocate An Environment Handle SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &EnvHandle); // Set The ODBC Application Version To 3.x if (EnvHandle != 0) SQLSetEnvAttr(EnvHandle, SQL_ATTR_ODBC_VERSION, (SQLPOINTER) SQL_OV_ODBC3, SQL_IS_UINTEGER); // Allocate A Connection Handle if (EnvHandle != 0) SQLAllocHandle(SQL_HANDLE_DBC, EnvHandle, &ConHandle); // Attempt To Connect To A Data Source Using An Invalid // User ID (This Will Cause An Error To Be Generated) if (ConHandle != 0) RetCode = SQLConnect(ConHandle, (SQLCHAR *) "SAMPLE", SQL_NTS, (SQLCHAR *) "db2_admin", SQL_NTS, (SQLCHAR *) "ibmdb2", SQL_NTS); // If Unable To Establish A Data Source Connection, // Obtain Any Diagnostic Information Available if (RetCode != SQL_SUCCESS) { // Find Out How Many Diagnostic Records Are // Available SQLGetDiagField(SQL_HANDLE_DBC, ConHandle, 0, SQL_DIAG_NUMBER, &NumRecords, SQL_IS_INTEGER, NULL); // Retrieve And Display The Diagnostic Information // Produced for (Counter = 1; Counter <= NumRecords; Counter++)

{ // Retrieve The Information Stored In Each // Diagnostic Record Generated SQLGetDiagRec(SQL_HANDLE_DBC, ConHandle, Counter, SQLState, &NativeErr, ErrMsg, sizeof(ErrMsg), &ErrMsgLen); // Display The Information Retrieved printf("SQLSTATE : %s\n", SQLState); printf("%s\n", ErrMsg); } } // Free The Connection Handle if (ConHandle != 0) SQLFreeHandle(SQL_HANDLE_DBC, ConHandle); // Free The Environment Handle if (EnvHandle != 0) SQLFreeHandle(SQL_HANDLE_ENV, EnvHandle); // Return Control To The Operating System return(0); } The important thing to remember is that more than one diagnostic record may be generated when an error or warning condition occurs. One way to ensure that all diagnostic records generated are retrieved is to examine the diagnostic header record to determine how many records are available. Another way is to repeatedly call the SQLGetDiagRec() function, incrementing the record ID number each time, until the return code SQL_NO_DATA is returned. At that point, no more diagnostic records will be available.

Creating Executable Applications


So far, we have looked at the steps used to code CLI/ODBC applications, but we have not looked at how the source code for a CLI/ODBC application is converted into an actual working program. Once a CLI/ODBC source-code file has been written, it must be compiled by an appropriate highlevel programming language compiler (such as Visual C++, Visual Basic, and so on). The compiler is responsible for converting the source code file into an object module that the linker can use to create an executable program; the linker combines object files and high-level programming language libraries to produce an executable application. For most operating systems, this executable application will be an executable module that runs stand-alone. However, it can also be a shared library or dynamic link library that is used by another executable module. Figure 5-7 illustrates this source code file-to-executable application conversion process.

Figure 5-7: Converting a source code file containing CLI/ODBC functions into an executable application. It is important to note that DB2 UDB is packaged with a set of special bind files that are used to support DB2 CLI. When a database is created, these files are bound to the database as part of the database creation process to produce a package that facilitates CLI interaction with that database.

Practice Questions
Question 1 Which two of the following must be allocated before a statement handle can be allocated? A. An application handle B. An environment handle C. A data source handle D. A driver handle E. A connection handle Which of the following CLI/ODBC functions can NOT be used to obtain diagnostic information when an SQL-related error occurs? A. SQLGetError() B. SQLGetSQLCA() C. SQLGetDiagRec() D. SQLGetDiagField() The CLI/ODBC function SQL DriverConnect() is to be used to establish a connection to a DB2 UDB database. Which of the following is NOT a valid keyword that can be included in the connection string used by the SQL DriverConnect() function? A. DSN B. PWD

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

C. UID D. ALIAS Which of the following return codes will be returned by SQL Fetch() when there are no more records to be retrieved in a result data set? A. SQL_ERROR B. SQL_NO_DATA C. SQL_SUCCESS_WITH_INFO D. SQL_NO_MORE_DATA Which of the following attributes is used to indicate that queries are to be executed using Read Stability locking semantics? A. SQL_ATTR_REPEATABLE_READ B. SQL_ATTR_CONCURRENCY C. SQL_ATTR_TXN_ISOLATION D. SQL_ATTR_CURSOR_ISOLATION Which two of the following CLI/ODBC functions can be used to establish a new connection to a remote DB2 UDB database? A. SQLDatasourceConnect() B. SQLDriverConnect() C. SQLSetConnection() D. SQLDatabaseConnect() E. SQLBrowseConnect() Which of the following is NOT a valid CLI/ODBC handle definition? A. SQL_HANDLE_ENV B. SQL_HANDLE_DBC C. SQL_HANDLE_DESC D. SQL_HANDLE_CON Given the following source code: SQLHANDLE EnvHandle, ConHandle, StmtHandle; SQLRETURN RetCode;

Question 9

SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &EnvHandle); SQLAllocHandle(SQL_HANDLE_DBC, EnvHandle, &ConHandle); SQLConnect(ConHandle, "SAMPLE", SQL_NTS, "db2admin", SQL_NTS, "ibmdb2", SQL_NTS); Assuming everything has executed successfully up to this point, what value will be assigned to the variable RetCode if the following line is executed next? RetCode = SQLExecDirect(ConHandle, "SELECT * FROM EMPLOYEES", SQL_NTS); A. SQL_SUCCESS B. SQL_ERROR C. SQL_INVALID_HANDLE D. SQL_NEED_DATA Which of the following will return values that are only in upper case? A. SELECT name FROM employee WHERE UCASE(name) = smith

Question 10

Question 11

B. SELECT name FROM employee WHERE UCASE(name) = SMITH C. SELECT UCASE(name) FROM employee WHERE name = smith D. SELECT name FROM employee WHERE name IN (SELECT name FROM employee WHERE UCASE(name) = UCASE(smith)) Which of the following statement attributes is used to indicate that a cursor is to be a key-set driven cursor? A. SQL_ATTR_CURSOR_TYPE B. SQL_ATTR_CURSOR_SENSITIVITY C. SQL_ATTR_CONCURRENCY D. SQL_ATTR_CURSOR_MODE Given the following table: TAB1 ----------------------EMPID NAME ------- ------1 USER1 2 USER2 3 USER3 If a CLI/ODBC application containing the following pseudo-code executes successfully: SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &EnvHandle); SQLAllocHandle(SQL_HANDLE_DBC, EnvHandle, &ConHandle); SQLConnect(ConHandle, (SQLCHAR *) "SAMPLE", SQL_NTS, (SQLCHAR *) "db2admin", SQL_NTS, (SQLCHAR *) "ibmdb2", SQL_NTS); SQLSetConnectAttr(ConHandle, SQL_ATTR_AUTOCOMMIT, SQL_AUTOCOMMIT_ON, SQL_IS_UINTEGER); SQLAllocHandle(SQL_HANDLE_STMT, ConHandle, &StmtHandle); SQLExecDirect(StmtHandle, "INSERT INTO tab1 VALUES (4, USER4)", SQL_NTS); SQLExecDirect(StmtHandle, "INSERT INTO tab1 VALUES (5, USER5)", SQL_NTS); SQLEndTran(SQL_HANDLE_DBC, Conhandle, SQL_COMMIT); SQLExecDirect(StmtHandle, "INSERT INTO tab1 VALUES (6, USER6)", SQL_NTS); SQLExecDirect(StmtHandle, "DELETE FROM tab1 WHERE empid = 1", SQL_NTS); SQLExecDirect(StmtHandle, "DELETE FROM tab1 WHERE empid = 2", SQL_NTS); SQLEndTran(SQL_HANDLE_DBC, Conhandle, SQL_ROLLBACK); How many records will be stored in TAB1? A. 3 B. 4 C. 5 D. 6

Question 12

Question 13

Question 14

Question 15

Question 16

Question 17

Which of the following CLI/ODBC functions can cause a cursor to be opened? A. SQLFetch() B. SQLPrep() C. SQLExecDirect() D. SQLOpen() Which of the following CLI/ODBC function calls can be used to determine whether or not the results of a query are read-only? A. SQLFetch(SQL_FETCH_MODE) B. SQLGetStmtAttr(SQL_ATTR_CONCURREN CY) C. SQLGetInfo(SQL_ATTR_CURSOR_TYPE) D. SQLPrepare(SQL_READ_ONLY) When the following CLI/ODBC function is executed, the return code SQL_ERROR is produced: RetCode = SQLPrepare(Handle, "SELECT * FROM EMPLOYEES", SQL_NTS); Which of the following CLI/ODBC function calls can be used to determine why the SQLPrepare() function failed? A. SQLGetDiagRec(SQL_HANDLE_ENV, Handle, ) B. SQLGetDiagRec(SQL_HANDLE_DBC, Handle, ) C. SQLGetDiagRec(SQL_HANDLE_STMT, Handle, ) D. SQLGetDiagRec(SQL_HANDLE_DESC, Handle, ) Which of the following CLI/ODBC functions would be used to update multiple rows in a DB2 UDB table? A. SQLUpdate() B. SQLFetch() C. SQLExecute() D. SQLBulkUpdate() Which of the following handle types contains information about the application variables that have been bound to parameter markers in an SQL statement? A. Environment B. Connection C. SQL statement D. Descriptor Which two of the following options will cause a dialog to be displayed when more information is needed by the SQL DriverConnect() function than was provided in the connection string used? A. SQL_DRIVER_NOPROMPT B. SQL_DRIVER_COMPLETE

Question 18

C. SQL_DRIVER_QUERY_USER D. SQL_DRIVER_COMPLETE_REQUIRED E. SQL_DRIVER_PROMPT_USER Given the following CLI/ODBC application pseudocode that executes successfully: SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &EnvHandle); SQLAllocHandle(SQL_HANDLE_DBC, EnvHandle, &ConHandle); SQLConnect(ConHandle, (SQLCHAR *) "SAMPLE", SQL_NTS, (SQLCHAR *) "db2admin", SQL_NTS, (SQLCHAR *) "ibmdb2", SQL_NTS); SQLSetConnectAttr(ConHandle, SQL_ATTR_AUTOCOMMIT, SQL_AUTOCOMMIT_OFF, SQL_IS_UINTEGER); SQLAllocHandle(SQL_HANDLE_STMT, ConHandle, &StmtHandle); SQLExecDirect(StmtHandle, "DELETE FROM tab1 WHERE empid = 1", SQL_NTS); Which of the following CLI/ODBC function calls will generate the return code SQL_ERROR? A. SQLEndTran(SQL_HANDLE_DBC, Conhandle, SQL_COMMIT) B. SQLEndTran(SQL_HANDLE_DBC, Conhandle, SQL_ROLLBACK) C. SQLExecute(StmtHandle) D. SQLFreeHandle(SQL_HANDLE_STMT, StmtHandle)

Answers Question 1

Question 2

Question 3

The correct answers are B and E. Every CLI application must begin by allocating an environment handle, and each data source connection is managed by a connection handle. Therefore, in order to allocate and use an SQL statement handle, an environment handle and at least one connection handle must first be allocated. The correct answer is A. Earlier versions of CLI/ODBC contained a function named SQLError() that could be used to obtain diagnostic information; however, there is no function named SQLGetError(). On the other hand, SQLGetDiagRec(), SQLGetDiagField(), and SQLGetSQLCA() can all be used to obtain diagnostic information when an SQL-related error occurs. (SQLGetSQLCA() can only be used to obtain information about the last SQL statement executed. Furthermore, this function has been depreciatedSQLGetDiagRec() and SQLGetDiagField() should be used instead.) The correct answer is D. A connection string is simply a series of keyword/value pairs, separated by semicolons, that contains information that is to be used by the SQLDriverConnect() function to establish a connection to a data source. Some of the more common keyword/value pairs used include: DSN=DataSourceName, UID=UserID, PWD=Password, and NEWPWD= NewPassword.

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

Question 11

The correct answer is B. The SQL_NO_DATA return code is generated when a CLI/ODBC function completes successfully but no relevant data is found. Therefore, if SQLFetch() is executed when no more records are available, the return code SQL_NO_DATA will be returned. The correct answer is C. Read Stability is one of four transaction isolation levels available with DB2 UDB. (The four isolation levels available are: Repeatable Read, Read Stability, Cursor Stability, and Uncommitted Read.) The connection attribute used to specify the isolation level to use for a given data source connection is the SQL_ATTR_TXN_ISOLATION attribute. The correct answers are B and E. Three CLI/ODBC functions can be used to establish a connection to a data source. They are: SQLConnect(), SQLDriverConnect(), and SQLBrowseConnect(). The correct answer is D. Handles are initialized and their corresponding data storage areas are allocated by calling the SQLAllocHandle() function with the appropriate handle type (SQL_HANDLE_ENV [environment], SQL_ HANDLE_DBC, [connection], SQL_HANDLE_STMT [statement], SQL_HANDLES_DESC [descriptor]) and parent handle specified. The correct answer is C. An attempt to submit a query to a data source via a connection handle (as is the case in this example) will fail and cause the return code SQL_INVALID_HANDLE to be generated. That's because SQL statements must be sent to a data source for processing via a statement handlenot a connection handle. The correct answer is C. The key wording here is return values in upper case. To return values in upper case, the UCASE( ) function must be used to convert all values retrieved. If the UCASE( ) function is used in the WHERE clause of a SELECT statement, the conversion is applied for the purpose of locating matching valuesnot to return values that have been converted to upper case. The correct answer is A. The SQL_ATTR_CURSOR_TYPE statement attribute is used to specify the type of cursor to be used when processing result data sets. (A cursor can be defined as forward-only, static, keyset-driven, or dynamic.) The SQL_ATTR_CONCURRENCY statement attribute is used to specify the cursor concurrency level to use (read-only, low-level locking, or value-comparison locking), and the SQL_ATTR_CURSOR_SENSITIVITY statement attribute is used to specify whether cursors on a statement handle are to make changes made to a result data set by another cursor visible (insensitive or sensitive). There is no SQL_ATTR_CURSOR_MODE statement attribute. The correct answer is B. The line "SQLSetConnectAttr(ConHandle, SQL_ATTR_AUTOCOMMIT, SQL_AUTOCOMMIT_ON, SQL_IS_UINTEGER);" tells the CLI/ODBC application to run in auto-commit mode. (When auto-commit mode is used, each SQL statement is treated as a complete transaction, and each transaction is automatically committed after the SQL statement successfully executes.) Therefore, every SQL statement processed is automatically committed after it is successfully executed. As a result, three records were added to the three

Question 12

Question 13

Question 14

Question 15

Question 16

Question 17

Question 18

records that already existed in table TAB1, then two records were deleted, leaving a total of four records in table TAB1. The correct answer is C. Cursors are automatically generated and opened when the SQLPrepare()-SQLExecute() or SQLExecDirect() function is executed, and a result data set containing zero or more rows is produced in response to the SQL statement that was prepared/executed. The correct answer is B. The SQL_ATTR_CONCURRENCY statement attribute is used to specify the cursor concurrency level to use (read-only, lowlevel locking, or value-comparison locking), and the SQLGetStmtAttr() function is used to obtain the value assigned to a statement attribute. The correct answer is C. When CLI/ODBC functions fail to execute correctly, diagnostic information that can help identify why the error occurred can be obtained by calling the SQLGetDiagRec() function and/or the SQLGetDiagField() function. These functions accept an environment, connection, statement, or descriptor handle as input and return diagnostic information about the last CLI/ODBC function executed, using the handle specified. In this case, the error is associated with a statement handle (remember that SQL statements are processed using statement handles), so SQLGetDiagRec() must be called with the statement handle that diagnostic information is to be obtained for. The correct answer is C. In order to update data from a CLI/ODBC application, an UPDATE SQL statement must be constructed, prepared, and executed. Once constructed, any SQL statement can be prepared and executed using the SQLPrepare() and SQLExecute() functions together or by using the SQLExecDirect() function, which combines the functionality of the SQLPrepare() and SQLExecute() functions into a single operation. The correct answer is D. A descriptor handle is a pointer to a data storage area that contains a collection of metadata that describes either the application variables that have been bound to the parameter markers in an SQL statement or the application variables that have been bound to the columns of a result data set produced in response to a query. The correct answers are B and D. When the SQLDriverConnect() function is called with a connection string that does not contain all the information needed to establish a data source connection, DB2 CLI will display a dialog that prompts the user for the missing informationprovided the SQLDriverConnect() function is called with the SQL_DRIVER_PROMPT, SQL_DRIVER_COMPLETE, or SQL_ DRIVER_COMPLETE_REQUIRED option specified. The correct answer is C. Because an SQL statement must be prepared by the SQLPrepare() function before it can be executed by the SQLExecute() function, the statement SQLExecute(StmtHandle) will fail when it is calledin this example, an SQL statement was executed directly with the SQLExecDirect() function, but no SQL statement was prepared for later execution with the SQLPrepare() function.

Chapter 6: Java Programming


Overview
Thirteen percent (13%) of the DB2 UDB V8.1 Family Application Development exam (Exam 703) is designed to test your knowledge of JDBC and SQLJ, as well as to test your ability to create simple JDBC and SQLJ applications and applets. The questions that make up this portion of the exam are intended to evaluate the following: Your ability to identify the types of classes and interfaces available with JDBC. Your ability to identify some of the more common object methods used in JDBC application and applet development. Your ability to identify the similarities and differences between JDBC and SQLJ. Your ability to establish a connection to a database within both a JDBC and an SQLJ application/applet. Your ability to manage transactions across multiple databases within a Java application and applet. Your ability to identify the steps needed to build JDBC and SQLJ applications and applets. Your ability to capture and process errors when they occur. Your ability to identify how contexts and iterators are used in an SQLJ application/applet. This chapter is designed to introduce you to Java application and applet development and to walk you through the basic steps used to construct both JDBC and SQLJ applications and applets. This chapter is also designed to introduce you to some of the more common objects and methods that are used to develop Java-based datase applications and applets. Terms you will learn: JDBC Java bytecode Java Virtual Machine Applet JDBC driver Type 1: JDBC-ODBC bridge plus ODBC driver Type 2: Partly Java-Partly Native API driver Type 3: JDBC-Pure Java driver Type 4: Native-Pure Java driver DB2 Universal JDBC Driver DriverManager getConnection() Connection getMetaData() setTransactionIsolation() setAutoCommit() setReadOnly() createStatement() prepareStatement() prepareCall() commit() rollback() getWarnings() close() DatabaseMetaData getTables() getColumns() getTypeInfo()

supportsOpenCursorsAcrossCommit() supportsOpenCursorsAcrossRollback() Statement PreparedStatement CallableStatement execute() executeQuery() executeUpdate() getResultSet() setCursorName() setMaxRows() ResultSet getMetaData() next() ResultSetMetaData getColumnName() getColumnType() isNullable() SQLException getMessage() getErrorCode() getSQLState() DataSource getConnection() PooledConnection DB2SimpleDataSource DB2DataSource DB2ConnectionPoolDataSource DB2XADataSource Class.forName() Static Initializer Uniform Resource Locator (URL) Parameter markers Exception Throwing an exception Exception Handler try catch finally SQLJ Declaration statements Executable statements SQLJ Clause Connection Context Default Context Execution Context Result set iterator Positioned iterator Named iterator Host Expression SQLJ Translator SQLJ serialized profile file DB2 SQLJ Profile Customizer db2profc db2sqljcustomize

Techniques you will master: Understanding the difference between JDBC and SQLJ. Knowing how some of the more common JDBC interfaces and classes are used to develop JDBC applications and applets. Knowing how some of the more common SQLJ interfaces and classes are used to develop SQLJ applications and applets. Knowing how to establish a database connection from a JDBC or an SQLJ application or applet. Knowing how to construct JDBC and SQLJ applications and applets from source code files. Knowing how to capture and process errors, as well as how to obtain diagnostic information when an error occurs. Recognizing how iterators are used in SQLJ applications and applets to obtain values from a result data set.

An Introduction to Java and JDBC


The Java platform is built around the idea that, once written, software should run on many kinds of computers, consumer gadgets, and other devices without requiring additional modification. Java was developed by Sun Microsystems and since its initial commercial release in 1995, Java technology has grown in popularity and usage because of its true portabilityits "Write Once, Run Anywhere" capability allows developers to create a single version of an application that runs "as is" on a wide variety of platforms. So how does Java work? Application source code files are constructed using the Java programming language and compiled to produce what is known as Java bytecode. Java bytecode is then interpreted by a tool known as the Java Virtual Machine (JVM) to produce machine instructions that are native to the operating system upon which the JVM has been installed. JDBC is a dynamic SQL interface that works transparently with the Java programming language. (As a point of interest, JDBC is a trademarked namenot an acronym; nevertheless, JDBC is often thought of as standing for "Java Database Connectivity.") JDBC was developed by JavaSoft, a one-time subsidiary of Sun Microsystems, and is now the de facto industry standard for database-independent connectivity between the Java programming language and a wide range of relational database management systems. JDBC consists of a set of classes and interfaces (written in the Java programming language) that together provide a standard API set that makes it possible to write industrial-strength database applications that can take advantage of the capabilities Java provides. The value of the JDBC API is that an application can access virtually any data source and run on any platform that contains the Java Virtual Machine. In other words, with the JDBC API, it isn't necessary to write one program to access a DB2 UDB database, another program to access an Oracle database, another program to access a Sybase database, and so on. Instead, a single program written using the JDBC API is capable of sending SQL statements to any relational database system residing on any operating system.

JDBC Drivers
To use the JDBC API with a particular relational database management system, you must have a JDBC technology-based driver that can mediate calls between JDBC and the database system being used. JDBC technology-based drivers generally fit into one of four categories: Type 1: JDBC-ODBC bridge plus ODBC driver. The JDBC-ODBC bridge driver is a product that provides JDBC access via ODBC drivers; this type of driver translates JDBC calls made from a Java application into corresponding ODBC function calls and forwards them to an ODBC Driver Manager for processing. This driver was originally designed to provide JDBC applications with a way to access data when no other drivers were available. Now that other drivers exist, use of the Type 1 driver should be restricted to prototyping. (Not only is it the slowest driver available but it is also the most limited in the functionality it provides.) Type 2: Partly Java-Partly Native API driver. This type of driver converts JDBC calls made from a Java application into DBMS-specific native-API calls and executes them. Like the JDBCODBC bridge driver, this driver requires some operating system-specific binary code to be stored on each client workstation used. (DB2 UDB Version 8.1 and earlier provides a Type 2 driver

named COM.ibm.db2.jdbc.app.DB2Driver, which is located in a file named db2java.zip. Because this driver is typically used with Java applications rather than with Java applets, it is often referred to as the "app" driver.) Type 3: JDBC-Pure Java driver and middleware server. This type of driver translates JDBC calls made from a Java program (typically an applet) into a DBMS-independent net protocol, which is then translated to a DBMS-specific protocol by a server. (DB2 UDB Version 8.1 and earlier provides a Type 3 driver named COM.ibm.db2.jdbc.net.DB2Driver, which is located in a file named db2java.zip. This driver communicates with a DB2 Java daemon process that translates JDBC calls into CLI. Because this driver is typically used over a network, it is often referred to as the "net" driver; because it is often used with JDBC applets rather than with JDBC applications, it is also referred to as the "DB2 JDBC applet" driver or simply "applet" driver.) Type 4: Native-Pure Java driver. This type of driver is written in pure Java and uses a DBMS vendor-specific protocol to establish a network connection directly to a specific DBMS. This driver requires the least amount of overhead and provides the most efficient solution to database access. It can also be used for any type of Java application or applet. (DB2 UDB Version 8.1, FixPak 2 and later provides a Type 4 driver named com.ibm.db2.jcc.DB2Driver, which is located in a file named db2jcc.jar. This driver is referred to as the DB2 Universal JDBC Driver.) The DB2 JDBC application, or "app," driver communicates directly with a DB2 UDB server, using CLI/ODBC function calls that are made through the DB2 UDB client libraries. As a result, this driver provides much better performance than the DB2 JDBC "applet," or "net," driver because communication does not have to go through the DB2 Java daemon. Figure 6-1 illustrates a typical DB2 JDBC environment in which the "app" driver is used.

Figure 6-1: DB2 JDBC Application ("app") driver environment. The DB2 JDBC "net" driver, however, communicates indirectly with a DB2 UDB server, using the DB2 Java daemon (also known as the DB2 JDBC Applet Server). Here's how it works: When a Java applet is encountered on a Web client, the client Web browser downloads the file db2java.zip just before the applet is executed. Then, the DB2 "net" driver establishes a TCP/IP connection with the DB2 Java daemon that is running on the Web Server host where the Web page containing the applet was served. The DB2 Java daemon then fulfills database requests on behalf of the applet. Figure 6-2 illustrates a typical DB2 JDBC environment in which the "net" driver is used.

Figure 6-2: DB2 JDBC Applet ("net") driver environment. Because the "app" driver provides much better performance than the "net" driver, it should be used whenever possible. However, keep in mind that the "app" driver cannot be used with Java applets because an applet is run on a client browser and most clients who access Web pages will not have the DB2 libraries installed that are needed to process a DB2 JDBC applet.

Common JDBC Classes and Methods


Because it is based on object-oriented technology, the JDBC API is comprised of a set of interfaces and classes, many of which are subclasses that have been derived from other Java base classes. The JDBC-specific interfaces and classes available are stored in two separate packages named java.sql and javax.sql. Depending on the nature of the Java application or applet being developed, one or both of these packages must be imported into a Java source code file before the JDBC API can be used. The interfaces and classes contained in the java.sql package are identified in Table 6-1; the interfaces and classes contained in the javax.sql package are identified in Table 6-2. Table 6-1: JDBC Interfaces and Classes Found in the Java v1.4.2 java.sql Package Interface or Class Purpose Data Source Connection Management DriverManager class Driver interface DriverPropertyInfo class Provides the basic service needed to manage a set of JDBC drivers. A special interface that must be implemented by every JDBC driver. Obtains or sets driver properties. (Typically, only advanced programmers who need to interact with a driver to obtain and

Table 6-1: JDBC Interfaces and Classes Found in the Java v1.4.2 java.sql Package Interface or Class Purpose supply properties for establishing connections use this class.) Connection interface SQL Statement Processing Statement interface PreparedStatement interface CallableStatement interface ResultSet interface Used to execute an SQL statement and obtain any results produced by the statement's execution. Used to represent a precompiled or prepared SQL statement. (This interface extends the Statement interface.) Used to invoke an SQL stored procedure. (This interface extends the Statement interface.) Used to retrieve and/or update a result data set, which is usually generated by executing an SQL statement that queries the connected data source. Establishes and manages a data source connection (session).

Savepoint interface

Used to provide the representation of a savepoint, which is a point within the current transaction that can be referenced from the Connection.rollback() method. When a transaction is rolled back to a savepoint, all changes made after that savepoint was created are undone. SQL Data Type-to-Java Data Type Mapping Array interface Blob interface Clob interface Date class Represents (maps) an SQL ARRAY value in a Java application or applet. Represents (maps) an SQL BLOB value in a Java application or applet. Represents (maps) an SQL CLOB value in a Java application or applet. Represents (maps) an SQL DATE value in a Java application or applet. (This class consists of a thin wrapper around a millisecond value that allows JDBC to identify it as an SQL DATE value.) Represents (maps) an SQL REF value, which is a reference to an SQL structured data type value in the connected data source, in a Java application or applet. Represents (maps) an SQL structured data type value in a Java application or applet. Represents (maps) an SQL TIME value in a Java application or applet. (This class consists of a thin wrapper around a java.util.Date value that allows JDBC to identify it as an SQL TIME value.) Represents (maps) an SQL TIMESTAMP value in a Java application or applet. (This class consists of a thin wrapper around a java.util.Date value that allows JDBC to identify it as a SQL TIMESTAMP value.) Defines the constants used to identify generic SQL data types (called JDBC types) in a Java application or applet.

Ref interface

Struct interface Time class

Timestamp class

Types class

Table 6-1: JDBC Interfaces and Classes Found in the Java v1.4.2 java.sql Package Interface or Class Purpose SQL User-Defined Data Type-to-Java Data Type Mapping SQLData interface SQLInput interface SQLOutput interface Obtaining Metadata DatabaseMetaData interface ParameterMetaData interface ResultSetMetaData interface Trapping Errors and Warnings SQLException class SQLWarning class DataTruncation class An exception that provides information when there is a problem accessing data. An exception that provides information when a data source access warning is generated. An exception that reports a "Data Truncation" warning (on reads) or throws a "Data Truncation" exception (on writes) when JDBC unexpectedly truncates a data value. An exception that provides information when one or more commands in a batch update operation do not execute successfully. Used to specify the permission that the SecurityManager will check when code that is running an applet calls the DriverManager.setLogWriter method. Purpose An interface that registers to be notified of events generated by a PoolConnection object. Such events are generated by a JDBC driver when an application is finished using a data source connection or when a connection error occurs because the connection can no longer be used (for example, if a server crashes). A factory for PooledConnection objects. A factory for connections to the physical data source that the DataSource object represents. (A JDBC driver that is accessed via the DataSource interface does not Used to obtain comprehensive information about the connected data source as a whole. Used to obtain information about the data types and properties of the parameters used in a PreparedStatement object. Used to get information about the attributes (data types and properties) of the columns in a ResultSet object. Represents (maps) an SQL user-defined data type value in a Java application or applet. An input stream that contains a stream of values representing an instance of an SQL structured data type or distinct data type. An output stream that can be used to write the attributes of an SQL user-defined data type back to a connected data source.

BatchUpdateException class Providing Security SQLPermission class

Table 6-2: JDBC Interfaces and Classes Found in the Java v1.4.2 javax.sql Package Interface or Class Interfaces ConnectionEventListener

ConnectionPoolDataSource DataSource

Table 6-2: JDBC Interfaces and Classes Found in the Java v1.4.2 javax.sql Package Interface or Class Purpose automatically register itself with the DriverManager class.) PooledConnection RowSet An interface that provides hooks for connection pool management. An interface that adds support to the JDBC API for the JavaBeans component model. (The RowSet interface is unique in that its implementation is a layer of software that executes "on top" of a JDBC driver.) An interface that a RowSet object implements to present itself to a RowSetReader or RowSetWriter object. An interface that must be implemented by any component that wants to be notified when a significant event happens in the life of a RowSet object. An interface used to get information about the columns in a RowSet object. A special interface that a disconnected RowSet object calls on to populate itself with rows of data. A special interface that provides methods used to update rows in a RowSet object. (For example, the RowSetWriter.writeRow() method is called internally by RowSet.acceptChanges() for each row that is inserted, updated, or deleted in a RowSet object.) An interface that provides support for distributed transactions. A factory for XAConnection objects (that is used internally). An event object that is used to provide information about the source of a connection-related event. ConnectionEvent objects are generated when an application closes a pooled connection and when an error occurs. An event object that is generated when an event occurs to a RowSet object. RowSetEvent objects are generated when a single row in a rowset is changed, the whole rowset is changed, or the rowset cursor is moved.

RowSetInternal RowSetListener

RowSetMetaData RowSetReader RowSetWriter

XAConnection XADataSource Classes ConnectionEvent

RowSetEvent

Of course, the real heart of the JDBC API is not so much the interfaces and classes themselves, but rather the various methods that are encapsulated within each interface and class provided. Therefore, to develop any type of Java application or applet that utilizes the JDBC API, you must be familiar with some of the more common interface and class methods available. The DriverManager Class Earlier, we saw that in order to use the JDBC API to interact with a relational database management system, a JDBC technology-based driver must be available to mediate calls between the Java application or applet and the database system being used. Four types of drivers are available, and before any driver can be used, it must be registered with a JDBC

DriverManager class. (The DataSource interface, which is a new interface introduced in the JDBC 2.0 API, is an alternative to the DriverManager class.) Registration is performed at the beginning of a JDBC application or applet by calling the static forName() method of the Class object and specifying the fully-qualified name of the JDBC-based driver to be registeredthis action will result in the creation of a corresponding DriverManager object. By far, the most important DriverManager class method available is the getConnection() method. An instance of the Connection interface is created by calling this method and passing it a Uniform Resource Locator (URL) that identifies the data source a connection is to be established to, along with any other appropriate information such as the authorization ID and password that is to be used to establish the connection. As part of the Connection interface initialization, the DriverManager class attempts to locate a JDBC driver that can be used to connect to the desired data source, and if a suitable driver is identified, a data source connection is established. From that point on, the connection to the data source is represented by its associated Connection object.

The Connection Interface


Like a CLI/ODBC connection handle, a JDBC Connection interface represents a connection to a specific data source. And like connection handles, the Connection interface controls certain properties that are related to data source interaction, such as the isolation level used by transactions and whether transactions are to be run in manual-commit or automatic-commit mode. Some of the more common Connection interface methods used include the following: getMetaData(). Returns a DatabaseMetaData object that contains information about the connected data source (for example, a list of table names, a list of column names, etc.). The DatabaseMetaData object returned has its own set of methods that can be used to extract specific values from the information obtained. setTransactionIsolation(int). Specifies the isolation level that is to be used by all transactions interacting with the Connection object. setAutoCommit(boolean). Specifies whether transactions are to be committed manually or automatically whenever an SQL statement is submitted to the connected data source for execution. As in CLI/ODBC, the auto-commit feature is turned on by default and must be explicitly turned off if you wish to manually commit transactions. setReadOnly(boolean). Specifies whether the Connection object is to be used in read-only mode. createStatement(). Creates a Statement object that can be used to send one or more SQL statements to the connected data source for processing. (Several Statement objects can be created for a single Connection object.) prepareStatement(String). Creates a PreparedStatement object that can be used to execute a single SQL statement repeatedly. A prepared SQL statement may contain parameter markers (which are represented by question marks (?)), in which case specific values must be provided for each parameter marker used before the prepared SQL statement is executed. (Several PreparedStatement objects can be created for a single Connection object.) prepareCall(String). Creates a CallableStatement object that can be used to call a stored procedure that resides in the connected data source. (Several CallableStatement objects can be created for a single Connection object.) commit(). Commits the current transaction associated with the Connection object. rollback(). Rolls back the current transaction associated with the Connection object.

getWarnings(). Returns the first warning reported for the Connection object. close(). Terminates a data source connection and releases a Connection object's JDBC resources immediately instead of waiting for them to be released automatically. Note If a JDBC object is not explicitly closed when it is no longer needed, the Java Virtual Machine's garbage collector thread will eventually release its associated resources during a lull in system activity or if memory resources on the database server are constrained. However, it is generally good practice to release JDBC resources as soon as you are finished with them by calling a JDBC object's close() method when the object is no longer needed. The DatabaseMetaData Interface A DatabaseMetaData object is used to obtain information about the data source that a JDBC application has established a connection to. Some of the more common DatabaseMetaData interface methods used include the following: getTables(). Obtains a list of all tables that have been defined in the data source. getColumns(). Obtains a list of a table's indices and corresponding statistics. getIndexInfo(). Obtains a list of all columns that have been defined in the data source. getTypeInfo(). Obtains a list of all data types recognized by the data source. supportsOpenCursorsAcrossCommit(). Indicates whether open cursors will persist across transaction commit boundaries. supportsOpenCursorsAcrossRollback(). Indicates whether open cursors will persist across transaction rollback boundaries. The Statement Interface Like a CLI/ODBC statement handle, a JDBC Statement object represents a single SQL statement that is to be processed. And once created, a Statement object can be used both to execute an SQL statement and to obtain any results produced by the statement's execution. Some of the more common Statement interface methods used include the following: execute(String). Executes an SQL statement that may or may not produce a result data set. (This method returns a boolean value of TRUE if a result data set is produced as a result of the SQL statement specified being executed; depending on the value returned, you may decide to invoke other methods such as getResultSet() and getUpdateCount().) executeQuery(String). Executes an SQL query (i.e., a SELECT or a VALUES SQL statement) and returns a ResultSet object containing the results of the query. executeUpdate(String). Executes an INSERT, UPDATE, or DELETE SQL statement. This method returns a count of the rows that were modified (if any) when the SQL statement was executed. getResultSet(). Creates a ResultSet object that has its own set of methods that can be used to extract specific values from the result data set produced in response to a query. setCursorName(String). Assigns a name to the cursor associated with the result set produced by either the execute() or the executeQuery() method. Once defined, a cursor name can be used to perform positioned update and positioned delete operations.

setMaxRows(int). Specifies the maximum number of rows that any ResultSet object is allowed to contain. getWarnings(). Returns the first warning reported for the Statement object. close(). Releases a Statement object's JDBC resources immediately instead of waiting for them to be released automatically. The Statement interface has two important sub interfaces: PreparedStatement, which represents an SQL statement that has been prepared for execution, and CallableStatement, which represents an SQL statement that is to be used to invoke a stored procedure. Because both of these interfaces are derived from the Statement interface, they inherit all of the methods the Statement interface provides and they contain additional methods of their own, many of which are used to assign values to parameter markers used in the statement. (Many of these methods can be seen in Table 6-3.) Table 6-3: DB2 SQL-to-JDBC Data Type Mappings and Their Corresponding Data Movement Methods SQL Data Type Compatibl e Java Data Type or Class short int BigDecimal PreparedStatement/Ca llableStatement Interface Input Method setShort() setInt() setBigDecimal() ResultSet Interface Output Method getShort() getInt() getBigDecimal()

SMALLINT INTEGER or INT DECIMAL, DEC, NUMERIC, or NUM REAL or FLOAT DOUBLE, DOUBLE PRECISION, or FLOAT CHARACTER or CHAR CHARACTER VARYING, CHAR VARYING, or VARCHAR LONG VARCHAR

float double

setFloat() setDouble()

getFloat() getDouble()

String String or InputStrea m

setString() setString(), setASCIIStream(), or setUnicodeStream()

getString() getString(), getASCIIStream() , or getUnicodeStream () getASCIIStream() or getUnicodeStream () getByte() or getBytes() getBytes() getBytes() or getBinaryStream(

InputStrea m

setASCIIStream() or setUnicodeStream()

CHAR FOR BIT DATA VARCHAR FOR BIT DATA LONG

byte or byte[] byte[] byte[] or InputStrea

setByte() or setBytes() setBytes() setBytes() or setBinaryStream()

Table 6-3: DB2 SQL-to-JDBC Data Type Mappings and Their Corresponding Data Movement Methods SQL Data Type Compatibl e Java Data Type or Class m PreparedStatement/Ca llableStatement Interface Input Method ResultSet Interface Output Method )

VARCHAR FOR BIT DATA GRAPHIC VARGRAPHI C LONG VARGRAPHI C DATE TIME TIMESTAMP BLOB CLOB DBCLOB

byte or byte[] byte[] byte[] or InputStrea m Date Time Timestamp Blob Clob InputStrea m

setByte() or setBytes() setBytes() setBytes() or setBinaryStream() setDate() setTime() setTimestamp() setBlob() setClob() setBinaryStream()

getByte() or getBytes() getBytes() getBytes() or getBinaryStream( ) getDate() getTime() getTimestamp() getBlob() getClob() getBinaryStream( )

The ResultSet Interface


A ResultSet object is used to process a result data set that has been generated in response to a query. A ResultSet object maintains a cursor that always points to its current row of data, and initially, the cursor is positioned before the first row. (You may recall that when a query is executed from within an embedded SQL application, DB2 UDB uses a mechanism known as a cursor to retrieve data values from the result data set produced.) As you might imagine, a method is used to move the cursor to the next row in the result data set, and because this method returns a boolean value of FALSE when no more rows are available, it can be used in conjunction with a while loop to iterate through every record stored in a result data set. By default, a ResultSet object is not updatable, and its cursor can move only in the forward direction. However, new methods in the JDBC 2.0 API make it possible to create ResultSet objects that are both scrollable and updatable. Some of the more common ResultSet interface methods used include the following: getMetaData(). Creates a ResultSetMetaData object, which can be used to obtain information about the properties of the columns (such as names, data types, nullability, etc.) in the result data set associated with a ResultSet object. next(). Advances the cursor to the next row in the result data set. getWarnings(). Returns the first warning reported for the ResultSet object. close(). Releases a ResultSet object's JDBC resources immediately instead of waiting for them to be released automatically.

The ResultSetMetaData Interface


A ResultSetMetaData object is used to obtain information about the properties of a result data set that has been produced in response to a query. Some of the more common ResultSetMetaData interface methods used include the following: getColumnCount(). Returns the number of columns found in the corresponding ResultSet object. getColumnName(int). Obtains the name that has been assigned to the designated column. getColumnType(int). Obtains the SQL data type that has been assigned to the designated column. isNullable(int). Specifies whether the designated column contains and/or accepts null values. The SQLException Class The SQLException class is used to obtain diagnostic information about what caused a data source-related error or warning condition to occur. This class is a subclass of the Exception class, which in turn is a subclass of the Throwable class, which is the superclass of all error and exception handlers in the Java programming language. Objects that are instances of the Throwable class or one of its subclasses are thrown by the Java Virtual Machine when conditions that a reasonable application might want to capture and process are encountered. Some of the more common SQLException class methods used include the following: getMessage(). Retrieves the error message text available (if any) associated with the SQLException object. getErrorCode(). Retrieves the vendor-specific error code (in our case, the DB2-specific error code) generated for the SQLException object. getSQLState(). Retrieves the SQLSTATE value generated for the SQLException object.

A Word about the JDBC DataSource Interface


Earlier, we saw that, before the JDBC API can be used to interact with a relational database management system, a JDBC technology-based driver must be registered with a Java application or applet. Typically, JDBC drivers are registered by calling the static forName() method of the Java Class object and providing the fully-qualified name of the JDBC-based driver to be registered with the DriverManager interface object, which in turn is used to establish a connection to a data source. Unfortunately, this approach reduces application portability because the application must know the driver class name and URL in advance in order to use a particular JDBC driver. Furthermore, the driver class name and URL that must be provided as input are specific to each JDBC vendor's driver implementation. To get around this issue, the DataSource interface was introduced in the JDBC 2.0 API. The most significant difference between the DriverManager interface and the DataSource interface in terms of establishing a connection is the way in which connection information is provided. When the DriverManager interface is used, connection information such as server name, port number, and data source name are provided with the name of a JDBC driver as a single URL value. However, when the DataSource interface is used, this information is provided as object property values that are set by the application and are not tied to any particular JDBC driver. Some of the properties that can be set, along with their respective data types and the methods used to set and retrieve their values, are shown in Table 6-4.

Table 6-4: JDBC DataSource Interface Properties Property Description Data Type String Method Used to Retrieve or Set Property Value getServerName(), setServerName()

serverName

The name assigned to the server where the data source a connection is to be established with resides. The database server listener port number. (With DB2 UDB servers, this is the port number that the DB2 Java daemon monitors for database connection requests.) The alias assigned to the data source a connection is to be established with. The authorization ID that is to be used to establish a data source connection. The password that is to be used to establish a data source connection. Descriptive information about the data source.

portNumber

int

getPortNumber(), setPortNumber()

databaseName

String

getDatabaseName(), setDatabaseName() getUser(), setUser() getPassword(), setPassword() getDescription(), setDescription(),

user

String

password

String

description

String

Once created, a DataSource object can be thought of as a factory that facilitates connections to the data source that the DataSource object represents. In a basic DataSource object implementation, the getConnection() method of a Data-Source object is used to create a Connection object that provides a physical connection to a particular data source. In a more complex implementation, a DataSource object may also implement connection pooling for the data source it represents. In this case, the getConnection() method returns a handle to a PooledConnection object rather than to a Connection object. An application uses the resulting PooledConnection object like any other connectionconnection pooling has no effect on application code except that a pooled connection, like all connections, should be explicitly closed when it is no longer needed. (When an application closes a connection that is pooled, the connection is returned to a pool of reusable connections, and the next time the getConnection() method is called, a handle to one of these pooled connections will be returned if one is available. Because connection pooling reduces the need to create a new physical connection every time one is requested, it can help applications run significantly faster.) When the DB2 Universal JDBC driver is used instead of the "app" or "net" driver, several implementations of the DataSource interface are provided: DB2SimpleDataSource. Provides the same functionality as the DataSource interface but does not support connection pooling. DB2DataSource. Provides the same functionality as the DataSource interface and provides connection pooling support. (With this implementation, connection pooling is handled internally and is transparent to the application that uses it.)

DB2ConnectionPoolDataSource. Provides the same functionality as the DB2DataSource interface; however, with this implementation, you must manage the connection pooling yourself, by either writing your own code or by using a tool such as WebSphere Application Server. DB2XADataSource. Provides the same functionality as the DataSource interface along with connection pooling and supports distributed transactions (also known as two-phase commits). With this implementation, you must manage the distributed transactions and connection pooling yourself, by either writing your own code or by using a tool such as Web-Sphere Application Server. (If you are using an XA-compliant transaction manager, such as IBM WebSphere, BEA Tuxedo, or Microsoft Transaction Server, the number of transactions that can be running at one time is limited only by the resources available.)

DB2 SQL and JDBC Data Types


As you know by now, variables that are to be used to move data between an application program and a database must be defined such that they are assigned a data type that is compatible with the SQL data type they will be interacting with. With JDBC applications and applets, such data type mapping is fairly straightforward because every SQL data type has a corresponding JDBC data type, along with a matching Java data type. (The JDBC data types are defined in the java.sql.Types class and are mapped to standard Java primitive data types or class objects.) In addition, most SQL data types have corresponding setXXX() and getXXX() methods in the PreparedStatement, CallableStatement and ResultSet interfaces, which you use when assigning values to parameter markers or retrieving values from result data sets. The JDBC data types that are used with a particular DB2 SQL data type, along with their respective setXXX() and getXXX() methods, can be seen in Table 6-3.

JDBC Application and Applet Basics


Now that we have identified some of the major components that make up the JDBC interface, let's take a look at how these components are used to construct JDBC applications and applets. At a minimum, every JDBC application or applet must perform the following tasks, in the order shown: 1. Register a JDBC driver (if the DataSource interface is not used). 2. Establish a data source connection. 3. Process one or more transactions. 4. Clean up and release all resources used. As you might imagine, the actual work associated with each of these tasks is conducted by defining one or more JDBC objects and executing one or more of the defined object's methods.

JDBC Driver Registration


Earlier, we saw that to use the JDBC API with a particular relational database management system, you must have a JDBC technology-based driver that can mediate calls between JDBC and the database system being used. However, before a JDBC driver can be used by a Java application or applet, it must first be registered with the JDBC DriverManager class. This is done at the beginning of a Java application or applet by calling the forName() method of the Class object and specifying the fully-qualified name of the JDBC-based driver to be registered. (This action will result in the creation of a corresponding DriverManager object.) In the case of DB2 UDB, the "app" JDBC driver is registered with a JDBC DriverManager class by making the following Class.forName() method call: Class.forName("COM.ibm.db2.jdbc.app.DB2Driver") On the other hand, the DB2 UDB "net" JDBC driver is registered by making a Class.forName() method call that looks like this: Class.forName("COM.ibm.db2.jdbc.net.DB2Driver")

Finally, the DB2 Universal JDBC driver is registered by making a Class.forName() method call that looks like this: Class.forName("com.ibm.db2.jcc.DB2Driver") Because the Class object is part of Java and not JDBC, it can be used to load any class that can be found in a workstation's class path (which is defined using an environment variable). When a Class object is created, the Java Virtual Machine is responsible for initializing the class unless the class provides its own initialization code, referred to as the "static initializer". In a JDBC driver, the static initializer is responsible for registering the newly-loaded driver with JDBC at run time.

Establishing a Data Source Connection


Once a JDBC technology-based driver has been registered, the getConnection() method of the resulting DriverManager object can be used to physically establish a connection to a data source. Any data source connection information needed (such as database name, server ID, and port number) are passed to this method in the form of a Uniform Resource Locator (URL). A valid URL for the DB2 UDB "app" JDBC driver is jdbc:db2:[DataSource] while a valid URL for the DB2 UDB "applet" JDBC driver is jdbc:db2://[ServerID]:[Port]/[DataSource] where: DataSource ServerID Port

Identifies the name assigned to the data source with which a connection is to be established. Identifies the name assigned to the server where the data source specified in the DataSource parameter resides. Identifies the listener port number being used by the DB2 Java daemon on the server upon which the data source specified resides. (If no port number is specified, the default port number 6789 is used.)

Additional information such as the authorization ID and corresponding password that is to be used to establish the connection can also be specified as input parameter values for the getConnection() method. When the getConnection() method is invoked, the DriverManager class attempts to connect to the data source specified using a JDBC driver that has already been registered. Once a suitable driver is identified, a data source connection is established and a Connection interface object is created (if no driver is found, an exception will be thrown). From that point on, the connected data source is represented by its corresponding Connection object.

Transaction Processing
Like CLI/ODBC applications, the bulk of a JDBC application or applet is made up of high-level programming language statements (in this case, Java) that invoke functions (or methods) to perform SQL operations against a connected data source. These SQL operations can be performed individually or they can be grouped together as a single unit of work (otherwise known as a transaction). In either case, the following steps must be carried out in order to perform most SQL operations: 1. Create a JDBC Statement, PreparedStatement, or CallableStatement object and associate it with an SQL statement. 2. Provide values for any parameter markers used. 3. Execute the SQL statement. 4. Retrieve and process any results produced.

5. 6.

Terminate the current transaction by committing it or rolling it back. Release all resources used.

The following sections describe each of these steps in more detail. Creating a JDBC Statement, PreparedStatement, or CallableStatement

Object
Before any SQL statement can be executed by a JDBC application or applet, it must be assigned to a JDBC Statement, PreparedStatement, or CallableStatement object. Like CLI/ODBC statement handles, Statement, PreparedStatement, and CallableStatement objects contain detailed information about a particular SQL statement information that a data source needs in order to process the statement. Just as SQL statements can be processed in embedded SQL and CLI/ODBC applications in one of two ways, SQL statements can be processed in a JDBC application or applet in one of two ways: Prepare then Execute. This approach separates the preparation of the SQL statement from its execution and is typically used when an SQL statement is to be executed repeatedly. This method is also used when an application needs advance information about the columns that will exist in the result data set produced when the SQL statement is executed. Execute. This approach combines the preparation and execution of an SQL statement into a single step and is typically used when an SQL statement is to be executed only once. This method is also used when the application does not need additional information about the result data set that will be produced, if any, when the SQL statement is executed. The method used determines which type of object is needed: If the preparation of an SQL statement is to be separated from its execution, a PreparedStatement object should be used. On the other hand, if the preparation and execution of an SQL statement is to be combined into a single step, a Statement object should be used. (If the SQL statement will invoke a stored procedure, a CallableStatement object should be used.) Statement objects are created by calling the createStatement() method of the appropriate Connection object; the SQL statement that is to be associated with a particular Statement object must be specified as an input parameter value to many of the Statement object methods available. In contrast, PreparedStatement objects are created by calling the prepareStatement() method of the appropriate Connection object; in this case, the corresponding SQL statement must be provided (as an input parameter value) when the prepareStatement() method is invoked. (CallableStatement objects are created by calling the prepareCall() method of the appropriate Connection object with the corresponding SQL statement specified as an input parameter value.)

Populating Parameter Markers


To increase reusability, SQL statements that are associated with a PreparedStatement or CallableStatement object can include parameter markers in place of constants, expressions, or both. Parameter markers are represented by the question mark (?) character and indicate the position in an SQL statement where a value is to be substituted when the statement is actually executedthis substitution is carried out by calling the appropriate setXXX() method of the PreparedStatement or CallableStatement object that the SQL statement is associated with. (Refer to Table 6-3 for more information about the setXXX() methods available.) Every setXXX() method available requires two parameter values as input. The first parameter value identifies the parameter marker in the corresponding SQL statement that a value is being provided for, and the second is the value itself. Thus, the Java code used to substitute the integer value 20 for the second parameter marker in an SQL statement would look something like this: Stmt.setInt(2, 20);

(assuming that Stmt refers to a PreparedStatement or CallableStatement object). By using parameter markers and appropriate setXXX() methods, an application can execute a single SQL statement multiple times and obtain different results each time simply by modifying the value assigned to any parameter marker between each execution. Because the access plan used to process such an SQL statement is cached in memory at the time the statement is prepared, subsequent executions require little or no additional overhead, resulting in an improvement in application performance.

Executing SQL Statements


Once an SQL statement has been associated with a JDBC Statement, PreparedStatement, or CallableStatement object (and values have been provided for any parameter markers used), the statement is ready for execution. Regardless of which type of statement object is used, three methods are available to facilitate the execution of SQL statements. These methods are the execute() method, the executeQuery() method, and the executeUpdate() methodthe appropriate method to use is determined largely by the type of SQL statement that is to be executed. As you would expect, the execute-Query() method is normally used to execute queries (SELECT and VALUES SQL statements), and the executeUpdate() method is typically used to execute INSERT, UPDATE, and DELETE statements, along with most Data Definition Language statements available (for example, CREATE, ALTER, and DROP statements). The execute() method, however, can be used to execute any type of SQL statement but is typically used for statements that cannot be handled by either of the other two methods.

Retrieving and Processing Results


Once an SQL statement has been prepared and executed, any results produced will need to be retrieved and processed. (Result information is stored in a data storage area that is referenced by a ResultSet object, which is created automatically when the executeQuery() method is invoked.) If the SQL statement executed was anything other than a SELECT or VALUES statement, the only additional processing required after execution is a quick check to ensure that the statement performed as expected. However, if a SELECT statement or VALUES statement was executed and a ResultSet object was created, the following steps must be performed to retrieve each row of data from the result data set produced: 1. Determine the structure (i.e., the number of columns, column data types, and data lengths) of the result data set produced. This is done by invoking the ResultSetMetaData() method of the ResultSet object produced in response to the query. 2. Copy values stored in the columns of the result data set produced into application variables using the appropriate getXXX() method. (Refer to Table 6-3 for more information about the getXXX() methods available.) 3. Position the cursor on each row of data in the result data set by repeatedly calling the next() method of the ResultSet object being used until every row has been processed. In the first step, the result data set produced when the SELECT or VALUES SQL statement was executed is analyzed to obtain information about its structure (i.e., column names, column data types, column nullability, etc.). If the SQL statement was hard-coded into the application, this step is unnecessary because the structure of the result data set produced should already be known. However, if the SQL statement was generated at application run time, the result data set produced must be queried to obtain this information. Once the structure of a result data set is known, the values stored in one or more columns can be copied to application variables using getXXX() methods that are compatible with a column's data type. One getXXX() method must be invoked for every column value that is to be retrieved.

In the third step, when all column values for a particular row have been acquired, the cursor is positioned on the next row in the result data set by calling the next() method and the process of retrieving column values is repeated. Usually, this entire process is repeated as long as there are unprocessed rows in the result data set.

Terminating the Current Transaction


It was mentioned earlier that a transaction (also known as a unit of work) is a sequence of one or more SQL operations that are grouped together as a single unit, usually within an application process. Transactions are important because the initiation and termination of a single transaction defines points of data consistency within a database; the effects of all operations performed within a transaction are either applied to the database and made permanent (committed) or backed out (rolled back), in which case, the database is returned to the state it was in before the transaction was initiated. As with CLI/ODBC applications, JDBC applications and applets can connect to multiple data sources simultaneously, and each data source connection can constitute a separate transaction boundary. And like CLI/ODBC applications, JDBC applications and applets can be configured to run in one of two modes: auto-commit or manual-commit. When auto-commit mode is used, each SQL statement is treated as a complete transaction, and each transaction is automatically committed after the SQL statement successfully executes. For anything other than SELECT SQL statements, the commit operation takes place immediately after the statement is executed. For SELECT statements, the commit operation takes place immediately after the ResultSet object being used to process the results is closed. Auto-commit mode is the default commit mode used and is usually sufficient for simple JDBC applications. Larger applications, however, particularly applications that perform update operations, should switch to manual-commit mode as soon as a data source connection is established. When manual-commit mode is used, transactions are started implicitly the first time an application accesses a data source and are explicitly ended when the commit() or rollback() method of the Connection object associated with the transaction is executed. Thus, all operations performed against a data source between the time it was first accessed and the time the commit() or rollback() method is invoked are treated as a single transaction. Regardless of which type of commit mode is used, all transactions associated with a particular data source should be completed before the connection to that data source is terminated.

Cleaning up Resources
One step any well-behaved application will perform is to clean up any resources that were allocated or acquired on its behalf when those resources are no longer needed. In some situations, it is possible to ignore this step because the Java Virtual machine, the operating system, or both automatically free any resources acquired when an application terminates. However, things can be very different if an application, or particularly an applet, is running in a "long-life" environment such as a Web application server. The Java Virtual Machine contains a mechanism known as the "garbage collector" that periodically locates and cleans up any unused objects it happens to find lying around. Some may argue that because the garbage collector exists, Java applications and applets do not have to be as diligent about resource clean-up as other applications. However, there is no way to tell when the garbage collector will spring into action, and an application may run out of resources long before that point. Additionally, because JDBC applications interface with one or more external systems (data sources), they are often required to use and maintain external resources that may be somewhat limited. For example, the number of open PreparedStatement or ResultSet objects allowed may be (and in fact, probably is) limited by the JDBC-technology based driver being used. Therefore, when the results of an SQL statement have been processed and the corresponding JDBC Statement, PreparedStatement, or CallableStatement object used is no longer needed, that object should be closed so that any resources reserved on its behalf

are freed. The same is true for any ResultSet objects that are produced in response to a query. To clean up and release the resources used by a Statement, PreparedStatement, CallableStatement, or ResultSet object, simply call that object's close() method. However, because ResultSet objects have a dependency on Statement or PreparedStatement objects, and because these objects, in turn, have a dependency on Connection objects, a ResultSet object will be closed automatically if its associated Statement or PreparedStatement object is closed. Likewise, a Statement, PreparedStatement, or CallableStatement object will automatically be closed when its associated Connection object is closed.

Putting It All Together


Now that we have examined each distinct task that every JDBC application or applet must perform, let's see how each of these tasks are typically coded in both a JDBC application and a JDBC applet. A simple JDBC application that obtains and prints the names and salaries of every employee that works in department D11 might look something like this: // Import The JDBC Driver Manager Packages/Classes import java.io.*; import java.util.*; import java.sql.*; // Define The TestApplication Class" public class TestApplication { static { // Register The DB2 UDB JDBC Application Driver try { Class.forName("COM.ibm.db2.jdbc.app.DB2Driver"); } catch (ClassNotFoundException E) { System.out.println("Error registering driver."); E.printStackTrace(); } } /*-----------------------------------------------------*/ /* The Main Method */ /*-----------------------------------------------------*/ public static void main(String argv[]) {

try { // Declare The Local Memory Variables Connection ConObject; Statement StmtObject; ResultSet Results; String ConURL; String UserID; String Password; String SQLStmt; String FName; String LName; double Salary; // Create The Connection URL And Establish A // Connection To The SAMPLE Database ConURL = "jdbc:db2:SAMPLE"; UserID = "db2admin"; Password = "ibmdb2"; ConObject = DriverManager.getConnection(ConURL, UserID, Password); // Use Manual-Commit Mode ConObject.setAutoCommit(false); // Create And Execute A SELECT SQL Statement SQLStmt = "SELECT * FROM EMPLOYEE "; SQLStmt += "WHERE WORKDEPT='D11'"; StmtObject = ConObject.createStatement(); Results = StmtObject.executeQuery(SQLStmt); // Print A Header System.out.println("\nEmployee Information:\n"); System.out.println("Name\t\t\tSalary"); System.out.print("--------------------\t"); System.out.println("----------"); // While There Are Records, Process Them while (Results.next()) { // Retrieve A Record FName = Results.getString(2);

LName = Results.getString(4); Salary = Results.getDouble(12); // Print The Record Retrieved System.out.print(FName + " " + LName + "\t"); if (FName.length() + LName.length() < 13) System.out.print("\t"); System.out.println("$ " + Salary + "0"); } // Issue A Commit To Free All Locks ConObject.commit(); // Close The Result Data Set And SQL Statement // Objects Results.close(); StmtObject.close(); // Close The Database Connection ConObject.close(); } catch(SQLException E) { System.out.println ("SQL Error encountered."); System.out.println ("Error msg: " + E + ". SQLSTATE = " + E.getSQLState() + " Error code = " + E.getErrorCode()); E.printStackTrace(); } catch (Exception E) { System.out.println("Error encountered."); E.printStackTrace(); } } } A simple JDBC applet that performs the same operation (obtains and prints the names and salaries of every employee that works in department D11) might look something like this: // Import The JDBC Driver Manager Packages/Classes

import java.sql.*; import java.awt.*; import java.applet.Applet; // Define The TestApplet Class public class TestApplet extends Applet { static { // Register The DB2 UDB JDBC Net Driver try { Class.forName("COM.ibm.db2.jdbc.net.DB2Driver"); } catch (ClassNotFoundException E) { System.out.println("Error registering driver."); E.printStackTrace(); } } // Declare The Global Memory Variables Connection ConObject; /*-----------------------------------------------------*/ /* Define The Initialization Method -- This Is The */ /* First Method That Gets Called When The Applet Is */ /* Downloaded public void init() { try { // Declare The Local Memory Variables String ConURL; String Server; String Port; String UserID; String Password; // Get All Parameter Values Passed From The */ /*-----------------------------------------------------*/

// Calling HTML Page Server = getParameter("server"); Port = getParameter("port"); UserID = "db2admin"; Password = "ibmdb2"; // Create The Connection URL ConURL = "jdbc:db2://" + Server + ":" + Port; ConURL += "/SAMPLE"; // Establish A Connection To The SAMPLE Database ConObject = DriverManager.getConnection(ConURL, UserID, Password); // Use Manual-Commit Mode ConObject.setAutoCommit(false); } catch(Exception E) { System.out.println("Unable to connect."); E.printStackTrace(); } } /*-----------------------------------------------------*/ /* Define The Screen Painting Method -- This Method */ /* Retrieves Data From The Database And Displays It */ /*-----------------------------------------------------*/ public void paint(Graphics Output) { try { // Declare The Local Memory Variables Statement StmtObject; ResultSet Results; String SQLStmt; String FName; String LName; double Salary; int YCoordinate;

String Outtext; // Create And Execute A SELECT SQL Statement SQLStmt = "SELECT * FROM EMPLOYEE "; SQLStmt += "WHERE WORKDEPT='D11'"; StmtObject = ConObject.createStatement(); Results = StmtObject.executeQuery(SQLStmt); // Print A Header Output.drawString("Employee Information:", 20, 40); Output.drawString("Name", 20, 65); Output.drawString("Salary", 180, 65); Output.drawString("--------------------", 20, 75); Output.drawString("---------------", 100, 75); Output.drawString("----------------", 180, 75); // While There Are Records, Process Them YCoordinate = 90; while (Results.next()) { // Retrieve A Record FName = Results.getString(2); LName = Results.getString(4); Salary = Results.getDouble(12); // Display The Record Retrieved Outtext = FName + " " + LName; Output.drawString(Outtext, 20, YCoordinate); Outtext = "$ " + Salary + "0"; Output.drawString(Outtext, 180, YCoordinate); YCoordinate += 20; } // Issue A Commit To Free All Locks ConObject.commit(); // Close The Result Data Set And SQL Statement // Objects

Results.close(); StmtObject.close(); } catch(SQLException E) { System.out.println ("SQL Error encountered."); System.out.println ("Error msg: " + E + ". SQLSTATE = " + E.getSQLState() + " Error code = " + E.getErrorCode()); E.printStackTrace(); } catch(Exception E) { System.out.println("Error encountered."); E.printStackTrace(); } } } Keep in mind that although a JDBC application can be executed from a system command prompt, a JDBC applet can only be called from an HTML page that is accessed by a Javaenabled Web browser. The following HTML code illustrates how the JDBC applet just shown could be invoked: <html> <!-- Before running this code, modify the values of the <!-- parameters used as follows: --> --> --> -->

<!-- Assign the name of the database server where the DB2 <!-- SAMPLE database resides to "server" and assign the --> <!-- port number that DB2 JDBC server daemon is listening <!-- on (6789 is the default) to "port" <head> <title>DB2 JDBC Sample Applet </title> </head> <body> <h1> <center>DB2 JDBC Sample Applet </center></h1> <center> <applet code="TestApplet.class" width=325 height=375 archive="db2java.zip"> -->

<param name=server value='DB_SERVER'> <param name=port value='6789'> </applet> </center> </body> </html>

Diagnostics and Error Handling


The Java programming language provides a mechanism known as exceptions to help programs capture and process errors. (An exception is an event that occurs during program execution that disrupts the normal flow of instructions.) Many kinds of errors can cause an exception to occur errors ranging from serious hardware problems, such as a hard disk crash, to simple programming errors, such as trying to access an out-of-bounds array element. When such errors occur within a Java method, the method creates an exception object and hands it off to the Java Virtual Machine. The exception object contains information about the exception, including its type and the state the program was in when the exception object was generated. The Java Virtual Machine then searches backward through the call stack, beginning with the method in which the error occurred, until it finds code that is designed to capture and process the exception that was generated. In Java terminology, creating an exception object and handing it to the Java Virtual Machine is known as throwing an exception, and the block of code that is responsible for capturing and processing a thrown exception is known as an exception handler. Multiple exception handlers can exist within a single JDBC application or applet. When an exception is thrown, the Java Virtual Machine attempts to locate an exception handler that has been designed specifically for the type of exception that was thrown. (The exception handler chosen is said to catch the exception.) If the Java Virtual Machine exhaustively searches all of the methods on the call stack without finding an appropriate exception handler, the Java application or applet is terminated. Depending on how it has been coded, an exception handler can attempt to recover from the situation that caused an exception to be thrown or, if it determines that the situation is unrecoverable, provide a clean way for the Java application or applet to terminate. Three Java keywords play an important part in exception handling: try. The try keyword identifies a block of statements that an exception might be thrown from. A try keyword must have at least one corresponding catch or finally keyword. catch. The catch keyword identifies a block of statements that are designed to handle a specific type of exception. The catch keyword must be associated with a try keyword; the statements within a catch keyword block are executed only if an exception of a specified type occurs within the corresponding try block. finally. The finally keyword identifies a block of statements that are to be executed regardless of whether an error occurs within a try block. Like the catch keyword, the finally keyword must be associated with a try keyword. These keywords are used in the following way: try { [ProgramStatements] } catch ([ExceptionType] [Name]) { [ProgramStatements]

} finally { [ProgramStatements] } where: ProgramStatements ExceptionType Name

Identifies one or more program statements that make up the try block, catch block, or finally block. Identifies a specific type of exception that the exception handler (identified by the catch block) is designed to trap and process. Identifies the name that is to be assigned to the exception handler.

The following is a simple example of a try block and its associated catch block (exception handler). In this example, an attempt to load the DB2 UDB JDBC application driver is made; if the driver is not found, the catch block displays an error message and prints a stack trace. (If the driver is loaded successfully, the catch block is never executed.) // Register The DB2 UDB JDBC Application Driver try { Class.forName("COM.ibm.db2.jdbc.app.DB2Driver"); } catch (ClassNotFoundException E) { System.out.println("Error registering driver."); E.printStackTrace(); } Similar code usually appears at the beginning of any JDBC application or applet that is designed to interact with a DB2 UDB

Creating Executable JDBC Applications and Applets


So far, we have looked at the steps that are used to code JDBC applications and applets, but we have not looked at how the source code for a JDBC application or applet is converted into a working program. In most cases, source code files written in a high-level programming language must be either compiled and linked to create an executable application, (for example, C and C++) or "interpreted" by some kind of interpreter (for example, Smalltalk) at application run time. The Java programming language is unusual in that source code files written with it must be both compiled and interpreted (by the Java Virtual Machine). The Java compiler is responsible for converting Java source code files into an intermediate language known as Java bytecode (which is platform independent code that is read by an interpreter); the interpreter, otherwise known as the Java Virtual Machine, parses and converts Java bytecode to machine instructions that are native to the operating system upon which it has been installed. Compilation happens just once, but interpretation occurs each time the program is executed. Figure 6-3 illustrates this compilation/interpretation process.

Figure 6-3: Converting a JDBC source code file into an executable application or applet. It is important to note that when source code files are written in other high-level programming languages and compiled to produce an executable application, the compiled code is customized to run on a specific hardware platform. As a result, the application produced will be optimized for that particular environment but will have to be recompiled, and in some cases modified, before it can be used on a different platform. As a platform-independent environment, the Java platform can cause applications to be a bit slower than similar applications that are platform specific. However, smart compilers, well-tuned interpreters, and just-in-time bytecode compilers can bring a Java application's performance close to that of a similar platform-specific application's performance without threatening portability.

An Introduction to SQLJ
Earlier, we saw that dynamic SQL provides database applications with flexibility that cannot be obtained when static SQL is used. However, this flexibility does not come without costbecause the work of analyzing SQL statements to select the optimum data access plan to use for statement execution is done at application run time, dynamic SQL statements can take longer to execute than their equivalent static SQL counterparts. Unfortunately, although the option of using static SQL or dynamic SQL has been available to other high-level programming language developers for quite some time, Java application developers were initially forced to work exclusively with dynamic SQL. This restriction meant that in order to develop a Java application that interacted with a database, a developer had to master the JDBC API because embedding SQL statements directly in a Java source code file was not an option. To get around this limitation, The SQLJ Groupa consortium composed of IBM, Informix, Oracle, Sybase, Tandem, and Sun Microsystemsdeveloped a specification known as SQLJ (SQL-Java). SQLJ provides a way for Java developers to embed static SQL statements in a Java application. Eventually, The SQLJ Group submitted their SQLJ specification to the INCITS Technical Committee H2 on Database (which is the committee that is responsible for developing

standards for the syntax and semantics of database languages), and in 1999, SQLJ was formally adopted into the SQL:1999 standard. The SQLJ specification consists of three parts: Part 0: Embedded SQL in Java. This part specifies the SQLJ language syntax and semantics used to embed SQL statements in a Java application. Part 0 supports the use of static SQL statements in Java; however, it does not support the use of dynamic SQL statementsdynamic SQL is still handled by JDBC. Part 0 does support the mixing of embedded static SQL statements with JDBC APIs and it supports the same mapping between Java data types and SQL data types that is defined by JDBC. Part 1: SQL routines and Java. This part specifies extensions that allow SQL code to invoke methods or stored procedures that are written in Java. To support Part 1, a DBMS must have a Java Virtual Machine associated with it. And, in order to provide interaction between SQL functions and Java methods, each SQL parameter used, along with any return values generated, must be able to be mapped to respective Java method parameters and return values. Part 1 deals only with static SQL Java methods. Part 2: SQL data types and Java. This part defines SQL extensions for using Java classes as data types in SQL. Part 2 allows the mapping of SQL:1999 User-Defined Data Types (UDTs) to Java classes. It also allows the importing of a Java package into a database by defining tables that contain columns whose data type are specified to be a Java class. Part 2 adds non-static method support to the static method support provided in Part 1. It is important to note that SQLJ is not a replacement for JDBC. In fact, SQLJ run-time classes execute their queries using JDBC APIs. Later, when we look at the steps needed to compile and run an SQLJ application, we'll see how it is possible for SQLJ to provide static SQL support even though it functions as a layer on top of JDBC.

SQLJ Language Elements


Aside from providing the capability to embed static SQL statements in a Java application, what differentiates SQLJ from JDBC? First, where JDBC consists primarily of interface or class objects and methods that provide an API that is used to process SQL, SQLJ relies on special statements rather than objects for SQL statement processing. Two types of SQLJ statements are available: declaration statements, which are used to create objects that will be used by the application, and executable statements, which are used to embed static SQL statements in an application. Declaration statements can appear throughout an application and have no real distinction that sets them apart from other Java statements. Executable statements, on the other hand, appear in a Java source code file in what are known as SQLJ clauses. SQLJ clauses are the mechanisms by which SQL statements embedded in a Java application are forwarded to a data source for processing and are distinguished by their formatevery SQLJ clause begins with the token "#sql" and ends with a semicolon (;). The simplest form of an SQLJ clause consists of the "#sql" token followed by an executable SQL statement enclosed in curly braces ({ }). For example, the following SQLJ clause would be used to delete all rows from a table named EMPLOYEE: #sql { DELETE FROM EMPLOYEE }; Another major difference between JDBC and SQLJ is how data source connections are managed. SQLJ uses special objects known as contexts to manage both connections and SQL operations that are performed against a connected data source. Three types of context objects exist: connection, default, and execution. Connection. A connection context object is used to manage all SQL operations performed against a specific data source. A connection context maintains a JDBC Connection instance that dynamic SQL operations can be performed against. It also contains a default execution context object by which SQL operation execution semantics may be queried and modified. Default. The default context object provides a default implementation of a connection context. This context is used to process executable SQL statements when no connection context instance variable is provided. Execution. The execution context object is used to provide the context in which executable SQL operations are performed. An execution context object contains a wide variety of methods that

are used for execution control, execution status, and execution cancellation. Execution control methods are used to modify the semantics of subsequent SQL operations that will be executed on the context. Execution status methods are used to describe the results of the last SQL operation executed on the context. Execution cancellation methods are used to terminate the current SQL operation being executed on the context. Looking back at our example of a simple SQLJ clause, you will notice that no context has been specified. Such a clause would rely on the default context for connection information. An SQLJ clause that is associated with a specific connection context consists of the "#sql" token, followed by the ID of the desired connection context enclosed in brackets ([]), followed by an executable SQL statement enclosed in curly braces ({ }). Thus, an SQLJ clause designed to delete all rows from a table named EMPLOYEE, using a connection context named ConnectionCtx, would look like this: #sql [ConnectionCtx] { DELETE FROM EMPLOYEE }; Although JDBC applications rely on ResultSet objects to obtain and process the results of a query, SQLJ applications rely on objects known as result set iterators (or iterators) for this functionality. Specifically, a result set iterator is a special interface that is used to iterate (move back and forth) over the contents of a result data set produced in response to a query. Two types of iterators exist: positioned and named. Positioned iterator. A positioned iterator is a special result set iterator that employs a byposition binding strategy. Positioned iterators depend on the position of the columns of the data to which they are bound, as opposed to the names of the columns to which they are bound. Named iterator. A named iterator is a special result set iterator that employs a by-name binding strategy. Named iterators depend on the name of the columns of the data to which they are bound, as opposed to the position of the columns to which they are bound.

A Word about Host Expressions


In Chapter 4, "Embedded SQL Programming," we saw that because the DB2 Database Manager cannot work directly with high-level programming language variables, special host variables must be used to move data between an embedded SQL application and a database. In SQLJ applications, host expressions are used instead; a host expression can be a simple Java identifier or a complex expression. Host expressions are used in an SQLJ application the same way host variables are used in an embedded SQL application, with one exception: host expressions do not have to be declared in a special declare section. Like host variables used in embedded SQL applications, host expressions used in SQLJ applications are prefaced with a colon (:) when they are referenced in an SQL operation. The following example illustrates the proper use of a host expression in an SQLJ application: long LongValue; ... #sql { INSERT INTO TABLE1 VALUES (:LongValue) };

SQLJ Application and Applet Basics


Now that we have identified some of the major differences between JDBC and SQLJ, let's take a look at how SQLJ applications and applets are constructed. Most SQLJ applications and applets are constructed such that they perform the following set of tasks, in the order shown: 1. Register a JDBC driver (if the DataSource interface is not used). 2. Establish a data source connection. 3. Execute one or more SQL statements. 4. Retrieve and process any results produced. 5. Terminate the current transaction. 6. Clean up and release all resources used.

JDBC Driver Registration


Earlier, we saw that in order to use the JDBC API with a particular relational database management system, you must have a JDBC technology-based driver that can mediate calls between JDBC and the database system being used. Because SQLJ functions as a layer that runs on top of JDBC, the steps used to register a JDBC driver for an SQLJ application or applet are the same as those used to register a driver for a JDBC application or applet. Thus, a DB2 UDB JDBC driver can be registered for an SQLJ application by making any of the following Class.forName() method calls: Class.forName("COM.ibm.db2.jdbc.app.DB2Driver") Class.forName("COM.ibm.db2.jdbc.net.DB2Driver") Class.forName("com.ibm.db2.jcc.DB2Driver")

Connecting to a Data Source


Like embedded SQL, CLI/ODBC, and JDBC applications, SQLJ applications must establish a connection to a valid data source (database server) before they can perform any SQL operation. However, where embedded SQL, CLI/ODBC, and JDBC application developers have only one technique at their disposal for establishing such a connection, SQLJ application developers have several options available. Most SQLJ applications use one of the following techniques: Explicitly create a connection context class and provide connection information (in the form of a URL) to the class constructor when an object of the class is instantiated. When this approach is used, additional information (such as the authorization ID and password that are to be used to establish the connection and whether automatic-commit mode is to be used) must be provided, along with a connection URL, as input to the connection context class constructor. Invoke the getConnection method of the JDBC DriverManager interface and explicitly create an instance of a connection context using the resulting Connection object. When this approach is used, authorization ID and password information can be provided as parameter values for the getConnection() method, and the setAutoCommit() method must be used to control whether auto-commit or manual-commit mode is used. SQLJ applications also have the option of bypassing both of these techniques and using the default connection context object provided. (To create a default connection context, SQLJ does a Java Naming and Directory Interface (JNDI) lookup for jdbc/defaultDataSource. If nothing is registered, a null context exception is issued when an attempt to access the context is made.)

Executing SQL Statements


As with embedded SQL applications, the bulk of any SQLJ application or applet is comprised of one or more executable SQL statements that are to perform specific operations against a connected data source. Like embedded SQL applications, these SQL statements can be executed individually or they can be grouped together to produce one or more transactions. Any SQL statement that is supported by the JDBC driver may be embedded in an SQLJ application and as we saw earlier, all executable SQL statements used must be coded within an SQLJ clause. (Remember, an SQLJ clause consists of the "#sql" token, followed by the ID of a valid connection context enclosed in brackets ([]), followed by an executable SQL statement enclosed in curly braces ({ }).)

Retrieving and Processing Results


Like embedded SQL applications, SQLJ applications that wish to retrieve a single row of data from a DB2 UDB database can do so by executing a SELECT INTO SQL statement that contains a WHERE clause that defines a result data set that contains only that row. In an SQLJ application, such a statement might look something like this: #sql [ConnectionCtx] { SELECT SALARY INTO :Salary FROM EMPLOYEE WHERE EMPNO = '000100' }; However, most queries, when executed, produce result data sets that contain more than one row. In embedded SQL and CLI/ODBC applications, you use a mechanism called a cursor to retrieve individual records from the result data set produced; in SQLJ applications, you use an object known as a result set iterator. Like a cursor, a result set iterator can be non-scrollable, which means that when you use it to fetch rows, it moves serially through the records, from the beginning of the result data set to the end. Alternatively, cursors and result set iterators can be scrollable, which means that when you use either one to fetch rows, you can move forward, backward, or to any individual row in the result data set. Unlike a cursor, a result set iterator can be passed as an input parameter to an SQLJ object method. The basic steps for using a result set iterator are: 1. Declare the iterator, which will result in the creation of an iterator class. 2. Define an instance of the iterator class. 3. Assign the result data set produced by a query to the iterator instance created. 4. Retrieve each row in the result data set, one by one, until no more rows are available. (This is done by repeatedly executing the next() method of the iterator object used.) 5. Close the iterator and delete the result data set produced. (This is done by executing the close() method of the iterator object used.) Earlier, it was mentioned that two types of result set iterators are available: positioned and named. Positioned iterators identify the columns of a result data set by their positionthe columns of the iterator correspond to the columns of the result data set produced, in left-to-right order (the first data type declaration in the iterator corresponds to the first column in the result data set, the second data type declaration corresponds to the second column in the result data set, and so on). The basic syntax used to define a positioned iterator is: #sql iterator [IteratorName]( [DataType], ...); where: IteratorName DataType

Identifies the name to be assigned to the iterator class being defined. Identifies one or more Java data types that correspond to the SQL data types used by the columns in a result data set.

Thus, a positioned iterator for a result data set that contains one column containing CHAR data values and another column containing DATE data values would be defined by executing an SQLJ definition statement that looks something like this: #sql iterator PositionItr(String, Date); Named iterators, however, identify the columns of a result data set by their column names; when you declare a named iterator (and consequently create a named iterator class), you must specify names for each iterator column, and every name provided must match the name of a column in the result data set that an instance of the named iterator class will be associated with. (An iterator column name and a result table column name that differ only in case are considered to be matching names.)

The basic syntax used to define a named iterator is: #sql iterator [IteratorName]( [[DataType][ColumnName]], ...); where: IteratorName DataType ColumnName

Identifies the name to be assigned to the iterator class being defined. Identifies one or more Java data types that correspond to the SQL data types used by the columns in a result data set. Identifies one or more column names that correspond to the names assigned to the columns in a result data set.

Thus, a named iterator for a result data set that contains one column named LASTNAME that contains CHAR data values and another column named HIREDATE that contains DATE data values would be defined by executing an SQLJ definition statement that looks something like this: #sql iterator NamedItr(String Lastname, Date Hiredate); When a named iterator class is created as the result of a named iterator class declaration, individual accessor methods are created as well. One accessor method is created for each column of the iterator, and each accessor method created is assigned the same name as the corresponding iterator column. Once created, these accessor methods are used to retrieve data from columns of the result data set that the iterator is ultimately associated with. Going back to our previous example, when the following SQLJ definition statement: #sql iterator NamedItr(String Lastname, Date Hiredate); is executed, not only is a named iterator called NamedItr created, but two accessor methods, named Lastname() and Hiredate(), are created as well. The Lastname() accessor method is used to retrieve values from the first column of the corresponding result data set while the Hiredate() accessor method is used to retrieve values from the second column of the corresponding result data set. Note A result set iterator can be declared in a variety of ways. However, because a Java class underlies each iterator created, you need to ensure that when an iterator is declared, the resulting class obeys Java rules. For example, iterators that contain a WITH clause must be declared as "public." If an iterator needs to be public, it must be declared where a public class declaration is allowed.

Terminating the Current Transaction


As with JDBC applications, transactions in SQLJ applications are normally initiated the first time an executable SQL statement is executed after a connection to a data source has been established or immediately after a pre-existing transaction has been terminated. Once initiated, transactions can be implicitly terminated, using automatic commit, or explicitly terminated by executing either the COMMIT or the ROLLBACK SQL statement. Regardless of which statement is used, it must be coded within an SQLJ clause. Thus, an SQLJ clause intended to terminate the current transaction by committing all work done (using a connection context named ConnectionCtx) might look like this: #sql [ConnectCtx] { COMMIT }; On the other hand, an SQLJ clause intended to terminate the current transaction by backing out all changes made (again, using a connection context named ConnectionCtx) would look like this: #sql [ConnectCtx] { ROLLBACK };

Cleaning up Resources
Earlier, we saw that a well-behaved application will clean up any resources that were allocated or acquired on its behalf when those resources are no longer needed. Usually, no resource allocation is involved with processing SQLJ clauses; however, that is not the case when connection contexts and result set iterators are used. Context and iterator objects require resources, and these resources should be released when they are no longer needed. Like JDBC applications, SQLJ applications can count on the Java Virtual Machine garbage collector to close any unused objects that it happens to find lying around. However, as we saw earlier, when that approach is used, there is no way to tell when the garbage collector will spring into action, and an application can run out of resources long before garbage collection takes place. Therefore, whenever a context or iterator object is no longer needed, it should be closed so that any resources reserved on its behalf are freed immediately. To clean up and release the resources used by any context or iterator object, simply call that object's close() method.

Putting It All Together


Now that we have covered the SQLJ application and applet development basics, let's see how an SQLJ application that incorporates each concept covered might be coded. A simple SQLJ application that obtains and prints the names and salaries of every employee who works in department D11 might look something like this: // Import The JDBC Driver Manager And SQLJ Packages/Classes import sqlj.runtime.*; import java.sql.*; // Define A Connection Context Class #sql context Ctx; // Define A Named Iterator Class #sql iterator NamedIterator(String Lastname, String Firstnme, double Salary); // Define The TestApplication Class public class TestApplication { static { // Register The DB2 UDB JDBC Applet Driver try { Class.forName("COM.ibm.db2.jdbc.net.DB2Driver"); } catch (ClassNotFoundException E) { System.out.println("Error registering driver.");

E.printStackTrace(); } } /*-----------------------------------------------------*/ /* The Main Function */ /*-----------------------------------------------------*/ public static void main(String argv[]) { try { // Declare The Local Memory Variables String ConURL; String UserID; String Password; String Server; String Port; String EmpName; // Initialize The Local Memory Variables UserID = "db2admin"; Password = "ibmdb2"; Server = "DB_SERVER"; Port = "6789"; // Create The Connection URL ConURL = "jdbc:db2://" + Server + ":" + Port; ConURL += "/SAMPLE"; // Create A Connection Context Object And // Connect To The SAMPLE Database Using It // (Use Manual-Commit Mode) Ctx ConnectCtx = new Ctx(ConURL, UserID, Password, false); // Create A Named Iterator Object NamedIterator NamedItr; // Execute A Query And Asign The Result Data Set // Produced To The Named Iterator Object Created #sql [ConnectCtx] NamedItr = { SELECT LASTNAME, FIRSTNME, SALARY FROM EMPLOYEE WHERE

WORKDEPT='D11' }; // Print A Header System.out.println("\nEmployee Information:\n"); System.out.println("Name\t\t\tSalary"); System.out.print("-------------------\t"); System.out.println("----------"); // As Long As There Are Records Available, // Move The Named Iterator Through The Result // Data Set Produced And Display Them while (NamedItr.next()) { EmpName = NamedItr.Firstnme() + " " + NamedItr.Lastname() + "\t"; System.out.print(EmpName); if (EmpName.length() < 17) System.out.print("\t"); System.out.println("$ " + NamedItr.Salary() + "0"); } // Close The Named Iterator Object NamedItr.close(); // Issue A Commit To Free All Locks #sql [ConnectCtx] { COMMIT }; // Close The Connection Context Object // And Terminate The Database Connection ConnectCtx.close(); } catch(SQLException E) { System.out.println ("SQL Error encountered."); System.out.println ("Error msg: " + E + ". SQLSTATE = " + E.getSQLState() + " Error code = " + E.getErrorCode()); E.printStackTrace(); }

catch (Exception E) { System.out.println("Error encountered."); E.printStackTrace(); } }

Creating Executable SQLJ Applications and Applets


Earlier, we saw that source code files written in Java must be both compiled and interpreted before they can function as an executable application or applet. The Java compiler is responsible for converting Java source code files into an intermediate language known as Java bytecode (which is platformindependent code that is read by an interpreter); the interpreter, otherwise known as the Java Virtual Machine, parses and converts Java bytecode to machine instructions that are native to the operating system on which it has been installed. Compilation happens just once, but interpretation occurs each time the program is executed. Like other high-level programming language compilers, the Java compiler cannot interpret SQL statements directly. Instead, Java applications containing SQLJ statements must be preprocessed by an SQL translator before they can be compiled. (To aid in application development, SQLJ is composed of three individual components: a translator, a customizer, and a run-time environment.) When invoked, the SQL translator reads and analyzes an SQLJ source code file, converts all SQL statements encountered into calls to SQLJ run time libraries, and produces two new files as output. The first of these files is a pure Java source code file that, by default, is automatically compiled by the Java compiler to produce Java bytecode that can be interpreted by the Java Virtual machine. The second file is an SQLJ serialized profile (.ser file) that contains a binary representation of each SQL statement converted by the SQLJ translator. Static SQL statement packages are created in the database when this file is processed by the SQLJ customizer. Once an SQLJ file has been translated and compiled to produce Java bytecode, it can be used to perform operations against a database. If no intermediate steps are performed, all SQL statements coded in the application will be processed as dynamic SQL statements. If, instead, the corresponding SQLJ serialized profile file is "bound" to the database the application was designed to interact with, each SQL statement used will be executed statically. Such binding is performed using a tool known as the DB2 SQLJ Profile Customizer, which is distributed with the SQLJ translator and is invoked by executing either the db2profc or the db2sqljcustomize command. When invoked, the DB2 SQLJ Profile Customizer creates a DB2 customization for the serialized profile file specified, optionally online-checks all SQL statements encountered that can be dynamically prepared, and by default, creates DB2 packages for the application and stores them in the appropriate database's system catalog. Four packages are createdone for each isolation level available. Figure 6-4 illustrates the SQLJ source code file-to-executable application or applet conversion process when static SQL is to be used.

Figure 6-4: Converting an SQLJ source code file into an executable application or applet. Note that because the DB2 SQLJ Profile Customizer augments an SQLJ profile with DB2-specific information that is to be used at application run time, it should be invoked sometime after an SQLJ application has been translated but before the application is run

Practice Questions
Question 1 For which of the following is the CallableStatement object used? A. To execute a query B. To execute a stored procedure C. To perform an update operation D. To communicate with the system catalog Which of the following capabilities is provided by SQLJ but not JDBC? A. Ability to use the DataSource interface to establish a connection B. Ability to execute stored procedures C. Ability to issue dynamic queries against a DB2 database D. Ability to issue static queries against a DB2 database Given the following table definition: CREATE TABLE employees (empid INTEGER, name (VARCHAR(20)) Assuming the following statements execute successfully in a JDBC application: String SQLString = "INSERT INTO employees VALUES (?, ?)"; PreparedStatement PStmt = Con.prepareStatement (SQLString); Which of the following is the correct way to provide values for the parameter markers used?

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

A. PStmt.execute(20, JAGGER); B. SQLString.execute(20, JAGGER); C. PStmt.setInt(1, 20); PStmt.setString(2, JAGGER); D. SQLString.setInt(1, 20); SQLString.setString(2, JAGGER); Which of the following objects is used by an SQLJ application to scroll through the results of a query? A. An iterator B. A context C. A ResultSet object D. A Statement object Assuming the following code executes without throwing any exceptions: Class.forName("COM.ibm.db2.jdbc.app.DB2Driver"); ConURL = "jdbc:db2:SAMPLE"; UserID = "db2admin"; Password = "ibmdb2"; Connection Con = DriverManager.getConnection(ConURL, UserID, Password); Which of the following can be used to determine if any warnings were generated? A. SQLWarning Warn = Con.getWarnings(); if (Warn != null) System.out.println(Warning!); B. SQLWarning Warn = Con.getMessage(); if (Warn != null) System.out.println(Warning!); C. SQLWarning Warn = DriverManager.getWarnings(); if (Warn != null) System.out.println(Warning!); D. SQLException Warn = Con.getExceptions(); if (Warn != null) System.out.println(Warning!); Which two of the following commands are used to customize an SQLJ profile? A. sqlj B. db2profc C. db2prepare D. sqljcustomize E. db2sqljcustomize Which of the following is NOT a DataSource object property that must be set in order to establish a DB2 JDBC connection to a remote database? A. serverName B. portNumber C. networkProtocol D. databaseName Which of the following JDBC objects can be used to get information about the data types that have been assigned to each column returned when the following query is executed? SELECT * FROM DEPARTMENTS A. Statement B. ResultSet C. DatabaseMetaData

Question 9

Question 10

Question 11

Question 12

Question 13

Question 14

Question 15

D. ResultSetMetaData A JDBC application requires connection pooling and the ability to support distributed transactions. Which of the following DB2 implementations of the DataSource interface should be used by this application? A. DB2DataSource B. DB2XADataSource C. DB2ConnectionPoolDataSource D. DB2SimpleDataSource Which of the following JDBC object methods can be used to obtain DB2-specific error values if the execution of an update operation fails? A. Statement.getErrorCode() B. PreparedStatement.getSQLCA() C. ResultSet.getError() D. SQLException.getErrorCode() Which of the following lines of code could only be found in an SQLJ applet? A. System.out.println(Unable to connect.); B. Class.forName(COM.ibm.db2.jdbc.net.DB2Driver ); C. #sql [ConCtx] NamedItr = { SELECT * FROM EMPLOYEE }; D. ConObject.setAutoCommit(false); Which of the following JDBC interfaces contains methods that can be called to determine whether or not cursors will remain open across transaction boundaries? A. Statement B. ResultSet C. ResultSetMetaData D. DatabaseMetaData Which of the following JDBC objects can be used to execute the following SQL statement? SELECT DEPTNO, DEPTNAME FROM DEPARTMENTS A. DriverManager B. Statement C. PreparedStmt D. ResultSet When building an SQLJ application, which of the following commands must be executed in order to generate an SQL package? A. db2jsqlcustomize B. db2profp C. sqlj D. sqljbind Which two of the following are a valid sequence for defining the URL string that is to be used to connect to a data source using a DB2 JDBC driver? A. jdbc:db2:[DataSource] B. db2:jdbc:[DataSource] C. jdbc:db2://[DataSource]/[UserID]: [Password] D. jdbc:db2://[ServerID]:[Port]/

Question 16

Question 17

Question 18

[DataSource] E. db2.jdbc://[ServerID]:[Port]/ [DataSource] Which of the following JDBC interfaces contains the methods commit() and rollback()? A. Statement B. ResultSet C. Connection D. DriverManager What is the maximum number of global transactions that can be associated with a DB2XAData-Source object at any one time? A. 1 B. 255 C. Determined by the resources available D. Determined by the MAX_TRANSACTIONS object attribute Which of the following JDBC objects can be used to execute SQL statements that contain parameter markers? A. Statement B. CallableStatement C. PreparedStmt D. CallableStmt The correct answer is B. The CallableStatement object is used to execute SQL stored procedures. (This object is derived from the Statement object.) The correct answer is D. JDBC applications and applets are forced to work exclusively with dynamic SQL. SQLJ, however, provides a way for Java developers to embed static SQL statements in Java applications and applets. The correct answer is C. Parameter markers are represented by the question mark (?) character and indicate the position in an SQL statement where a value is to be substituted when the statement is actually executed. This substitution is carried out by calling the appropriate setXXX() method of the PreparedStatement or CallableStatement object the SQL statement is associated with. In this scenario, the setInt() and setString() methods of the PStmt object must be used to provide values for the parameter markers used in the SQL statement "INSERT INTO employees VALUES (?, ?)". The correct answer is A. SQLJ applications rely on objects known as result set iterators (or iterators) to obtain and process the results of a query. A result set iterator is a special interface object that is used to iterate (move back and forth) over the contents of a result data set. Two types of iterators exist: positioned iterators and named iterators. The correct answer is A. The getWarnings() method returns the first warning reported for a Connection object. The code segment shown in Answer A illustrates the proper way to check for warnings after an attempt to establish a data source connection has been made. The correct answers are B and E. The DB2 SQLJ Profile Customizer, which is invoked by executing either the

Answers Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

db2profc or the db2sqljcustomize command, is used to create a DB2 customization for the serialized profile file specified (which is generated by the SQLJ translator), optionally online-check all SQL statements encountered that can be dynamically prepared, and, by default, create DB2 packages for the application and store them in the appropriate database's system catalog. The correct answer is C. When the DB2 JDBC DataSource interface is used, the following information is provided as object property values: serverName, portNumber, databaseName, user, password, and description. The correct answer is D. A ResultSetMetaData object is used to obtain information about the properties of a result data set that has been produced in response to a query. Some of the more common ResultSetMetaData interface methods used include the following: getColumnCount(). Returns the number of columns found in the corresponding ResultSet object. getColumnName(int). Obtains the name that has been assigned to the designated column. getColumnType(int). Obtains the SQL data type that has been assigned to the designated column. isNullable(int). Specifies whether the designated column contains, accepts, or contains and accepts null values. The correct answer is B. The DB2 Universal JDBC driver provides four implementations of the DataSource interface: DB2SimpleDataSource. This implementation provides the same functionality as the DataSource interface but does not support connection pooling. DB2DataSource. This implementation provides the same functionality as the DataSource interface and provides connection pooling support. (With this implementation, connection pooling is handled internally and is transparent to the application that uses it.) DB2ConnectionPoolDataSource. This implementation provides the same functionality as the DB2DataSource interface; however, with this implementation, you must manage the connection pooling yourself, by either writing your own code or by using a tool such as Web-Sphere Application Server. DB2XADataSource. This implementation provides the same functionality as the DataSource interface along with connection pooling, and it supports distributed transactions (also known as two-phase commits). With this implementation, you must manage the distributed transactions and connection pooling yourself, by either writing your own code or by using a tool such as WebSphere Application Server. In this case, because both connection pooling and distributed transaction support are needed, the DB2XADataSource implementation must be used. The correct answer is D. An SQLException object is used to obtain diagnostic information about what caused a data source-related error or warning condition to occur; the getErrorCode() method of the SQLException class is

Question 9

Question 10

Question 11

Question 12

Question 13

Question 14

Question 15

Question 16

Question 17

Question 18

used to retrieve vendor-specific error code (in our case, DB2specific error code) information generated for an SQLException object. The correct answer is C. A typical SQLJ application or applet is comprised of special statements that are used like any other normal Java statement. Two types of SQLJ statements are available: declaration statements, which are used to create objects that will be used by the application, and executable statements, which are used to embedded static SQL statements in an application. Executable statements appear in a Java source code file in SQLJ clauses, which are distinguished by their formatevery SQLJ clause begins with the token "#sql" and ends with a semicolon (;). The correct answer is D. The DatabaseMetaData object is used to obtain information about the data source to which a JDBC application has established a connection. Two methods for this objectsupportsOpenCursorsAcrossCommit() and supportsOpenCursorsAcrossRollback()are used to determine whether open cursors will persist across transaction boundaries (i.e., after commit and rollback operations are performed). The correct answer is B. A Statement object can be used both to execute an SQL statement and to obtain any results produced by the statement's execution. (The Statement interface has two important sub interfaces: PreparedStatement, which represents an SQL statement that has been prepared for execution (that may or may not contain parameter markers), and CallableStatement, which represents an SQL statement that is to be used to invoke a stored procedure.) The correct answer is A. The DB2 SQLJ Profile Customizer, which is invoked by executing either the db2profc or the db2sqljcustomize command, is used to create DB2 packages for SQLJ applications or applets and store them in the appropriate database's system catalog. The correct answers are A and D. A valid URL for the DB2 UDB "app" JDBC driver is jdbc:db2:[DataSource], and a valid URL for the DB2 UDB "applet" JDBC driver is jdbc:db2://[ServerID]:[Port]/[DataSource]. (Typically, user ID and password information is supplied with a valid URL when the getConnection() method of the DriverManager class is invoked.) The correct answer is C. The commit() and rollback() methods are supplied by the Connection interface to provide support for transaction management. (The commit() method is used to commit the current transaction associated with a Connection object while the rollback() method is used to roll back the current transaction associated with a Connection object.) The correct answer is C. If you are using an XA-compliant transaction manager such as IBM WebSphere, BEA Tuxedo, or Microsoft Transaction Server, the number of transactions that can be running at one time is limited only by the resources available. The correct answer is B. To increase reusability, SQL statements associated with a PreparedStatement or

CallableStatement object can include parameter markers in place of constants, expressions, or both. (Parameter markers are represented by the question mark (?) character and indicate the position in an SQL statement where a value is to be substituted when the statement is actually executed.)

Chapter 7: Advanced Programming


Overview
Sixteen percent (16%) of the DB2 UDB V8.1 Family Application Development exam (Exam 703) is designed to test your knowledge of advanced programming concepts such as parameter marker utilization, compound SQL, transaction processing, isolation levels, and locking. The questions that make up this portion of the exam are intended to evaluate the following: Your ability to identify the difference between static SQL and dynamic SQL, as well as to identify when each should be used. Your ability to identify the difference between atomic compound SQL and notatomic compound SQL, as well as to identify when each should be used. Your ability to use parameter markers in embedded SQL, CLI/ODBC, and JDBC applications. Your ability to manage transactions across multiple databases. Your ability to identify how two-phase commit processing works. Your ability to identify the appropriate isolation level to use for a given situation and to identify how isolation levels affect concurrency. Your ability to identify how objects can be locked. This chapter is designed to introduce you to advanced programming concepts such as parameter marker utilization, compound SQL, transaction processing, two-phase commit processing, isolation levels, and locking. This chapter is also designed to review some basic embedded SQL programming concepts and introduce you to the types of cursors that can be used with CLI/ODBC applications. Terms you will learn: Structured Query Language (SQL) Embedded SQL EXEC SQL END-EXEC Static SQL Dynamic SQL Host variables Declare section BEGIN DECLARE SECTION END DECLARE SECTION Indicator variables Compound SQL Atomic Not Atomic Parameter markers Typed parameter markers Untyped parameter markers SQLBindParameter() SQL_NTS SQL_NULL_DATA SQL_DATA_AT_EXEC SQL_ATTR_PARAM_BIND_OFFSET_PTR Descriptors Application Parameter Descriptor (APD) Application Row Descriptor (ARD)

Implementation Parameter Descriptor (IPD) Implementation Row Descriptor (IRD) SQLGetDescRec() SQLGetDescField() SQLSetDescRec() SQLSetDescField() SQLMoreResults() Block cursor SQL_ATTR_ROWSET_SIZE Column-wise binding Row-wise binding Scrollable cursors Static cursor Dynamic cursor Keyset-driven cursor Mixed cursor "Type 1" connections Remote unit of work "Type 2" connections Application-directed distributed unit of work Connection states RELEASE Two-phase commit Distributed unit of work Transaction coordinator Resource Manager Transaction Manager database Indoubt transaction Database consistency Inconsistent Interleaved Transactions Serializable Transactions Concurrency Lost Updates Dirty Reads Non-repeatable Reads Phantoms Isolation Levels Repeatable Read Read Stability Cursor Stability Uncommitted Read SQL_ATTR_TXN_ISOLATION setTransactionIsolation() Locks LOCK TABLE Techniques you will master: Understanding the difference between static SQL and dynamic SQL, as well as knowing how to identify when each is used in a source code file. Knowing how to create atomic and not-atomic compound SQL statements, as well as knowing the difference between the two. Knowing how parameter markers are used in embedded SQL, CLI/ODBC, and JDBC applications. Understanding the difference between Type 1 and Type 2 connections, as well as knowing how to manage transactions across multiple databases when Type 2 connections are used.

Knowing how two-phase commit processing works. Understanding how activities performed by transactions are isolated from one another in a multi-user environment. Understanding how DB2 UDB provides concurrency control with isolation levels and locks. Recognizing the types of isolation levels available and understanding when each is to be used. Understanding how locks are acquired.

Embedded SQL Revisited


Earlier, we saw that Structured Query Language (SQL) is a standardized language used to work with database objects and the data they contain. SQL is comprised of several different statements that are used to define, alter, and destroy database objects as well as add, update, delete, and retrieve data values. However, because SQL is nonprocedural by design, it is not a general purpose programming language. (SQL statements are executed by DB2, not by the operating system.) Thus, database applications are normally developed by combining the decision and sequence control of a high-level programming language with the data storage, manipulation, and retrieval capabilities of SQL. Several methods are available for sending SQL statements from an application to DB2 for processing, but the simplest technique is to use a method known as embedded SQL. As the name implies, embedded SQL applications are constructed by embedding SQL statements directly into one or more source code files that will be used to create a database application. Embedded SQL statements can be static or dynamic. One of the drawbacks to developing applications using embedded SQL is that high-level programming language compilers do not recognize, and therefore cannot interpret, any SQL statements encountered. Because of this, source code files containing embedded SQL statements must be preprocessed (using a process known as precompiling) before they can be compiled (and linked) to produce a database application. To facilitate this preprocessing, every SQL statement coded in a high-level programming language source code file must be prefixed with the keywords "EXEC SQL" and terminated with either a semicolon (in C/C++ or FORTRAN) or the keyword "END-EXEC" (in COBOL). When the preprocessor (a special tool known as the SQL precompiler) encounters these keywords in a source code file, it replaces all text that follows (until a semicolon or the keyword "END-EXEC" is found) with a DB2 UDB-specific function call that forwards the SQL statement specified to the DB2 Database Manager for processing. Likewise, the DB2 Database Manager cannot work directly with high-level programming language variables. Instead, it must use special variables known as host variables to move data between an application and a database. Host variables look like any other high-level programming language variable, so to set them apart, they must be defined within a special section known as a declare section. And in order for the SQL precompiler to distinguish host variables from other text used in an SQL statement, all references to host variables must be preceded by a colon (:).

Static SQL
A static SQL statement can be hard-coded in an application program at development time because information about its structure and the objects it references (i.e., tables, columns, and data types) is known in advance. Because the details of a static SQL statement are known at development time, the work of analyzing the statement and selecting the optimum data access plan to use to execute the statement is performed as part of the development process (during precompiling or bindingif deferred binding is used). As a result, static SQL statements execute quickly because their operational form is stored in the database (as a package) and does not have to be generated at application run time. The downside to this is that all static SQL statements must be prepared (their access plans must be generated and stored in the database) before they can be executed, and the SQL statements themselves cannot be altered at application run time. In addition, because static SQL applications require prior knowledge of database objects, changes made to these objects after the application is developed can produce undesirable results. The following are examples of embedded static SQL statements: EXEC SQL INSERT INTO TABLE1 (COL1) VALUES (:Value1);

EXEC SQL SELECT COL1 FROM TABLE1 INTO :Value1; EXEC SQL DECLARE CURSOR CUR1 FOR SELECT (*) FROM TABLE1; Generally, static SQL statements are well suited for high-performance applications that execute predefined operations against a known set of database objects.

Dynamic SQL
Although static SQL statements are relatively easy to incorporate into an application, their use is somewhat limited because their format must be known in advance. Dynamic SQL statements, on the other hand are much more flexible because they can be constructed at application run time; information about a dynamic SQL statement's structure and the objects it intends to interact with does not have to be known at development time. Furthermore, because dynamic SQL statements do not have a precoded, fixed format, the data object(s) they reference can change each time the statement is executed. Even though dynamic SQL statements are generally more flexible than static SQL statements, they are usually more complicated to incorporate into an application. And because the work of analyzing the statement to select the best data access plan to use is performed at application run time, dynamic SQL statements can take longer to execute than their equivalent static SQL counterparts. (Because dynamic SQL statements can take advantage of the database statistics available at application run time, there can be some cases in which a dynamic SQL statement will execute faster than an equivalent static SQL statement, but those are the exception and not the norm.) When static SQL statements are embedded in an application, they are executed as they are encountered. However, when dynamic SQL statements are used, there are two ways in which they can be processed: Prepare and Execute. This approach separates the preparation of the SQL statement from its execution and is typically used when an SQL statement is to be executed repeatedly. This method is also used when an application needs advance information about the columns that will exist in the result data set produced when a SELECT SQL statement is executed. The SQL statements PREPARE and EXECUTE are used to process dynamic SQL statements in this manner. Execute Immediately. This approach combines the preparation and execution of an SQL statement into a single step and is typically used when an SQL statement is to be executed only once. This method is also used when the application does not need additional information about the result data set that will be produced, if any, when the SQL statement is executed. The SQL statement EXECUTE IMMEDIATE is used to process dynamic SQL statements in this manner. Dynamic SQL statements that are prepared and executed (using either method) at run time are not allowed to contain references to host variables. They can, however, contain parameter markers in place of constants and/or expressions. Parameter markers are represented by the question mark (?) character and indicate the position in the SQL statement where the current value of one or more host variables or elements of an SQLDA data structure variable are to be substituted when the statement is actually executed. (Parameter markers are typically used where a host variable would be referenced if the SQL statement being executed were static.) The following are examples of embedded dynamic SQL statements: EXEC SQL DESCRIBE Stmt INTO :MySQLDA; EXEC SQL PREPARE Stmt FROM :SQLStmt;

EXEC SQL EXECUTE Stmt USING :Value1, :Value2 :Value3; Generally, dynamic SQL statements are well suited for applications that interact with a rapidlychanging database or that allow users to define and execute ad-hoc queries. Many commercial off-the-shelf application providers (such as ERP/CRM/SCM vendors) use dynamic SQL because of its flexibility.

Host Variables
Earlier, we saw that the DB2 Database Manager relies on host variables to move data between an application and a database. We also saw that, to distinguish host variables from other highlevel programming language variables, host variables must be defined in a special section known as a declare section. The beginning of a declare section, you may recall, is defined by the BEGIN DECLARE SECTION SQL statement, while the end is defined by the END DECLARE SECTION statement. Thus, a typical declare section in a C/C++ source code file might look something like this: EXEC SQL BEGIN DECLARE SECTION char EmployeeID[7]; double Salary; EXEC SQL END DECLARE SECTION A declare section can be coded anywhere that high-level programming language variable declarations can be coded in a source code file. And although a source code file typically contains only one declare section, multiple declare sections are allowed. Host variables that are used to transfer data to a database are known as input host variables; host variables that receive data from a database are known as output host variables. Regardless of whether a host variable is used for input or output, its attributes must be appropriate for the context in which it is used. Thus, you must define host variables in such a way that their data types and lengths are compatible with the data types and lengths of the columns they are intended to work withwhen deciding on the appropriate data type to assign to a host variable, you should obtain information about the column or special register the variable will be associated with and refer to the conversion charts found in the IBM DB2 Universal Database Application Development Guide Programming Client Applications documentation. Note A special tool known as the Declaration Generator can be used to generate host variable declarations for the columns of a given table in a database; this tool creates embedded SQL declaration source code files, which can easily be inserted into C/C//, Java, COBOL, and FORTRAN applications. For more information about this utility, refer to the db2dclgen command in the DB2 UDB Command Reference product documentation. Because the Declaration Generator automatically generates declaration sections based on actual database statistics contained in the system catalog, its use eliminates typographical errors and data type mismatch errors from occurring, so it should be used whenever possible. It is important to keep in mind that each host variable used in a source code file must be assigned a unique name; duplicate names are not allowed in the same source code file, even when the host variables are defined in different declare sections. So once a host variable has been created, how is it used to move data between an application and a database? The easiest way to answer this question is to examine a simple embedded SQL source code fragment where host variables are used. The following pseudo-source code, written in the C programming language, shows one example of how host variables can be defined and used: ...

// Define The SQL Host Variables Needed EXEC SQL BEGIN DECLARE SECTION; char EmployeeNo[7]; char LastName[16]; EXEC SQL END DECLARE SECTION; ... // Declare A Static Cursor EXEC SQL DECLARE C1 CURSOR FOR SELECT EMPNO, LASTNAME FROM EMPLOYEE; // Open The Cursor EXEC SQL OPEN C1; // If The Cursor Was Opened Successfully, Retrieve The // Results Until The End Of The Result Data Set Is Reached while (sqlca.sqlcode == SQL_RC_OK) { // Retrieve The Current Record From The Cursor EXEC SQL FETCH C1 INTO :EmployeeNo, :LastName; // Do Something With The Results ... } // Close The Open Cursor EXEC SQL CLOSE C1; ...

Indicator Variables
By default, columns in a DB2 UDB database table can contain null values, and because null values are not stored the same way conventional data is stored, special provisions must be made if an application intends to work with null data. Null values cannot be retrieved and copied to host variables in the same manner that other data values can. Instead, a special flag must be examined to determine whether a specific value is meant to be null. And to obtain the value of this flag, a special variable known as an indicator variable (or null indicator variable) must be associated with the host variable that has been assigned to a "nullable" column. Because indicator variables must be accessible by both the DB2 Database Manager and the application program, they must be defined inside a declare section and they must be assigned a data type that is compatible with the DB2 UDB SMALLINT data type. Thus, the code used to define a null indicator variable in a C/C++ source code file would typically look something like this: EXEC SQL BEGIN DECLARE SECTION

short SalaryNullIndicator; EXEC SQL END DECLARE SECTION Once an indicator variable has been associated with a host variable (a null indicator variable is associated with a specific host variable when it follows a host variable immediately after the host variable is used in an SQL statement), it can be examined as soon as its corresponding host variable has been populated. If an indicator variable contains a negative value, it means that a null value was found and the value of the corresponding host variable should be ignored. Otherwise, the value of the corresponding host variable is valid. Indicator variables can also be used to send null values to a database when an insert or update operation is performed; when processing INSERT and UPDATE SQL statements, the DB2 Database Manager examines the value of any indicator variable provided first. If the indicator variable contains a negative value, the DB2 Database Manager assigns a null value to the appropriate columnprovided null values are allowed. (If the indicator variable is set to zero or contains a positive number or if no indicator value is used, the DB2 Database Manager assigns the value stored in the corresponding host variable to the appropriate column instead.) Thus, the code used in a C/C++ source code file to assign a null value to a column in a table would look something like this: ValueInd = -1; EXEC SQL INSERT INTO TAB1 VALUES (:Value :ValueInd); Again, to understand how indicator variables are used, it helps to look at an example embedded SQL source code fragment. The following pseudo-source code, written in the C programming language, shows one example of how indicator variables are both defined and used: ... // Define The SQL Host Variables Needed EXEC SQL BEGIN DECLARE SECTION; char EmployeeNo[7]; double Salary; short SalaryNI; // Salary - Used If SalaryNI Is // Salary NULL Indicator - Used // Not Null // To Determine If Salary // Value Should Be NULL EXEC SQL END DECLARE SECTION; ... // Declare A Static Cursor EXEC SQL DECLARE C1 CURSOR FOR SELECT EMPNO, DOUBLE(SALARY) FROM EMPLOYEE; // Open The Cursor EXEC SQL OPEN C1; // If The Cursor Was Opened Successfully, Retrieve And // Display All Records Available

while (sqlca.sqlcode == SQL_RC_OK) { // Retrieve The Current Record From The Cursor EXEC SQL FETCH C1 INTO :EmployeeNo, :Salary :SalaryNI; // If The Salary Value For The Record Is NULL, ... if (SalaryNI < 0) { printf("No salary information is available for "); printf("employee %s\n", EmployeeNo); } } // Close The Open Cursor EXEC SQL CLOSE C1; ...

Compound SQL
Earlier, it was mentioned that the DB2 Database Manager, rather than the operating system, is responsible for executing SQL statements. To reduce DB2 Database Manager overhead, several individual SQL statements can be grouped together into a single executable block of code known as a compound SQL statement (or simply compound SQL). In addition to reducing DB2 Database Manager overhead, compound SQL reduces the number of requests that have to be transmitted across the network for remote clients. Compound SQL statements can be processed in one of two ways: Atomic. When a compound SQL statement is processed in this manner, the application executing the statement receives a response from the DB2 Database Manager when all substatements within the compound statement have completed successfully or when one of the sub-statements ends in an error. (If one substatement ends in an error, the entire group of statements is considered to have ended in error, and any changes made to the database by other substatements within the compound statement are backed out with a rollback operation.) Not Atomic. When a compound SQL statement is processed in this manner, the application executing the statement receives a response from the DB2 Database Manager when all substatements within the block have completedall substatements within the compound statement are executed, even if one or more substatements end in error. (If one or more substatements end in error, changes made to the database by those substatements are rolled back; changes made by substatements within the compound statement that executed successfully can only be backed out by explicitly rolling back the transactionvia a rollback operationthat the compound SQL statement was executed from.) The beginning of a compound SQL statement block is defined by the BEGIN COMPOUND SQL statement, and the end is defined by the keywords "END COMPOUND." Thus, a simple atomic compound SQL statement in a C/C++ source code file might look something like this: EXEC SQL BEGIN COMPOUND ATOMIC STATIC UPDATE ACCOUNTS SET ABALANCE = ABALANCE + :delta WHERE AID = :aid; UPDATE TELLERS SET TBALANCE = TBALANCE + :delta

WHERE TID = :tid; INSERT INTO TELLERS (TID, BID, TBALANCE) VALUES (:i, :branch_id, 0); COMMIT; END COMPOUND; As you can see from this example, each substatement coded in a compound SQL statement block does not have to begin with the keywords "EXEC SQL"; however, each substatement does have to be terminated with a semicolon. That's because embedded compound SQL statement blocks are treated as a single SQL statement by the SQL precompiler. Compound statements are typically used for short operations that require little control flow logic but significant data flow. For larger constructs with nested complex control flow, stored procedures are a better choice.

Dynamic SQL and Parameter Markers


Earlier, we saw that dynamic SQL statements are not allowed to contain references to host variables but can instead contain parameter markers in place of constants and/or expressions. Parameter markers are represented by the question mark (?) character and indicate the position in the SQL statement where the current value of one or more host variables or elements of an SQLDA data structure variable are to be substituted when the statement is executed. (Parameter markers are typically used where a host variable would be referenced if the SQL statement being executed were static.) Two types of parameter markers are available: typed and untyped. Typed parameter markers. A typed parameter marker is a parameter marker that is specified with its target data type. Typed parameter markers have the general form: CAST(? AS DataType) This notation does not imply that a function is called, but rather it "promises" that the data type of the value replacing the parameter marker at application run time will be either the data type specified or a data type that can be converted to the data type specified. For example, in the SQL statement: UPDATE EMPLOYEE SET LASTNAME = CAST(? AS VARCHAR(12)) WHERE EMPNO = '000050' the value for the LASTNAME column is provided at application run time, and the data type of that value will be either VARCHAR(12) or a data type that can be converted to VARCHAR(12). Untyped parameter markers. An untyped parameter marker is a parameter marker that is specified without a target data type and has the form of a single question mark (?). The data type of an untyped parameter marker is determined by the context in which it is used. For example, in the SQL statement: UPDATE EMPLOYEE SET LASTNAME = ? WHERE EMPNO = '000050' the value for the LASTNAME column is provided at application run time, and the data type of that value will be compatible with the data type that has been assigned to the LASTNAME column of the EMPLOYEE table. Keep in mind that parameter markers are only allowed in certain places in SQL statements. For example, parameter markers cannot be used in the list of columns to be returned by a SELECT SQL statement, nor can they be used as the operand of a relational operator such as the equal sign (=). In general, the use of parameter markers in most Data Manipulation Language (DML) statements is legal, but the use of parameter markers in Data Definition Language (DDL) statements is not.

Replacing Parameter Markers with Values


So just how are values substituted for parameter markers that have been coded in an SQL statement? The answer depends on which interface is being used (embedded SQL, CLI/ODBC, or JDBC). In the following sections, we will answer this question by examining the steps that are used to forward SQL statements containing parameter markers to the DB2 Database Manager for processing when each of the interfaces available are used.

Populating Parameter Markers Used in Embedded SQL Applications


When parameter markers are used in embedded SQL applications, values that are to be substituted for parameter markers placed in an SQL statement must be provided as additional parameters to the EXECUTE or EXECUTE IMMEDIATE SQL statement when either one is used to execute the statement specified. The following pseudo-source code example, written in the C programming language, illustrates how values would be provided for parameter markers that have been coded in a simple UPDATE SQL statement: ... // Define The SQL Host Variables Needed EXEC SQL BEGIN DECLARE SECTION; char SQLStmt[80]; char JobType[10]; EXEC SQL END DECLARE SECTION; ... // Define A Dynamic UPDATE SQL Statement That Uses A // Parameter Marker strcpy(SQLStmt, "UPDATE EMPLOYEE SET JOB = ? "); strcat(SQLStmt, "WHERE JOB = 'DESIGNER'"); // Populate The Host Variable That Will Be Used In // Place Of The Parameter Marker strcpy(JobType, "MANAGER"); // Prepare The SQL Statement EXEC SQL PREPARE SQL_STMT FROM :SQLStmt; // Execute The SQL Statement EXEC SQL EXECUTE SQL_STMT USING :JobType; ... This approach is the same regardless of the type of SQL statement used. Thus, an embedded SQL application designed to invoke a stored procedure using parameter markers might look something like this:

... // Define The SQL Host Variables Needed EXEC SQL BEGIN DECLARE SECTION; char SQLStmt[80]; char EmpNo[7]; short EmpNoNI; int Rating; short RatingNI; EXEC SQL END DECLARE SECTION; ... // Define A Dynamic CALL SQL Statement That Uses // Parameter Markers strcpy(SQLStmt, "CALL StoredProc(?,?)"); // Populate The Host Variables That Will Be Used In // Place Of The Parameter Markers strcpy(JobType, "MANAGER"); // Prepare The CALL SQL Statement EXEC SQL PREPARE SQL_STMT FROM :SQLStmt; // Execute The CALL SQL Statement EXEC SQL EXECUTE SQL_STMT USING :EmpNo:EmpNoNI, :Rating:RatingNI; ...

Populating Parameter Markers Used in CLI/ODBC Applications


When parameter markers are used in CLI/ODBC applications, each parameter marker placed in an SQL statement must be associated with an application variable and that variable must be populated before the statement can be executed. An application variable that has been associated with a specific parameter marker in an SQL statement is said to be "bound" to the parameter marker, and such binding is carried out by calling the SQLBindParameter() function. Each time this function is called, the following information must be provided: The parameter marker number. Parameters are numbered in increasing order as they appear from left to right in the SQL statement, beginning with the number 1. Although it is legal to specify a parameter marker number higher than the number of parameter markers used in the SQL statement, additional parameter marker numbers are ignored when the SQL statement is executed.

The parameter type (input, output, or input/output). Except for parameter markers used in stored procedure calls, all parameters are treated as input parameters. The C data type, memory address, and size (length) in bytes of the application variable being bound to the parameter marker. The CLI/ODBC driver used must be able to convert the data from the C data type specified to the SQL data type used by the data source or an error will occur. The SQL data type, precision, and scale of the parameter itself. Optionally, the memory address of a length/indicator variable. This variable is used to provide the byte length of binary or character data, specify that the data is a NULL value, or specify that the data is long and will be sent in pieces using the SQLPutData() function.

Parameter markers can be bound to application variables in any order and once bound, the association with that variable remains in effect until it is overridden or until the corresponding SQL statement handle is freed. To bind a parameter marker to a different variable, an application simply rebinds the parameter marker with the new variable; the previous binding is automatically released. However, if a parameter marker is rebound after an SQL statement has been executed, the new binding does not take effect until the SQL statement is re-executed. To set the value of an application variable that has been (or will be) bound to a parameter marker, an application simply assigns a value to the variable. It is not important when values are assigned to such application variables as long the assignment is made sometime before the SQL statement to which they have been bound is executed. Thus, an application can assign a value to a variable before or after it is bound to a parameter marker, and the value assigned can be changed any number of times. Each time an SQL statement containing parameter markers is executed, the appropriate CLI/ODBC driver simply retrieves the current value of each variable bound to the statement and sends it, along with the SQL statement used to the data source for processing. Length/indicator variables can also be bound to parameter markers when application variables are bound; if a length/indicator variable is bound to a specific parameter, it must be set to one of the following values before the SQL statement is executed: The actual length, in bytes, of the data value stored in the bound application variable. The driver checks this length only if the application variable contains character or binary data. SQL_NTS. (A null-terminated string is stored in the bound application variable). SQL_NULL_DATA. (A null value is stored in the bound application variable. In this case, the driver ignores the value of the bound variable.) SQL_DATA_AT_EXEC or the result of the SQL_LEN_DATA_AT_EXEC (Length) macro. (The value stored in the bound application variable is to be sent to the data source by the SQLPutData() function.) Again, to understand how values are provided for parameter markers in a CLI/ODBC application, it helps to look at an example. The following pseudo-source code, written in the C programming language, shows how values are provided for parameter markers that have been coded in a SELECT SQL statement: ... // Allocate An SQL Statement Handle SQLAllocHandle(SQL_HANDLE_STMT, ConHandle, &StmtHandle); // Define A SELECT SQL Statement That Uses A Parameter // Marker

strcpy((char *) SQLStmt, "SELECT EMPNO, LASTNAME FROM "); strcat((char *) SQLStmt, "EMPLOYEE WHERE JOB = ?"); // Prepare The SQL Statement RetCode = SQLPrepare(StmtHandle, SQLStmt, SQL_NTS); // Bind The Parameter Marker Used In The SQL Statement To // An Application Variable RetCode = SQLBindParameter(StmtHandle, 1, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR, sizeof(JobType), 0, JobType, sizeof(JobType), NULL); // Populate The "Bound" Application Variable strcpy((char *) JobType, "DESIGNER"); // Execute The SQL Statement RetCode = SQLExecute(StmtHandle); ... Changing parameter bindings and values with offsets. CLI/ODBC applications have the advantage of being able to specify an offset value that is to be added to a bound parameter's address and to any corresponding length/indicator buffer address when the SQLExecute() or SQLExecDirect() function is used to execute an SQL statement containing parameter markers. (Offsets are specified by calling the SQLSetStmtAttr() function and passing it the SQL_ATTR_PARAM_BIND_OFFSET_PTR value and a pointer to a location in memory that contains an offset value.) This feature allows an application to change parameter bindings and values without having to rebind previously bound application variables. When offsets are used, the original bindings represent a template of how the application buffers are laid outthe application can move this template to different areas of memory by simply changing the offset. New offsets can be specified at any time; each time an offset is specified, it is added to the originally bound buffer address. Thus, offset additions are not cumulative; instead, each offset specified cancels out the last offset used. It goes without saying that the sum of the original bound address and any offset provided must always represent a valid address (the offset and/or the address to which the offset is added can be invalid, provided the sum of the two constitutes a valid address).

Populating Parameter Markers Used in JDBC Applications


As with embedded SQL and CLI/ODBC applications, when SQL statements containing parameter markers are used in a JDBC application, values must be provided for each parameter marker placed in an SQL statement before that statement can be executed. This replacement is performed by calling the appropriate setXXX() method of the PreparedStatement or CallableStatement object that the SQL statement is associated with. (Refer to Table 6-3 in Chapter 6, "Java Programming," for more information about the setXXX() methods available.) Every set XXX() method provided requires two arguments as input: the first argument identifies the parameter marker in the corresponding SQL statement that a value is being provided for while the second argument is the value itself.

Once again, to better understand how values are provided for parameter markers used in a JDBC application, it helps to look at an example. The following Java pseudo-source code illustrates how values are provided for parameter markers that have been coded in an SELECT SQL statement: ... // Declare The Local Memory Variables String ResultSet ... // Define A SELECT SQL Statement That Uses A Parameter // Marker SQLStmt = "SELECT * FROM EMPLOYEE WHERE WORKDEPT = ?"; // Prepare The SQL Statement PStmtObject = ConHandle.prepareStatement(SQLStmt); // Assign A Value To The Parameter Marker Used PStmtObject.setString(1, "D11"); // Execute The SQL Statement Results = PStmtObject.executeQuery(); SQLStmt; Results; PreparedStatement PStmtObject;

Using CLI/ODBC Descriptors


In Chapter 4, "Embedded SQL Programming," we saw that applications developed using embedded SQL have access to a special data structure known as the SQL Descriptor Area (SQLDA) structure. This structure is typically used with PREPARE, DESCRIBE, EXECUTE, OPEN, and FETCH SQL statements to pass detailed information (usually result data set column attributes) between an application and a database. With CLI/ODBC applications, descriptors are comparable to the SQLDA structure used with embedded SQL. When an SQL statement handle is allocated, CLI/ODBC implicitly allocates storage for and creates the following four descriptors and assigns them to the statement handle being allocated: 1 Application Parameter Descriptor (APD) 1 Application Row Descriptor (ARD) 1 Implementation Parameter Descriptor (IPD) 1 Implementation Row Descriptor (IRD) Each descriptor is used to describe one of the following: Zero or more dynamic parameters (represented by parameter markers) used in an SQL statement: The APD contains the input parameter values as set by the application (dynamic input parameters) or the parameter values returned by a stored procedure (dynamic output parameters). The IPD contains the same information as the APD after any specified data conversion is performed (dynamic input parameters) or the parameter values

that are returned by a stored procedure before any specified data conversion is performed (dynamic output parameters). A single row of data in a result data set: The IRD contains the current row from the data source. (These buffers conceptually contain data as written to, or read from, the data source. However, the stored form of data source data is not specified. Therefore, the data in an IRD could have been converted from its original form.) The ARD contains the current row of data as presented to the application after any specified data conversion has been applied. Although these four descriptors are typically used in the manner described, each can serve in a different capacity. For example, a row descriptor in one statement can serve as a parameter descriptor in another. For both parameter and row descriptors, if the application specifies different data types in corresponding records of the implementation and application descriptor used, the CLI/ODBC driver automatically performs any data conversion necessary when it interacts with the descriptor.

Obtaining and Setting Descriptor Information


Each descriptor contains one header record and zero or more parameter or column records, depending upon whether the descriptor is a parameter descriptor or a row descriptor. The descriptor header record contains general information about the descriptor itself, and each parameter or column record that follows contains information that describes a single parameter or column variable. (Each time a new parameter marker or column is bound to an application variable, a new parameter or column record is added to the appropriate descriptor. Likewise, each time a parameter marker or column is unbound, the corresponding parameter or column record is removed from the descriptor.) Changing a field value in the descriptor header record affects all parameters or columns associated with the descriptor; changing a field value in a parameter or column affects only the parameter or column associated with that record. CLI/ODBC functions that work with parameter and column data (for example, SQLBindParameter(), SQLBindCol(), and SQLFetch()) implicitly set and retrieve descriptor field information during their execution. For instance, when the SQLBindCol() function is used to bind column data to an application variable, it sets one or more descriptor fields to describe the complete binding assignment. Because CLI/ODBC functions implicitly use descriptors as needed, applications typically do not concern themselves with how descriptors are managed. In fact, no database operations require an application to gain direct access to a descriptor. However, for some applications, gaining access to one or more descriptors can help streamline many operations. Applications can retrieve information from a descriptor record by invoking the SQLGetDescRec() or SQLGetDescField() function; the SQLGetDescField() function is used to retrieve the contents of several parameter or column record fields (which identify the data type and storage of a specific parameter or column) with a single function call. However, this function cannot be used to obtain information from a descriptor header record. Applications wishing to retrieve information from a descriptor header record must invoke the SQLGetDiagField() function instead. (Because many statement handle attributes correspond to descriptor header fields, the SQLGetStmtAttr() function can also be used to examine descriptor header information.) By calling the SQLSetDescRec() function, applications can modify the descriptor record fields that affect the data type (C and SQL) and storage of parameters and/or columns associated with explicitly allocated descriptors. Specific fields of any explicitly allocated descriptor record (including the header record) can be changed or set by calling the SQLSetDescField() function. Again, because many statement attributes correspond to descriptor header fields, the SQLSetStmtAttr() function can often be called in place of the SQLSetDescField() function to change descriptor header record information.

The SQLSetDescField() function can also be used to define the initial (default) values that are used to populate record fields when an application row descriptor is first allocated. To provide a standard method for presenting database data to an application, the initial value of an explicitly allocated descriptor's SQL_DESC_TYPE field is always SQL_DEFAULT. An application may change this at any time by setting one or more fields of the descriptor record. The concept of a default value is not valid for IRD fields. In fact, the only time an application can gain access to IRD fields is when a prepared or executed SQL statement is associated with it.

Working with Batches of SQL Statements in CLI/ODBC Applications


Although SQL statements that produce result data sets when executed are typically processed one at a time, two or more SQL statements can be handled as a single unit, otherwise referred to as a batch, when CLI/ODBC is used. Processing queries in batches is often more efficient than processing them independently because network traffic can often be reduced and sometimes the execution of a batch of SQL statements can be optimized in ways that the execution of individual statements cannot. With CLI/ODBC, three types of batches are supported: Explicit. An explicit batch contains two or more SQL statements separated by semicolons (;). (No semicolon follows the last statement in the list.) Arrays. Arrays of parameter values can be used with a parameterized SQL statement as an effective way to perform batch operations. For example, an array of parameter values can be used with an INSERT SQL statement to insert multiple rows of data into a table by simply executing a single SQL statement. Stored Procedures. If a stored procedure contains more than one SQL statement, it is considered to contain a batch of SQL statements. In most batch implementations, the entire set of SQL statements is executed before any results are returned to the calling application. When two or more SELECT SQL statements are executed in a batch, multiple result data sets are produced. To process multiple result data sets, an application must call the SQLMoreResults() function. This function discards the contents of the current result data set and makes the next result data set available. For example, in a CLI/ODBC application, the following code could be used to execute two SELECT statements as a single batch: SQLExecDirect(StmtHandle, "SELECT * FROM Parts WHERE Price <= 100.00; SELECT * FROM Parts WHERE Price > 100", SQL_NTS); After multiple SELECT SQL statements are executed in this manner, the application can begin fetching rows from the result data set created by the first query statement because it automatically has access to it. When the application is finished fetching rows, it must call SQLMoreResults() to make the result data set generated by the second query statement available. If necessary, SQLMoreResults() automatically discards any unfetched rows in the first result data set and closes the cursor. The application can then begin fetching rows from the second result data set. When an error occurs while a batch of SQL statements is being executed, one of the following four things can happen; depending upon the data source used and the statements included in the batch. No remaining statements in the batch will be executed. No remaining statements in the batch will be executed, and the current transaction will be rolled back. All statements in the batch that were processed before the error was generated will not be affected; no other statements will be executed. All statements in the batch except the statement that caused the error will be executed. In the first two cases, the function used to execute the batch (SQLExecute() or SQLExecDirect()) will return the value SQL_ERROR. In the latter two cases, the function used

to execute the batch may return the value SQL_SUCCESS_WITH_INFO or SQL_SUCCESS, depending upon the driver's implementation of batch processing. In all cases, diagnostic information can be retrieved by calling the SQLGetDiagRec() or SQLGetDiagField() function; however, it is unlikely that the information returned by either of these two functions will identify the statement that caused the error to occur.

CLI/ODBC's Extended Cursors


Most relational database management systems provide a simple model for retrieving data from result data sets created in response to a query. With this model, rows of data are returned to an application, one at a time, in the order specified by the query until the end of the result data set is reached. And as you should know by now, the mechanism used to implement this simple model is known as a cursor (also called a forward-only cursor). Early in the development of ODBC (in fact, before the term ODBC was used), Rick Vicik of Microsoft Corporation took a collection of ideas and proposals for cursor management and pioneered the design of a more advanced cursor model for client/server architectures. This model, which became the foundation on which cursors in CLI, ODBC, and several other products are based, contains several extended cursors that are designed to overcome many of the limitations imposed by the simple forward-only cursor found in most database management systems. These extended cursors are defined in terms of two broad types of attributes (nonscrollable or block, and scrollable) and can contain components of either or both attributes.

Block Cursors
In the client/server environment, many applications spend a significant amount of time retrieving data from the database. Part of this time is spent bringing the data across the network, and part of it is spent on network overhead (for example, a call made by a driver to request a row of data). Often, the time spent on network overhead can be reduced by using block cursors (also referred to as "fat" cursors), which can return more than one row at a time. The rows returned when data is fetched using a block cursor are known as a rowset. (The actual number of rows returned in a rowset is controlled by the SQL_ATTR_ROWSET_SIZE statement attribute.) Screen-based applications often set the rowset size to match the number of rows that will be displayed on the screen while other applications tend to set the rowset size to match the largest number of rows that the application can reasonably handle.) Just as a traditional forwardonly cursor points to the current row, a block cursor points to the current rowsetwhen a block cursor first returns a rowset, the current row is the first row of that rowset. If an application wants to perform operations that work with a single row, it must indicate which row in the rowset is to be treated as the current row. It is important not to confuse a rowset with a result data set. The result data set is maintained at the data source, but the rowset is maintained in application buffers. Also, although a result data set is fixed, the rowset is notit changes position and contents each time a new set of rows are fetched. Figure 7-1 shows the relationship of a block cursor, a result data set, a rowset, and a current row in a rowset.

Figure 7-1: Components of a CLI/ODBC block cursor. Because block cursors return multiple rows of data, applications that use them must bind an array of variables or LOB locators to each column in the result data set that the cursor is associated with. (Collectively, these arrays are sometimes referred to as rowset buffers.) An application binds columns to arrays in the same manner that it binds columns to other application

variables/LOB locatorsby calling the SQLBindCol() function. The only difference is that the addresses provided as SQLBindCol() function parameters reference arrays instead of individual variables/LOB locators. However, before binding columns to arrays, an application must decide on the binding style that it will use. The following binding styles are available: Column-wise binding. One or more arrays are bound to each column in the result data set for which data is to be returned. This is called column-wise binding because each array (or set of arrays) is associated with a single column in the result data set. Row-wise binding. A data structure that holds a single data value for each column in a row is defined, and each element of the first structure in an array of these structures is bound to each column in the result data set for which data is to be returned. This is called row-wise binding because each data structure contains the data for a single row in the result data set. The decision of whether to use column-wise binding or row-wise binding is largely a matter of preference. Column-wise binding is the default binding style used; however, row-wise binding might correspond more closely to an application's data layout, in which case it could provide better performance. Applications can change from column-wise binding to row-wise binding by setting the SQL_ATTR_ROW_BIND_TYPE statement attribute.

Scrollable Cursors
Interactive applications, especially those written for personal computers, often need to provide a way for a user to scroll through data in a result data set using the arrow keys, "Page Up" and "Page Down" keys, or scroll bar and mouse. For such applications, returning to a previously fetched row can be a problem. One solution to this problem is to close and reopen the cursor and then fetch rows until the cursor reaches the required row. Another possibility is to read the result data set once and store it locally to implement scrolling in the application. Both methods only work well with small result data sets, and the latter method can be quite difficult to implement. A better solution is to use a cursor that can move forward and backward in the result data set. A cursor that provides the ability to move forward and backward within a result data set is called a scrollable cursor. The ability to move backward in a result data set raises an important question: Should the cursor detect changes made to rows previously fetched? In other words, should it detect updated, deleted, and newly-inserted rows? This question arises because the definition of a result data set (that is, the set of rows matching certain criteria) does not state when rows are checked to see if they match that criteria, nor does it state whether rows must contain the same data each time they are fetched. The former omission makes it possible for scrol-lable cursors to detect whether rows have been inserted or deleted, and the latter makes it possible for them to detect updated data. To cover the different needs of different applications, CLI/ODBC defines the following types of scrollable cursors: Static Dynamic Keyset-driven Mixed Each of these cursors varies, both in expense and in ability to detect changes made to the underlying result data set. Static cursors detect few or no changes but are relatively cheap to implement; dynamic cursors detect all changes made but can be quite expensive to implement. Keyset-driven and mixed cursors lie somewhere in between, detecting most changes made, at less expense than dynamic cursors.

Static Cursors
With a static cursor, the result data set appears to be static; static cursors typically do not detect changes made to the result data set by other applications after the cursor has been opened. However, static cursors may detect the effects of their own insert, update, and delete operations, although they are not required to do so. This type of cursor is most useful for read-only

applications that do not need the most up-to-date data available or for applications for which multiple users never need to modify data concurrently. Static cursors are commonly implemented by locking the rows in the result data set or by making a copy, or snapshot, of the result data set. Although locking rows is relatively easy, it significantly reduces transaction concurrency. Making a copy or snapshot of a result data set allows greater concurrency and provides the cursor with a way to keep track of its own inserts, updates, and deletes by modifying the copy; however, a copy is more expensive to make and can differ from the underlying data as other applications attempt to alter that data.

Dynamic Cursors
Unlike static cursors, dynamic cursors can detect changes made to the result data set by other applications after the cursor has been opened; dynamic cursors detect the effects of their own insert, update, and delete operations, as well as the effects of similar operations performed by other applications. (This is subject to the isolation level being used by the transaction that has opened the cursor; the isolation level used is controlled by the value of the SQL_ATTR_TXN_ISOLATION connection attribute.)

Keyset-driven Cursors
A keyset-driven cursor lies somewhere between a static cursor and a dynamic cursor in its ability to detect changes. Like static cursors, keyset-driven cursors do not always detect changes made to the set of rows in the result data set or to the order in which rows are returned. Like dynamic cursors, keyset-driven cursors are able to detect changes made to the rows that already reside in the result data set (depending on the transaction isolation level being used). The advantage of using this type of cursor is that it provides access to the most upto-date data values while allowing an application to retrieve (fetch) rows based on absolute position within a result data set. When a keyset-driven cursor is opened, it saves keys (unique row identifiers) for the entire result data set. A key can be a row ID (if available), a unique index value, a unique key, or the entire row. As the cursor scrolls through the result data set, it uses the keys in the keyset to retrieve the current data values for each row. Because of this, keyset-driven cursors always detect their own updates and deletes as well as updates and deletes made by other applications. For example, suppose a keyset-driven cursor retrieves a row of data from a result data set. Now suppose another application then updates or deletes that row. When a keyset-driven cursor attempts to refetch a row that has been deleted, this row appears as a "hole" in the result data set; the key for the row exists in the keyset, but the row no longer exists in the result data set. If the key for a row is updated, the update is treated as if the original row had been deleted and a new row inserted. As a result, these rows also appear as holes in the result data set. If the keyset-driven cursor retrieves that row again, it will see the changes made by the other application because it retrieved the row using its key. Figure 7-2 identifies the basic components of a keyset-driven cursor.

Figure 7-2: Components of a CLI/ODBC keyset-driven cursor. A keyset-driven cursor can always detect rows deleted by others and can optionally remove the keys for rows that it deletes itself from the keyset, thereby hiding its own deletes. Rows inserted by other applications are never visible to keyset-driven cursors. That's because no keys for these rows exist in the keyset. However, a keyset-driven cursor can optionally add keys

to the keyset for rows that it inserts itself. (Keyset-driven cursors that do this can detect their own inserts.) Keyset-driven cursors are commonly implemented by creating a temporary table that contains keys and row versioning information for each row in the result data set. To scroll through the original result data set, the keyset-driven cursor opens a static cursor over the temporary table; to retrieve a row in the original result data set, the keyset-driven cursor first retrieves the appropriate key from the temporary table and then retrieves the current values for the row. If block cursors are used, the cursor must retrieve multiple keys and rows.

Mixed Cursors
A mixed cursor is a combination of a keyset-driven cursor and a dynamic cursor. Mixed cursors are used when the result data set is too large to reasonably save keys for and are implemented by creating a keyset that is smaller than the entire result data set but larger than the rowset used. As long as the application scrolls within the keyset, the behavior is the same as a keyset-driven cursor. When the application scrolls beyond the keyset, the behavior becomes dynamicthe cursor fetches the requested rows and creates a new keyset for them. After the new keyset is created, the behavior reverts to keyset-driven within the new keyset. For example, suppose a result set has 1,000 rows and uses a mixed cursor with a keyset size of 100 and a rowset size of 10. When the first rowset is fetched, the cursor creates a keyset consisting of keys for the first 100 rows. It then returns the first 10 rows, as requested. Now suppose another application deletes rows 11 and 101. If the cursor attempts to retrieve row 11, it will encounter a hole because it has a key for this row but no row exists; this is keyset-driven behavior. If the cursor attempts to retrieve row 101, the cursor will not detect that the row is missing because it does not have a key for the row. Instead, it will retrieve what was previously row 102. This is dynamic cursor behavior. A mixed cursor is equivalent to a keyset-driven cursor when the keyset size is equal to the result data set size. A mixed cursor is equivalent to a dynamic cursor when the keyset size is equal to 1.

Controlling Cursor Characteristics


A CLI/ODBC application can specify the cursor type to use by setting the SQL_ATTR_CURSOR_TYPE statement attribute before executing an SQL statement that will create a result data set; if no cursor type is explicitly specified, a forward-only cursor is used by default. For keyset-driven and mixed cursors, applications can also specify the keyset size to use by setting the SQL_ATTR_KEYSET_SIZE statement attribute. To get a mixed cursor, an application simply specifies a keyset-driven cursor and defines a keyset size that is smaller than the size of the result data set that will be produced; if the keyset size is set to 0 (which is the default), the keyset size is altered to match the size of the result data set created, and a keysetdriven cursor is used. The keyset size can be changed any time after the cursor has been opened. An application also has the option of specifying the characteristics of a cursor instead of specifying the cursor type (forward-only, static, keyset-driven, or dynamic). To specify the characteristics of a cursor, the application defines the cursor's scrollability (by setting the SQL_ATTR_CURSOR_SCROLLABLE statement attribute) and sensitivity (by setting the SQL_ATTR_CURSOR_SENSITIVITY statement attribute) before the cursor is opened. The CLI/ODBC driver used then chooses the cursor type that most efficiently provides the characteristics that the application requested.

Managing Database Connections


Although the methods used to construct embedded SQL database applications differ from those used to develop CLI/ODBC applications, which in turn are different from the methods used to construct JDBC and SQLJ applications, all three types of applications have one thing in common

before an application using any of these interfaces can perform operations against a database, it must first establish a connection to that database. With embedded SQL applications, database connections are made (and in some cases, terminated) by executing the CONNECT SQL statement. With CLI/ODBC applications, database connections are established by calling the SQLConnect(), SQLDriverConnect(), and/or the SQLBrowseConnect() function. And with JDBC and SQLJ applications, database connections are made by invoking the getConnection() method of a DriverManager object. During the connection process, the information needed to establish a connection (such as authorization ID and corresponding password of an authorized user) is passed to the database specified for validation. Embedded SQL applications have the option of using two different types of connection semantics. These two types, known simply as "Type 1" and "Type 2," support two types of transaction behavior. Type 1 connections support only one database connection per transaction (referred to as remote unit of work) while Type 2 connections support any number of database connections per transaction (referred to as application-directed distributed unit of work). Essentially, when Type 1 connections are used, an application can be connected to only one database at a time; once a connection to a database is established and a transaction is started, that transaction must be either committed or rolled back before another database connection can be established. On the other hand, when Type 2 connections are used, an application can be connected to several different databases at the same time, and each database connection will have its own transactions. The type of connection semantics an application will use is determined by the value assigned to an SQL precompiler option at the time the application is precompiled. (CLI/ODBC, JDBC, and SQLJ applications can connect to any number of data sources at any timeno special connection semantics are required.) When Type 2 connections are used, each time the CONNECT statement is executed, any database connection that existed before the CONNECT statement was executed is placed in the "Dormant" state; the new database server name is added to the list of available servers; and the new connection is placed into both the "Current" state and the "Held" state. (Initially, all database connections are placed in the "Held" state, which means that the connection will not be terminated the next time a commit operation is performed.) When the RELEASE SQL statement is executed, a connection is removed from the "Held" state and placed in the "Release-Pending" state, which means that the connection will be terminated by the next successful commit operation (roll back operations have no effect on connections). Regardless of whether a connection is in the "Held" or "Release-Pending" state, it can also be in the "Current" or "Dormant" state. When a connection is in the "Current" state, SQL statements executed by the application can reference data objects that are managed by the corresponding database server. (You can find out which connection is in the "Current" state simply by examining the value of the CURRENT SERVER special register.) When a connection is in the "Dormant" state, however, it is no longer current, and no SQL statement is allowed to reference its data objects. Either the SET CONNECTION SQL statement or the CONNECT RESET statement can be used to change the state of a specific connection from the "Dormant" state to the "Current" state, which automatically places all other existing connections in the "Dormant" state. (Only one connection can be in the "Current" state at any given point in time.)

Coordinating Transactions across Multiple Database Connections


In Chapter 3, "Data Manipulation," we saw that a transaction (also known as a unit of work) is a sequence of one or more SQL operations grouped together as a single unit, usually within an application process. The initiation and termination of a single transaction defines points of data consistency within a database; the effects of all operations performed within a transaction are either applied to the database and made permanent (committed), or backed out (rolled back), in which case the database is returned to the state it was in before the transaction was initiated. Transactions are initiated the first time an executable SQL statement is executed after a connection to a database has been made or immediately after a pre-existing transaction has been terminated. Once initiated, transactions can be implicitly terminated using a feature known as "automatic commit" (in which case each executable SQL statement is treated as a single transaction, and any changes made by that statement are applied to the database if the

statement executes successfully or discarded if the statement fails) or explicitly terminated by executing the COMMIT or the ROLLBACK SQL statement. So what happens when an application establishes connections with several databases at the same time? In this case, each data source connection can constitute a separate transaction boundary. Figure 7-3 illustrates how multiple transaction boundaries can coexist when an application interacts with two separate data sources at the same time.

Figure 7-3: Transaction boundaries in an application that interacts simultaneously with multiple data sources. It is important to remember that commit and rollback operations only have an effect on changes that have been made within the transactions that they terminate. So in order to evaluate the effects of a series of transactions, you must be able to identify where each transaction begins, as well as when and how each transaction is terminated.

Two-phase Commit Processing


As Figure 7-3 illustrates, transactions typically do not cross connection boundaries. However, there may be times when it is desirable for an application to work with data that is distributed across two or more databases within a single transaction. In such situations, applications can utilize a process known as two-phase commit to allow a single transaction to span multiple database connections. (Such transactions are referred to as distributed units of work.) The DB2 Database Manager contains a component known as the transaction coordinator (also known as the Resource Manager (RM)) that is designed to coordinate read/write operations made to multiple databases within a single transaction. The transaction coordinator, in turn, uses a special database known as the Transaction Manager (TM) database to register each transaction and track the completion status of that transaction across all databases that the transaction is involved with. (The database to be used as the Transaction Manager database is determined by the value stored in the tm_database parameter of the DB2 Database Manager configuration file.) The Transaction Manager database can be any database that an application can connect to; however, for operational and administrative reasons, the database must reside on a robust machine that is up and running most of the time. Additionally, all connections to the Transaction Manager database should be made by the transaction coordinatoran application program should never attempt to connect directly to the Transaction Manager database.

How do the transaction coordinator and the Transaction Manager database coordinate transaction processing across multiple database connections? To answer that question, we need to examine the steps that take place during a two-phase commit process. The following list identifies these steps: 1. When the application program starts a transaction, it automatically connects to the Transaction Manager database. 2. Just before the first SQL statement in the transaction is executed, the transaction coordinator sends a Transaction Register (XREG) request to the Transaction Manager database to register the new transaction. 3. The Transaction Manager database responds to the application program by providing a unique global transaction ID for the new transaction (because the XREG request was sent without a predefined ID). 4. After receiving the transaction ID, the application program registers the new transaction (using the transaction ID) with the database containing the required user data. A response is sent back to the application program when the transaction has been successfully registered. 5. SQL statements issued against the database containing the user data are handled in the normal manner with the return code for each SQL statement processed being returned in an SQLCA data structure variable. 6. Steps 4 and 5 are repeated for each database accessed by the transaction. All other databases accessed in the transaction receive the global transaction ID just before the first SQL statement is executed against them. As we saw earlier, the SET CONNECTION SQL statement is used to switch between database connections. 7. When the application program requests that the current transaction be committed, the transaction coordinator sends a "PREPARE" message to all databases that have been accessed by the transaction. Each database that receives this message writes a "PREPARED" record to its log files and sends a response back to the transaction coordinator. 8. When the transaction coordinator receives a positive response from all databases that the "PREPARE" message was sent to, it sends a message to the Transaction Manager database to inform it that the transaction has been prepared and is now ready to be committed. This completes the first phase of the two-phase commit process. 9. The Transaction Manager database writes a "PREPARED" record to its log file and sends a message back to the transaction coordinator, informing it that the second phase of the commit process can be started. The transaction coordinator then forwards this message to the application program. 10. When the transaction coordinator receives the message to begin the second phase of the commit process, it sends a "COMMIT" message to all databases that the "PREPARE" message was sent to (telling them to commit all changes made by the transaction). Each database that receives this message writes a "COMMITTED" record to its log file and releases all locks that were held by the transaction. When each database has completed committing its changes, it sends a reply back to the transaction coordinator. 11. When the transaction coordinator receives a positive response from all databases that the "COMMIT" message was sent to, it sends a message to the Transaction Manager database to inform it that the transaction has been completed. The Transaction Manager database then writes a "COMMITTED" record to its log file to indicate that the transaction is complete and sends a message to the transaction coordinator to indicate that it has finished processing. The transaction coordinator then forwards this message to the application program to indicate that the transaction has been terminated. Figure 7-4 illustrates the sequence of events that are executed to support two-phase commit processing. (Each numbered item corresponds to one of the steps just outlined.)

Figure 7-4: How two-phase commit processing works.

Recovering from Errors Encountered During Two-phase Commit Processing


As you might imagine, when databases are distributed over several remote servers, the potential for error situations resulting from network or communication failures to occur is greatly increased. Therefore, to ensure that data integrity is never compromised when two-phase commit processing is used, the DB2 Database Manager handles two-phase commit-related errors as follows: First phase errors. If a database responds that it failed to PREPARE a transaction, the transaction will be rolled back during the second phase of the commit process. In this case, a "PREPARE" message will not be sent to the Transaction Manager database. During the second phase of the commit, the application program sends a roll back message to all participating databases that successfully prepared the transaction during the first phase of the commit. Each database that receives this message writes an "ABORT" record to its log file and releases all locks that were held by the transaction. Second phase errors. Error handling at the second stage of a commit is dependent upon whether the second phase is to commit or roll back the transaction. The second phase will roll back the transaction only if the first phase encountered an error. If one of the participating databases fails to commit the transaction (possibly due to a communications failure), the transaction coordinator will continue (until successful) to try to commit the transaction to the database that failed. (The value stored in the resync_interval parameter of the DB2 Database Manager configuration file is used to determine how long the transaction coordinator will wait between attempts to commit the transaction.) Transaction Manager database errors. If for some reason the Transaction Manager database fails, the transaction coordinator will resynchronize the transaction when the Transaction Manager database is restarted. This resynchronization process will attempt to complete all indoubt transactionsthat is, all transactions that have completed the first phase, but not the second phase, of the two-phase commit process. The DB2 Database Manager instance where the Transaction Manager database resides will perform the resynchronization by:

1. Connecting to the databases that replied that they were "PREPARED" to commit during the first phase of the commit process. 2. Attempting to commit the indoubt transactions at that database. (If no indoubt transactions can be found, the DB2 Database Manager assumes that the database successfully committed the transaction during the second phase of the commit process.) 3. Committing the indoubt transactions in the Transaction Manager database after all indoubt transactions have been committed in the participating databases. Other database errors. If one of the databases accessed in the transaction fails and is restarted, the DB2 Database Manager for that database will check the log files of the Transaction Manager database to determine whether the transaction should be rolled back or committed. If the transaction is not found in the Transaction Manager database log files, the DB2 Database Manager assumes the transaction was rolled back, and the indoubt transactions for this database will be rolled back as well. Otherwise, the database will wait for a commit request from the transaction coordinator. If, for some reason, you cannot wait for the transaction coordinator to automatically resolve indoubt transactions, you can manually resolve them by "making a heuristic decision" and applying it to all applicable records. In order to apply a heuristic decision to an indoubt transaction, you must first acquire its global transaction ID. The LIST INDOUBT TRANSACTIONS command or the List Indoubt Transactions API can be used to obtain the global transaction ID for all indoubt transactions that exist for a specified database; once an indoubt transaction's transaction ID has been obtained, the Commit An Indoubt Transaction API can be used to heuristically commit it and the Rollback An Indoubt Transaction API can be used to heuristically roll it back. After a transaction has been heuristically committed or rolled back, the Forget Transaction Status API can be used to tell the database manager to "forget" it by removing all log records that refer to it and releasing its log space. (If the LIST INDOUBT TRANSACTIONS command is executed with the WITH PROMPTING option specified, it can also be used to heuristically roll back, commit, or forget indoubt transactions.) Caution The indoubt transaction processing command and APIs should be used with extreme caution and only as a last resort. The best way to resolve indoubt transactions is to wait for the transaction coordinator to drive the resynchronization process.

Understanding Data Consistency


To understand how DB2 Universal Database attempts to maintain data consistency in both single- and multi-user environments, you must first understand what data consistency is, as well as be able to identify what can cause a database to be placed in an inconsistent state. One of the best ways to learn both is by studying an example. Suppose your company owns a chain of hardware stores and that a database is used to keep track of the inventory stored at each store. By design, this database contains an inventory table for each hardware store in the chain. When supplies are received or sold at a particular store, the inventory table that corresponds to that store is updated accordingly. Now, suppose a case of hammers is physically moved from one hardware store to another. For the database to reflect this inventory move, the hammer count value stored in the donating store's inventory table needs to be lowered, and the hammer count value stored in the receiving store's inventory table needs to be raised. If a user lowers the hammer count value in the donating store's inventory table but fails to raise the hammer count value in the receiving store's inventory table, the data will become inconsistent. Now, the hammer inventory count for the entire chain of hardware stores is no longer accurate. A database can become inconsistent if a user forgets (or an improperly written application fails) to make all necessary changes (as in the previous example), if the system crashes while a user or application is in the middle of making changes (hammer count is lowered in donating store's table, then a system crash occurs before the hammer count is raised in the receiving store's table), or if, for some reason, a database application stops execution prematurely. Inconsistency

can also occur when several users attempt to access the same data at the same time. For example, using the hardware store scenario we just looked at, one user might query the database and discover that no more hammers are available (when there really are) because the query read another user's changes before all tables affected by those changes had been properly updated. Reacting to this information, the user (or an Automated Purchase Ordering System) might then place an order for more hammers when none are needed. To ensure that users and applications accessing the same data at the same time do not inadvertently place that data in an inconsistent state, DB2 UDB relies on two mechanisms known as isolation levels and locks.

Isolation Levels
Earlier, we saw that a transaction (otherwise known as a unit of work) is a recoverable sequence of one or more SQL operations grouped together as a single unit, usually within an application process. A given transaction can perform any number of SQL operationsfrom a single operation to many hundreds or even thousandsdepending on what is considered a "single step" within your business logic. The initiation and termination of a single transaction defines points of data consistency within a database; either the effects of all operations performed within a transaction are applied to the database and made permanent (committed), or the effects of all operations performed are backed out (rolled back) and the database is returned to the state it was in before the transaction was initiated. In single-user, single-application environments, each transaction runs serially and does not have to contend with interference from other transactions. However, in multi-user environments, transactions can execute simultaneously, and each transaction has the potential to interfere with any other transaction that has been initiated but not yet terminated. Transactions that have the potential of interfering with one another are said to be interleaved, or parallel, and transactions that run isolated from each other are said to be serializable, which means that the results of running them simultaneously will be no different from the results of running them one right after another (serially). Ideally, every transaction should be serializable. Why is it important that transactions be serializable? Consider the following: Suppose a travel agent is entering hotel reservation information into a database system at the same time a hotel manager is checking room availability for a conference planning committee. Now, suppose the travel agent blocks off two hundred rooms for a large tour group (to check availability and get a price quote) but does not commit the entry. While the travel agent is relaying the price quote information to the tour group coordinator, the hotel manager queries the database to see how many rooms are available, discovers that all but twenty rooms have been reserved, and tells the conference planning committee that he cannot accommodate their needs. Now, suppose the tour coordinator decides not to reserve the rooms because the quoted price is higher than anticipated. The travel agent rolls back the transaction because no reservations were made, and the two hundred rooms that had been marked as reserved are now shown as being available. Unfortunately, the damage has already been done. The hotel missed the opportunity to host a conference, and they have two hundred vacant rooms they need to fill. If the travel agent's transaction and the hotel manager's transaction had been isolated from each other (serialized), this problem would not have occurred. Either the travel agent's transaction would have finished before the hotel manager's transaction started, or the hotel manager's transaction would have finished before the travel agent's transaction started; in either case, the hotel would not have missed out on the opportunity to host the conference. When transactions are not isolated from each other in multi-user environments, the following types of events (or phenomena) can occur: Lost Updates. This event occurs when two transactions read the same data, both attempt to update that data, and one of the updates is lost. For example: Transaction A and Transaction B read the same row of data and calculate new values for that row based on the original values read. If Transaction A updates the row with its new value and Transaction B then updates the same row, the update operation performed by Transaction A is lost.

Dirty Reads. This event occurs when a transaction reads data that has not yet been committed. For example: Transaction A changes a row of data, and Transaction B reads the changed row before Transaction A commits the change. If Transaction A rolls back the change, Transaction B will have read data that theoretically never existed. Nonrepeatable Reads. This event occurs when a transaction reads the same row of data twice but gets different results each time. For example: Transaction A reads a row of data, and then Transaction B modifies or deletes that row and commits the change. When Transaction A attempts to reread the row, it will retrieve different data values (if the row was updated) or discover that the row no longer exists (if the row was deleted). Phantoms (or Phantom Reads). This event occurs when a row of data matches some search criteria but initially is not seen. For example: Transaction A retrieves a set of rows that satisfy some search criteria, and then Transaction B inserts a new row that contains matching search criteria for Transaction A's query. If Transaction A re-executes the query that produced the original set of rows, a different set of rows will be retrievedthe new row added by Transaction B will now be included in the set of rows returned. Because several different users can access and modify data stored in a DB2 UDB database at the same time, the DB2 Database Manager must be able to allow users to make necessary changes while ensuring that data integrity is never compromised. The sharing of resources by multiple interactive users or application programs at the same time is known as concurrency. One of the ways DB2 UDB enforces concurrency is through the use of isolation levels, which determine how data used in one transaction is "isolated from" other transactions. DB2 Universal Database recognizes and supports the following isolation levels: Repeatable Read Read Stability Cursor Stability Uncommitted Read Table 7-1 shows the various phenomena that can occur when each of these isolation levels are used. Table 7-1: DB2 Universal Database's Isolation Levels and the Phenomena That Can Occur When Each Is Used Isolation Level Phenomena Lost Updates Dirty Reads Nonrepeatable Reads Phantoms

Repeatable Read Read Stability Cursor Stability

No No No

No No No

No No Yes

No Yes Yes

Uncommitted No Yes Yes Yes Read Adapted from Table 1 on Page 56 of the IBM DB2 Administration GuidePerformance manual.

The Repeatable Read Isolation Level


The Repeatable Read isolation level completely isolates one transaction from the effects of other concurrent transactions. When this isolation level is used, every row that is referenced in any manner by the isolated transaction is "locked" for the duration of that transaction. As a result, if the same query is issued two or more times within the same transaction, the result data set produced will always be the same. (Lost updates, dirty reads, non-repeatable reads, and

phantoms cannot occur.) In addition, transactions using the Repeatable Read isolation level will not see changes made to other rows by other transactions until those changes have been committed. Transactions using the Repeatable Read isolation level can retrieve the same set of rows multiple times and perform any number of operations on them until terminated by performing either a commit or a rollback operation. However, no other transaction is allowed to perform any insert, update, or delete operation that would affect the set of rows being accessed by the isolating transactionas long as that transaction remains active. To ensure that the data being accessed by a transaction running under the Repeatable Read isolation level is not adversely affected by other transactions, each row referenced by the isolating transactionnot just the rows that are actually retrieved and/or modifiedis locked. Thus, if a transaction scans 1,000 rows to retrieve 10, locks are acquired and held on all 1,000 rows scannednot just on the 10 rows retrieved. Note If an entire table or view is scanned in response to a query when the Repeatable Read isolation level is used, the entire table or all table rows referenced by the view are locked. This greatly reduces concurrency, especially when large tables are used. So how does this isolation level work in a real-world situation? Suppose you own a large hotel and have a Web site that allows individuals to reserve rooms on a first-come, first-served basis. If your hotel reservation application runs under the Repeatable Read isolation level, when a customer retrieves a list of all rooms available for a given range of dates, you will not be able to change the room rate for those rooms during the date range specified, nor will other customers be able to make or cancel reservations that would cause the list to change if it were generated againas long as the transaction that produced the list is active. (However, you can change room rates for any room that was not scanned in response to the first customer's query. Likewise, other customers can make or cancel room reservations for any room that was not scanned in response to the first customer's query.)

The Read Stability Isolation Level


Unlike the Repeatable Read isolation level, the Read Stability isolation level does not completely isolate one transaction from the effects of other concurrent transactions. That is because when the Read Stability isolation level is used, only rows that are actually retrieved by a single transaction are locked for the duration of that transaction. Thus, when this isolation level is used, if the same query is issued two or more times within the same transaction, the result data set produced may not always be the same. (Lost updates, dirty reads, and nonrepeatable reads cannot occur; phantoms, however, can and may be seen.) In addition, transactions using the Read Stability isolation level will not see changes made to other rows by other transactions until those changes have been committed. Transactions using the Read Stability isolation level can retrieve a set of rows and perform any number of operations on them until terminated by performing either a commit or a rollback operation. However, no other transaction is allowed to perform any update or delete operation that would affect the set of rows retrieved by the isolating transactionas long as that transaction exists. (Other transactions can perform insert operations, and if the transaction running under the Read Stability isolation level executes the same query multiple times, rows inserted between each query by other concurrent transactions may appear in subsequent result data sets produced. As mentioned earlier, such rows are called "phantoms.") Unlike the Repeatable Read isolation level, in which every row that is referenced in any way by the isolating transaction is locked, when the Read Stability isolation level is used, only the rows that are actually retrieved and/or modified by the isolating transaction are locked. Thus, if a transaction scans 1,000 rows to retrieve 10, locks are only acquired and held on the 10 rows retrievednot on all 1,000 rows scanned. (Because fewer locks are acquired, more transactions can run concurrently.)

How does this isolation level change the way our hotel reservation application works? Now, when a customer retrieves a list of rooms available for a given range of dates, you will be able to change the room rate for any room in the hotel that does not appear on the list, and other customers will be able to cancel room reservations for rooms that had been reserved for the date range specified by the first customer's query. Therefore, if the customer generates the list of available rooms again (before the transaction that submitted the query terminates), the list produced may contain new room rates and/or rooms that were not available the first time the list was generated.

The Cursor Stability Isolation Level


The Cursor Stability isolation level is even more relaxed than the Read Stability isolation level in the way it isolates one transaction from the effects of other concurrent transactions. When the Cursor Stability isolation level is used, only the row that is currently being referenced by a cursor is locked. (Say a potential customer is considering making a reservation for room 4212. The moment the customer selects this room for evaluation, a pointercalled a cursorwill be positioned on the row for room 4212, and that row will be locked.) The lock acquired remains in effect until the cursor is repositionedmore often than not by executing the FETCH SQL statementor until the isolating transaction terminates. (If the cursor is repositioned, the lock being held on the last row read is released, and a new lock is acquired for the row that the cursor is now positioned on.) When a transaction using the Cursor Stability isolation level retrieves a row from a table via an updatable cursor, no other transaction can update or delete that row while the cursor is positioned on it. However, other transactions can add new rows to the table, as well as perform update and/or delete operations on rows positioned on either side of the locked row, provided the locked row was not accessed using an index. Furthermore, if the isolating transaction modifies any row it retrieves, no other transaction can update or delete that row until the isolating transaction is terminated, even when the cursor is no longer positioned on the modified row. As you might imagine, when the Cursor Stability isolation level is used, if the same query is issued two or more times within the same transaction, the results returned may not always be the same. (Lost updates and dirty reads cannot occur; nonrepeatable reads and phantoms, however, can and may be seen.) In addition, transactions using the Cursor Stability isolation level will not see changes made to other rows by other transactions until those changes have been committed. Once again, let us see how this isolation level affects our hotel reservation application. Now, when a customer retrieves a list of rooms available for a given range of dates and then views information about each room on the list produced (one room at a time), you will be able to change the room rate over any date range for any room in the hotel except the room that the customer is currently looking at. Likewise, other customers will be able to make or cancel reservations over any date range for any room in the hotel; however, they will not be able to do anything with the room that the first customer is currently looking at. When the first customer views information about another room in the list, the same holds true for the new room the customer looks atyou will now be able to change the room rate for the room that the first customer was just looking at, and other customers will be able to reserve that particular room provided the first customer did not reserve the room for himself/herself.

The Uncommitted Read Isolation Level


While the Repeatable Read isolation level is the most restrictive of the isolation levels available, the Uncommitted Read isolation level is the least intrusive isolation level provided. In fact, when the Uncommitted Read isolation level is used, rows that are retrieved by a single transaction are locked only if another transaction attempts to drop or alter the table from which the rows were retrieved. Because rows often remain unlocked when this isolation level is used, dirty reads, nonrepeatable reads, and phantoms can occur. Therefore, the Uncommitted Read isolation level is commonly used for transactions that access read-only tables/views or transactions that execute queries on which uncommitted data from other transactions will have no adverse effect.

In most cases, transactions using the Uncommitted Read isolation level can read changes made to rows by other transactions before those changes have been committed. However, such transactions can neither see nor access tables, views, or indexes created by other concurrent transactions until those transactions have been terminated. The same applies to existing tables, views, or indexes that have been droppedtransactions using the Uncommitted Read isolation level will learn that these objects no longer exist only when the transaction that dropped them is terminated. There is one exception to this behavior: When a transaction running under the Uncommitted Read isolation level uses an updatable cursor, the transaction will behave as if it is running under the Cursor Stability isolation level, and the constraints of the Cursor Stability isolation level will apply. So how does the Uncommitted Read isolation level affect our hotel reservation application? Now, when a customer retrieves a list of rooms available for a given range of dates, you will be able to change the room rate for any room in the hotel, and other customers will be able to make or cancel reservations over any date range for any room. Furthermore, the list produced for the first customer may contain rooms that other customers have chosen to cancel reservations for, but whose cancellations have not yet been committed to the database.

Choosing the Proper Isolation Level


In addition to controlling how well the DB2 Database Manager provides concurrency, the isolation level used also determines how well applications running concurrently will perform. As a result, using the wrong isolation level for a given situation can have a significant negative impact on both concurrency and performance. How do you determine which isolation level to use for a given situation? You start by identifying the concurrency problems that can arise and determining what concurrency problems are acceptable in your environment based upon the business rules in effect. Then you select an isolation level that will prevent any of these business rules from being violated. Typically, you should use the: Repeatable Read isolation level if you are executing queries and do not want other concurrent transactions to have the ability to make changes that could cause the same query to return different results if run more than once. Read Stability isolation level when you want some level of concurrency between applications, yet you also want qualified rows to remain stable for the duration of an individual transaction. Cursor Stability isolation level when you want maximum concurrency between applications and you do not want queries to return uncommitted data values. Uncommitted Read isolation level if you are executing queries on readonly databases or you do not care if a query returns uncommitted data values.

Specifying the Isolation Level to Use


Although isolation levels control concurrency at the transaction level, they are actually set at the application level. Therefore, in most cases, the isolation level specified for a particular application is applicable to every transaction initiated by that application. (It is important to note that an application can be constructed in several different parts, and each part can be assigned a different isolation level, in which case the isolation level specified for a particular part is applicable to every transaction that is created within that part.) For embedded SQL applications, the isolation level to be used is specified at precompile time or when the application is bound to a database (if deferred binding is used). The isolation level for embedded SQL applications written in a supported compiled language (such as C and C++) is set through the ISOLATION option of the PRECOMPILE and BIND commands. The isolation level for Open Database Connectivity (ODBC) and Call Level Interface (CLI) applications is set at application run time by calling the SQLSetConnectAttr() function with the SQL_ATTR_TXN_ISOLATION connection attribute specified. Alternatively, the isolation level for ODBC/CLI applications can be set by assigning a value to the TXNISOLATION keyword in the db2cli.ini configuration file; however, this approach does not provide the flexibility of changing

isolation levels for different transactions within the application that the first approach provides. The isolation level for JDBC and SQLJ applications is set at application run time by calling the setTransactionIsolation() method of the Connection interface. When the isolation level for an application is not explicitly set using one of these methods, the Cursor Stability isolation level is used as the default. This holds true for commands, SQL statements, and scripts executed from the Command Line Processor, as well as for embedded SQL, ODBC/CLI, JDBC, and SQLJ applications. Therefore, it is also possible to specify the isolation level to be used for any transaction that is to be executed by the Command Line Processor. In this case, the isolation level is set by executing the CHANGE ISOLATION command before a connection to a database is established.

Locking
The one thing that all four isolation levels have in common is that they control how data is accessed by concurrent transactions through the use of locks. So just what is a lock? A lock is a mechanism that is used to associate a data resource with a single transaction for the sole purpose of controlling how other transactions interact with that resource while it is associated with the transaction that has it locked. (The transaction that has a data resource associated with it is said to "hold" or "own" the lock.) Essentially, locks in a database environment serve the same purpose as they do in a house or a carthey determine who can and cannot gain access to a particular resourcewhich, in the case of a data resource, is one or more tablespaces, tables, and/or rows. The DB2 Database Manager imposes locks to prohibit "owning" transactions from accessing uncommitted data that has been written by other transactions and to prevent other transactions from making data modifications that might adversely affect the owning transaction. When an owning transaction is terminated (by being committed or rolled back), any changes made to the resource that was locked are made permanent (or removed), and all locks on the resource that had been acquired by the owning transaction are released. Once unlocked, a resource can be locked again and manipulated by another active transaction. Figure 7-5 illustrates the principles of transaction and resource locking.

Figure 7-5: How DB2 Universal Database prevents uncontrolled concurrent access to a resource by using locks

How Locks Are Acquired


Except for occasions in which the Uncommitted Read isolation level is used, it is never necessary for a transaction to explicitly request a lock. That's because the DB2 Database Manager implicitly acquires locks as they are needed (and once acquired, these locks remain under the DB2 Database Manager's control until they are no longer needed). By default, the DB2 Database Manager always attempts to acquire row-level locks. However, it is possible to control whether the DB2 Database Manager will always acquire row-level locks or table-level locks on a specific table resource by executing a special form of the ALTER TABLE SQL statement. The syntax for this form of the ALTER TABLE statement is: ALTER TABLE [TableName] LOCKSIZE [ROW | TABLE] where: TableName

Identifies the name of an existing table for which the level of locking that all transactions are to use when accessing it is to be specified.

For example, if executed, the SQL statement ALTER TABLE EMPLOYEE LOCKSIZE TABLE would force the DB2 Database Manager to acquire table-level locks for every transaction that accesses the table named EMPLOYEE. On the other hand, if the SQL statement ALTER TABLE EMPLOYEE LOCKSIZE ROW was executed, the DB2 Database Manager would attempt to acquire row-level locks (which is the default behavior) for every transaction that accesses the table named EMPLOYEE. But what if you don't want every transaction that works with a particular table to acquire tablelevel locks? What if, instead, you want only one specific transaction to acquire table-level locks, and you want all other transactions to acquire row-level locks when working with that table? In this case, you leave the default locking behavior alone (so row-level locking is used) and use the LOCK TABLE SQL statement to acquire a table-level lock for the appropriate individual transaction. The syntax for the LOCK TABLE statement is: LOCK TABLE [TableName] IN [SHARE | EXCLUSIVE] MODE where: TableName

Identifies the name of an existing table or declared temporary table that is to be locked.

As you can see, the LOCK TABLE statement allows a transaction to acquire a table-level lock on a particular table, in one of two modes: SHARE mode and EXCLUSIVE mode. If a table is locked using SHARE mode, a table-level Share (S) lock is acquired on behalf of the requesting transaction, and other concurrent transactions are allowed to read, but not change, the data stored in the locked table. If a table is locked using EXCLUSIVE mode, however, a table-level Exclusive (X) lock is acquired, and other concurrent transactions can neither access nor modify data stored in the locked table. For example, if executed, the SQL statement LOCK TABLE EMPLOYEE IN SHARE MODE would acquire a table-level Share (S) lock on the EMPLOYEE table on behalf of the current transaction (provided no other transaction holds a lock on this table), and other concurrent transactions would be allowed to read, but not change, the data stored in the table. On the other hand, if the SQL statement LOCK TABLE EMPLOYEE IN EXCLUSIVE MODE

were executed, a table-level Exclusive (X) lock would be acquired, and no other concurrent transaction would be allowed to read or modify data stored in the EMPLOYEE table. Note The LOCK TABLE statement cannot be used to lock system catalog tables or tables that are referenced by nicknames

Practice Questions
Question 1 Which of the following code fragments is considered to be static SQL? A. EXEC SQL EXECUTE IMMEDIATE :hv1 B. EXEC SQL SELECT c1 FROM t1 INTO :hv1 C. EXEC SQL INSERT INTO t1 VALUES (?) D. EXEC SQL DELETE FROM t1 WHERE EMPNO = ? Which of the following applies to atomic compound SQL? A. Can contain operating system commands B. SQL substatements can succeed or fail, regardless of the success or failure of preceding SQL substatements in the same compound SQL statement block. C. If any SQL substatement in the compound SQL statement block fails, no other substatements are executed and effects of the compound SQL statement are rolled back. D. It is the DB2 UDB term for SQL statements that include subselects, CASE statements, declared temporary tables, common table expressions, and recursive SQL. Given the following table: TAB1 EMPID NAME 1 USER1 2 USER2 3 USER3 Users A and B are using the DB2 Command Line Processor with AUTOCOMMIT turned off. They execute the following commands in the order shown: User A: SET CURRENT ISOLATION RS DECLARE c1 CURSOR FOR SELECT empid FROM tab1 OPEN c1 User B: SET CURRENT ISOLATION RS DELETE FROM tab1 WHERE empid = 3 User A: db2 "FETCH c1" What will each user see in their respective

Question 2

Question 3

Question 4

Question 5

Question 6

windows? A. User A: 1; User B: session hangs B. User A: 1; User B: session completes successfully C. User A: session hangs; User B: session hangs D. User A: session hangs; User B: session completes successfully Using the CS isolation level, Application A drops table TAB1 and does not perform a commit operation. Using the UR isolation level, Application B attempts to retrieve data from TAB1. Which of the following will occur? A. Application A will be prompted to commit the current transaction B. Application B will wait until Application A commits or rolls back the transaction C. Application B will be able to access data in TAB1 until Application A commits the transaction D. Application B will receive an error immediately after Application A's delete operation is performed Which of the following affects concurrency by controlling the isolation level that will be used by embedded SQL applications? A. TXNISOLATION B. SQL_ATTR_TXN_ISOLATION C. setTransactionIsolation() D. ISOLATION The following sequence of SQL statements were successfully executed in manual-commit mode: EXEC SQL CONNECT TO dbase2; EXEC SQL CONNECT TO dbase1; INSERT INTO tab1 VALUES (1, 'Red'); SET CONNECTION dbase2; INSERT INTO tab1 VALUES (2, 'White'); SET CONNECTION dbase1; RELEASE dbase2; COMMIT; INSERT INTO tab1 VALUES (3, 'Blue'); ROLLBACK; Which of the following describes the current state? A. One record was added to DBASE1 and the connection to it is open; one record was added to DBASE2 and the connection to it is closed B. Two records were added to DBASE1 and the connection to it is open; one record was added to DBASE2 and the connection to it is closed C. One record was added to DBASE1; one record was added to DBASE2; and the connections to both DBASE1 and DBASE2 are open

Question 7

Question 8

Question 9

Question 10

D. One record was added to DBASE1; one record was added to DBASE2; and the connections to both DBASE1 and DBASE2 are closed Which of the following does NOT correctly illustrate parameter marker usage? A. SELECT c1 FROM t1 WHERE c2 = ? B. INSERT INTO t1 VALUES (?, ?) C. SELECT ? FROM t1 WHERE c2 = c3 D. UPDATE t1 SET c1 = ? WHERE c2 = c3 Which of the following JDBC interfaces has methods that permit the use of parameter markers? A. Connection B. PreparedStatement C. Statement D. ResultSet A CLI/ODBC application executes the following statement: SQLExecDirect(StmtHandle, "SELECT c1 FROM t1; SELECT c1 FROM t2;", SQL_NTS); When all of the records retrieved from table T1 have been processed, which of the following functions must be executed in order to process the records retrieved from table T2? A. SQLFetch() B. SQLNextResultSet() C. SQLCloseCursor() D. SQLMoreResults() Given the following table definition: CREATE TABLE tab1 (c1 SMALLINT, c2 VARCHAR(30)) and the following embedded SQL pseudo-source code: EXEC SQL BEGIN DECLARE SECTION;

int short char short char

hv1; hv2; hv3[30]; hv4; text_stmt[70];

EXEC SQL END DECLARE SECTION; ... hv1 = 20; hv2 = -1;

strcpy(hv3, "USER1"); hv4 = 0; strcpy(stmt, "INSERT INTO t1 VALUES (?, ?) EXEC SQL PREPARE stmt FROM :text_stmt; Which of the following statements will successfully insert a row into table T1 where C1 contains a null value and C2 contains the value USER1? A. EXEC SQL EXECUTE stmt USING :hv2, :hv3) B. EXEC SQL EXECUTE stmt USING :hv4 :hv2, :hv3) C. EXEC SQL EXECUTE stmt USING :hv1 hv2, :hv3) D. EXEC SQL EXECUTE stmt USING :hv1 :hv4, :hv3) Given the following code: EXEC SQL BEGIN COMPOUND NOT ATOMIC INSERT INTO territory VALUES (:Country1, :Region); INSERT INTO territory VALUES (:Country2, :Region); INSERT INTO territory VALUES (:Country3, :Region); UPDATE territory SET region = :Region2 WHERE Country = :Country4); INSERT INTO territory VALUES (:Country4, :Region2); COMMIT; END COMPOUND; Assuming that all substatements except the following executed successfully: INSERT INTO territory VALUES (:Country2, :Region); How many rows were added to the TERRITORY table? A. 0 B. 1 C. 2 D. 3 Which of the following can NOT be locked, either implicitly or explicitly? A. Declared temporary tables B. User tables C. System database directory D. Rows A table contains a list of all seats available at a football stadium. A seat consists of a section number, a seat number, and whether or not the seat has been assigned. A ticket agent working at the box office generates a list of all unassigned seats. When the agent refreshes the list, it should only change if another agent assigns one or more unassigned seats. Which of the following is the best isolation level to use for this application?

Question 11

Question 12

Question 13

Question 14

Question 15

Question 16

Question 17

A. Repeatable Read B. Read Stability C. Cursor Stability D. Uncommitted Read Which two of the following are TRUE for a remote unit of work? A. Type 1 connections must be used B. Operations can only be performed against a single data source C. Type 2 connections must be used D. Operations can be performed against any number of data sources E. A transaction coordinator and a Transaction Manager database are used to coordinate COMMITs and ROLLBACKs Which of the following CLI/ODBC functions can be used to obtain information about a specific parameter in an SQL statement after the SQLPrepare() function is called? A. SQLGetInfo() B. SQLGetParamInfo() C. SQLBindParameter() D. SQLGetDescRec() Which of the following CLI/ODBC statement handle attributes is used to control the number of rows that are retrieved from a result data set each time the SQLFetch() function is called? A. SQL_ATTR_MAX_ROWS B. SQL_ATTR_ROW_ARRAY_SIZE C. SQL_ATTR_ROWSET_SIZE D. SQL_ATTR_FETCH_SIZE A developer wants to use static SQL to retrieve a row of data from a table and he is not sure how many rows will be returned by his query. Which of the following code fragments correctly demonstrates the SQL statements needed to meet his objective? A. EXEC SQL SELECT col1 FROM tab1 USING CURSOR c1; EXEC SQL FETCH cur INTO :hv; EXEC SQL CLOSE cur;

B. EXEC SQL DECLARE cur CURSOR FOR SELECT col1 FROM tab1;

EXEC SQL OPEN cur; EXEC SQL FETCH cur INTO :hv; EXEC SQL CLOSE cur;

C. EXEC SQL PREPARE stmt FOR SELECT col1 FROM tab1; EXEC SQL DECLARE cur CURSOR FOR stmt;

EXEC SQL OPEN cur; EXEC SQL FETCH cur INTO :hv; EXEC SQL CLOSE cur

Question 18

D. EXEC SQL SELECT col1 FROM tab1 INTO :hv; Which of the following correctly identifies what indoubt transactions are? A. Transactions that have not yet completed the first phase of the two-phase commit process B. Transactions that were "lost" during the second phase of the two-phase commit process C. Transactions that have completed the first phase, but not the second phase of the two-phase commit process D. Poorly constructed transactions that have failed and left doubt about a database's consistency The correct answer is B. Parameter markers are not used with static SQL statements, so answers C and D are automatically eliminated. The EXECUTE IMMEDIATE statement is used to execute dynamic SQL statements, and the SELECT INTO SQL statement can only be executed statically, so the correct answer is B. The correct answer is C. When a compound SQL statement is processed atomically, the application executing the statement receives a response from the DB2 Database Manager when all substatements within the compound statement have completed successfully or when one of the substatements ends in an error. If one substatement ends in an error, the entire group of statements is considered to have ended in error, and any changes made to the database by other substatements within the compound statement are backed out with a rollback operation. The correct answer is A. Transactions using the Read Stability isolation level can retrieve a set of rows and perform any number of operations on them until terminated by performing either a commit or rollback operation. However, no other transaction is allowed to perform any update or delete operation that would affect the set of rows retrieved by the isolating transactionas long as that transaction exists. Therefore, the first row retrieved from the table named TAB1 will be shown to User A, and the session for User B will hang until User 1 closes cursor C1 and commits (or rolls back) the transaction. The correct answer is B. In most cases, transactions using the Uncommitted Read (UR) isolation level can read changes made to rows by other transactions before those changes have been committed. However, such transactions can neither see nor access tables, views, or indexes created by other

Answers Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

concurrent transactions until those transactions have been terminated. The same applies to existing tables, views, or indexes that have been droppedtransactions using the Uncommitted Read isolation level will learn that these objects no longer exist only when the transaction that dropped them is terminated. The correct answer is D. The isolation level for embedded SQL applications written in a supported compiled language is set through the ISOLATION option of the PRECOMPILE and BIND commands. (The isolation level for CLI/ODBC applications is set at application run time by calling the SQLSetConnectAttr() function with the SQL_ATTR_TXN_ISOLATION connection attribute specified. Alternatively, the isolation level for ODBC/CLI applications can be set by assigning a value to the TXNISOLATION keyword in the db2cli.ini configuration file; however, this approach does not provide the flexibility of changing isolation levels for different transactions within the application that the first approach does. The isolation level for JDBC and SQLJ applications is set at application run time by calling the setTransactionIsolation() method of the JDBC Connection interface.) The correct answer is A. When the RELEASE SQL statement is executed, a connection is removed from the "Held" state and placed in the "Release-Pending" state, which means that the connection will be terminated by the next successful commit operation. In this example, one row is added to TAB1 of DBASE1, one row is added to TAB1 of DBASE2, DBASE2 is placed in "Release-Pending" state, the row added to TAB1 of DBASE1 is committed and the connection to DBASE2 is terminated, another row is added to TAB1 of DBASE1, and then that row is removed with a rollback operation. The correct answer is C. Parameter markers are typically used where a host variable would be referenced if the SQL statement being executed were static; neither host variables nor parameter markers can be used to identify column names in a SELECT statement. The correct answer is B. When parameter markers are used in JDBC applications, values are supplied for each parameter marker placed in an SQL statement by calling the appropriate setXXX() method of the PreparedStatement or CallableStatement object that the SQL statement is associated with. The correct answer is D. When two or more SELECT SQL statements are executed together as a batch, multiple result data sets are produced. To process multiple result data sets, an application must call the SQLMoreResults() function. This function discards the contents of the current result data set and makes the next result data set available. The correct answer is B. When processing INSERT and UPDATE SQL statements, the DB2 Database Manager examines the value of any indicator variable provided first, and if it contains a negative value, the DB2 Database Manager assigns a null value to the appropriate columnprovided null values are allowed. If the indicator variable is set to zero or contains a positive number, or if no indicator value is used, the

Question 11

Question 12

Question 13

Question 14

Question 15 Question 16

Question 17

DB2 Database Manager assigns the value stored in the corresponding host variable to the appropriate column instead. The correct answer is D. When a NOT ATOMIC compound SQL statement is processed, the application executing the statement receives a response from the DB2 Database Manager when all substatements within the block have completedall substatements within the compound statement are executed, even if one or more substatements end in error. If one or more substatements end in error, changes made to the database by those substatements are rolled back. Changes made by substatements within the compound statement that executed successfully can only be backed out by rolling back the transaction that the compound SQL statement was executed from. The correct answer is C. The LOCK TABLE statement cannot be used to lock system catalog tables or tables that are referenced by nicknames. The system database directory is a file that is used to keep track of where databases have been created and catalogedtherefore, it cannot be locked. The correct answer is C. If the Repeatable Read isolation level is used, other agents will be unable to assign seats as long as the transaction that generated the list remains active; therefore, the list will not change when it is refreshed. If the Read Stability isolation level is used, other agents will be able to unassign currently assigned seats (and these unassigned seats will show up when the list is refreshed) but will not be able to assign any seat that appears in the list as long as the transaction that generated the list remains active. If the Uncommitted Read isolation level is used, other agents will be able to unassign currently-assigned seats, as well as assign unassigned seats; however, uncommitted seat unassignments or assignments will show up when the list is refreshed, and the agent may make an inappropriate change based on this data. Therefore, the best isolation level to use for this particular application is the Cursor Stability isolation level. The correct answers are A and B. Type 1 connections support only one database connection per transaction (referred to as remote unit of work) while Type 2 connections support any number of database connections per transaction (referred to as application-directed distributed unit of work). Essentially, when Type 1 connections are used, an application can be connected to only one database at a time; once a connection to a database is established and a transaction is started, that transaction must be either committed or rolled back before another database connection can be established. The correct answer is D. The SQLGetDescRec() function can be used to retrieve the contents of several parameter or column record fields, which identify the data type and storage of a specific parameter or column in a result data set. The correct answer is C. The rows returned when data is fetched using a block cursor are known as a rowset, and the actual number of rows that are returned in a rowset is controlled by the SQL_ATTR_ROWSET_SIZE statement attribute. The correct answer is B. When a query that has the potential to generate a result data set containing several rows is executed, a cursor must be used to retrieve data values from

the result data set produced. The following steps must be followed (in the order shown) if a cursor is to be incorporated into an application program: 1. Declare (define) a cursor, along with its type (read-only or updatable), and associate it with the desired query (SELECT or VALUES SQL statement). 2. Open the cursor. This action will cause the corresponding query to be executed and a result data set to be produced. 3. Retrieve (fetch) each row in the result data set, one by one, until an "End of data" condition occurs each time a row is retrieved from the result data set, the cursor is automatically moved to the next row. 4. If appropriate, modify or delete the current row (but only if the cursor is an updatable cursor). 5. Close the cursor. This action will cause the result data set that was produced when the corresponding query was executed to be deleted. With a DB2 UDB embedded SQL application, the following statements are used to carry out these steps: EXEC SQL DECLARE CURSOR EXEC SQL OPEN EXEC SQL FETCH EXEC SQL CLOSE The correct answer is C. Indoubt transactions are transactions that have completed the first phase, but not the second phase, of the two-phase commit process.

Question 18

Chapter 8: User-defined Routines


Overview
Eight percent (8%) of the DB2 UDB V8.1 Family Application Development exam (Exam 703) is designed to test your knowledge of user-defined data types, user-defined functions, and stored procedures, along with your ability to identify when each should be used. The questions that make up this portion of the exam are intended to evaluate the following: Your ability to identify each type of user-defined function available. Your ability to identify when user-defined functions should be used. Your ability to identify when stored procedures should be used. Your knowledge of how the Development Center can be used to construct userdefined functions and stored procedures. This chapter is designed to introduce you to user-defined data types, userdefined functions, and stored procedures; walk you through the process used to create and register user-defined data types and stored procedures; and provide you with an overview of the Development Center. Terms you will learn: User-defined data type (UDT) Distinct data type CREATE DISTINCT TYPE Structured data type CREATE TYPE User-defined function (UDF) Sourced function SQL scalar function SQL table function

SQL row function External scalar function External table function OLE DB external table function Function template CREATE FUNCTION Parameter passing styles Stored procedures SQL stored procedures External stored procedures CREATE PROCEDURE CALL Development Center

Techniques you will master: Understanding how and why user-defined data types are created Understanding how and why user-defined functions are created. Recognizing the differences among the user-defined data types available. Understanding how and why stored procedures are created. Understanding how the Development Center aids in the development of userdefined functions and stored procedures.

DB2 Universal Database's Data Types


If you stop and think about it, you will discover that most of the "data" you encounter on a day-today basis falls into distinct categories. The money you buy coffee with and the change you get back are numerical in nature; the email messages you read and the replies you send back are composed of character strings; and many of the things you do, such as attending meetings, eating dinner, and going to bed, revolve around time. Most of the data that gets stored in a DB2 UDB database can be categorized in a similar manner. To ensure that all data is stored as efficiently as possible, DB2 UDB comes equipped with a rich assortment of built-in data types. The built-in data types available are shown in Table 8-1. Table 8-1: DB2 UDB's Built-in Data Types Data Type Small Integer Integer Big Integer Decimal Definition(s) SMALLINT INTEGER INT BIGINT DECIMAL(Precision, Scale) DEC(Precision, Scale) NUMERIC(Precision, Scale) NUM(Precision, Scale) where Precision is any number between 1 and 31 and Scale is any number between 0 and Precision REAL FLOAT(Precision) where Precision is any number between 1 and 24 DOUBLE DOUBLE-PRECISION FLOAT FLOAT(Precision) where Precision is any number between 25 and 53 CHARACTER(Length) <FOR BIT DATA>* CHAR(Length) <FOR BIT DATA>*

Single-Precision Floating-Point Double-Precision Floating-Point

Fixed-Length Character String

Table 8-1: DB2 UDB's Built-in Data Types Data Type Varying-Length Character String Definition(s) where Length is any number between 1 and 254 CHARACTER VARYING(MaxLength) <FOR BIT DATA>* CHAR VARYING(MaxLength) <FOR BIT DATA>* VARCHAR(MaxLength) <FOR BIT DATA>* where MaxLength is any number between 1 and 32,672 LONG VARCHAR*

Long VaryingLength Character String Fixed-Length Double-Byte Character String Varying-Length Double-Byte Character String Long VaryingLength DoubleByte Character String Date Time Timestamp Binary Large Object

GRAPHIC(Length) where Length is any number between 1 and 127 VARGRAPHIC(MaxLength) where MaxLength is any number between 1 and 16,336 LONG VARGRAPHIC

DATE TIME TIMESTAMP BINARY LARGE OBJECT(Size <K | M | G>) BLOB(Size <K | M | G>) where Size is any number between 1 and 2,147,483,647; if K (for kilobyte) is specified, Size is any number between 1 and 2,097,152; if M (for megabyte) is specified, Size is any number between 1 and 2,048; if G (for gigabyte) is specified, Size is any number between 1 and 2 CHARACTER LARGE OBJECT(Size <K | M | G>) CHAR LARGE OBJECT(Size<K | M | G>) CLOB(Size<K | M | G>) where Size is any number between 1 and 2,147,483,647; if K (for kilobyte) is specified, Size is any number between 1 and 2,097,152; if M (for megabyte) is specified, Size is any number between 1 and 2,048; if G (for gigabyte) is specified, Size is any number between 1 and 2 DBCLOB(Size<K | M | G>) where Size is any number between 1 and 1,073,741,823; if K (for kilobyte) is specified, Size is any number between 1 and 1,048,576; if M (for megabyte) is specified, Size is any number between 1 and 1,024; if G (for gigabyte) is specified, Size must be 1

Character Large Object

Double-Byte Character Large Object

*If the FOR BIT DATA option is used with any character string data type definition, the contents of the column that the data type is assigned to are treated as binary data. As a result, code page conversions are not performed if data is exchanged between other systems, and all comparisons made are done in binary, regardless of

Table 8-1: DB2 UDB's Built-in Data Types Data Type Definition(s)

the collating sequence used by the database. In addition to these "traditional" types of data, DB2 UDB provides facilities that can be used to create an infinite number of user-defined data types, which can in turn be used to store complex, nontraditional data that might be found in a complex computing environment.

User-defined Data Types


As the name implies, user-defined data types (UDTs) are data types that are created (and named) by a database user. A user-defined data type can be a distinct data type that shares a common representation with one of the built-in data types provided with DB2 UDB, or it can be a structured type that consists of a sequence of named attributes, each of which has its own data type. Structured data types can also be created as subtypes of other structured types, thereby defining a type hierarchy. (Structured data types are not supported by DB2 UDB for iSeries.)

Distinct Data Types


A distinct data type is a user-defined data type that is derived from one of the built-in data types available. Although a distinct data type shares a common internal representation with a built-in data type, it is considered a separate data type that is distinct from any other data type (hence, the "distinct" in the name). Although distinct data types share the same representation as other built-in types, the DB2 Database Manager guarantees that strong data typing exists, which means that even though distinct data types share the same representation as other built-in data types, the value of one user-defined data type is compatible only with values of that same type (or of other user-defined data types within the same data type hierarchy). As a result, userdefined data types cannot be used as arguments for most of the built-in functions available. Instead, user-defined functions (or methods) that provide similar functionality must be developed when that kind of capability is needed. (Similarly, a built-in data type cannot be used in arguments or operands designed to use a distinct data type.) When a distinct data type is created, six comparison functions (named =, <>, <, <=, >, and >=) are also created. These functions allow two instances of the distinct data type to be compared in the same manner that any two values of the same built-in data type can be compared. Because these functions enable the DB2 Database Manager to compare instances of a distinct data type, the ORDER BY, GROUP BY, and DISTINCT clauses of a SELECT SQL statement can be used with columns that have been defined with a distinct data type. (However, because Large Object (LOB) data types cannot be compared, LOB columns cannot be used in indexes, ORDER BY clauses, GROUP BY clauses, and DISTINCT clauses, and the same is true for distinct data types that are based on LOB data types.) Along with six comparison functions, two casting functions are also generated when a distinct data type is created. These functions are used to convert data between the new distinct data type and the built-in data type on which the distinct data type is based. The name of the casting function that will convert a built-in data type value to a distinct data type value is the same as that of the distinct data type itself. Thus, if a distinct data type named EMPID that is based on an INTEGER built-in data type is created, the DB2 Database Manager automatically generates the casting functions EMPID(INTEGER) and INTEGER(EMPID). Both casting functions are extremely efficient because the distinct data type and the built-in type that the functions are based on share the same representation, and no real work is needed to convert values from one data type to the other. Distinct user-defined data types are created by executing the CREATE DISTINCT TYPE SQL statement. The basic syntax for this statement is:

CREATE DISTINCT TYPE [DataTypeName] AS [SourceDataType] <WITH COMPARISIONS> where: DataTypeName

Identifies the name that is to be assigned to the distinct user-defined data type that is to be created. SourceDataType Identifies the built-in data type that the distinct user-defined data type is to be derived from. (Table 8-1 contains a list of valid built-in data type definitions.) Note Although basic syntax is presented for most of the SQL statements covered in this chapter, the actual syntax supported may be much more complex. To view the complete syntax for a specific SQL statement or to obtain more information about a particular statement, refer to the IBM DB2 Universal Database, Version 8 SQL Reference Volume 2 product documentation.

So, if you wanted to create a distinct user-defined data type named MONEY that is based on the DECIMAL built-in data type, you could do so by executing a CREATE DISTINCT TYPE SQL statement that looks something like this: CREATE DISTINCT TYPE MONEY AS DECIMAL(11,2) WITH COMPARISONS

Structured Data Types


A structured data type is a user-defined data type that contains one or more attributes, each of which has a name and data type of its own. A structured data type often serves as the data type of a typed table or view, in which case each column of the table or view derives its name and data type from an attribute of the structured data type. A structured data type can also be created as a subtype of another structured type (referred to as its supertype). In this case, the subtype inherits all the attributes of the supertype and can optionally add additional attributes of its own. Just as six comparison functions and two casting functions are automatically created when a distinct data type is created, six comparison functions (also called =, <>, <, <=, >, and >=) and two casting functions can be created when a structured data type is created. (By default, the automatic creation of these functions is suppressed.) When created, these comparison functions are used to compare references to the structured data type, but not to the data type values themselves. Likewise, the two casting functions are used to cast between the generated reference type of the structured data type and the underlying data type that the structured data type uses as a reference type. Structured data types are created by executing the CREATE TYPE SQL statement.

Why Use User-defined Data Types?


With so many built-in data types available, you may be wondering why you would want to create your own. Some of the benefits associated with using user-defined data types include: Extensibility. By defining new data types, you can increase the set of data types that are available to support your data storage needs. Flexibility. You can specify semantics and behaviors for your new data types by using userdefined functions to augment the diversity of the data types available. Consistent behavior. Strong typing ensures that different types of data are not combined in ways that do not make sense (for example, heights would not be compared with weights, although both might be based on floating-point numbers). User-defined data types are guaranteed to behave appropriately because only functions that have been defined for a specific user-defined data type can be applied to instances of that user-defined type. (On the other hand, DB2 UDB is already strongly typed, and using UDFs makes it even more restrictiveall

functionality will have to be re-invented by hand if you need the power the built-in data types provide.) Encapsulation. The set of functions and operators applied to a userdefined data type defines the behavior of that data type. This provides flexibility in implementation because applications do not have to be dependent on the internal representation used by a particular data type. Performance. Because distinct data types are internally represented in the same way as built-in data types, they share the same efficient code that is used to implement comparison operators and casting functions for the built-in data types available.

User-defined Functions
Earlier, we saw that when a distinct data type is created, six comparison functions and two casting functions are also created. These functions allow simple comparison operations to be performed and provide a way to seamlessly move data between a distinct data type and the builtin type on which it is based. However, other operations that might apply to a built-in data type are not automatically inherited. That's where user-defined functions (UDFs) come into play; userdefined functions are special objects that are used to extend and enhance the support provided by the built-in functions available with DB2 UDB. Like user-defined data types, user-defined functions (or methods) are created and named by a database user. However, unlike DB2 UDB's built-in functions, user-defined functions can take advantage of system calls and DB2 UDB's administrative APIs, thereby providing more synergy between applications and databases. Five types of user-defined functions can be created: Sourced (or Template). This type of function is implemented by invoking another function that is already registered in the database. (It is possible to create a partial function, called a function template, that defines what types of values are to be returned but contains no executable code. The function template is then mapped to a data source function within a federated system so that the data source function can be invoked from a federated database. A function template can be registered only with an application server that is designated as a federated server.) SQL Scalar, Table, or Row. This type of function, written entirely in SQL, can be scalar in nature (scalar functions return a single value and can be specified in an SQL statement wherever a regular expression can be used) or can return a single row or an entire table. External Scalar. This type of function, written using a high-level programming language, is scalar in nature. The function itself resides in an external library and is registered in the database, along with any related attributes. External Table. This type of function, written using a high-level programming language, returns a table to the SQL statement that references it and can only be specified in the FROM clause of a SELECT statement. Again, the function itself resides in an external library and is registered in the database, along with any related attributes. OLE DB External Table. This type of function, written using a high-level programming language, returns a table from an OLE DB provider to the SQL statement that references it. Like an external table function, an OLE DB external table function can only be specified in the FROM clause of a SELECT statement. The function resides in an external library and is registered in the database, along with any related attributes. User-defined functions are created (or registered) by executing the CREATE FUNCTION SQL statement. Several flavors of this statement are available, and the appropriate form to use is determined by the type of user-defined function to be created. (We'll take a look at each form as we examine each function type available.)

Sourced Functions
A sourced function is constructed from a function that has already been registered with a database (referred to as the source function). Sourced functions can be columnar, scalar, or table in nature or they can be designed to overload a specific operator such as +, , *, and /. When a sourced function is invoked, all arguments passed to it are converted to the data types that are expected by the underlying source function, and the source function itself is invoked. Upon completion, the source function performs any conversions necessary on the results

produced and returns control to the calling SQL statement. The most common use of sourced functions is to enable a user-defined distinct data type to selectively inherit some of the semantics of the built-in data type on which it is based. (Often, a user-defined function that is sourced on an existing function for the purpose of providing that function's support to a userdefined data type is given the same name as the sourced function used. This allows users to invoke the same function with a user-defined distinct type without realizing that an additional definition was required. Generally, the same name can be used for more than one function if there is some difference in each function's signature.) The basic syntax for the form of the CREATE FUNCTION statement used to create a sourced function looks something like this: CREATE FUNCTION [FunctionName] ( <[ParameterName]> [InputDataType] ,...> ) RETURNS [OutputDataType] <SPECIFIC [SpecificName]> SOURCE [SourceFunction] <([DataType] ,...)> <AS TEMPLATE> where: FunctionName ParameterName InputDataType OutputDataType SpecificName

Identifies the name to be assigned to the sourced function that is to be created. Identifies the name to be assigned to one or more function parameters. Identifies the type of data the function expects to receive for the parameter identified by ParameterName. Identifies the type of data the function is to return. Identifies the specific name to be assigned to the user-defined function. This name can be used later to refer to or delete (drop) the function; however, it cannot be used to invoke the function. Identifies the name that has been assigned to the existing function that is to be used to create the sourced function. Identifies the type of data that the existing function expects to receive for each parameter required. When a specific name is assigned to a user-defined function, the function can be dropped by referencing the specific name in a special form of the DROP SQL statement (DROP SPECIFIC FUNCTION [SpecificName]). However, if no specific name is assigned to a user-defined function, both the function name and the function's signature (a list of the data types used by each of the function's parameters, enclosed in parenthesis) must be provided as input to the DROP FUNCTION statement.

SourceFunction DataType Note

If the AS TEMPLATE clause is specified when this form of the CREATE FUNCTION statement is used, a function template will be produced. The function template must then be mapped to a data source function within a federated system before the data source function can be invoked from a federated database. So, if you created a distinct data type named YEAR that is based on the INTEGER built-in data type and wanted to create a sourced function named AVG() that accepts and returns a YEAR value and is based on the AVG() built-in function (which accepts and returns an INTEGER value), you could do so by executing a CREATE FUNCTION SQL statement that looks something like this: CREATE FUNCTION AVG(YEAR) RETURNS YEAR

SOURCE SYSIBM.AVG(INTEGER)

SQL Functions
Although a sourced function is constructed from a function that already exists, an SQL function is constructed from the ground, up, using only SQL statements. An SQL function can return a single scalar value, a table, or a row, depending on how it has been defined. The basic syntax for the form of the CREATE FUNCTION statement used to create an SQL function looks something like this: CREATE FUNCTION [FunctionName] ( <[ParameterName]> [InputDataType] ,...> ) RETURNS [[OutputDataType] | TABLE ( [ColumnName] [ColumnDataType],... ) | ROW ( [ColumnName] [ColumnDataType],... )] <SPECIFIC [SpecificName]> <LANGUAGE SQL> <DETERMINISTIC | NOT DETERMINISTIC> <READS SQL DATA | CONTAINS SQL> <CALLED ON NULL INPUT> [SQLStatements] | RETURN [ReturnStatement] where: FunctionName ParameterName InputDataType OutputDataType ColumnName ColumnDataType SpecificName

Identifies the name to be assigned to the SQL function that is to be created. Identifies the name to be assigned to one or more function parameters. Identifies the type of data the function expects to receive for the parameter identified by ParameterName. Identifies the type of data the function is to return. Identifies the name to be assigned to one or more columns that the function is to return (if the function is designed to return a table or a row). Identifies the type of data the function expects to return for the column identified by ColumnName. Identifies the specific name to be assigned to the user-defined function. This name can be used later to refer to or delete (drop) the function; however, it cannot be used to invoke the function. Identifies one or more SQL statements that are to be executed when the function is called. Together, these statements act as a single dynamic compound SQL statement. Identifies the RETURN SQL statement that is to be used to return to the application that called the function. (If the SQL function's body is comprised of a dynamic compound statement, it must contain at least one RETURN statement; otherwise, a RETURN statement must be executed when the function is called. If the function is a table or row function, it can contain only one RETURN statement, and that statement must be the last statement used.)

SQLStatements

ReturnStatement

If the DETERMINISTIC clause is specified when this form of the CREATE FUNCTION statement is used, it is implied that the function will always return the same scalar value, table, or row when it is called with the same parameter values specified. However, if the NOT DETERMINISTIC clause is specified (or if neither clause is specified) it is implied that the function may return different results each time it is called with the same parameter values specified.

Thus, if you wanted to create a scalar SQL function named TANGENT() that accepts and returns a DOUBLE value using the built-in functions SIN() (sine) and COS() (cosine), you could do so by executing a CREATE FUNCTION SQL statement that looks something like this: CREATE FUNCTION TANGENT (X DOUBLE) RETURNS DOUBLE LANGUAGE SQL CONTAINS SQL DETERMINISTIC RETURN SIN(X)/COS(X)

External Scalar Functions


An external scalar function is a function that is written using a high-level programming language such as C, C++, or Java that returns a single value. Although the process of creating and implementing a sourced function or an SQL function is fairly simple, the process used to create and employ an external scalar function (or any external function for that mater), is much more complexto create any external function, the following steps must be performed: 1. Construct the body of the user-defined function, using a supported highlevel programming language. 2. Compile the user-defined function. 3. Link the user-defined function to create a library (or dynamic-link library). 4. Debug the user-defined function and repeat steps 2 through 4 until all problems have been resolved. 5. Physically store the library containing the user-defined function on the server workstation, and modify the system permissions for the library file containing the user-defined function so that all users can execute it. (In a UNIX environment, the chmod command is used to make a file executable; in a Windows environment, the attrib command is used for the same purpose.) 6. Register the user-defined function with a DB2 database using the appropriate form of the CREATE FUNCTION SQL statement. Once these steps have been completed, the resulting user-defined function can be used in the same manner in which any other scalar or table function can be used. Because the body of an external scalar function is executed by a high-level programming language rather than by the DB2 Database Manager, each value provided as a function argument must be converted from an SQL data type to an appropriate high-level programming language data type before it can be used. (If a function is defined such that one of its parameters expects a userdefined distinct data type, values for that parameter are converted to the appropriate built-in data type before they are passed to the external function.) Likewise, any value that is to be returned by the function must be converted from its high-level programming language data type to the appropriate SQL data type before it can be returned. The basic syntax for the form of the CREATE FUNCTION statement used to register an external scalar function (that has been created as outlined above) looks something like this: CREATE FUNCTION [FunctionName] ( <[ParameterName]> [InputDataType] ,...> ) RETURNS [OutputDataType] <SPECIFIC [SpecificName]> EXTERNAL <NAME [ExternalName] | [Identifier]> LANGUAGE [C | JAVA | OLE] PARAMETER STYLE [DB2GENERAL | JAVA | SQL]

<DETERMINISTIC | NOT DETERMINISTIC> <FENCED | NOT FENCED> <RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT> <NO SQL | CONTAINS SQL | READS SQL DATA> where: FunctionName ParameterName InputDataType OutputDataType SpecificName

Identifies the name to be assigned to the external scalar function that is to be created. Identifies the name to be assigned to one or more function parameters. Identifies the type of data the function expects to receive for the parameter identified by ParameterName. Identifies the type of data the function is to return. Identifies the specific name to be assigned to the user-defined function. This name can be used later to refer to or delete (drop) the function; however, it cannot be used to invoke the function. Identifies the name of the library and the name of the function in the library that contains the executable code of the external function being registered. (We'll take a closer look at how this name is constructed shortly.) Identifies the name of the library that contains the executable code of the external function being registered, but only if the procedure was written using C or C++. The DB2 Database Manager will look for a function that has the same name as the library name specified.

ExternalName

Identifier

As you can see, this form of the CREATE FUNCTION statement contains several additional clauses that are not found in the other forms of the CREATE FUNCTION statement that we have already looked at. In many cases, it is not immediately apparent what information these clauses are trying to convey, so before we look at an example, let's examine some of these clauses in detail. EXTERNAL <NAME [ExternalName] | [Identifier]>. This clause is used to identify two things: the name of the library and, optionally, the name of the function within the library that contains the executable code for the userdefined function being registered. The high-level programming language used to construct the body of any external user-defined function determines how these names are provided. For example, if an external user-defined function was developed using the C or C++ programming language, the names of the library and function within the library that contains the body of the function can be specified in four different ways: LibraryName LibraryName ! FunctionName AbsolutePath AbsolutePath ! FunctionName If a library name is provided instead of an absolute path, the DB2 Database Manager will look in the /sqllib/function and /sqllib/function/unfenced subdirectories (\sqllib\function and \sqllib\function\unfenced subdirectories on Windows) for the library name specified. On the other hand, if an absolute path is provided, the name of the library must be appended to the path, and the DB2 Database Manager will look in the location specified for the appropriate library. (If neither a library name nor an absolute path is provided, the DB2 Database Manager will look in the default subdirectories shown earlier for a library and function that has the same name as the name that is to be assigned to the user-defined function being registered.) If a function name is provided, the DB2 Database Manager will look for a function that has the name specified within the library specified; if no function name is provided, the DB2 Database Manager will look for a function that has the same name as the library name specified. PARAMETER STYLE [DB2GENERAL | JAVA | SQL]. This clause is used to identify the parameter passing style that the user-defined function expects the calling application to use when passing values to it. As you can see, three parameter passing styles are available:

DB2GENERAL. Values are passed and returned using the calling conventions that are used to call a method in a Java class. (Can only be used when the external user-defined function is written using Java.) JAVA. Values are passed and returned using calling conventions that conform to the Java language and SQLJ specifications. (Can only be used when the external user-defined function is written using Java.) SQL. Values are passed and returned using calling conventions that are defined in the SQL/Persistent Stored Modules ISO working draft; along with the arguments identified, the following are passed to the stored procedure when it is called: a null indicator for each parameter passed, a placeholder for the SQLSTATE to be returned in, the qualified name of the stored procedure, the specific name of the stored procedure, and a placeholder for the SQL diagnostic string to be returned in. (Can only be used when the external user-defined function is written using C/C++ or OLE.) <FENCED | NOT FENCED>. This clause is used to identify whether the external user-defined function is considered "safe" enough to be run in the DB2 Database Manager operating environment's process/address space (NOT FENCED), or not (FENCED). If the FENCED clause is specified (or if neither clause is specified) the DB2 Database Manager will not allow the function to access its internal resources. <NO SQL | CONTAINS SQL | READS SQL DATA>. This clause is used to identify which types of SQL statements have been coded in the body of the external user-defined function. Three different values are available: NO SQL. The body of the external user-defined function either does not contain any SQL or it contains non-executable SQL statements. (Examples of non-executable SQL statements are the INCLUDE and WHENEVER statements.) CONTAINS SQL. The body of the external user-defined function contains executable SQL statements that neither read nor modify data. READS SQL DATA. The body of the external user-defined function contains executable SQL statements that read but do not modify data. Thus, if you wanted to register an external scalar function named CENTER() that accepts two values (an INTEGER value and a DOUBLE value), returns a DOUBLE value, and is stored in a library named "double" that resides in the directory "/home/db2inst1/myfuncs", you could do so by executing a CREATE FUNCTION SQL statement that looks something like this: CREATE FUNCTION CENTER(INT, DOUBLE) RETURNS DOUBLE EXTERNAL NAME '/home/db2inst1/myfuncs/double' LANGUAGE C PARAMETER STYLE SQL DETERMINISTIC NO SQL

External Table Functions


Like external scalar functions, external table functions are written using a high-level programming language. Although an external scalar function returns a scalar value, an external table function returns a result data set each time it is invoked. External table functions are powerful because they allow you to make almost any source of data appear to be a DB2 UDB base table. Furthermore, the result data set returned by an external table function can be used in join operations, grouping operations, set operations (for example, UNIONs), and any other operation that can be applied to a read-only view. The basic syntax for the form of the CREATE FUNCTION statement that is used to register an external table function looks something like this: CREATE FUNCTION [FunctionName]

( <[ParameterName]> [InputDataType] ,...> ) RETURNS TABLE ( [ColumnName] [ColumnDataType],... ) <SPECIFIC [SpecificName]> EXTERNAL <NAME [ExternalName] | [Identifier]> LANGUAGE [C | JAVA | OLE] PARAMETER STYLE [DB2GENERAL | SQL] <DETERMINISTIC | NOT DETERMINISTIC> <FENCED | NOT FENCED> <RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT> <READS SQL DATA | NO SQL | CONTAINS SQL> where: FunctionName ParameterName InputDataType ColumnName ColumnDataType SpecificName

Identifies the name to be assigned to the external table function that is to be created. Identifies the name to be assigned to one or more function parameters. Identifies the type of data the function expects to receive for the parameter identified by ParameterName. Identifies the name to be assigned to one or more columns that the function is to return (if the function is designed to return a table or a row). Identifies the type of data the function expects to return for the column identified by ColumnName. Identifies the specific name to be assigned to the user-defined function. This name can be used later to refer to or delete (drop) the function; however, it cannot be used to invoke the function. Identifies the name of the library and the name of the function in the library that contains the executable code of the external function being registered. Identifies the name of the library that contains the executable code of the external function being registered, but only if the procedure was written using C or C++. The DB2 Database Manager will look for a function that has the same name as the library name specified.

ExternalName

Identifier

So, if you wanted to register an external table function named EMPDATA() that accepts two variable-length character string values, returns a result data set that contains employee information (retrieved from an ASCII file), and is stored in a library named "EMPDATA" that resides in the directory "/home/db2inst1/myfuncs", you could do so by executing a CREATE FUNCTION SQL statement that looks something like this: CREATE FUNCTION EMPDATA (VARCHAR(30), VARCHAR(255)) RETURNS TABLE (EMPID INT, LNAME CHAR(20), FNAME CHAR(20)) EXTERNAL NAME '/home/db2inst1/myfuncs/EMPDATA' LANGUAGE C PARAMETER STYLE SQL NO SQL DETERMINISTIC NO EXTERNAL ACTION NOT FENCED

OLE DB External Table Functions


Microsoft OLE DB is a set of Application Programming Interfaces (APIs) that are designed to provide access to a wide variety of data sources. To OLE DB, a data source consists of the data itself, its associated Database Management System (DBMS), the platform on which the DBMS exists, and the network used to access that platform. OLE DB is designed to provide access to all types of data in an OLE Component Object Model (COM) environment. Along with the functionality provided with CLI/ODBC, OLE DB defines interfaces that are suitable for gaining access to data that cannot be accessed via SQL. OLE DB facilitates application integration by defining a set of standard interfaces, which are simply groupings of semanticallyrelated functions through which one application accesses the services of another. Interfaces are the binary standard for component-object interaction, and each interface contains a set of functions that defines a contract between the object implementing the interface (the provider) and the client using the interface (the consumer). Two classes of OLE DB providers exist: OLE DB data providers, which own data and expose their data in tabular format as a rowset, and OLE DB service providers, which do not own their own data but encapsulate some services by producing and consuming data through OLE DB interfaces. Like external table functions, external OLE DB table functions are written using a high-level programming language. However, with OLE DB table functions, a generic built-in OLE DB consumer can be used to interface with any OLE DB provider to access data; you need only to register an OLE DB table function and refer to the appropriate OLE DB provider as the data source. No additional programming is needed. And like external table functions, the table returned by an external OLE DB table function can be used in joins, grouping operations, set operations (for example, UNIONs), and any other operation that can be applied to a read-only view. For example, you can define an OLE DB table function to return a table from a Microsoft Access database or a Microsoft Exchange address book and then create a report that seamlessly combines data from this OLE DB table with data in a DB2 UDB database. Note To use OLE DB table functions with a DB2 Universal Database, you must install OLE DB 2.0 or later, which is available from Microsoft. Refer to your data source documentation for more information about the system requirements and OLE DB providers available for your data sources. The basic syntax for the form of the CREATE FUNCTION statement used to register an OLE DB external table function looks something like this: CREATE FUNCTION [FunctionName] ( <[ParameterName]> [InputDataType] ,...> ) RETURNS TABLE ( [ColumnName] [ColumnDataType],... ) <SPECIFIC [SpecificName]> EXTERNAL <NAME [ExternalName]> LANGUAGE OLEDB <DETERMINISTIC | NOT DETERMINISTIC> <RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT> where: FunctionName ParameterName InputDataType ColumnName

Identifies the name to be assigned to the OLE DB external table function that is to be created. Identifies the name to be assigned to one or more function parameters. Identifies the type of data the function expects to receive for the parameter identified by ParameterName. Identifies the name to be assigned to one or more columns the function is to return (if the function is designed to return a table or a row).

ColumnDataType SpecificName

Identifies the type of data the function expects to return for the column identified by ColumnName. Identifies the specific name to be assigned to the user-defined function. This name can be used later to refer to or delete (drop) the function; however, it cannot be used to invoke the function. Identifies the name of the external table being referenced and the name of the OLE DB provider.

ExternalName

The syntax used to specify the external table name and the OLE DB provider is: '[Server]!<Rowset>' or '!<Rowset>![ConnectString] <!COLLATING_SEQUENCE = [N | Y]>' where: Server Rowset ConnectString

Identifies the local name of a data source as defined by the CREATE SERVER SQL statement. Identifies the rowset (table) that is exposed by the OLE DB provider. Specifies a connection string that identifies the initialization properties that are needed to connect to a data source via an OLE DB provider. This string consists of a series of keyword=value pairs that are similar to those used to construct a connection string for the CLI/ODBC function SQLDriverConnect().

The <!COLLATING_SEQUENCE = [N | Y]> option is used to specify whether the data source being accessed uses the same collating sequence as DB2 UDB. Thus, if you wanted to register an OLE DB external table function named ORDERS() that retrieves order information from a Microsoft Access database, you could do so by executing a CREATE FUNCTION SQL statement that looks something like this: CREATE FUNCTION ORDERS() RETURNS TABLE (ORDERID INTEGER, CUSTOMERID CHAR(5), EMPLOYEEID INTEGER, ORDERDATE TIMESTAMP, REQUIREDDATE TIMESTAMP, SHIPPEDDATE TIMESTAMP, SHIPCHARGES DEC(19,4)) LANGUAGE OLEDB EXTERNAL NAME '!orders!Provider=Microsoft.Jet.OLEDB.3.51; Data Source=c:\sqllib\samples\oledb\nwind.mdb'

Why Use User-defined Functions?


Just as there are times when it is advantageous to create user-defined data types, there are times when it makes sense to create user-defined functions. Some of the benefits associated with creating user-defined functions include: Better reuse of code. If a particular operation or action is needed by several users and/or applications, a user-defined function will enable each user or application to perform the desired

operation without having to duplicate the code used to perform the operation. Additionally, because the DB2 Database Manager will take care of many data type conversions, once a function is created, the potential for it to be used in more ways than was originally conceived is high. Increased availability. If the code used to perform a particular operation is embedded in an application, only the users of that application are allowed to perform the operation; interactive users, such as those who use the Command Line Processor (CLP) to access a database, cannot take advantage of the functionality that such an application provides. This is not the case when the code needed to perform the operation is incorporated into a user-defined function; both interactive users and application users can perform the operation or action once it has been embodied in a user-defined function. Increased performance. An operation that has been embodied in a user-defined function will often execute faster than a similar operation that has been incorporated into an application, particularly when the operation is used to qualify data for further processing. For example, suppose you have a need to extract some information from a large object (LOB) data value. If you have a function that performs this operation, the extraction can take place directly at the database server, and only the extracted value will need to be passed to a user or a calling application. This is much more efficient than sending the entire LOB value to an application or user and requiring the extraction to take place on a client workstation. Encapsulation. When you create a distinct data type, you are automatically provided with functions for casting between the distinct type and its source type, and in most cases, you are provided with comparison functions, as well (depending on the source data type used). However, you must provide any additional behavior yourself. Because it is clearly desirable to keep the behavior of a distinct data type in the database where all users of the distinct type can easily access its behaviors, user-defined functions can be used to encapsulate distinct data types with operations that are designed to only be used with them.

Stored Procedures
Earlier, we saw that when you set up a remote DB2 UDB database server and access it from one or more DB2 UDB client workstations, you have, in essence, established a basic DB2 UDB client/server environment. In such an environment, each time an SQL statement is executed against the database on the remote server, the statement itself is sent through a network from the client workstation to the database server. The database server then processes the statement, and the results are sent back, again through the network, to the client workstation. (This means that two messages must go through the network for every SQL statement that is executed.) To take advantage of this architecture, client/server application development focuses on breaking an application into two separate parts, storing those parts on two different platforms (the client and the server), and having them communicate with each other as the application executes. This allows the code that interacts directly with a database to reside on a database server or midrange computer where computing power and centralized control can be used to provide quick, coordinated data access. At the same time, the application logic can reside on one or more smaller (client) workstations so it can make effective use of all the resources the client workstation has to offer without causing a bottleneck at the server. Figure 8-1 illustrates how such an application works in a DB2 UDB client/server environment. If you have an application that contains one or more transactions that perform a relatively large amount of database activity with little or no user interaction, each transaction can be stored on the database server as a stored procedure. With a stored procedure, all database processing done by the transaction can be performed directly at the database server. And because a stored procedure is invoked by a single SQL statement, fewer messages have to be transmitted across the networkonly the data that is actually needed at the client workstation has to be sent across. Figure 8-2 illustrates how a DB2 UDB application using a stored procedure works in a client/server environment. Client/server applications that use stored procedures have the following advantages over client/server applications that do not:

Figure 8-1: How client/server applications work in a DB2 UDB client/server environment. Reduced network traffic. Messages are not sent across the network for SQL statements that are coded in a stored procedure. If a stored procedure is designed correctly, only data that is needed by the client application will be sent across the network. Improved performance of server-intensive work. Because less data has to be sent across the network, and because processing is done right at the server, complex queries and other serverintensive work can execute faster. Ability to separate and reuse business logic. When business rules are incorporated into stored procedures, the logic can be reused multiple times by simply calling the stored procedure as needed. Because the logic used in a stored procedure can be modified independently, applications do not have to be recoded when business rules change. Ability to access features that exist only on the server. Because stored procedures run directly on the server workstation, they can take advantage of any extra memory, faster processor(s), etc. that the database server might have. (Typically, database servers have more memory and multiple or faster processors than client workstations.) Additionally, stored procedures can take advantage of DB2's set of administrative APIs, which can be run only at the server. Finally, because stored procedures are not restricted to performing databaseonly related activities, they can take advantage of any additional software that has been installed on the server workstation.

Figure 8-2: How DB2 UDB applications using stored procedures work in a client/server environment. Applications that use stored procedures do so under the restriction that all input data must be passed from application to the stored procedure at invocation time and that any result data sets produced by the stored procedure will not be returned to the application until the stored procedure completes execution. In other words, no interactions can occur between the application and the stored procedure while the stored procedure is running.

Developing and Registering Stored Procedures


Just as there are different types of user-defined functions, there are two different types of stored procedures available: SQL and external. As the name implies, an SQL stored procedure is composed entirely of SQL statements. An external stored procedure, however, is constructed using a high-level programming language such as C, C++, Java, or COBOL. Regardless of how they are constructed, all stored procedures must be structured such that they performs three distinct tasks: First, the procedure must accept input parameter values, if any, from the calling application. Next, the procedure must perform whatever processing is appropriate. (Typically, this involves executing one or more SQL statements within a single transaction.) Finally, the procedure must return output data, if any, to the calling application. At the very least, a stored procedure should always return a value that indicates its success or failure. Stored procedures are created by executing the CREATE PROCEDURE SQL statement. Two forms of this statement are available, and the appropriate form to use is determined by the type

of stored procedure that is to be created. (We'll take a look at each form as we examine each type of procedure available.)

SQL Stored Procedures


As mentioned earlier, an SQL stored procedure is a procedure that consists entirely of SQL statements. The basic syntax for the form of the CREATE PROCEDURE statement that is used to create an SQL stored procedure looks something like this: CREATE PROCEDURE [ProcedureName] ( [ParamType] [ParamName] [DataType] ,...) <SPECIFIC [SpecificName]> <DYNAMIC RESULT SETS [NumResultSets]> <NO SQL | CONTAINS SQL | READS SQL DATA> <DETERMINISTIC | NOT DETERMINISTIC> <CALLED ON NULL INPUT> <LANGUAGE SQL> [SQLStatement] where: ProcedureName ParamType ParamName DataType SpecificName

Identifies the name to be assigned to the procedure to be created. Indicates whether the parameter identified by ParamName is an input parameter (IN), an output parameter (OUT), or both an input parameter and an output parameter (INOUT). (Valid values include IN, OUT, and INOUT.) Identifies the name to be assigned to a procedure parameter. Identifies the type of data the procedure expects to receive or send for the parameter identified by ParamName. Identifies the specific name to be assigned to the stored procedure. This name can be used later to comment on the stored procedure or drop the stored procedure; however, it cannot be used to invoke the stored procedure. Identifies whether the stored procedure being registered returns result data sets and, if so, how many. Identifies a single SQL statement or a compound SQL statement (i.e., two or more SQL statements enclosed with the keywords BEGIN ATOMIC and END and terminated with a semicolon) that is to be executed when the stored procedure is invoked. When a specific name is assigned to a stored procedure, the procedure can be dropped by referencing the specific name in a special form of the DROP SQL statement (DROP SPECIFIC PROCEDURE [SpecificName]). However, if no specific name is assigned to a stored procedure, both the procedure name and the procedure's signature (a list of the data types used by each of the stored procedure's parameters) must be provided as input to the DROP PROCEDURE statement.

NumResultSets SQLStatement

Note

As you can see, some of the clauses that are used with this form of the CREATE PROCEDURE SQL statement are identical to those used by the CREATE FUNCTION SQL statement. A simple SQL stored procedure could be created by executing a CREATE PROCEDURE statement that looks something like this: CREATE PROCEDURE GET_SALES

(IN QUOTA INTEGER, OUT RETCODE CHAR(5)) DYNAMIC RESULT SETS 1 LANGUAGE SQL BEGIN DECLARE SQLSTATE CHAR(5); DECLARE SALES_RESULTS CURSOR WITH RETURN FOR SELECT SALES_PERSON, SUM(SALES) AS TOTAL_SALES FROM SALES GROUP BY SALES_PERSON HAVING SUM(SALES) > QUOTA; DECLARE EXIT HANDLER FOR SQLEXCEPTION SET RETCODE = SQLSTATE; OPEN SALES_RESULTS; SET RETCODE = SQLSTATE; END The resulting SQL stored procedure, called GET_SALES, accepts an integer input value (in an input parameter called QUOTA) and returns a character value (in an output parameter called RETCODE) that reports the procedure's success or failure. The procedure body consists of a compound SQL statement that returns a result data set (i.e., an open cursor) containing the name and total sales figures for each salesperson whose total sales exceed the quota specified in a result data set. This is done by: 1. Indicating that the SQL procedure is to return a result data set by specifying the DYNAMIC RESULT SETS clause of the CREATE PROCEDURE statement and assigning it the value 1. 2. Declaring a cursor within the procedure body (using the WITH RETURN FOR clause) for the result data set that is to be returned. (Earlier, we saw that a cursor is a named control structure that points to a specific row within a set of rows and is used by an application program to retrieve values from this set of rows.) 3. Opening the cursor, which produces the result data set that is to be returned. 4. Leaving the cursor open when the SQL procedure ends. It is important to note that when an SQL stored procedure is used to implement a business rule, the logic used to apply that business rule can be incorporated into any application simply by invoking the stored procedure. Thus, the same business rule logic is guaranteed to be enforced across all applications. When business rules change, only the logic in the SQL stored procedure needs to be changed; applications that call the procedure do not have to be modified. (The same can be said for external stored procedures, but the steps required to change the business logic coded in an external stored procedure is a little more complex.)

External Stored Procedures


An external stored procedure is a stored procedure that is written using a high-level programming language such as C, C++, or Java. Whereas SQL procedures offer rapid application development and considerable flexibility, external stored procedures can be much more powerful than SQL stored procedures because they can take advantage of system calls and administrative APIs, as well as execute SQL statements. However, this increase in functionality makes them more difficult to create; to create any external procedure, the following steps must be performed:

Construct the body of the stored procedure, using a supported high-level programming language. Compile the stored procedure. Link the stored procedure to create a library (or dynamic-link library). Debug the stored procedure and repeat steps 2 through 4 until all problems have been resolved. Physically store the library containing the stored procedure on the database server. By default, the DB2 Database Manager looks for external stored procedures in the /sqllib/function and /sqllib/function/unfenced subdirectories (\sqllib\function and \sqllib\function\ unfenced subdirectories on Windows). Additionally, the system permissions for the library file containing the stored procedure must be modified so that all users can execute it. For example, in a UNIX environment, the chmod command is used to make a file executable; in a Windows environment, the attrib command is used for the same purpose. Register the stored procedure with a DB2 database using the appropriate form of the CREATE PROCEDURE SQL statement.

The basic syntax for the form of the CREATE PROCEDURE statement used to register an external stored procedure (that has been created as outlined above) looks something like this: CREATE PROCEDURE [ProcedureName] ( [ParamType] [ParamName] [DataType] ,...) <SPECIFIC [SpecificName]> <DYNAMIC RESULT SETS [NumResultSets]> <NO SQL | CONTAINS SQL | READS SQL DATA> <DETERMINISTIC | NOT DETERMINISTIC> <CALLED ON NULL INPUT> LANGUAGE [C | JAVA | COBOL | OLE] EXTERNAL <NAME [ExternalName] | [Identifier]> <FENCED <THREADSAFE | NOT THREADSAFE> | NOT FENCED <THREADSAFE>> PARAMETER STYLE [DB2GENERAL | DB2SQL | GENERAL | GENERAL WITH NULLS | JAVA | SQL] <PROGRAM TYPE [SUB | MAIN]> <DBINFO | NO DBINFO> where: ProcedureName ParamType

Identifies the name to be assigned to the procedure to be created. Indicates whether the parameter identified by ParamName is an input parameter (IN), an output parameter (OUT), or both an input parameter and an output parameter (INOUT). (Valid values include IN, OUT, and INOUT.) Identifies the name to be assigned to a procedure parameter. Identifies the type of data the procedure expects to receive or send for the parameter identified by ParamName. Identifies the specific name to be assigned to the stored procedure. This name can be used later to comment on or drop the stored procedure; however, it cannot be used to invoke the stored procedure.

ParamName DataType SpecificName

NumResultSets SQLStatements

Identifies whether the stored procedure being registered returns result data sets and, if so, how many. Specifies one or more SQL statements that are to be executed when the stored procedure is invoked. These statements make up the body of an SQL procedure. Identifies the name of the library, along with the name of the function in the library, that contains the executable code of the stored procedure being registered. Identifies the name of the library that contains the executable code of the stored procedure being registered, but only if the procedure was written using C or C++. The DB2 Database Manager will look for a function that has the same name as the library name specified.

ExternalName

Identifier

Again, many of the clauses used with this form of the CREATE PROCEDURE SQL statement are similar to those used by the CREATE FUNCTION SQL statement. For example: <DETERMINISTIC | NOT DETERMINISTIC>. This clause is used to identify whether the stored procedure will always return the same results when passed the same parameter values (DETERMINISTIC) or not (NOT DETERMINISTIC). (A stored procedure that applies a 15% increase to any value passed to it would be considered DETERMINISTIC, while a stored procedure that generates a unique ID using the TIMESTAMP_ISO() function would be considered NOT DETERMINISTIC.) <CALLED ON NULL INPUT>. When this clause is used, the stored procedure will always be invoked, even if a null value is passed for one or more of its input parameters. <NO SQL | CONTAINS SQL | READS SQL DATA | MODIFIES SQL DATA>. This clause is used to identify which types of SQL statements have been coded in the body of the external stored procedure. Four different values are available: NO SQL. The body of the stored procedure either does not contain any SQL or it contains nonexecutable SQL statements. (Examples of non-executable SQL statements are the INCLUDE and WHENEVER statements.) CONTAINS SQL. The body of the stored procedure contains executable SQL statements that neither read nor modify data. READS SQL DATA. The body of the stored procedure contains executable SQL statements that read but do not modify data. MODIFIES SQL DATA. The body of the stored procedure contains executable SQL statements that both read and modify data. PARAMETER STYLE [DB2GENERAL | DB2SQL | GENERAL | GENERAL WITH NULLS | JAVA | SQL]>. This clause is used to identify the parameter passing style that the stored procedure expects the calling application to use when passing values to it. As you can see, there are six parameter passing styles available: DB2GENERAL. Values are passed and returned using the calling conventions that are used to call a method in a Java class. (Can only be used when the stored procedure is written using Java.) DB2SQL. Values are passed and returned using calling conventions that are defined in the SQL/Persistent Stored Modules ISO working draft; along with the arguments identified, the following are passed to the stored procedure when it is called: a null indicator for each parameter passed, a placeholder for the SQLSTATE to be returned in, the qualified name of the stored procedure, the specific name of the stored procedure, and a placeholder for the SQL diagnostic string to be returned in. (Can only be used when the stored procedure is written using C/C++, COBOL, or OLE.) GENERAL. Values are passed and returned exactly as they are specified when the stored procedure is invoked. (Can only be used when the stored procedure is written using C/C++ or COBOL.) GENERAL WITH NULLS. Same as GENERAL, with one major differencean additional argument containing a vector of null indicators is also passed to and returned from the stored procedure. (Can only be used when the stored procedure is written using C/C++ or COBOL.)

JAVA. Values are passed and returned using calling conventions that conform to the Java language and SQLJ specifications. (Can only be used when the stored procedure is written using Java.) SQL. Same as DB2SQL. <PROGRAM TYPE [SUB | MAIN]>. This clause is used to identify whether the stored procedure was defined as a main routine (MAIN) or as a subroutine (SUB). If the PROGRAM TYPE SUB clause is specified, the DB2 Database Manager will pass all parameter values to the stored procedure as separate arguments. In this case, the stored procedure can be assigned any name that conforms to the function naming conventions allowed by the high-level programming language used. On the other hand, if the PROGRAM TYPE MAIN clause is specified, the DB2 Database Manager will pass all parameter values to the stored procedure as a combination of an argument counter and an array of argument values. In this case, the DB2 Database Manager expects the name of the stored procedure to be "main." <DBINFO | NO DBINFO>. This clause is used to identify whether information known by DB2 is to be passed to the stored procedure as an additional argument when it is invoked (DBINFO) or not (NO DBINFO). If the DB2INFO clause is used, the DB2 Database Manager will pass a data structure that contains the following information to the stored procedure at the time it is invoked: The name of the currently connected database. The unique application ID that is established for each connection to the database. The application run-time authorization ID. The database code page. The version, release, and modification level of the database server invoking the stored procedure. The operating system being used by the server. Thus, if you wanted to register an external stored procedure named EXTRACT_RESUME that is stored as a function named "ExResume" in a library named "SProc" that resides in the directory "C:\StoredProcs", you could do so by executing a CREATE PROCEDURE SQL statement that looks something like this: EXEC SQL CREATE PROCEDURE EXTRACT_RESUME (IN FILENAME VARCHAR(255), IN EMPNO CHAR(6)) SPECIFIC EXTRACT_RESUME DYNAMIC RESULT SETS 0 EXTERNAL NAME 'C:\StoredProcs\SProc!ExResume' LANGUAGE C PARAMETER STYLE GENERAL DETERMINISTIC FENCED CALLED ON NULL INPUT PROGRAM TYPE SUB; When this particular CREATE PROCEDURE SQL statement is executed, an external stored procedure will be registered that: Has been assigned the name EXTRACT_RESUME. Has one input parameter called FILENAME that expects a VARCHAR(255) data value and another input parameter called EMPNO that expects a CHAR(6) data value. Has been assigned the specific name EXTRACT_RESUME. Does not return a result data set. Was constructed using the C or C++ programming language.

Expects calling applications to use the GENERAL style when passing parameters. Will always return the same results if called with the same parameter values. Is to be run outside the DB2 Database Manager operating environment's address space. Is to be called, even if one of its parameters contains a null value. Was written as a subroutine

Calling a Stored Procedure


Once a stored procedure has been registered with a database (by executing the CREATE PROCEDURE SQL statement), that procedure can be invoked, either interactively, using a utility such as the Command Line Processor, or from a client application. Registered stored procedures are invoked by executing the CALL SQL statement. The basic syntax for this statement is: CALL [ProcedureName] ( <[InputParameter] | [OutputParameter] | NULL> ,...) where: ProcedureName

Identifies the name assigned to the procedure to be invoked. (Remember, the procedure name, not the specific name, must be used to invoke the procedure.) Identifies one or more parameter values that are to be passed to the procedure being invoked. Identifies one or more parameter markers or host variables that are to receive return values from the procedure being invoked.

InputParameter OutputParameter

Thus, the SQL procedure named GET_SALES that we created earlier could be invoked by connecting to the appropriate database and executing a CALL statement from the Command Line Processor that looks something like this: CALL GET_SALES (25, ?) The same procedure could be invoked from an embedded SQL application by executing a CALL statement that looks something like this: CALL GET_SALES (25, :RetCode) where RetCode is the name of a valid host variable. When this CALL statement is executed, the value 25 is passed to the input parameter named QUOTA, and a question mark (?) or a host variable named RetCode is used as a place-holder for the value that will be returned in the output parameter named RETCODE.

The Development Center


As you might imagine, the complexity of developing, debugging, and deploying user-defined functions and stored procedures increases as the amount of work a user-defined function or stored procedure is expected to do increases. However, this complexity can be greatly reduced when user-defined functions and stored procedures are developed using a special tool known as the Development Center. The Development Center is an easy-to-use, interactive GUI application that provides users with a development environment that supports the entire DB2 UDB Family. With it, an application developer can focus on creating and testing user-defined functions and stored procedures without having to address the details of registering, building, and installing user-defined functions and stored procedures on a DB2 server. (When you use the Development Center to build a userdefined function or a stored procedure, it compiles the source code on the client workstation

[Java routines] or server [SQL routines], copies the source code and resulting library to the server, and registers the routine in the system catalog of the database being used.) With the Development Center, users can: Create, build, and deploy user-defined functions, including SQL scalar functions, SQL table functions, OLE DB table functions, functions that read MQSeries messages, and functions that extract data from XML documents. (You can manually create user-defined functions using the builtin editor or automate the process by using one of the Wizards available.) Create, build, and deploy SQL and Java stored procedures. (You can manually create SQL and Java stored procedures using the built-in editor or automate the process by using one of the Wizards available.) Debug SQL stored procedures using the integrated debugger. Create and build structured data types. View and work with database objects such as tables, triggers, and views. Export and import routines and project information. The Development Center can be launched as a separate application from the DB2 Universal Database program group or it can be started as a separate process from the Control Center toolbar, Tools menu, or Stored Procedures folder. Figure 8-3 shows what the Development Center looks like on a Windows 2000 server.

Figure 8-3: The Development Center. The Development Center also provides a DB2 Development Add-In for each of the following interactive development environments: Microsoft Visual Studio Microsoft Visual Basic Microsoft Visual InterDev IBM VisualAge for Java Using these add-ins, application developers can quickly and easily access the features of the Development Center directly from a Microsoft or IBM VisualAge development tool.

Practice Questions
Question 1 Given the following type definition: CREATE DISTINCT TYPE speed_limit AS DECIMAL (5,0) WITH COMPARISONS Which of the following is the best choice for the implementation of the addition operation for two values of the SPEED_LIMIT data type? A. An SQL scalar UDF B. An external table UDF C. A sourced UDF D. An SQL row UDF

Question 2

Question 3

Question 4

Question 5

Which of the following is NOT a specific benefit of using stored procedures in a client/server environment? A. Ability to reduce data flow over the network B. Ability to return user-defined messages C. Ability to separate and reuse business logic D. Ability to improve performance of server-intensive work The Development Center can be used to develop which of the following? A. External scalar UDFs B. External table UDFs C. SQL and Java stored procedures D. C/C++ and Java stored procedures Which of the following is NOT a valid parameter passing style for a UDF? A. GENERAL B. JAVA C. SQL D. DB2GENERAL Given the statement: CREATE PROCEDURE proc1 (IN p1 INTEGER) RESULT SETS 1 LANGUAGE SQL P1:BEGIN DECLARE c1 CURSOR WITH RETURN FOR SELECT * FROM tab1; OPEN c1; P1:END Which of the following accurately describes the database object that will be created? A. An SQL stored procedure that accepts an integer value as input and returns a cursor for all rows found in table TAB1 to the procedure caller B. An external stored procedure that accepts an integer value as input and returns a cursor for all rows found in table TAB1 to the procedure caller C. An SQL stored procedure that accepts an integer value as input and returns an array containing all rows found in table TAB1 to the procedure caller D. An SQL stored procedure that accepts an integer value as input and returns nothing to the procedure caller because the TO CALLER clause does not follow the WITH RETURN clause of the DECLARE CURSOR statement.

Question 6

Question 7

Question 8

Which of the following does NOT accurately describe the Development Center? A. The Development Center is in interactive GUI application B. The Development Center can be used to develop applications for the entire DB2 Family, including DB2 UDB for iSeries (AS/400) and DB2 for zSeries (z/OS, OS/390) C. The Development Center precompiles and compiles the source code for an SQL stored procedure on the server, copies the source code and resulting library to the server, and registers the routine in the system catalog of the database being used D. Because the Development Center cannot be used to develop C/C+ + stored procedures, it does not interact with Microsoft Visual Studio. Four different applications need to make a business measurement calculation that involves performing several different SQL operations. These applications will be run from a number of different client workstations that communicate to a single server workstation over a 10/100 Ethernet network. Which of the following should be used to perform this calculation while providing maximum application performance? A. An SQL stored procedure B. An SQL scalar UDF C. An external scalar UDF D. An external table UDF Given the following SQL stored procedure: CREATE PROCEDURE myStoredProc(IN p1 INTEGER, IN p2 CHAR(5)) LANGUAGE SQL SPECIFIC SProc BEGIN IF (p1 = 0 OR p1 = 1) THEN UPDATE tab1 SET col1 = p2; END IF; END Which of the following will remove this stored procedure from the database? A. DROP PROCEDURE myStoredProc() B. DROP SPECIFIC PROCEDURE SProc C. DROP PROCEDURE myStoredProc(p1, p2) D. DROP SPECIFIC PROCEDURE myStoredProc() The correct answer is C. A sourced UDF is a user-defined

Answers Question

Question 2

Question 3

Question 4

Question 5

Question

function that is constructed from a function that has already been registered with a database (referred to as the source function). Sourced functions can be columnar, scalar, or table in nature or they can be designed to overload a specific operator such as +, , *, and /. (SQL scalar UDFs, external table UDFs, and SQL row UDFs cannot be used to overload the addition operator.) The correct answer is B. Client/server applications that use stored procedures have the following advantages over client/server applications that do not: Reduced network traffic Improved performance of server-intensive work Ability to separate and reuse business logic Ability to access features that only exists on the server The correct answer is C. With the Development Center, you can create, build, and deploy SQL scalar UDFs, SQL table UDFs, OLE DB UDFs, UDFs that read MQSeries messages, UDFs that extract data from XML documents, SQL stored procedures, and Java stored procedures. (The Development Center cannot be used to create external UDFs or stored procedures that are to be written in C/C++ or COBOL.) The correct answer is A. Three parameter passing styles can be used when registering external user-defined functions available. They are: DB2GENERAL. Values are passed and returned using the calling conventions that are used to call a method in a Java class. (Can be used only when the external user-defined function is written using Java.) JAVA. Values are passed and returned using calling conventions that conform to the Java language and SQLJ specifications. (Can be used only when the external userdefined function is written using Java.) SQL. Values are passed and returned using calling conventions that are defined in the SQL/Persistent Stored Modules ISO working draft; along with the arguments identified, the following are passed to the stored procedure when it is called: a null indicator for each parameter passed, a placeholder for the SQLSTATE to be returned in, the qualified name of the stored procedure, the specific name of the stored procedure, and a placeholder for the SQL diagnostic string to be returned in. (Can be used only when the external user-defined function is written using C/C++ or OLE.) The correct answer is A. Because the body of the stored procedure contains nothing but SQL statements and because the LANGUAGE SQL clause was used with the CREATE PROCEDURE statement, the resulting stored procedure must be an SQL stored procedure. The statement (IN p1 INTEGER) that follows the procedure name indicates that the procedure accepts an integer value as input, and the DECLARE CURSOR statement indicates that all rows found in the table TAB1 are to be copied to a cursor. Finally, the RESULT SETS 1 clause of the CREATE PROCEDURE statement, coupled with the WITH HOLD clause of the DECLARE CURSOR statement, indicates that the cursor containing all rows found in the table TAB1 will be returned to the procedure caller when the procedure ends. The correct answer is D. The Development Center provides a

DB2 Development Add-In for each of the following interactive development environments: Microsoft Visual Studio Microsoft Visual Basic Microsoft Visual InterDev IBM VisualAge for Java Using these add-ins, application developers can quickly and easily access the features of the Development Center directly from a Microsoft or IBM VisualAge development tool. The correct answer is A. In this case, an SQL stored procedure will eliminate duplication of code and will reduce network traffic over a relatively slow network. The correct answer is B. When a specific name is assigned to a stored procedure, the procedure can be dropped by referencing the specific name in a special form of the DROP SQL statement (DROP SPECIFIC PROCEDURE [SpecificName]). However, if no specific name is assigned to a stored procedure, both the procedure name and the procedure's signature (a list of the data types used by each of the stored procedure's parameters) must be provided as input to the DROP PROCEDURE statement.

Question 7 Question 8

Вам также может понравиться