Академический Документы
Профессиональный Документы
Культура Документы
Pam Odden
1 12/11/2002
Objectives
What
is an inner join? How is an outer join different? Which one should I use? Joining a table to itself Coding considerations for joining multiple tables
2 12/11/2002
The process of forming pairs of rows by matching the contents of related columns is called joining the tables. The resulting table, which contains data from both of the original tables, is called a join between the two tables. An inner join returns a row for every pair of rows that are matched by the related columns. For example, to see a list of students and their emergency contact numbers for a particular class, we can join the astu_student table of student information with the aemg_contact table of students emergency contact information. (see next slide)
3 12/11/2002
400437 400439
MARCUS TAYLOR
( (
) )
4 12/11/2002
The select statement on the prior slide shows an inner join with SQL2 standard notation, which specifically uses the words INNER JOIN and puts the join conditions in the FROM clause. Before the SQL2 standard, inner joins were expressed more like a single-table select statement, with the tables in a list separated by commas and the join conditions included in the WHERE clause. The following query has the same result set as the one on the prior slide:
SELECT A.PERMNUM, A.FIRSTNAME, C.TELEPHONE1 FROM SSASIDB1.ASTU_STUDENT A, SSASIDB1.AEMG_CONTACT C WHERE A.SCHOOLNUM = C.SCHOOLNUM AND A.PERMNUM = C.PERMNUM AND A.SCHOOLNUM = '235' AND A.TRK = '5' AND A.GRADE = ' K';
The SQL2 standard also includes other optional notation for inner joins which is not supported by DB2.
5 12/11/2002
The inner join in the previous example is fine if we want only students with emergency contacts on our list. However, what if we wanted the list to include ALL students in the class, whether they have an emergency contact or not? Many students in the class are not listed in our inner join, because they are not paired with a matching emergency contact row.
SELECT COUNT(*) FROM SSASIDB1.ASTU_STUDENT A WHERE A.SCHOOLNUM = '235' AND A.TRK = '5' AND A.GRADE = ' K'; ---------+---------+---------+113
6 12/11/2002
An outer join extends the standard inner join by retaining unmatched rows of one or both of the joined tables in the query results, and using null values for data from the other table. Which tables unmatched rows are kept depends on the key words LEFT or RIGHT OUTER JOIN. They literally refer to the table whose name would be on the left or right side of the FROM clause if it were written out all on one line. When the key word FULL OUTER JOIN is used, single rows from both tables are kept, with nulls in the columns of the table without a matching row. Full outer joins are not efficient in DB2 v5, so they are not used at CCSD.
7 12/11/2002
464301 DEIRDRE ( ) 465775 JOHNATHEN (702) DSNE610I NUMBER OF ROWS DISPLAYED IS 113
8 12/11/2002
464301 DEIRDRE ( ) 465775 JOHNATHEN (702) DSNE610I NUMBER OF ROWS DISPLAYED IS 113
9 12/11/2002
Self Joins
Some multi-table queries involve a relationship that a table has with itself. For example, suppose an Employee table has the employees managers employee number as one of its columns. The managers name and other information is a row in the table just like any other employee. If we wanted to list the employees with the name of their manager, we would match each employees row with his managers row in the same table:
SSN 111-11-1111 222-22-2222 333-33-3333 NAME PAM ODDEN KATHY JONES PHIL BRODY MGR_SSN 222-22-2222 333-33-3333 ???-??-????
SELECT EMPS.NAME, MGRS.NAME FROM MHRMSDB1.EMPLOYEE INNER JOIN MHRMSDB1.EMPLOYEE ON EMPS.MGR_SSN = MGRS.SSN
EMPS MGRS
10 12/11/2002
Qualified column names are needed to eliminate ambiguous column references. Note in the example in slide 4, all column names are prefaced (or qualified) by A. or C. This is only required for columns that appear by the same name in both tables, schoolnum and permnum. However, it is good practice to qualify all column names. Table aliases can be used in the FROM clause to simplify qualifying column names, and are needed when joining a table with itself. SELECT * has a special meaning for multi-table queries. When used as we use it for a single-table query, it selects all columns of all tables in the query. It can also be used with with a qualifier to mean all the columns of one table. The following query selects two columns from table A and all columns from table C.
SELECT A.PERMNUM, A.FIRSTNAME, C.* FROM SSASIDB1.ASTU_STUDENT A INNER JOIN SSASIDB1.AEMG_CONTACT C ON . . .
11 12/11/2002
Checking for nulls is necessary even for columns that dont allow nulls in the database. For any table that is not the first or major table in an outer join, columns will be null when there was no matching row in that table. In COBOL, a null indicator must be defined and checked, just as for any other possible null value. Join using SQL instead of program logic. The DB2 optimizer can usually perform a join faster than a programming language can process separate cursors for each table. Use joins instead of subqueries when possible. Even when only columns from one of the tables are needed, it is usually more efficient to use a join. Join on clustered or indexed columns when possible, for better efficiency. Use caution when using ORDER BY. If the columns in the ORDER BY clause are all from one table, DB2 might avoid a sort. Provide as much search criteria as possible in addition to the join criteria. Additional criteria in the WHERE clause, preferably for each of the tables, provides DB2 with the best opportunity to rank the tables for joining in the most efficient manner.
12 12/11/2002
Multitable Joins
When multiple tables are joined using the SQL2 notation, the first join occurs, producing a results set. Then the next table is joined with the results set from the first join, and so on. Join expressions may be enclosed in parentheses, and the resulting table used in another join expression. The processing specified in the FROM clause occurs first, including all joins. Then the WHERE clause conditions are applied to the resulting table. In the WHERE clause, consideration must be given to the fact that some columns may be null in the results set even though those columns do not allow nulls in the database.
13 12/11/2002
14 12/11/2002
400506 400514
SARAH SAVANNAH
( ) 645-0003 ---------------
407217 ZOIE ( ) 646-6156 413048 ALEX --------------413111 RACHEL ( ) 645-8081 499390 SPENCER --------------500507 GRAYDON ( ) 656-2589 DSNE610I NUMBER OF ROWS DISPLAYED IS 113
Note we need to add the last line to the query, to include rows where there was no APRN_PARENT row and the P.CCSD_SEQUENCE is null.
15 12/11/2002
Here are two new types of joins defined in the SQL2 standard, but not yet adopted by DB2 (as of v.7). Cross join a Cartesian product of two tables
SELECT * FROM TABLEA CROSS JOIN TABLEB
Union join all the rows of the first table, nullextended with columns of the second table plus all the rows of the second table, null-extended with columns of the first table.
SELECT * FROM TABLEA UNION JOIN TABLEB
16 12/11/2002
Summary
In a multi-table query (a join), the tables containing the data are named in the FROM clause. Join criteria (called the join predicate) may be in the FROM clause (SQL2 notation) or included in the WHERE clause. The selection criteria (called the local predicate) are named in the WHERE clause, and are applied to the results of the join. Outer joins extend the inner join by retaining unmatched rows of one or both of the joined tables in the query results, and using null values for data from the other table. A table may be joined to itself. Self-joins require the use of a table alias.
17 12/11/2002