Вы находитесь на странице: 1из 10

Sub queries are also known as nested queries and are used to answer multi-part questions.

They are often interchangeable with a join in SQL. In fact, when executed, a query containing a sub-query may well be treated by the Oracle optimiser exactly as if it were a join. Let's use a trivial example of finding the names of everybody who works in the same department as a person called Jones to illustrate this point. The SQL could be written using a sub query as follows: SELECT name FROM emp WHERE dept_no = (SELECT dept_no FROM emp WHERE name = 'JONES') or as a join statement, like this:SELECT e1.name FROM emp e1,emp e2 WHERE e1.dept_no = e2.dept_no AND e2name = 'JONES' With a trivial example like this there would probably be very little difference in terms of performance of the SQL for such a simple query, but with more complex queries there could well be performance implications. For this reason it is always worth trying a few variations of the SQL for a query and examining the execution plans before deciding on a particular approach, unless they're very simple queries. Learn more about Oracle performance tuning here.

Non Correlated Sub-Queries


There are, in fact, two types of sub query: correlated and non-correlated. The example shown above is a non-correlated sub query. The difference between them is that a correlated sub query refers to a column from a table in the parent query, whereas a noncorrelated sub query doesn't. This means that a non-correlated sub query is executed jsut once for the whole SQL statement, whereas correlated sub queries are executed once per row in the parent query. See advanced SQL tutorial (part 4) for more on correlated sub queries.

Let's use the same example as we did to illustrate SQL joins: Table Store_Information
store_name Sales San Diego Boston $250 $700 Date Jan-07-1999 Jan-08-1999 Jan-08-1999

Los Angeles $1500 Jan-05-1999 Los Angeles $300

Table Geography
region_name store_name East East West West Boston New York Los Angeles San Diego

and we want to use a subquery to find the sales of all stores in the West region. To do so, we use the following SQL statement: SELECT SUM(Sales) FROM Store_Information WHERE Store_name IN (SELECT store_name FROM Geography WHERE region_name = 'West') Result: SUM(Sales) 2050 In this example, instead of joining the two tables directly and then adding up only the sales amount for stores in the West region, we first use the subquery to find out which stores are in the West region, and then we sum up the sales amount for these stores. In the above example, the inner query is first executed, and the result is then fed into the outer query. This type of subquery is called a simple subquery. If the inner query is dependent on the outer query, we will have a correlated subquery. An example of a correlated subquery is shown below: SELECT SUM(a1.Sales) FROM Store_Information a1 WHERE a1.Store_name IN (SELECT store_name FROM Geography a2 WHERE a2.store_name = a1.store_name) Notice the WHERE clause in the inner query, where the condition involves a table from the outer query.

EXISTS:
EXISTS simply tests whether the inner query returns any row. If it does, then the outer query proceeds. If not, the outer query does not execute, and the entire SQL statement returns nothing. The syntax for EXISTS is: SELECT "column_name1" FROM "table_name1" WHERE EXISTS (SELECT *

FROM "table_name2" WHERE [Condition]) Please note that instead of *, you can select one or more columns in the inner query. The effect will be identical. Let's use the same example tables: Table Store_Information
store_name Sales San Diego Boston $250 $700 Date Jan-07-1999 Jan-08-1999 Jan-08-1999

Los Angeles $1500 Jan-05-1999 Los Angeles $300

Table Geography
region_name store_name East East West West Boston New York Los Angeles San Diego

and we issue the following SQL query: SELECT SUM(Sales) FROM Store_Information WHERE EXISTS (SELECT * FROM Geography WHERE region_name = 'West') We'll get the following result: SUM(Sales) 2750 At first, this may appear confusing, because the subquery includes the [region_name = 'West'] condition, yet the query summed up stores for all regions. Upon closer inspection, we find that since the subquery returns more than 0 row, the EXISTS condition is true, and the condition placed inside the inner query does not influence how the outer query is run.

CASE

CASE is used to provide if-then-else type of logic to SQL. Its syntax is:

SELECT CASE ("column_name") WHEN "condition1" THEN "result1" WHEN "condition2" THEN "result2" ... [ELSE "resultN"] END FROM "table_name" "condition" can be a static value or an expression. The ELSE clause is optional. In our Table Store_Information example, Table Store_Information
store_name Los Angeles San Diego San Francisco Boston Sales Date

$1500 Jan-05-1999 $250 Jan-07-1999 $300 Jan-08-1999 $700 Jan-08-1999

if we want to multiply the sales amount from 'Los Angeles' by 2 and the sales amount from 'San Diego' by 1.5, we key in, SELECT store_name, CASE store_name WHEN 'Los Angeles' THEN Sales * 2 WHEN 'San Diego' THEN Sales * 1.5 ELSE Sales END "New Sales", Date FROM Store_Information "New Sales" is the name given to the column with the CASE statement. Result: store_name Los Angeles San Diego San Francisco Boston New Sales Date $3000 Jan-05-1999 $375 Jan-07-1999 $300 Jan-08-1999 $700 Jan-08-1999

RANK

Displaying the rank associated with each row is a common request, and there is no straightforward way to do so in SQL. To display rank in SQL, the idea is to do a self-join, list out the results in order, and do a count on the number of records that's listed ahead of (and including) the record of interest. Let's use an example to illustrate. Say we have the following table, Table Total_Sales
Name John Stella Sophia Greg Jeff Sales 10 20 40 50 20

Jennifer 15

we would type, SELECT a1.Name, a1.Sales, COUNT(a2.sales) Sales_Rank FROM Total_Sales a1, Total_Sales a2 WHERE a1.Sales <= a2.Sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) GROUP BY a1.Name, a1.Sales ORDER BY a1.Sales DESC, a1.Name DESC; Result: Name Greg Sophia Stella Jeff Jennifer John Sales 50 40 20 20 15 10 Sales_Rank 1 2 3 3 5 6

Let's focus on the WHERE clause. The first part of the clause, (a1.Sales <= a2.Sales), makes sure we are only counting the number of occurrences where the value in the Sales column is less than or equal to itself. If there are no duplicate values in the Sales column, this portion of the WHERE clause by itself would be sufficient to generate the correct ranking. The second part of the clause, (a1.Sales=a2.Sales and a1.Name = a2.Name), ensures that when there are duplicate values in the Sales column, each one would get the correct rank.

MINUS

The MINUS operates on two SQL statements. It takes all the results from the first SQL statement, and then subtract out the ones that are present in the second SQL statement to get the final

answer. If the second SQL statement includes results not present in the first SQL statement, such results are ignored. The syntax is as follows: [SQL Statement 1] MINUS [SQL Statement 2] Let's continue with the same example: Table Store_Information
store_name Sales San Diego Boston $250 $700 Date Jan-07-1999 Jan-08-1999 Jan-08-1999

Los Angeles $1500 Jan-05-1999 Los Angeles $300

Table Internet_Sales
Date Sales

Jan-07-1999 $250 Jan-10-1999 $535 Jan-11-1999 $320 Jan-12-1999 $750

and we want to find out all the dates where there are store sales, but no internet sales. To do so, we use the following SQL statement: SELECT Date FROM Store_Information MINUS SELECT Date FROM Internet_Sales Result: Date Jan-05-1999 Jan-08-1999 "Jan-05-1999", "Jan-07-1999", and "Jan-08-1999" are the distinct values returned from "SELECT Date FROM Store_Information." "Jan-07-1999" is also returned from the second SQL statement, "SELECT Date FROM Internet_Sales," so it is excluded from the final result set. Please note that the MINUS command will only return distinct values. Some databases may use EXCEPT instead of MINUS. Please check the documentation for your specific database for the correct usage.

SubQueries

Uses of Sub Queries


The most common use of sub queries is in the WHERE clause of queries to define the limiting condition for the rows returned (i.e. what value(s) the rows must have to be of interest), as in the above example. However, they can also be used in other parts of the query. Specifically, sub queries can be used:

to define the limiting conditions for SELECT, UPDATE and DELETE statements in the following clauses:o WHERE o HAVING o START WITH Instead of a table name in o INSERT statements o UPDATE statements o DELETE statements o the FROM clause of SELECT statements To define the set of rows to be created in the target table of a CREATE TABLE AS or INSERT INTO sql statement. To define the set of rows to be included by a view or a snapshot in a CREATE VIEW or CREATE SNAPSHOT statement. To provide the new values for the specified columns in an UPDATE statement

The first example of sub query in SQL shown above, used a simple equality expression as we were interested in only one row, but we can also use the sub query to provide a set of rows. For example, to find the names of all employees in the same departments as Smith and Jones, we could use the following SQL statement :-

SELECT name FROM emp WHERE dept_no IN (SELECT dept_no FROM emp WHERE name = 'JONES' OR name = 'SMITH') In fact, the original example could also return more than one row from the sub query if there were two or more people that were called Jones working in different departments. In the first example a run-time SQL error would be generated in that case, because the first example, by using '=', specified that the sub query should produce no more than one row (it is perfectly legitimate for a sub query to return no rows). We can reverse the question to ask for the names of all the employees that are NOT in the same department as Jones, To do this, the sense of the sub query just has to be reversed by prefixing it with 'NOT' or '!'. Again depending on whether there might be more than one Jones, we would either use 'IN' or '=' SELECT name FROM emp WHERE dept_no NOT IN ( SELECT dept_no FROM emp WHERE name = 'JONES') Or SELECT name FROM emp WHERE dept_no != ( SELECT dept_no FROM emp WHERE name = 'JONES')

Nested Sub-Queries
The syntax of SQL allows queries to be which means that a sub query itself can contain a sub query, enabling very complex queries to be built. For example, the SQL statement to find the departments that have emplyess with a salary higher than the average employee salary could be written as: SELECT name FROM dept WHERE id IN (SELECT dept_id FROM emp WHERE sal > (SELECT avg(sal)FROM emp) ) Any of the other comparison operators instead of '=' or 'IN' such as '>', or '<' can also be used with a sub query.

Sub Queries In The From Clause


The previous SQL examples used sub queries in the where clause, but sub queries can also be used in the from clause instead of a table name. In these circumstances the sub query acts as if it had been predefined as a view.

For example, the following SQL statement returns the amount of used space, the free space and the total allocated space for all tablespaces in a database. SELECT ts.tablespace_name ,ROUND(fs.mbytes,2) "Free (Mbytes)" FROM dba_tablespaces ts ,( SELECT tablespace_name , SUM(bytes)/1024/1024 mbytes FROM dba_free_space GROUP BY tablespace_name) fs WHERE ts.tablespace_name = fs.tablespace_name

Sub Queries That Return No Rows


Up until now the queries shown have all been expected to produce a result, but when creating tables, it can be very useful to write the SQL to use a sub query which will not return any rows - when just the table structure is required and not any of the data. In the following example we create a copy of the policy table with no rows: CREATE TABLE new_policy AS (SELECT * from policy WHERE FALSE=TRUE);

The sub query returns no data but does return the column names Sub Queries (ctd)
Correlated Sub-Queries
and data types to the 'create table' statement. sub query: correlated and non-correlated. We've already looked at non-correlated sub queries (in part 1). All of the examples of sub queries up until now have been noncorrelated sub queries. Just like non-correlated sub queries, correlated sub queries are used to answer multi-part questions, but they are most often used to check for existence or absence of matching records in the parent table and the related table in the sub query. A correlated sub query refers to a column from a table in the parent query. This type of query can often be performed just as easily by a join query or a non-correlated sub query, but the SQL may be significantly faster when a correlated sub-query is used. As correlated sub queries refer to a column from their parent queries, they are executed once per row in the parent query whereas non-correlated sub queries are executed once for the whole statement.

For example, using the emp and dept tables from before, to find out which departments have no employees assigned to them, we can write the SQL statement in 3 different ways - as a non-correlated sub query, as an outer join, or as a correlated sub-query.

Example 1 - non-correlated sub query


SELECT dept.name FROM dept WHERE dept.id NOT IN ( SELECT dept_id FROM emp WHERE dept_id IS NOT NULL)

Example 2- outer join


SELECT dept.name FROM dept,emp WHERE emp.dept_id (+) = dept.id

Example 3 - correlated sub query


SELECT dept.name FROM dept WHERE NOT EXISTS (SELECT dept_id FROM emp WHERE emp.dept_id = dept.id) The second example is an outer join SQL statement. This in fact does more than just return the names of departments which have no employees assigned to them, it also returns the names of those departments that do have employees assigned to them. This is because an outer join returns both matching rows and the non-matching rows on one side of the join. The first and the third SQL statements would produce exactly the same results, but the first would probably be slower than the third if the dept_id column in the emp table were indexed (depending on the sizes of the tables). The first SQL statement can not use any indexes - the where clause of the sub query is just checking for NOT NULL rows - so a full table scan would be performed. Also the sub query would be executed once for each row in the dept table. On the other hand, the sub query in the third example can use the index and since only the dept_id is returned by the sub query, there is no need for any subsequent table access. For these reasons, the third query would normally perform better than the first. Find the offerings of courses that have an attendance below the average attendance for offerings of that course.

Вам также может понравиться