Академический Документы
Профессиональный Документы
Культура Документы
net
Home Art icle s
Search
sorting collections
Steven Feuerstein once wrote that having spent years persuading PL/SQL developers to modularise their code, he was moving 11g New Features 10g New Features 9i New Features 8i New Features Misce llane o us on to encourage their use of collections. Indeed, collections have become more widely used by developers in SQL and PL/SQL code, possibly due to the bulk PL/SQL techniques introduced in 8i and associative array features of 9i. Recent Oracle releases have also added more features for working with collections, such as the multiset operations introduced in 10g. In fact, Oracle is now rich in features for working with collections, with one important exception: sorting arrays of data. This article demonstrates a small number of techniques that we can adopt for sorting our collections.
setup
All of the examples in this article will use a nested table type, because this can be used in both SQL and PL/SQL (unlike associative arrays which are PL/SQL- only). Where alternative collection types can be used (i.e. associative arrays or VARRAYs), this will be noted. Our collection type is defined as follows.
Type created. To keep the initial examples short and simple, we will wrap a single small collection in a view, as follows.
SQL> CREATE VIEW v_collection 2 3 4 5 AS SELECT varchar2_ntt( 'Bananas', 'Oranges', 'Apples', 'Toaster Ovens' ) AS collection PDFmyURL.com
FROM
dual;
View created. Our collection contains four unordered elements and it is returned to us in the same order when we query it from the view, as follows.
COLLECTION ----------------------------------------------------------------VARCHAR2_NTT('Bananas', 'Oranges', 'Apples', 'Toaster Ovens') 1 row selected. We will now demonstrate some techniques for re- ordering the four elements in our sample collection below.
PDFmyURL.com
This method is extremely simple. We have used the TABLE() operator to convert the collection to a rowsource, meaning of course that it can be ordered with an ORDER BY clause. The TABLE() operator has been available since Oracle 8i and works with nested tables and VARRAYs (but not with associative arrays). If we want to retain the collection itself (albeit with sorted elements), we can wrap this SQL technique in a simple function, as follows.
SQL> CREATE FUNCTION sort_collection ( 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 END sort_collection; / RETURN v_collection; SELECT column_value BULK COLLECT INTO v_collection FROM ORDER TABLE (p_collection) BY column_value ; BEGIN v_collection varchar2_ntt; p_collection IN varchar2_ntt ) RETURN varchar2_ntt IS
Function created. In this collection sorter, we convert the original collection into an ordered rowsource with the TABLE() operator and an ORDER BY clause, then bulk fetch it into a collection of the same type for returning. We use it as follows.
PDFmyURL.com
1 row selected. As to be expected, the elements of the collection are re- ordered alphabetically. Note that to keep the example simple, it only supports ascending sorts. For a link to a version of this example that supports ascending, descending and distinct sorts, see the further reading section at the end of this article.
SQL> CREATE FUNCTION sort_collection_plsql ( 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 PDFmyURL.com /* Assign sorted elements back to collection... */ v_sorter_idx := v_sorter. FIRST ; WHILE v_sorter_idx IS NOT NULL LOOP END LOOP ; /* Sort the collection using the sorter array... */ FOR i IN 1 .. p_collection. COUNT LOOP v_sorter_idx := p_collection(i); v_sorter(v_sorter_idx) := CASE WHEN v_sorter. EXISTS (v_sorter_idx) THEN v_sorter(v_sorter_idx) + 1 ELSE 1 END ; BEGIN v_collection varchar2_ntt := varchar2_ntt(); v_sorter sorter_aat; v_sorter_idx VARCHAR2 (4000); TYPE sorter_aat IS TABLE OF PLS_INTEGER INDEX BY VARCHAR2 (4000); p_collection IN varchar2_ntt ) RETURN varchar2_ntt IS
28 29 30 31 32 33 34 35 36 37 38 39 40 41
/* Handle multiple copies of same value... */ FOR i IN 1 .. v_sorter(v_sorter_idx) LOOP v_collection. EXTEND ; v_collection(v_collection. LAST ) := v_sorter_idx; END LOOP ; v_sorter_idx := v_sorter. NEXT (v_sorter_idx); END LOOP ; RETURN v_collection; END sort_collection_plsql; /
Function created. Some notes on the techniques used are as follows: Line s 5- 6 : we use a string- indexed associative array to sort our collection and keep a log of the number of times each collection element occurs. The associative array index allows for collection elements of VARCHAR2(4000) and is of course ordered by definition; Line s 15- 22: we loop through our collection (this example assumes that it is densely packed) and assign each element to the index of the associative array. We also increment a count for the number of times each collection element appears (i.e. to handle duplicate values); Line s 25- 36 : once our collection has been copied to the associative array, we can build our sorted return collection. We do this by cycling through the associative array in index order. Each index value (remember that these are our original collection values) is assigned to our return collection the same number of times it originally occurred; Line 38: we return a sorted copy of our original collection. We use our PL/SQL- based sort function in the same way as our original SQL- based function, as follows.
SORTED_COLLECTION PDFmyURL.com
----------------------------------------------------------------VARCHAR2_NTT('Apples', 'Bananas', 'Oranges', 'Toaster Ovens') 1 row selected. Note that this technique also supports the sorting and return of associative arrays, in addition to nested tables and VARRAYs. If the sorted collection is never to be referenced or used in SQL, then an associative array can be used in place of a nested table type. It is only when using the collection in SQL that we require a SQL type such as a nested table.
SQL> SELECT sort_collection( 2 3 4 FROM collection MULTISET UNION collection ) AS sorted_collection v_collection;
SORTED_COLLECTION ----------------------------------------------------------------VARCHAR2_NTT('Apples', 'Apples', 'Bananas', 'Bananas', 'Oranges', 'Oranges', 'Toaster Ovens', 'Toaster Ovens') 1 row selected. We can see that the collection is sorted and each element appears twice. We will repeat the test with our PL/SQL implementation, as follows.
SQL> SELECT sort_collection_plsql( 2 3 4 FROM collection MULTISET UNION collection ) AS sorted_collection v_collection; PDFmyURL.com
SORTED_COLLECTION ----------------------------------------------------------------VARCHAR2_NTT('Apples', 'Apples', 'Bananas', 'Bananas', 'Oranges', 'Oranges', 'Toaster Ovens', 'Toaster Ovens') 1 row selected. Again, our collection is sorted and repeating elements are handled correctly.
performance considerations
We have two implementations of a collection sorting function, but which should we use? We will run two simple performance tests to determine which is the most efficient to use. First we will compare many sorts of a tiny collection and second we will compare a small number of sorts for a large collection. We will use a version of Tom Kyte's RUNSTATS utility to compare the resources and time used by each method. Before we begin our tests, however, we will recompile our functions to use native compilation. This will make them as efficient as possible (although we would expect this to have a greater positive impact on our PL/SQL- based function).
Function altered.
Function altered.
SQL> SELECT name 2 3 4 , FROM WHERE plsql_code_type user_plsql_object_settings name LIKE 'SORT_COLLECTION%';
NAME SORT_COLLECTION
------------------------------ --------------------
NATIVE NATIVE
SQL> DECLARE 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 END ; / runstats_pkg.rs_stop(100); /* Run 2: sort in PL/SQL... */ FOR i IN 1 .. v_iterations LOOP v_sorted_collection := sort_collection_plsql(v_small_collection); END LOOP ; runstats_pkg.rs_middle(); /* Run 1: sort in SQL... */ FOR i IN 1 .. v_iterations LOOP v_sorted_collection := sort_collection(v_small_collection); END LOOP ; runstats_pkg.rs_start(); /* Assign small collection... */ v_small_collection := varchar2_ntt('B','A','D','C'); v_small_collection v_iterations BEGIN varchar2_ntt := varchar2_ntt(); PLS_INTEGER := 10000; v_sorted_collection varchar2_ntt := varchar2_ntt();
================================================================================
-------------------------------------------------------------------------------1. Summary timings -------------------------------------------------------------------------------Run1 ran in 243 hsecs Run2 ran in 21 hsecs Run2 was 91.4% quicker than Run1
Type STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT
Name recursive cpu usage CPU used by this session execute count opened cursors cumulative recursive calls session cursor cache hits sorts (memory) workarea executions - optimal index fetch by key rows fetched via callback table fetch by rowid calls to get snapshot scn: kcmgss buffer is not pinned count sorts (rows) consistent gets consistent gets - examination consistent gets from cache session logical reads session uga memory
Run1 192 221 10,001 10,001 11,144 10,001 10,000 10,001 10,003 20,000 20,000 20,000 30,001 40,000 40,000 60,000 60,000 60,000 60,000 60,005 0
Diff -190 -202 -10,000 -10,000 -10,000 -10,000 -10,000 -10,000 -10,002 -20,000 -20,000 -20,000 -30,000 -40,000 -40,000 -60,000 -60,000 -60,000 -60,000 -60,004 246,880 PDFmyURL.com
STAT
262,144
262,144
-------------------------------------------------------------------------------3. Latching report -------------------------------------------------------------------------------Run1 used 70,405 latches Run2 used 271 latches Run2 used 99.6% fewer latches than Run1
================================================================================ End of report ================================================================================ PL/SQL procedure successfully completed. As we can see, the time- spans for this test are small, but the PL/SQL method took one- tenth of the time of the SQL method. The report shows the higher LIOs and latching that the SQL method incurred (but also demonstrates that all sorts took place in memory). To sort small collections, therefore, it appears that the PL/SQL method is the most efficient.
SQL> DECLARE 2 3 4 5 6 7 8 9 10 11 12 PDFmyURL.com runstats_pkg.rs_start; /* Assign large collection... */ SELECT object_name BULK COLLECT INTO v_large_collection FROM all_objects; v_large_collection v_iterations BEGIN varchar2_ntt := varchar2_ntt(); PLS_INTEGER := 100; v_sorted_collection varchar2_ntt := varchar2_ntt();
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 END ; / runstats_pkg.rs_stop(100); /* Run 2: sort in PL/SQL... */ FOR i IN 1 .. v_iterations LOOP v_sorted_collection := sort_collection_plsql(v_large_collection); END LOOP ; runstats_pkg.rs_middle; /* Run 1: sort in SQL... */ FOR i IN 1 .. v_iterations LOOP v_sorted_collection := sort_collection(v_large_collection); END LOOP ;
-------------------------------------------------------------------------------1. Summary timings -------------------------------------------------------------------------------Run1 ran in 1321 hsecs Run2 ran in 3209 hsecs Run1 was 58.8% quicker than Run2
Run2 1 1
1,244 101 100 65 81 72 108 200 200 200 201 471 301 31 400 160 163 600 600 600 600 1,126 1,202 65,512 65,536 0 7,149,800 -9,240,576
1,144 1 0 187 226 252 289 0 0 0 1 740 1 419 0 570 598 0 0 0 0 2 2,883 0 0 246,880 0 2,686,976
-100 -100 -100 122 145 180 181 -200 -200 -200 -200 269 -300 388 -400 410 435 -600 -600 -600 -600 -1,124 1,681 -65,512 -65,536 246,880 -7,149,800 11,927,552
LATCH channel operations parent latch LATCH checkpoint queue latch LATCH JS queue state obj latch LATCH messages STAT STAT STAT STAT STAT STAT index fetch by key rows fetched via callback table fetch by rowid workarea executions - optimal calls to get snapshot scn: kcmgss buffer is not pinned count
LATCH SQL memory manager workarea list lat LATCH row cache objects LATCH enqueues LATCH enqueue hash chains STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT STAT consistent gets consistent gets - examination consistent gets from cache session logical reads recursive cpu usage CPU used by this session session uga memory max session pga memory max session uga memory sorts (rows) session pga memory
-------------------------------------------------------------------------------3. Latching report -------------------------------------------------------------------------------Run1 used 2,117 latches Run2 used 4,816 latches Run1 used 56% fewer latches than Run2
================================================================================ PDFmyURL.com
End of report ================================================================================ PL/SQL procedure successfully completed. This time the results are very different. The SQL method is far more efficient at sorting large collections and is over twice as fast as the PL/SQL- based method. The statistics report looks slightly strange in places (particularly the memory statistics) but we can conclude that we should use the SQL technique for sorting larger collections.
SQL> SELECT CAST ( 2 3 4 5 FROM dual; MULTISET ( SELECT ename FROM emp ORDER BY ename ) AS varchar2_ntt ) AS ordered_emps
ORDERED_EMPS ----------------------------------------------------------------VARCHAR2_NTT('ADAMS', 'ALLEN', 'BLAKE', 'CLARK', 'FORD', 'JAMES', 'JONES', 'KING', 'MARTIN', 'MILLER', 'SCOTT', 'SMITH', 'TURNER', 'WARD') 1 row selected.
(although unlike MULTISET, we don't need to use a subquery). Since 11g Release 2, COLLECT also officially supports the ordering of the collection elements, as follows.
SQL> SELECT CAST ( 2 3 4 FROM emp; COLLECT (ename ORDER BY ename) AS varchar2_ntt ) AS ordered_emps
ORDERED_EMPS ----------------------------------------------------------------VARCHAR2_NTT('ADAMS', 'ALLEN', 'BLAKE', 'CLARK', 'FORD', 'JAMES', 'JONES', 'KING', 'MARTIN', 'MILLER', 'SCOTT', 'SMITH', 'TURNER', 'WARD') 1 row selected. Note the use of the word " officially" above. The COLLECT function has supported the ordering technique used in our example since its introduction in 10g Release 1, but Oracle has only documented this feature in 11g Release 2.
SQL> DECLARE 2 3 4 5 6 7 8 9 10 11 12 / PDFmyURL.com END ; v_enames varchar2_ntt; BEGIN SELECT ename BULK COLLECT INTO v_enames FROM ORDER emp BY ename; FOR i IN 1 .. v_enames. COUNT LOOP DBMS_OUTPUT.PUT_LINE (v_enames(i)); END LOOP ;
ADAMS ALLEN BLAKE CLARK FORD JAMES JONES KING MARTIN MILLER SCOTT SMITH TURNER WARD PL/SQL procedure successfully completed. Note that because BULK COLLECT is a PL/SQL construct, it supports associative arrays in addition to nested tables and VARRAYs.
further reading
A package for sorting collections, with support for descending and distinct sorts, is available as an o racle - d e ve lo p e r.ne t ut ilit y. The version of RUNSTATS used in the performance examples is availab le he re . For more information on the COLLECT function and its use, see t his o racle - d e ve lo p e r.ne t art icle . The 11g Release 2 new features for COLLECT are discussed in t his o racle - d e ve lo p e r.ne t art icle .
source code
The source code for the examples in this article can be downloaded from he re .
o rac le -d e ve lo p e r.ne t 20 0 2-20 12 c o p yrig ht Ad rian Billing to n all rig hts re s e rve d | o rig inal te mp late b y SmallPark | las t up d ate d 0 2 Ap ril 20 12
PDFmyURL.com
PDFmyURL.com