Вы находитесь на странице: 1из 45

Single Row vs Array Interface vs Parallelism

Michael Hallas and Mark Ashdown


Real-World Performance
Oracle Database Development
December 4, 2018

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracle’s products remains at the sole discretion of Oracle.

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Real-World Performance Team
Who We Are How We Work
• Part of Oracle Database • Use the product as designed
Development • Aim for best performance
• Team members in HQ, USA, • Apply data-driven analysis
Europe, and Asia
• Share what we learn
• Over four hundred years of
experience combined

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Root Causes of Suboptimal Database Performance

The database is not being The application architecture There is a suboptimal


used as it was designed to or code is suboptimal algorithm in the database
be used

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Real-World Performance at UKOUG
Day Time Stream Room Topic

Tuesday 09:00 to 09:45 Database 2 1C Controlling the Chaos -


Using Resource Management for Predictable Performance
Tuesday 14:25 to 15:10 Database 3 11B Single Row vs Array Interface vs Parallelism

Wednesday 09:00 to 09:45 Database 2 1C Successful Star Schemas

Wednesday 15:20 to 16:05 Database 4 11C Indexes – the Good, the Bad and the Ugly

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Single Row vs Array Interface vs Parallelism

1 Why roundtrips are important for your Application


2 Single Row vs Array Interface
3 Array Interface vs Manual Parallelism
4 Database Parallelism

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Oracle Client and Server
Client Server
Application Code

Driver or API Database

Network
Oracle Net Oracle Net

Operating System Operating System

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Round Trip Time with Distance
• International
100 ms • 6000 miles / 10000 Kilometres

• Regional
10 ms • 600 miles / 1000 Kilometres

• Metropolitan
1 ms • 60 miles / 100 Kilometres

• Local
< 1ms • Propagation delay is small relative to other costs in the network stack

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Single Row vs Array Interface vs Parallelism

1 Why roundtrips are important for your Application


2 Single Row vs Array Interface
3 Array Interface vs Manual Parallelism
4 Database Parallelism

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


What is the Array Interface?
• To use the array interface, the application developer allocates host arrays
– For columns in the SELECT list of queries
– For bind variables in DML statements
• For queries, the application can fetch multiple rows in a single call
• For INSERT … VALUES, the application can process multiple rows in a single call
• For other DML statements, the application can process multiple statements
in a single call

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Where is the Array Interface used?
• Database tools • Many drivers and APIs are built on
– Classic Export and Import top of OCI
– SQL*Loader – Oracle
– SQL*Plus – Open source
• ODP.NET and other Windows APIs – Third party

• Oracle Call Interface (OCI) • Many tools are built on top of these
drivers and APIs
• Precompilers
• Some tools are built on top of OCI

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


What about …?
PL/SQL JDBC
• PL/SQL executes inside the • JDBC does not expose the Oracle
database array interface directly
• Array interface via specific syntax – Prefetch
– BULK COLLECT – Batch updates
– FORALL
– Cursor loop
• A client can also exchange arrays
and more with PL/SQL

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


What is Prefetch?
• Prefetch is simpler for the application developer
• Behind the scenes, the driver allocates host arrays
• The driver fetches multiple rows in a single call
• The driver supplies rows to the application on request
• The application developer no longer needs to be aware of arrays

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Prefetch in JDBC
statement = connection.prepareStatement("SELECT ...");
statement.setFetchSize(100);
resultSet = statement.executeQuery();
while (resultSet.next()) {
...
}

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


What is Batch?
• Batch is simpler for the application developer
• Behind the scenes, the driver allocates host arrays
• The application adds an execution of a SQL statement to a batch
• The application submits the batch
• The application developer needs to be aware of batching
• The application developer no longer needs to be aware of arrays

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Batching in JDBC
statement = connection.prepareStatement("INSERT INTO … VALUES (?,?,?)");
for (...) {
// Bind
statement.setInt(1, ...);
statement.setString(2, ...);
statement.setString(3, ...);
// Add to batch
statement.addBatch();
}
// Execute batch
statement.executeBatch();

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Row-by-Row External table pointing to
file in file-system

declare
cursor c is select s.* from ext_scan_events s;
r c%rowtype;
begin
open c;
loop
fetch c into r; Retrieve one row
exit when c%notfound;
insert into stage1_scan_events d values r; Load one row
end loop;
close c;
commit;
end;

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Array/Bulk Processing
declare
cursor c is select * from ext_scan_events;
type t is table of c%rowtype index by binary_integer;
a t;
array_size binary_integer := 1024; Array-size
begin
open c;
loop
fetch c bulk collect into a limit array_size; Retrieve set of rows
exit when a.count = 0;
forall i in 1..a.count insert into stage1_scan_events d values a(i);
end loop;
close c; Load set of rows
commit;
end;

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Demo
Single Row vs Array Interface

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Array Racing With No Auto Commit

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Array Racing With Auto Commit

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Single Row vs Array Interface vs Parallelism

1 Why roundtrips are important for your Application


2 Single Row vs Array Interface
3 Array Interface vs Manual Parallelism
4 Database Parallelism

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


What is Manual Parallelism?
• The application developer distributes work over multiple threads or
processes
– Popular techniques include queues and hashing
– Often achieves reuse of existing code
• Some challenges to be resolved
– Even distribution of work
– Recovery from failures
– Concurrency in the database

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Manual Parallelism: Code For Thread One
declare Each thread uses own
cursor c is select s.* from ext_scan_events_1 s; external table definition
type t is table of c%rowtype index by binary_integer;
a t;
array_size binary_integer := 1024;
begin Files are distributed across
for r in (select ext_file_name from ext_scan_files threads using hashing
where mod(ora_hash(file_seq_nbr),16) = 1)
loop
execute immediate 'alter table ext_scan_events_1 location ' ||
'(' || r.ext_file_name || ')';
open c;
loop
fetch c bulk collect into a limit array_size;
exit when a.count = 0;
forall i in 1..a.count insert into stage1_scan_events values a(i);
end loop;
close c;
commit;
end loop;
end;

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Demo
Array Interface vs Manual Parallelism

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Arrays vs Manual Parallelism
Scenario
• Already have a
service that
processes a
single message
• Process more
messages
concurrently by
using multiple
threads or
change the
service to
process a set of
messages.
• Which is better?

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Arrays vs Manual Parallelism
Analysis
• Not much of a
contest
• Array takes
1minute
• Threads takes
14 minutes
• With higher
load on server
• But still
resource
available
• Larger pool size
for threads?

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Arrays vs Manual Parallelism
Analysis
• 4x threads
improved
performance
• Array takes
1minute
• Threads takes 5
minutes
• Much higher
load on server
• This is ONE
dedicated
process vs 64
parallel threads

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Arrays vs Manual Parallelism
Scenario
• What happens
when we add an
index to both
systems?

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Arrays vs Manual Parallelism
Analysis
• Both systems
slowed by index
• Multiple
threads suffers
from
contention,
buffer busy
wait, TX index
contention

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Single Row vs Array Interface vs Parallelism

1 Why roundtrips are important for your Application


2 Single Row vs Array Interface
3 Array Interface vs Manual Parallelism
4 Database Parallelism

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Set-based SQL

insert into stage1_scan_events


select s.*
from ext_scan_events s;

commit;

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Parallel Set-based SQL

alter session enable parallel dml; Simple code compared to


manual parallelism
insert /*+ APPEND */ into stage1_scan_events
select s.*
from ext_scan_events s;

commit;

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Demo
Database Parallelism

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Transforming using Row-by-Row Method
declare
cursor c is select
s.*
,cast(null as number(10)) as error_ind
from
stage2_scan_events_ref s;
r c%rowtype;
dk dim_day.day_key%type;
lk dim_loc.loc_key%type;
pk dim_prod.prod_key%type;
day_code dim_day.day_code%type := '20130922';
day_key dim_day.day_key%type :=
sd_convert.day_code_to_key(day_code);
error_ind number(10);
begin
open c;
loop
fetch c into r;
exit when c%notfound;
...

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Transforming using Row-by-Row Method
...
error_ind := 0;
if r.day_code != day_code
then
dk := null;
else
dk := day_key;
end if;
if dk is null then error_ind := error_ind + 1; end if;
lk := sd_convert.loc_code_to_key(r.loc_code);
if lk is null then error_ind := error_ind + 2; end if;
pk := sd_convert.prod_code_to_key(r.prod_code);
if pk is null then error_ind := error_ind + 4; end if;
if error_ind = 0
then
insert into stage3_scan_events d values (r ... );
else
r.error_ind := error_ind;
insert into stage2_scan_events_err d values r;
end if;
commit;
end loop;
close c;
end;
...

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Transforming using Set-based Method
alter session enable parallel dml;

insert /*+ APPEND */ first


when error_ind = 0 then into
stage3_scan_events
values ( ... )
else into
stage2_scan_events_err
values ( ... )
with
dim_day_current
as (select * from dim_day where day_code = '20130922')
select
s.*
,d.day_key
,l.loc_key
,p.prod_key
...

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Transforming using Set-based Method
...
,(
case when day_key is not null then 0 else 1 end
+ case when loc_key is not null then 0 else 2 end
+ case when prod_key is not null then 0 else 4 end
) as error_ind
from
stage2_scan_events_ref s
left outer join
dim_day_current d
on s.day_code = d.day_code
left outer join
dim_loc l
on s.loc_code = l.loc_code
left outer join
dim_prod p
on s.prod_code = p.prod_code;

commit;

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Transforming

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Transforming
Transform Time

Transform time in Row-by-Row 20143

seconds for 100M de-


duplicated scan events Array 2059

– Array size 100


– Jobs 32 Home-grown 117

– Degree of Parallelism
Set 11

0 5000 10000 15000 20000 25000

Transform Time

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Summary
• When you need to process rows via the application, think about
– Arrays
– Prefetch
– Batch
• Scale-out of an application that performs row-by-row processing
– Improves performance
– May be limited by contention
• When you can achieve your goal using SQL …
use SQL

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Safe Harbor Statement
The preceding is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracle’s products remains at the sole discretion of Oracle.

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |


Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Вам также может понравиться