Академический Документы
Профессиональный Документы
Культура Документы
This presentation outlines our general product direction and should not be relied on in making a
purchase decision. This presentation is not subject to your license agreement or any other agreement
with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to
develop or release any functionality mentioned in this presentation. This presentation and SAP's
strategy and possible future developments are subject to change and may be changed by SAP at any
time for any reason without notice. This document is provided without a warranty of any kind, either
express or implied, including but not limited to, the implied warranties of merchantability, fitness for a
particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in this
document, except if such damages were caused by SAP intentionally or grossly negligent.
120
Stable Device
5 4
7
RS System DB
Stable Queue Manager
(RSSD)
(SQM)
SQT Cache
Outbound Queue
(OBQ) 10
8 Data Server Interface9 DSI Executor(s)
(DSI/DSI-S) (DSIEXEC)
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 7
SRS data flow - normal
DIST
SQT Cache
6
WS-DSI
5
RS System DB
SQM DSIEXEC
(RSSD)
SQT Cache
OBQ
DSI DSIEXEC
2. RepAgent user parses and normalizes data and then submits to SQM
5. WS-DSI sorts replicated changes into commit sequence and determines how to apply the
data (language, bulk, grouping)
• SQT cache is used to resort the transactions
EXEC SQM
1 2
3 IBQ RSI User
DIST SQT 9 10
5 4 DIST
6 7 8 11
SQM OBQ RSI OBQ
SQM RSI
SQM 12
SQM
7. SQM writes data into the outbound queue for the route
• Aka route queue or RSI queue
9. RSI sends the data to the next SRS along the route using RTL
• Replication Transfer Language….similar to LTL
10. RSI User thread receives packets and sends to route DIST module
11. Route DIST module separates local from re-routed subscribers and writes commands to
a. local database outbound queues for subscribers in that RS
b. Route queues for routes to other destinations if further routing required
Think of SRS as multiple pipelines with disks at each end of the pipeline
• Inbound (capture) pipeline: (log) RA SRS RepAgent User SQM (inbound queue)
• Distribution pipeline: (IBQ) SQMR/SQT DIST SQM (OBQ)
• Route pipeline: (PRS OBQ) PRS RSI RRS RSI User DIST/MD SQM (OBQ)
• DSI (apply) pipeline: (OBQ) DSI DSIEXEC RDS.RDB (log)
• WarmStandby pipeline: (IBQ) WS/DSI DSIEXEC RDS.RDB (log)
Compare processing rates across the threads/modules
• If RA is sending 1000 cmds/sec, then the SQM, DIST, RSI, DSI and DSIEXECs all need to be able process
1000 cmds/sec or latency will develop
If they can’t and are running slower, latency will show up as disk space in the primary txn log, inbound queue or outbound
queue
Slowness in one module will quickly determine the rate for all the modules in same pipeline
– E.g. if DIST can only process 900 cmds/sec, as soon as SQT cache is full, SQT will be forced to slow to 900 cmds/sec as well
• What to look for:
Where disk space is piling up. The problem is in the next pipeline!!
Where time is spent in each thread/module
Rate
• Compare the rates across the pipelines….when and where it slows indicates a bottleneck
• Remember that the rate across a segment of the pipeline will often be driven by the slowest thread
You may need to look at each thread in that segment to find the root cause
Space
• If the above fails, use disk space. The problem area is in the segment of the pipeline AFTER the backlog
Time
• Each module has a lot of time counters. Use them to find out where time is being spent and focus on the
largest time chunks.
Objects
• Since a common backlog is in the DSIEXEC, look at which objects are involved and what DML (I/U/D) – then
sanity check repdefs and replicate db indexes as well as repdef owner vs. table owner marking
Bell Wringers
• Look for common faults that almost always cause significant backlog such as TransRemoved, RAWriteWaits,
DSIEXEC ResultTime, etc.
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 15
Lab #1
Analysis DB setup, running reports, summary report (latency & memory)
Lab instructions: setup
Summary Report:
1. How is data distributed (table repdefs/subscriptions, WS, db repdefs/subscr, routes)?
If you have time, draw a quick sketch
2. Which connections are showing latency/backlog?
Where do you suspect the problem(s) is/are?
Which ones are more key?
3. How is memory allocated among the threads/caches?
Is there a real potential for over-allocation or is it unlikely?
Detailed Report:
4. Look at the performance summary….how do the module throughput rates compare?
5. Where is there latency? At a high level, why?
STS Cache
Normalization
2 (cmd structure)
RSSD
Write request to
4 exec_sqm_write_request queue
RepAgent User
SQM
The ACK of packets from RepAgent waits until RepAgent User is finished
• Depending on the type of error, SRS can request RepAgent resend or rescan or stop.
• Once write request is put on the write request queue, RepAgent user sends ACK
• However, this waiting contributes to RepAgent latency (more on this later)
The ACK of packets from RepAgent waits until RepAgent User is finished
• Depending on the type of error, SRS can request RepAgent resend or rescan or stop.
• Once write request is put on the write request queue, RepAgent user sends ACK
• However, this waiting contributes to RepAgent latency (more on this later)
• Resolution is to use Async Parsing (ASO option required)
Pipeline scalability
• Normally, when people think of scalability, the first thought is to parallel processing
Which in a sense is inherent is SRS via multiple connections just as in a DBMS
• However in data movement, an important aspect is “pipelining”
In pipelining, multiple threads – each performing a small task are used
As soon as each thread does its task, it pushes the job down the pipe to the next thread
This technique is especially prevalent in Complex Event Processing/SAP Event Stream Processor
• Advantage of pipelining over parallelism is that serialization is maintained
Theoretically, if you used parallel processes you could achieve same throughput
In reality, much less as
– Some processes would be waiting for data to arrive to process
– Parallel processes would have to synchronize data flow to maintain serialization/transaction order
3
Normalization
2 4
1 packing
nrm_write_request queue
Parsing (cmds) (exec_nrm_request_limit)
LTL batch
5
exec_sqm_write_request queue
NRM (exec_sqm_request_limit)
RepAgent User
SQM
Monitoring
• RAWaitNRMTime (58038) – exec_nrm_request_limit reached
Counter obs value is number of times this happened
Counter total value is total time during the sample period spent waiting
• RAWriteWaitsTime (58019) – exec_sqm_write_request_limit reached
Parsing (cmds)
packing
Batch SYNC
5
LTL batch
exec_sqm_write_request queue
RepAgent User
1
(EXEC) 6
2 PRS 3 NRM
4
nrm_write_request queue
(exec_nrm_request_limit) SQM
Configuration Parameters
• exec_prs_num_threads
Specifies the number of async parser threads for the connection.
Default is 0. Max is 20.
• ascii_pack_ibq
Specifies whether the messages written to the IBQ of a connection are ASCII Packed.
Default is “off” (so Binary Pack by default).
• async_parser
An “alias” configuration (like parallel_dsi) that when set to “on” simultaneously sets configurations
– exec_prs_num_threads to 2
– ascii_pack_ibq to “on”
– cmd_direct_replicate to “on”
– dist_cmd_direct_replicate to “on”
When set to “off”, all of those configurations are set to their defaults.
• exec_nrm_request_limit
Controls both the buffer between the PRS and NRM threads as well as the LTL batch buffer sizes
However, the total memory consumption for SRS is
– exec_prs_num_threads * exec_nrm_request_limit NRM thread buffering
Sequence to Enable
• Stop the RepAgent via sp_stop_rep_agent
Not totally necessary as configuring will cause it to disconnect
• Suspend the distributor (suspend distributor in SRS)
• Configure async parsing
• Resume the distributor
• Restart the RepAgent
Limitations and comments
• Currently not compatible with exec command cache
• Currently not compatible with non-passthru RepAgents
As a result, this works only with ASE sources – RAX does not connect via Passthru (not in JDBC)
• Due to IO processing and reparsing being large bottlenecks, no benefit will be noticed if SQM command
cache is not used
Make sure cmd_direct_replicate and dist_cmd_direct_replicate are enabled
Watch SQM command cache sizes
We will discuss these concepts more in SQM section
Multiple RepAgents with MPR 15.7 sp100+ Yes 15.7.1+ Need to create rep filters in ASE
Number of command buffers received by a RepAgent thread. Buffers are broken into packets when in 'passthru' mode, or
58013 BuffersReceived language 'chunks' when not in 'passthru' mode. See counter 'PacketsReceived' for these numbers. Authors Note: In later revs of
SRS, PacketsReceived was deprecated and is replaced by RepAgentRecvTime counter_obs value.
Number of empty packets received in 'passthru' mode by a Rep Agent thread. These are 'forced' EOM's. See counter
58014 EmptyPackets
'PacketsReceived' for these numbers.
58022 RSTicket rs_ticket markers processed by a Rep Agent's executor thread.
The amount of time, in milli-seconds, spent receiving network packets or language commands. Authors Note: In later revs of SRS,
58023 RepAgentRecvTime PacketsReceived was deprecated and is replaced by RepAgentRecvTime counter_obs value….so this is the focus in the
discussion on throughput
58037 TotalBytesReceived Accumulated total bytes received by a Rep Agent thread so far.
'dump database log' (in ASE, SYNCDPDB records) and 'load database log' (in ASE, SYNCLDDB records) processed by a Rep Agent
58005 CmdsDumpLoadDB
thread. Authors Note: This likely will be seen during materialization/re-sync times or when using replicated coordinated dumps
CHECKPOINT records processed by a Rep Agent thread. CHECKPOINT instructs Repserver to purge to a specific OQIQ value. Authors
58006 CmdsPurgeOpen Note: This refers to normal checkpoints, so you should see 1 every 60 seconds or so from ASE. Only when RA first connects are the
open transactions actually purged.
Create, drop, and alter route requests written into an inbound queue by a Rep Agent thread. Route requests are issued by RS user.
58007 CmdsRouteRCL
Authors Note: This helps prevent data loss/duplicate commands during a topology change when an IRS is swapped
Enable replication markers written into an inbound queue by a Rep Agent thread. Enable marker is sent by executing the rs_marker
58008 CmdsEnRepMarker
stored procedure at the active DB. Authors Note: This should primarily be seen only during subscription creation, dropping, etc.
Updates to RSSD..rs_locater where type = 'e' executed by a Rep Agent thread. Authors Note: This will happen with each gettrunc()
58009 UpdsRslocater request – which is driven by RA scan_batch_size, etc.
UpdsRslocater/min
• If too frequent, processing is slowed by constant OQID updates to RSSD (10’s of ms each)
• Since this controls how fast recovery, even 1 per minute would support <1 minute recovery
• Consequently, you only want to see ~<10 per minute (which is 6 second recovery)
• To change, increase the scan batch size up to 10K or 20K max
If already at that level, do not increase further
The amount of time the RepAgent spent waiting for the SQM Writer thread to drain the number of outstanding write
58019 RAWriteWaitsTime
requests to get the number of outstanding bytes to be written under the threshold.
58023 RepAgentRecvTime The amount of time, in milli-seconds, spent receiving network packets or language commands.
58025 RepAgentExecTime The amount of time, in milli-seconds, Repagent User thread is scheduled by OCS.
58031 RepAgentParseTime The amount of time, in milli-seconds, spent parsing commands.
58033 RepAgentNrmTime The amount of time, in milli-seconds, spent normalizing commands.
58035 RepAgentPackTime The amount of time, in milli-seconds, spent packing commands.
The amount of time the RepAgent spent waiting for NRM thread to drain the number of commands on the message queue
58038 RAWaitNRMTime
to get the number of outstanding bytes to be written under the threshold.
58040 RACmdsDirectRepSend Number of commands directly sent from Executor with pre-packing data format.
58041 RepAgentExecutorTime The amount of time, in milli-seconds, spent in function _ex_executor_cmd().
58043 RAControlMem Number of time the memory control is executed from Executor.
58044 RAWaitMemTime The amount of time the RepAgent thread spent waiting memory usage under the memory control threshold.
RepAgentExecTime
• RepAgentExecTime is a good indication of how much CPU time the RepAgent is getting.
This is sort of a sum of all the CPU time for the RepAgent User thread
– ….or at least until you start splitting out the NRM, async parser, etc.
• Compare to sample interval and look at CPU usage on host
If there are more CPU resources and this number is NOT the full sample period and you have RepAgent latency but no
RA______Waits, then likely cause is ASE process kernel
RAYieldTime
• RAYieldTime is tied to the legacy exec_cmds_per_timeslice
In older pre-SMP versions, this was used to try to govern the RepAgent User from consuming too much CPU time
Unfortunately, with a common cause of problems being RepAgent latency, it was counter-productive in most cases
• If you see this, you likely upgraded from an older release – just set to 2B and forget it
Later versions default this to 2B
Warning!!!
• ONLY use the configure replication server command
do not update RSSD directly
• Make sure the queues are empty and RS is quiescent
Suspend log transfer from all
Wait for DSI and RSI to drain - and save interval to expire
Admin quiesce_force_rsi
(optional) sysadmin sqm_purge_queue
Configure replication server set block_size to ‘256’ with shutdown
(restart RS)
Resume log transfer from all
• Change happens on reboot of RS - so it may take a while to come up
OS Changes!!!
• You may need to change OS kernel settings to allow larger IO’s
set vxio:vol_maxio=16384 max of 16K IO before IO is broken up
set maxphys=8388608 largest IO
Check sd/ssd.conf (Solaris)
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 48
When to use larger block sizes
RepAgent User
LTL ASCII Stream (packets)
SQT
exec_sqm_write_request queue
SQM page cache
Current block
SQM
Packed binary cmd
Notes
• Not supported on HPUX (PA-RISC or HPIA)
62002 BlocksRead Number of blocks read from a stable queue by an SQM Reader thread.
62004 BlocksReadCached Number of blocks from cache read by an SQM Reader thread.
*CachedPct=(BlocksReadCached/BlocksRead)*100%
SQT cache looks fine, but really, the
SQT is seriously lagging inbound queue problem is being masked by low
• Almost no SQMR cached reads - all physical reads
throughput due to physical reads
No transactions were removed (SQT.TransRemoved)
• So problem is SQM write cache or SQM/SQMR delay tuning
Analysis: SQT (SQMR) Read speed is fine up until some limit at which point degradation happens
This limit turns out to be DIST throughput as we will see later
*CachedPct=(BlocksReadCached/BlocksRead)*100%
RepAgent User
LTL ASCII Stream (packets)
SQT
exec_sqm_write_request queue
SQM page cache
Current block
SQM
Packed binary cmd
SQT
DSI
md_sqm_write_request queue
SQM page cache
Current block
SQM
Packed ascii cmd
Some notes
• Due to SQT sleeping a lot (contention, PIO, etc.) this works best with DIST if dist_direct_cache_read is
enabled (requires ASO)
• For DSI, reparsing may happen a lot if there is considerable latency
Current block
Set cmd_direct_replicate on for the Executor thread to send parsed data directly to the Distributor thread along with
cmd_direct_replicate
Off binary data. When required, the Distributor thread can retrieve and process data directly from parsed data, and
improve replication performance by saving time otherwise spent parsing data again.
1MB/32 The maximum size, in bytes, of parsed data that Replication Server can store in the SQM command cache. Ignored
sqm_cmd_cache_size
20MB/64 if cmd_direct_replicate is off
Specifies, in each SQM block, the maximum number of entries with which the parsed data can associate. Set the
value of sqm_max_cmd_in_block to the number of entries in the SQM block. Depending on the data profile, each
sqm_max_cmd_in_block
320 block has a different number of entries because the block size is fixed, and the message size is unpredictable. If
you set a value that is too large, there is memory waste. If you set a value that is too small, replication performance
is compromised.
So….why sqm_max_cmd_in_block???
• There really is two parts to the SQM command cache
The command cache itself
An array of pointers
• Sqm_max_cmd_in_block is used to dimension the array of pointers
• If you think about it, we want to cache every command that is in SQM page cache
So….if we are averaging 10/commands per block and have a 4 block page (default) and a 16 page cache, we need to
dimension an array of 10*4*16=640
In this case, we would set the sqm_max_cmd_in_block to 10
30037 DIST DISTCmdsDirectRepRecv Number of commands received by DIST directly from EXEC
Large block sizes Any Any 15.5+ 16, 32, 64, 128, 256
Elapsed time, in milli-seconds, to process a segment. Timer starts when a segment is allocated or
6029 TimeSeg
Repserver starts. Timer stops when the segment is deleted.
6057 SQMWriteTime The amount of time taken for SQM to write a block.
The amount of time waiting for allocating segments. Authors Note: This is similar to SleepsWaitSeg
6059 SQMWaitSegTime (e.g. counter_obs should be identical), however, the time is useful it showing how bad the situation is
as well as RSSD response time.
6064 SQMCMDCLONETIME Time spent on cloing CMD_COMMAND.
6017 SleepsWriteRScmd srv_sleep() calls by an SQM Writer client while waiting to write a special message, such as synthetic rs_marker.
6018 SleepsWriteDRmarker srv_sleep() calls by an SQM Writer client while waiting to write a drop repdef rs_marker into into inbound queue.
6019 SleepsWriteEnMarker srv_sleep() calls by an SQM Writer client while waiting to write an enable rs_marker into the inbound queue.
Writes failed by an SQM thread due to loss detection, SQM_WRITE_LOSS_I, which is typically associated with a
6037 WritesFailedLoss
rebuild queues operation.
6050 XNLWrites Large messages written successfully so far. This does not count skipped large message in mixed version situation.
6052 XNLSkips Large messages skipped so far. This only happens when site version is lower than 12.5.
6053 XNLSize The size of large messages written so far.
Number of commands excluded from direct replication limited by SQM command cache memory size. Author: SQM
6061 SQMNoDirectReplicateInCache
command cache too small
Number of commands excluded from direct replication limited by the number of commands can be stored in each SQM
6062 SQMNoDirectReplicateInBlock block. Author: This probably won’t happen unless you have a lot of empty transactions or large block size and really
narrow row widths.
Count of cache collisions. Author: cache overwritten before read – SQM cmd cache too small or there is latency (e.g.
6063 SQMCacheCollisions
SQT cache is full and DIST lagging)
Number of commands excluded from direct replication limited by SQM cache memory size. Author: SQM page cache
6066 SQMNoDirectReplicateInSQMCache
too small for SQM cmd cache
You probably don’t need to be concerned about most of these except the SQM command cache ones
(highlighted - discussed previously)
RepAgent
1. Should we change any RepAgent configs (packet size, scan_batch_size)?
2. How much would using the NRM thread help?
3. Would async parsing help more or not so much?
4. Were there any issues with write waits between RAUser & SQM?
SQM (writer)
5. Should the sqm_recover_seg be adjusted?
6. Would a larger block_size configuration help?
7. How did the rate of segment allocations/deallocations appear?
8. Was SQM page cache sufficient?
The problem
• If we sleep too long,
we may have to read the block from disk
…and since SQMR is in SQT, the SQT will sleep and not serve DIST
• If we sleep too short, we will block the writer a lot
….but keep the DIST fed….so not quite as bad as above – hence start low and raise configs
….but might result in RAWriteWaits as exec_sqm_write_request_limit reached
– Either increase exec_sqm_write_request_limit…..or increase SQT read delays by a few milliseconds (10-20)
SQMR is reading
current SQM Primary input
block being SQMR lagging but 100% slows/stops - SQMR
written cached reads as it is starts catching back
SQMR is lagging -
reading from SQM cache- SQM Write Cache up
SQT cache is full too small??? (No)
Tuning tips
• During low or medium activity, ignore
• During peak activity, watch how many Sleeps there are per attempt to read
• SleepsPerBlock = SleepForWriteQTime.counter_obs/BlocksRead (or better yet …/CmdsRead)
If >1 per cmd consider increasing sqt_init_read_delay to average and sqt_max_read_delay to 2x avg.
– We would like to grab a full txn per read – e.g. 3 cmds/read…..or probably 5-10 sleeps/block is normal
If 1 per block, that is acceptable…but edgy…anything <2 per block is probably getting worrisome
0 or near 0 per block is only acceptable if BlocksReadCached is 70%+ of BlocksRead
– If cache hit ratio is high, then we are reading out of page cache and that is great
– If cache hit ratio drops, that indicates we are need to do more physical reads than desirable
Row 0.3.0
End of Queue Row 0.3.1 Beginning of Queue
Row 0.3.2
Row 0.3.3
Row 0.3.4
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 88
SQT Txn Sorting: Seg 0 Block 2 Read (3)
Pointers
Empty Transactions
• Due to explicit begin/commits in PDB but no DML (e.g. chained mode/ISO3 queries, etc.)
Used to be a big problem, but ASE 15 reduced a lot of these by not flushing empty BT/CT from ULC to primary log
• Can be due to DML on unreplicated tables
• Can also be caused by system tasks events such as reorgs
However, system level empty transactions were filtered in ASE 12.5.x
• SQT simply discards them
Depending on the SRS version whether counted first in CLOSED or not…
Transactions Removed
• Really large transactions with a lot of commands could fill SQT cache
• To prevent this, the SQT will remove large transactions from the OPEN list if cache is low
Note that large transactions in CLOSED or READ will not be removed from cache
• When COMMIT is seen for a removed transaction, the SQT moves it to CLOSED
• When SQT client wants to process it, SQT must rescan the transaction from disk
During this rescan, not only is it most likely a lot of physical reads, but SQT cannot simultaneously be reading new
transactions, so ongoing reads stop
• An occasional removed large transaction is not a problem….10’s of them are.
Commands in transactions completely scanned by an SQT thread. Authors Note: This counter is useful for spotting
24002 CmdsTran
large transactions (counter_max) as well as average cmds in a transaction (counter_total/counter_obs).
SQT thread memory use. Each command structure allocated by an SQT thread is freed when its transaction context is
removed. For this reason, if no transactions are active in SQT, SQT cache usage is zero. Authors Note: If this
24005 CacheMemUsed
reaches the maximum and remains there for any period of time without an large transactions, it is indicative of DIST
being slow and SQT is just buffering (can be proven by the number of CLOSED transactions)
24023 SQTAddCacheTime The time taken by an SQT thread (or the thread running the SQT library functions) to add messages to SQT cache.
The time taken by an SQT thread (or the thread running the SQT library functions) to delete messages from SQT
24025 SQTDelCacheTime
cache.
24031 CacheLow SQT cache is too low to load more than one transaction into cache.
24032 SQTResyncPurgedTrans Transactions purged by resync database command
24033 SQTControlMem Number of time the memory control is executed in a SQT.
24038 SQTParseTime The amount of time, in milli-seconds, spent by SQT in parsing commands.
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 97
SQT Performance Notes
Current closed transaction count. Authors Note: This should never be allowed to climb that high – possibly a few
24028 SQTClosedTrans hundred for the DSI, but in the case of the SQT, this should be not even that high. High values (e.g. thousands)
simply indicate that the next thread is slow in processing and the SQT cache is being used as a buffer.
Current read transaction count. Authors Note: Typically, this will be pretty low as it represents the transactions that
24029 SQTReadTrans have been read that are still in cache and NOT the total transactions read. As transactions are truncated, they will no
longer be in cache and consequently not counted.
Current truncation queue transaction count. Authors Note: This should be the sum of Open, Close and Read as
24030 SQTTruncTrans transactions are added to this as soon as they are read into SQT cache (still OPEN) and will be counted until
truncated from cache.
Adding more SQT cache shouldn’t even be considered due to no TransRemoved. However, in interval 1, the
DIST was lagging as can be seen by the lower value for ReadTransAdd vs. ClosedTransAdd – but it started
catching up by interval 2 & 3 – and especially by mid-way through the sample periods
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 101
Example SQT: Impact of empty transactions (cont)
This must be a huge SQT cache as we were simply buffering ~30,000 transactions due to latency in the DIST
processing. Note that ReadTrans will be low as transactions are discarded from cache as soon as truncated.
However, given the resurging increases in ClosedTrans at #6 and #10, the problem is more frequent vs. acute
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 102
DSI/SQT HVAR/RTL command parsing
Memory size consumed by SQT pre-parsed commands whose memory is counted against sqt_max_prs_size and
24037 SQTPrsMemory
currently kept in memory waiting to be read/consumed by client.
24038 SQTParseTime The amount of time, in milli-seconds, spent by SQT in parsing commands.
Large block sizes Any Any 15.5+ 16, 32, 64, 128, 256
Tuning tips
• DO NOT ADD SQT CACHE UNLESS TransRemoved > ~3-5+
….especially if SQTClosedTrans > 100
If you do, you are just masking the real problem by increasing the “buffering”…it will still fill
• Use SQM’s CmdSize * SQT’s CmdsTran.counter_max * max(3,SQTOpenTrans) to compute SQT cache
necessary
Note that some transactions simply are too large to reasonably cache (e.g. 1M cmds)
Consider a 100K cmd transaction with a 1K cmd size…it would take 102,400,000 bytes (~100MB) to
cache…consequently a 512MB cache is probably sufficient for an absolute upper bounds for an inbound SQT cache
unless you have a lot of really large transactions and you have a lot of memory.
PRS
O C R T (cmd structure) SRE 3
1 2 TD 4
MD
Read request 5
packing
Packed binary cmd DIST
SQT
Packed ascii cmd
md_sqm_write_request queue
SQM (o)
Packed ascii cmd
OpenServer Messaging
• In-Memory message queues A
Provides means for asynchronous processing
Each thread may interact with 1 or more message queues Memory/Caches
Provides a cache to reduce impact of ‘surges’ in processing
– exec_sqm_write_request_limit
– md_sqm_write_request_limit
– exec_nrm_request_limit
• RS configurations:
num_msgqueues OpenClient
num_msgs
B Callback
• Configuring:
If too low, RS will crash
Shouldn’t have to tune, unless running a lot of connections
RS Latency
• The queues do not ‘belong’ to a thread…
and therefore the thread does not have to be running for another
thread to act on the queue
• …but, for a back & forth exchange, thread OpenServer
sequencing & execution becomes an issue C Message Queues
~85% increase
~45% increase
0:02:00
decrease in
latency
The overall key for DIST (including SRE, TD & MD) is focus on time counters
Derived counters
• Cmds/sec = CmdsRead/(sample duration in seconds)
• DirectRepPct = (DISTCmdsDirectRepRecv/RACmdsDirectRepSend *100.0)
30016 SREcreate SRE creation requests performed by a DIST thread. This counter is incremented for each new SUB.
30017 SREdestroy SRE destroy requests performed by a DIST thread. This counter is incremented each time a new SUB is dropped.
SRE requests performed by a DIST thread to fetch a SRE object. This counter is incremented each time a DIST thread fetches an
30018 SREget
SRE object from SRE cache.
30019 SRErebuild SRE rebuild requests performed by a DIST thread.
30020 SREstmtsInsert Insert commands encountered by a DIST thread and resolved by SRE.
30021 SREstmtsUpdate Update commands encountered by a DIST thread and resolved by SRE.
30022 SREstmtsDelete Deletes commands encountered by a DIST thread and resolved by SRE.
DIST commands with no subscription resolution that are discarded by a DIST thread. This implies either there is no subscription or
the 'where' clause associated with the subscription does not result in row qualification. Author’s Note: If you see a lot of these (e.g.
30023 SREstmtsDiscard
10% or more of total), this is an indication that you might want to consider using sp_setreptable <tablename>, ‘never’ on tables you
aren’t replicating….although it also may point out a narrow subscription (e.g. where status=‘complete’)
30027 dist_stop_unsupported_cmd dist_stop_unsupported_cmd config parameter.
30033 DISTSreTime The amount of time taken by a Distributor to do sre resolve.
EXEC SQM
1 2
RSI User
3 IBQ
SQT 10
DIST 9 DIST
5 4
6 7 8 11
15
OBQ DSI-S DSIEXEC OBQ 13 DSI-S 14 DSIEXEC
Use a route any time a WAN is involved – no matter how short the WAN
• Using routes generally benefits performance and reliability in WAN environments
Consider a route in a LAN for performance and robustness if between buildings, etc.
• It also can alleviate contention between DSI and other threads on internal structures (e.g. STS cache)
If you want really low-latency in a LAN, a route may not help
• Direct command replication is not possible across a route
RepAgent DIST: direct command replication possible
DIST DSI: direct command replication possible
RSI network (RTL) RSIU….direct command replication not possible due to network
• So, if source & target are in the same datacenter a route may add a bit of latency
• HOWEVER: This needs to be carefully considered
If there is any latency at all in the DSI/DSIEXEC due to RDB execution speed, this argument fails quickly
Due to inherent network speed issues, this likely only works best when source/target are on same subnet
parameter rsi_fadeout_time.
Number of blocking (SQM_WAIT_C) reads performed by a RSI thread against SQM thread that manages a RSI
4007 BlockReads
queue.
4009 SendPTTime Time, in milli-seconds, spent in sending packets of data to the RRS.
4015 RSIReadSQMTime The time taken by an RSI thread to read messages from SQM.
4017 RSTicket rs_ticket markers processed by a RSI thread.
4018 UnpackedCmd Total commands unpacked by a RSI thread.
59000 RSIUCmdsRecv Commands received by a RSI User thread. Includes RSI message, get truncation requests, etc.
59001 RSIUMsgRecv RSI messages received by a RSI User.
59002 RSIUGetTRecv Get truncation requests received by an RSI User.
59003 RSIURebldQRecv Rebuild queues requests received by an RSI User.
59004 RSIUSetTRecv Set truncation requests received by an RSI User.
59005 RSIUCmdLen Length of an RSI command.
RSIUSER
59006 RSIUSendGetT The amount of time, in milli-seconds, spent responding to 'get truncation' requests.
59008 RSIUSendSetT The amount of time, in milli-seconds, spent responding to 'set truncation' requests.
59010 RSIURecvPckt The amount of time, in milli-seconds, spent receiving network packets.
Number of command buffers received by an RSI User thread. Buffers are broken into packets when in 'passthru'
59013 RSIUBuffsRcvd
mode, or language 'chunks' when not in 'passthru' mode. See counter 'RSIUPcktsRcvd' for these numbers.
Number of empty packets received in 'passthru' mode by an RSI User thread. These are 'forced' EOM's. See
59014 RSIUEmptyPckts
counter 'RSIUPcktsRcvd' for these numbers.
59015 RSIUConnPcktSz The connection packet size for the RSI User.
59016 RSIUBytsRcvd Bytes received by an RSI User thread. This size includes the TDS header size when in 'passthru' mode.
59017 RSIUExecTime The amount of time, in milli-seconds, RSI User thread is scheduled by OCS.
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 131
Route guidance is very simple
Soooo….the key focus is to load the counters for the SRS where the latency is
• ….and then look at how many truncation synchronizations per sec there are
Sender or receiver, the effect would be the same
DIST Analysis
6. Would dist_direct_cache_read help? How can you tell?
7. Are there any issues with repdefs
e.g. are we lacking any???
Is it possible that tables are marked for rep that shouldn’t be??
8. Is the MD write queue sized correctly??
9. How is the write speed to the outbound queue??
DSI/SQMR
• Just like with SQT/SQMR, it reads transactions from the queue
DSI/SQT
• Just like with the DIST/SQT, it sorts transactions into commit sequence order
Remember, a single destination may be the target of multiple sources....
The transactions from different sources may be intermingled
In addition, in the case of a WarmStandby DSI, the DSI SQT does the heavy lifting sort ala the SQT
• One difference from DIST/SQT is that the SQT cache is used for ….
DSI transaction grouping
DSI compilation for HVAR/RTL
….the CLOSED list is the source for both of these functions
• Another subtle difference is the transaction profiling
Determines whether contiguous inserts in same transaction can be sent via bulk inserts
Arguably, this is a module unto itself….but …..it is implemented within the core SQT module
– Note that the inbound queue DIST/SQT really doesn’t need this functionality, so, it skips it….mostly – except SQLDML
• Thirdly, the sqt_prs_cache_size is used to cache parsed commands for DSIHQ
DSIHQ
• This is the module that actually does the HVAR/RTL processing
• Compiles the net changes into a Compilation Database (CDB)
The CDB is an in-memory collection of data – not an end-user database
The compiled net changes are in the CDB
– With pointers back to the parsed commands in the SQT/PRS cache
SQT
DSI
md_sqm_write_request queue
SQM page cache
Current block
SQM
Packed ascii cmd
Frequently oversized.
The problem is due to monitoring via admin who,sqm
• Admin commands only report that module statistics and not entire system
• RS course teach that admin who,sqm can measure latency via Last Seg.Block vs. Next Read.
This was reasonably true when SRS didn’t use much memory for SQT cache as the amount of backlog cached was
minimal compared to disk space often in the queue due to latency
One of the most difficult aspects is getting admins to stop relying on admin who,sqm
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 144
Example DSI SQT cache: The mythical full cache
We have a 1GB DSI SQT cache and it is fully used – likely messages in errorlog about being full or admin
who,sqt will show filled=1 which may cause DBA to consider adding more cache…..but….
…(cont) In reality, we are simply buffering transactions in the DSI SQT cache because the DSIEXEC can’t
replicate them any faster to the RDB – most likely due to slow RDB execution. In this case ReadTrans is high
due to the large transaction groups used by HVAR
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 146
DSI Transaction grouping (w/ DSIHQ enabled)
By comparing the various ungrouped vs. grouped transaction counters, we can derive an effective
dsi_max_xacts_in_group setting. Normally, it would be about 20. However, with HVAR active we can easily
bypass this restriction. All is not well as the commits vs. sent are out of whack suggesting retries or other issues
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 147
DSI transaction profiler
The key is to look at the DSI group closure counters and see if….
• Closed prematurely due to silly default configs
Obvious fix – increase the configuration setting
• Closed prematurely due to interspersed transactions from multiple sources
If this is the case, consider using Multiple DSI’s – one for each source
Derived counters:
• Dsi_rate = DSICmdsRead/<seconds>
• Effective dsi_max_xacts_in_group = DSITransUngroupedSent/DSITranGroupsSent
Comments
• DSICmdsRead refers to the DSI-Scheduler aspect
The other modules (such as SQMR and SQT) have counters for CmdsRead as well
SQMR & SQT CmdsRead maybe higher until the SQT Cache is full
DSICmdsRead may be slightly artifically high at first due to DSIEXEC batching, but should quickly drop down to same
rate as DSIEXEC processing
At stable point, DSICmdsRead is the effective throughput of the DSI/DSIEXEC pipeline – which is essentially and most
often controlled by the speed of the replicate database in processing transactions or network overhead.
• The READ, SENT, Succeeded, Committed DSITransGroup differences
READ refers to the initial grouping as constructed by the DSI-S
SENT could include the retries due to failures
Succeeded is groups that finished but didn’t FAIL (includes retries)
COMMIT is those actually committed without a retry
Observations:
– READ is likely higher than COMMIT due to some still in cache (not yet sent)
– SENT = Committed + DSIAttemptsTranRetry + DSITransFailed
These are mainly ignorable except for certain debugging situations – listed here
mainly for completeness
GroupsClosedTrans is normal
• Except for DSIHQ
GroupsCloseNoneOrig can be a bad thing
• If low volume, this just indicates that the previous group was already sent and there weren’t any groups open
at this time to append to…..this is okay.
• If mid-high volume, this likely indicates that the previous transaction was from a different source and MDSI
likely should be used to reduce/eliminate latency
For DSIHQ, verify with HQ counters
5050 PartitioningWaits Transaction groups forced to wait for another group to complete (processed serially based on Transaction Partitioning rule).
5051 UserRuleMatchGroup Times Transaction Partitioning rule USER was checked and found to be 'parallel' for GROUPING decision.
5052 UserRuleMatchDist Times Transaction Partitioning rule USER was checked and found to be 'serial' for DISTRIBUTION decision.
5053 TimeRuleMatchGroup Times Transaction Partitioning rule TIME was checked and found to be 'parallel' for GROUPING decision.
5054 TimeRuleMatchDist Times Transaction Partitioning rule TIME was checked and found to be 'serial' for DISTRIBUTION decision.
5055 NameRuleMatchGroup Times Transaction Partitioning rule NAME was checked and found to be 'parallel' for GROUPING decision.
5056 NameRuleMatchDist Times Transaction Partitioning rule NAME was checked and found to be 'serial' for DISTRIBUTION decision.
5057 AllThreadsInUse This counter is incremented each time a Parallel Transaction must wait because there are no available parallel DSI threads.
5058 AllLargeThreadsInUse This counter is incremented each time a Large Parallel Transaction must wait because there are no available parallel DSI threads.
5059 ExecsCheckThrdLock Invocations of rs_dsi_check_thread_lock by a DSI thread. This function checks for locks held by a transaction that may cause a deadlock.
5060 TrueCheckThrdLock Number of rs_dsi_check_thread_lock invocations returning true. The function determined the calling thread holds locks required by other threads. A rollback and retry occurred.
Number of times transactions exceeded the maximum allowed executions of rs_dsi_check_thread_lock specified by parameter dsi_commit_check_locks_max. A rollback
5062 CommitChecksExceeded occurred.
5064 CmdGroupsRollback Command groups rolled back successfully by a DSI thread.
5066 RollbacksInCmdGroup Transactions in groups sent by a DSI thread that rolled back successfully.
5072 OriginRuleMatchGroup Times Transaction Partitioning rule ORIGIN was checked and found to be 'parallel' for GROUPING decision.
5073 OriginRuleMatchDist Times Transaction Partitioning rule ORIGIN was checked and found to be 'serial' for DISTRIBUTION decision.
5074 OSessIDRuleMatchGroup Times Transaction Partitioning rule ORIGIN_SESSID was checked and found to be 'parallel' for GROUPING decision.
5075 OSessIDRuleMatchDist Times Transaction Partitioning rule ORIGIN_SESSID was checked and found to be 'serial' for DISTRIBUTION decision.
5076 IgOrigRuleMatchGroup Times Transaction Partitioning rule IGNORE_ORIGIN was checked and found to be 'parallel' for GROUPING decision.
5077 IgOrigRuleMatchDist Times Transaction Partitioning rule IGNORE_ORIGIN was checked and found to be 'parallel' for DISTRIBUTION decision.
5087 DSIPutToSleep Number of DSI/E threads put to sleep by the DSI/S prior to loading SQT cache. These DSI/E threads have just completed their transaction.
5088 DSIPutToSleepTime Time spent by the DSI/S putting free DSI/E threads to sleep.
5092 DSIThrdRdyMsg ''Thread Ready'' messages received by a DSI/S thread from its assocaited DSI/E threads.
5093 DSIThrdCmmtMsgTime Time spent by the DSI/S handling a ''Thread Commit'' message from its associated DSI/E threads.
5095 DSIThrdSRlbkMsgTime Time spent by the DSI/S handling a ''Thread Single Rollback'' message from its associated DSI/E threads.
5097 DSIThrdRlbkMsgTime Time spent by the DSI/S handling a ''Thread Rollback'' message from its associated DSI/E threads.
5100 DSINoDsqlDatatype Number of commands that cannot use dynamic SQL statements because of TEXT, IMAGE, JAVA and ineligible UDDs.
5101 DSINoDsqlRepdef Number of commands excluded from dynamic SQL by replication definition.
5102 DSINoDsqlColumnCount Number of commands excluded from dynamic SQL because the number of parameters would exceed 255.
5104 DSINoBulkDatatype Number of bulk operations skipped because the tables have datatypes incompatible with bulk.
5105 DSINoBulkFstr Number of bulk operations skipped because the tables have customized function strings for rs_insert or rs_writetext.
5106 DSINoBulkAutoc Number of bulk operations skipped because the tables have autocorrection turned on.
Number of commands excluded from dynamic SQL because minimal columns is on for the update and at least some columns of the table are
5107 DSINoDsqlMinColNoRepdef
not in repdef.
5108 DSIPendingTimeOut Number of times DSI timed out waiting for the next batch of commands while previous batch results are pending.
(RS 15.7.1+)
(updates, deletes)
(inserts)
1 In-Memory
Consolidated Net 2 Bulk Load of Net
Changes Changes
Prod Tables
Temp Tables
Derived counters
• Compile % = (HQCmdsCompiled *100.0)/DSICmdsRead
• Reduced % = (HQCmdsReduced*100.0)/ HQCmdsCompiled
• Language % = (HQLangCmds*100.0)/ HQCmdsCompiled
Comments
• Typically, unless disabled for key tables, Compile% will be very high (~100%)
• Usually, Reduced% will be very low – most often <<10% and frequently <2%
So don’t expect huge gains here except on certain key tables (sequential key tables, queues, etc.)
• Ideally, Language% should be very low – hopefully <2% and better if <1%
If there is a constant dribble of language commands, the time spent processing these in the RDB may exceed the bulk
commands….the “constant dribble of death”
Large numbers of language commands suggest dsi_bulk_threshold too high or a lot of tables with small numbers of
changes.
Note that language commands are truly sent via SQL language text – DSQL is not used even if enabled
– Rationale is that one of the most common failures is due to datatype translation in bulk/DSQL vs. implicit with SQL language
• Otherwise mainly focus on compilation failure reasons, bulk sizes, execution time and retries.
Trying to improve the command reduction is a waste of time and effort
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 163
DSIHQ transaction closure and noncompilation reasons
Counter_id Display_name Description
Description is too long. Authors Note: This is actually the description in the RSSD. The real definition is that the
5118 HQGroupsClosedNoneOrig
transactions are from different origins and can’t be grouped and MDSI should be used to avoid the problem.
HQ Transaction groups closed by a DSI thread due to the next transaction satisfying the criteria of being large
5119 HQGroupsClosedCmd (number of cmds). Authors Note: This cryptic description simply means that the number of commands in the cdb
would have exceeded dsi_compile_max_cmds. If less than 50,000, consider increasing.
HQ Transaction groups closed by a DSI thread due to the next transaction satisfying the criteria of being large
5120 HQGroupsClosedSize (CDB size). Authors Note: This cryptic description simply means that the transaction data size would have
exceeded dsi_cdb_max_size. If less than 2048 (2GB), consider increasing.
Number of transactions declared non-compilable by transaction profiling processing. Transactions contain other
5121 TranNonHQForTPF
commands than insert, update or delete ones.
Number of transactions declared non-compilable because their size is unknown and incremental compilation
5122 TranNonHQForNoSize
mechanism is deactivated.
Number of transactions declared non-compilable because their size is too large and incremental compilation
5123 TranNonHQForTooBig
mechanism is deactivated.
HQ Transaction groups closed by a DSI thread due to the limitation of the max SQT cache size. Authors Note:
5124 HQGroupsClosedSQTSize
DSI SQT cache size kept us from compiling more….should be increased.
Transaction groups closed by a DSI thread due to the limitation of the max SQT cache size. Authors Note: Non
5125 GroupsClosedSQTSize
DSIHQ, but similar – e.g. 20 10MB txns and only 128MB in DSI SQT cache.
Transaction groups closed by a DSI thread because there is a free DSI/E ready to accept group. Authors Note:
5126 GroupsClosedDispatch This is how RS avoids excess latency with HVAR…if DSIEXEC ready, it sends it rather than waiting forever to hit
one of the limits.
Transaction groups closed by a DSI thread due to DSI/S switch to different SQT cache. Authors Note: happens
5127 GroupsClosedDispatch
on materialization, DDL, large transaction, or system transaction
5128 GroupsClosedDispatch Transaction groups closed by a DSI thread due to rs_update_lastcommit command in the group.
We are well below the dsi_cdb_max_size and dsi_compile_max_commands ….and DSI SQT cache is not a limiting
factor. However, we did have ~18 transactions from other source or proc – we can determine when HQNoneOrig is
tripped by another source/proc vs. by simply none open from same source by looking at Dispatch
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 165
DSIHQ time & bulk size counters
Interval LangCmds Bulk <100 100 -> 500 500 -> 1K 1K -> 5K 5K -> 10K 10K -> UP
------------------------------ ---------- ---------- ---------- ---------- ---------- ---------- ----------
(1) 11:31:17 -> 11:46:48 129 3391 37066 5273 50818 0 0
(2) 11:46:49 -> 11:55:04 146 4034 45489 3858 59357 0 0
(3) 11:55:05 -> 12:03:24 137 4120 41503 5254 54169 0 0
(4) 12:03:25 -> 12:11:42 112 3153 39600 8154 57050 0 0
(5) 12:11:43 -> 12:20:00 168 3839 39566 10133 60283 0 0
(6) 12:20:01 -> 12:28:23 160 3791 44468 0 54929 0 0
(7) 12:28:24 -> 12:36:40 146 3963 43336 1186 54457 0 0
(8) 12:36:41 -> 12:44:58 157 3320 37769 4271 50456 0 0
(9) 12:44:59 -> 12:53:15 184 3694 39956 1701 50981 0 0
(10) 12:53:16 -> 13:01:30 140 5284 40347 589 52295 0 0
(11) 13:01:31 -> 13:09:46 170 3695 41939 2726 52648 0 0
(12) 13:09:47 -> 13:18:01 186 4484 39323 573 49764 0 0
(13) 13:18:02 -> 13:26:18 176 6639 38132 4770 46230 0 0
(14) 13:26:19 -> 13:34:35 219 8369 36225 8260 42244 0 0
(15) 13:34:36 -> 13:42:51 193 8053 35747 10058 39641 0 0
------------------------------ ---------- ---------- ---------- ---------- ---------- ---------- ----------
2423 69829 600466 66806 775322 0 0
We aren’t hitting any config limits and all the group closures are due to NoneOrig……we can verify all from same source,
but if we look at SQMR counters in this case, we notice no backlog, so the smaller sizes are simply due to the fact we
really aren’t pushing things that hard and don’t have any latency (hence NoneOrig’s other cause – no latency)
Notice the transactions per group….it is ~100-200x the config…this is good. If we were only 10x or less of the config, it
would point to grouping problems
DSI Bulk Inserts (non-ASO) ECx 15.1+ ASE, ECO, ECH, RTL/IQ
SQLDML ? 15.2+
Direct load subscription ? HANA 15.7.1 SP100+ Tested, but not QA’d for ASE or non-HANA
targets, but works from all sources
DSI Message
DSI Queue
DSIEXEC(s) RDS.RDB
Txn Begin
• DSIEXEC reads from the DSI message queue which transaction group it is supposed to work on
• Once the first batch of SQL has been sent to RDB, it responds with a ‘begin’ message so that DSI can coordinate
other DSI’s if parallel DSI’s are used (e.g. think wait_for_start)
• If not using parallel DSI, this is minimal time
Batch
• DSIEXEC groups multiple statements together to send in small multi-statement chunks to the RDB for network
efficiency
• If there is a significant amount of time spent in batching, it could be due to any of the following reasons:
Premature batch flushing due to misconfiguration of system or batching is disabled
Batch SQL is not being used (e.g. Dynamic SQL uses RPC mechanism which cannot batch SQL)
A bulk insert operation is being used (bulk operations use a slightly different path vs. the normal one)
Send
• This is the time spent sending the data to the RDB – not execution time
• Should be very low unless a lot of text/image data or slow network
Results
• This is the time spent waiting for execution to complete and the typical ct_results() loop for each statement in
the batch
• This often will be the largest block of time
• See comments on next page as to how to reduce
Txn Commit
• This is the typical ‘commit tran’ and post-commit internal SRS cleanup
• Should not take much time unless parallel DSI’s are being used
Commit sequencing time may be fairly high in such cases.
This is not a problem unique to SRS – afflicts all applications at any DBMS
Target is ASE 12.5.4 or statement cache cannot be used at all (or heterogeneous DBMS)
• Note that a login trigger can be used to
disable statement cache for all connections other than SRS, so …
Enable literal auto parameterization for SRS connections/sessions only….
• Enable dynamic SQL in SRS for that connection
Dynamic_sql ‘on’ (default is off)
• Tune for a fairly large number of statements
dynamic_sql_cache_management ‘mru’ (default is ‘fixed’)
dynamic_sql_cache_size 1000 (default is 100)
• Monitor via RS MC
Target is ASE 15.x+
• Use statement cache with literal auto parameterization
Use a login trigger to enable literal parameterization for SRS sessions if not able to use for entire server (e.g. SAP apps)
• This leverages SQL batching for network efficiency and is quite a bit faster than DSQL (up to 50%)
Impact on replication
• ASE replication agent often may forward whole numbers as literals (“1”) vs. numerics(“1.0”)
• DSIEXEC doesn’t convert data into SQL language text based on datatype but based on literal values
• Result is often a connection may cause a lot of extraneous statements in the statement cache
Work-around/resolution
• Use RS dynamic SQL
• Decrease dsi_bulk_threshold to force more bulk inserts/HVAR consideration
Commit Blocking
• When a transaction in a DBMS issues a commit, it typically blocks until all the changes are recorded in the
transaction log
• To ensure full ACID, this requires that the physical IO’s to write the changes complete successfully prior to returning
control to the client
• This can have extreme performance penalties
In ASE, most of the contention on the log semaphore is due to each pending commit waiting on the current transaction’s physical
writes.
For short (e.g. atomic) transactions, most of the time is spent waiting on log writes
monSysWaits/monProcessWaits WaitEventID’s 54 (semaphore) & 55 (last log page)
Non-Blocking Commits
• Both Sybase and Oracle support a form on non-blocking commits in which the commit proceeds as soon as the
writes are scheduled (but before they are confirmed)
• Sybase ASE 15.0+
set delayed_commit { on | off }
Also available as database option via sp_dboption
• Oracle 10gv2+
alter session set commit_write = { nowait | immediate }
Benchmark Configuration:
• Sun RISC - 4 1592MHz CPUs, 24GB memory
• Transaction profile: TPC-C
Results:
• Without non-blocking commit feature: 16,574 sec
• With non-blocking commit feature: 12,727 sec
• Improvement of 30.2%
Comments:
• Feature will demonstrate better performance on slower devices
DBMS log on RAID 5 or RAID 6
• Feature will also improve performance with smaller transactions
either due to transaction grouping rules
or if transaction grouping disabled
Prior to RS 15.7
• Table repdefs were required even in standby implementations (both WS & MSA) to
Identify primary key columns
Identify quoted identifier columns
• Lack of repdefs often resulted in:
Dismal performance for updates & deletes as where clause was constructed using all non-BLOB columns. Replicate
DBMS had to compare & check all supplied columns not just primary key values (including long comment fields).
Database inconsistencies as approximate numeric (float/real) columns were included in where clause and different
hardware FPU processing resulted in slightly different values
Errors due to unexpected reserved words in SQL
RS 15.7
• ASE 15.7 now includes pkey & quoted identifier bits in column metadata.
• Requires primary key constraint or a unique index on tables in primary database.
Consider
• Once RepAgent User/Executor normalizes to a replication definition…
That repdef is associated with the command for the rest of processing
The lack of a repdef means there is no association with a repdef
If later you need to add a repdef (for function string manipulation), existing data in the queues will not be re-normalized to the
repdef
In other words, it may be too late to add a repdef when needed
• Best Practice: continue to create table repdefs
Use procs/scripts/PowerDesigner to avoid hand creation/enmass definition creation
DSI Bulk Inserts (non-ASO) ECx 15.1+ ASE, ECO, ECH, RTL/IQ
SQLDML ? 15.2+
DSIEXEC overall
1. Where was time spent the most? What are the possible ways it could be reduced?
2. Where was the second most time spent? What are the possible ways it could be reduced?
3. Was time spent somewhere that you likely couldn’t affect much???
DSIEXEC configuration
4. Was dsi_batch_size set correctly?
5. How many commands per batch were being sent? Do you consider this effective?
6. At the packet size, how many packets would it take?
Pre-15.7 RepAgent
• Each database repagent ran as a separate session (spid) in ASE
• Sequence
Scan from log to fill buffer
Once operations buffer filled, convert buffer from log records to RepAgent LTL
Once LTL buffer filled, send packets
Scan next log page(s)
Primary DR
Long distance
???
Multiple scanners
• Syntax:
sp_config_rep_agent dbname, 'multiple_scanners', {'true' | 'false'}
• Builds on multiple senders/multi-threaded RepAgents
• Adds an additional coordination thread
• Additional RepAgent configuration:
Additional configuration considerations
• sp_config_rep_agent dbname, 'trunc point request interval', '##‘
This is necessary or otherwise the path with lowest change volume would drive truncation point location frequency and
could cause log to fill
• sp_configure 'replication agent memory size', ####
Determines the size of the global pool for RepAgent schema caches, etc.
• sp_config_rep_agent 'max schema cache per scanner', ####
Specifies how much schema cache is allocated per scanner
Default is 512KB
• sp_config_rep_agent 'multipath distribution model', {'object'|'connection'|'filter'}
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 213
ASE 15.7.1 Replication Filters
Replication Filters
• Creates a filter than can be used by multiple scanners to filter out log records that don’t apply
• Syntax:
create replication filter filter_name on table_name as filter_clause
Filter clause can be any valid SQL expression
– Formula
– built-in function (e.g. hash), in(), like(), between…
• Filters are then bound to a replication path via sp_replication_path
Restrictions
• Filters can only be created on tables/procs in database filter is created in
• No complex expressions such as:
Joins
Subqueries
No User-defined functions
create replication filter filter_in_sales on sales as total_sales between 4095 and 12000
create replication filter filter_out_sales on sales as total_sales not between 4095 and 12000
create replication filter state_out on sales as state not in ("CA", "IN", "MD")
Determine the degree of parallelism likely needed and configure the logical paths
• Configure the scanners/senders, etc.
• Add logical paths
Scenario 1: 4 huge tables we want to split by value (hashed) and 3000 other tables
• Large tables are bulk loaded – otherwise readonly
• We want to use 5 way parallelism on the huge tables
Transaction serialization on large tables irrelevant due to bulkload.
• Target DBMS doesn’t support dsi_bulk_copy nor HVAR, so we need to use parallelism to keep up with SRS
Scenario 2: 700 concurrent users doing batch processing after market close
• We think we need ~6 parallel paths to keep latency to a minimum
Yet retain transaction serialization with respect to the same session.
…
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 219
Scenario 1: 4 large tables and bulk loads (cont)
Multi-threaded RepAgent
Syntax:
-- Subscription IDS_aseserver1.IDS.SAPSR3.KEKO -> HAN_DB.SAP_ECC.SAP_ECC.KEKO
-- Table Replication Definition: IDS1_SAPSR3_KEKO_rd
create subscription IDS1_2_HAN_KEKO_sub
for IDS1_SAPSR3_KEKO_rd
with replicate at HAN_DB.SAP_ECC
without holdlock direct_load user sapsa password Sybase1234
subscribe to truncate table
go
This could be SAPSR3 if you know password
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 229
Checking subscription progress
When you create a subscription with direct load, SRS does the following
1. Creates a catch-up queue for inflight DMLs
2. Sends usual begin subscription marker to primary ASE
3. Opens a direct connection from the RRS to the primary database/RAX to select the data
4. As the data is retrieved, it splits the round robin by row across parallel apply threads
Number of threads controlled by max_mat_load_threads
Each thread uses bulk insert to send data to replicate
Each thread commits every mat_max_tran_size rows
The apply threads are separate from normal DSI
– no need to suspend it as with normal subscriptions using atomic materialization
5. Any new transaction/DML that happens during materialization is held in catch-up queue
So…if a tran affects 3 tables (A, B, C) and we are materializing C
– A&B are applied as normal by the DSI
– Rows for C are sent to the side to the catch-up queue
6. When select completes, SRS starts to apply catch-up queue
Inserts are sent as delete/insert. Updates as normal (no D/I conversion)
As a result, updates on PKey will cause subscription to fail (row missing)
7. When SRS is nearly complete with catch-up queue, end marker is put into primary log
8. When end-marker is seen by SRS, it directs new DML to normal DSI queue
And then tears down the catch-up queue.
Monitor thread
Source DB 1 max_mat_apply_threads
Select thread 2 bulk_insert C (parallel)
3
BT
Ins A
RSI-U Catch-up SQM
Ins B
Ins C
CT
BT HANA
Ins C
CT
BT
Ins A
Ins B
Ins C Catch-up Queue
CT !
BT
DIST Ins A
BT Ins B
Ins A CT
Ins B
CT
DSI DSI-EXEC
Outbound SQM BT
BT Ins A
Ins A Ins B
Ins B
CT Outbound Queue
CT
Source DB max_mat_apply_threads
Select thread
BT
Ins A
RSI-U Catch-up SQM Catch-up DSI
Ins B
Ins C
CT
BT
4 BT HANA
Ins C Ins C
BT
BT
Ins A
CT Ins C
CT
5 CT
Ins B
Ins C Catch-up Queue Catch-up 6
CT
DSI-EXEC
BT
DIST Ins A
BT Ins B
Ins A CT
Ins B
CT
DSI DSI-EXEC
Outbound SQM BT
BT Ins A
Ins A Ins B
Ins B
CT Outbound Queue
CT
Source DB max_mat_apply_threads
Select thread
BT
Ins A
RSI-U Catch-up SQM Catch-up DSI
Ins B
Ins C
CT
BT HANA
Ins C
BT CT
BT Ins C
Ins A CT
Ins B
Ins C Catch-up Queue Catch-up
CT
DSI-EXEC
BT
DIST Ins A
BT Ins B
Ins A Ins C
Ins B CT
Ins C
CT
DSI DSI-EXEC
Outbound SQM BT
BT Ins A
Ins A Ins B
Ins B
Ins C Outbound Queue
Ins C
CT CT
Source DB max_mat_apply_threads
1
Select thread 2 bulk_insert C (parallel)
3
BT
Ins A
RSI-U Catch-up SQM Catch-up DSI
Ins B
Ins C
CT
BT
4 BT HANA
Ins C Ins C
BT
BT
Ins A
CT Ins C
CT
5 CT
Ins B
Ins C Catch-up Queue Catch-up 6
CT
DSI-EXEC
BT
DIST Ins A
BT Ins B
Ins A CT
Ins B
CT
7
DSI DSI-EXEC
Outbound SQM BT
BT Ins A
Ins A Ins B
Ins BOutbound Queue
CT
CT
Recommendations
• num_concurrent_subs to 40+ (possibly 50)
• num_threads by at least num_concurrent_subs*10
• num_stable_queues to (connections x 2 + num_concurrent_subs + 10) or 50 (whichever greater)
• cm_max_connections by ~num_concurrent_subs*5 (e.g. the default is 64 – set to 300)
© 2013 SAP AG or an SAP affiliate company. All rights reserved. 238
Admin who with concurrent direct_load
Spid Name State Info
---- ---------- -------------------- ------------------------------------------------------------
28 DSI EXEC Awaiting Command 112(1) HAN_REP_RSSD.HAN_REP_RSSD
19 DSI Awaiting Message 112 HAN_REP_RSSD.HAN_REP_RSSD
23 DIST Awaiting Wakeup 112 HAN_REP_RSSD.HAN_REP_RSSD
27 SQT Awaiting Wakeup 112:1 DIST HAN_REP_RSSD.HAN_REP_RSSD
13 SQM Awaiting Message 112:1 HAN_REP_RSSD.HAN_REP_RSSD
11 SQM Awaiting Message 112:0 HAN_REP_RSSD.HAN_REP_RSSD
32 REP AGENT Awaiting Command HAN_REP_RSSD.HAN_REP_RSSD
109 DSI EXEC Awaiting Command 114(1) HAN_DB.SAP_ECC
108 DSI Active 114 HAN_DB.SAP_ECC
15 SQM Awaiting Message 114:0 HAN_DB.SAP_ECC
Normal DSI
68 DSI EXEC Awaiting Command 937(1) HAN_DB.rs_0100006780000627IDS1_2_HAN_
123 DSI Awaiting Message 937 HAN_DB.rs_0100006780000627IDS1_2_HAN_
126 SQM Awaiting Message 937:0 HAN_DB.rs_0100006780000627IDS1_2_HAN_
Catch-up DSI for Sub1
184 DSI EXEC Awaiting Command 938(1) HAN_DB.rs_0100006780000628IDS1_2_HAN_
94 DSI Awaiting Message 938 HAN_DB.rs_0100006780000628IDS1_2_HAN_
103 SQM Awaiting Message 938:0 HAN_DB.rs_0100006780000628IDS1_2_HAN_
Catch-up DSI for Sub2
17 RSI Awaiting Wakeup IDS_REP_aseserver1
12 SQM Awaiting Message 16777317:0 IDS_REP_aseserver1
18 RSI Awaiting Wakeup IDS_REP_aseserver2
14 SQM Awaiting Message 16777318:0 IDS_REP_aseserver2
31 RSI USER Active IDS_REP_aseserver1
29 RSI USER Awaiting Command IDS_REP_aseserver2
21 dSUB Sleeping
7 dCM Awaiting Message
9 dAIO Awaiting Message
24 dREC Sleeping dREC
10 dDELSEG Awaiting Message
202 USER Sleeping sa
187 USER Active sa
89 SUB Awaiting Wakeup IDS1_2_HAN_JEST_sub Sub3 waiting (post catch-up??)
6 dALARM Awaiting Wakeup
25 dSYSAM Sleeping
In all cases
• Make sure sp_dboption ‘bulkcopy’, true & sp_dboption ‘trunc log on checkpoint’, true
• Set target system sp_configure ‘number of locks’ to 5 million or so…
• In SRS, set num_concurrent_subs to 30+ but no higher than 50 or so.
If using allpage locking (APL)
• Set connection max_mat_load_threads to 1
• Load subscriptions from large script….goal is high concurrent bulkloads in multiple tables concurrently
If using datapage locking (DPL)
• Change target table to datarows locking, do subscription via below, change table back to datapage locking
If using datarows locking (DRL)
• Set connection max_mat_load_threads between 3 and 5
• Set num_concurrent_subs to 3 to 5 (this is dynamic)
• Load subscriptions from file….goal is a few large tables with high insert concurrency on each table
Run update index statistics as each batch of subscriptions complete
Operation Processor
• This thread tracks the transactional context (which records belong to which transaction)
• Determines if operation is to be replicated and which operation from log records
E.g. if update was logged as a delete/insert pair, the operation processor creates a single update operation
• Operations are stored in the operation queue
Sender Thread
• Copies operations from the operation queue and writes them to the LTL Formatters unformatted queue in the LTI
buffer
• Performs necessary processing for DDL, LOB (text/image) and other special use cases
LTL Formatter
• Converts the unformatted operations into LTL formatted records understood by SRS
RepServer Interface
• Sends the records to Replication Server in batches (for efficiency)
LogScanner LTLFormatter
LTLFormatter
LTLFormatter
page 100 ts 1000
next page 101
Scan Buffer
page 101 ts 1001 LogRecords LTI Queue
LogRecords
LogRecords
next page 102
Unformatted
Unformatted Cmds
Cmds
page 102 ts 1002 Unformatted
(aka Change Sets)Cmds
(aka Change Sets)
next page 103 OperationProcessor (aka Change Sets)
LTL
LTL(Formatted) Cmds
page n ts 9999 Operation Queue LTL (Formatted)Cmds
(Formatted) Cmds
next page 100 One for
each TransactionContext
ansactionContext
transaction
TransactionContext
Transaction Log
Operations LTL Buffer
Operations
Operations
http://saptechedhandson.sap.com/ http://sapteched.com/online
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG.
The information contained herein may be changed without prior notice.
Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.
These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and
SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth
in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and
other countries.