Академический Документы
Профессиональный Документы
Культура Документы
juliandyke.com
About me...
20 years Oracle experience as DBA, developer and consultant Independent Consultant specializing in Kernel Performance Tuning RAC and High Availability Chair of UKOUG RAC & HA SIG Regular presenter at conferences, seminars and user group meetings in UK, Europe and USA Member of Oak Table Network Website http://www.juliandyke.com specializing in Oracle internals
2
juliandyke.com
juliandyke.com
juliandyke.com
Agenda
juliandyke.com
juliandyke.com
Interconnect Overview
Instances communicate with each other over the interconnect (network) Information transferred between instances includes data blocks locks SCNs Typically 1Gb Ethernet UDP protocol Often teamed in pairs to avoid SPOFs Can also use Infiniband Fewer levels in stack Other proprietary protocols are available
7
juliandyke.com
juliandyke.com
juliandyke.com
Clusterware
TCP
UDP
RAC
IP
Ethernet
Physical Layer
10
juliandyke.com
Interconnect Encapsulation
Data UDP Header IP Header Ethernet Header IP Header UDP Header UDP Header
Data
Data
14 bytes
20 bytes
8 bytes
4 bytes
MTU Size
11
juliandyke.com
Node 1 Node 2
Node 3 Node 4
Outgoing Incoming 12
juliandyke.com
Packets per hour 14,400 28,800 43,200 57,600 72,000 86,400 100,800 216,000 446,400
juliandyke.com
14
juliandyke.com
juliandyke.com
Instance 1
Instance 2
Redo Logs
juliandyke.com
juliandyke.com
58 is real time; 75 or 76 is time share You can also check process scheduling policies using chrt
oracle@server3 ~]$ chrt -p 8601 Time pid 8601's current scheduling policy: SCHED_RR pid 8601's current scheduling priority: 1 [oracle@server3 ~]$ chrt -p 8596 Share pid 8596's current scheduling policy: SCHED_OTHER pid 8596's current scheduling priority: 0 18 # lms0 - Real
# lmon - Time
juliandyke.com
juliandyke.com
20
juliandyke.com
21
juliandyke.com
22
juliandyke.com
23
juliandyke.com
Instance 2
2 Request granted
1318
24
STOP
juliandyke.com
1318
Instance 2
Instance 3
N X
1320
Instance 1 Instance 1 requests exclusive read on block
1318
25
STOP
juliandyke.com
1318
Instance 2
N X N
N X
1320
3 Block and resource status Instance 1 Instance 4 requests exclusive read on block
1323
Instance 4
1318
Note that Instance 1 will create a past image (PI) of the dirty block
26
STOP
juliandyke.com
Instance 2
4 Resource status
Instance 3
N X N
1320
Instance 1 Instance 2 requests current read on block
1323
Instance 4
1318
In Oracle 8.1.5 and above _fairness_threshold is used to avoid unnecessary lock conversions
27
STOP
juliandyke.com
Instance 2
4 Resource status
Instance 3
N X N
1320
Instance 1 Instance 2 requests current read on block
1323
Instance 4
1318
In Oracle 8.1.5 and above _fairness_threshold is used to avoid unnecessary lock conversions
28
STOP
juliandyke.com
29
juliandyke.com
1,42 2,50
RAC1 2
1,40 1318 2,44
30
juliandyke.com
No statistics so dynamic sampling required No indexes so full table scan required Steps are:
Dynamic Sampling 3-way 2-way 2-way 3-way 2-way Current Read 3-way
31
Consistent Read Consistent Read Consistent Read Consistent Read Consistent Read Current Read
Table block 15 Undo block 89 Undo block 239 Table block 15 Undo block 89 Table block 15
Consistent Read
juliandyke.com
juliandyke.com
33
juliandyke.com
34
juliandyke.com
3
1,42 2,44
5 RAC3
10
35
juliandyke.com
36
juliandyke.com
Resource Master 3 4 5 6 7 8
RAC2 1
RAC3 2
1,40 2,44
RAC1
1,40 1318 2,44
37
STOP
juliandyke.com
Destination RAC2 - LMS1 RAC4 - Server RAC3 - LMS1 RAC2 - LMS1 RAC4 - Server RAC4 - Server RAC4 - Server RAC4 - Server RAC4 - Server RAC4 - Server RAC2 - LMS1 RAC4 - LMS1
Description Request file 8 block 15 OK Send file 8 block 15 to RAC4 OK Block file 8 block 15 part 1 Block file 8 block 15 part 2 Block file 8 block 15 part 3 Block file 8 block 15 part 4 Block file 8 block 15 part 5 Block file 8 block 15 part 6 Received file 8 block 15 OK
Bytes 456 212 480 212 1500 1500 1500 1500 1500 868 244 212
juliandyke.com
3
1,42 2,44
RAC3
RAC4
1,40 1318 2,44
RAC3 saves past image of the dirty block until RAC4 writes the block to disk
39
STOP
juliandyke.com
40
juliandyke.com
7128 7129
Instance 1
Instance 2
7128
7129
7129 7123
Redo Log 1
41
STOP
BlockUndo/Redoappliedchanges DBWR hasis updatedcolumn a Instance 42updates perform to Instance 1subsequentlybuffer Block 42 1table t1 contains to Block 422 1 1is2 writtenfrom Assumeis needs recovery Instance notmust column Undo/redoupdated in from GCS transferswritten to 42 Block updates block Instance makes in buffer Undo/Redo Crashes 42 is Undo/redo block written Instance written Undo/Redo written to Block 42 is read from to disk ContentsPastdisk Instance 2 lost to recovery1for by block yet Instance 42cachePast Image Instance cachecache 42 backof 1 uses block are blockRedo Logto disk single row Log 1 a tobuffer 1 Redo in 2 Image DBWR 7127 7126 2 7124 1 1329 7128 7125 back to Log Redo
Redo Log 2
juliandyke.com
42
juliandyke.com
RAC2 1
RAC3 2
1,40 2,44
RAC1
1,40 1318 2,44
43
STOP
juliandyke.com
44
juliandyke.com
RAC2 1
RAC3 2
1,40 1,40 1,40 2,44 1,40 2,44 1,40 2,44 2,44 2,44
RAC1
1,40 1,40 1,40 2,44 1318 1,40 2,44 1,40 2,44 2,44 2,44
45
STOP
juliandyke.com
This trace can be misleading because: the gc cr multi block request specifies the LAST block in the range the gc cr multi block request does not specify how many blocks should be read the gc cr multi block request does not specify how many blocks have been returned from another instance
46
juliandyke.com
juliandyke.com
juliandyke.com
49
juliandyke.com
50
juliandyke.com
Buffer Header
Buffer Header
Buffer Header
Buffer Header
51
juliandyke.com
GCS Client
LE KJBL
LE KJBL KJBL
GCS Shadow
GCS Shadow describes blocks held by other instances, but mastered locally
52
juliandyke.com
juliandyke.com
54
juliandyke.com
55
juliandyke.com
For example
[0x137][0x40000][BL]
is file# 4, block# 311 Ordering by X$KJBR.KJBRNAME is difficult because the resource names do not collate when sorted e.g.:
[0x12E][0x40000][BL] [0x12F][0x40000][BL] [0x13][0x40000][BL] [0x130][0x40000][BL] [0x131][0x40000][BL] etc...
56
juliandyke.com
juliandyke.com
juliandyke.com
juliandyke.com
juliandyke.com
61
juliandyke.com
All blocks now mastered by the current instance To redistribute masters to all available instances use:
ORADEBUG LKDEBUG -m dpkey 52084
juliandyke.com
Object ID 52084
Current Master 0
Current Master 1
Previous Master 0
juliandyke.com
64
juliandyke.com
juliandyke.com
66
juliandyke.com
juliandyke.com
juliandyke.com
69
juliandyke.com
juliandyke.com
Destination RAC1-LMS0 RAC4-LMS0 RAC2-LMS0 RAC4-LMS0 RAC3-LMS0 RAC4-LMS0 RAC4-LMS0 RAC1-LMS0 RAC4-LMS0 RAC2-LMS0 RAC4-LMS0 RAC3-LMS0
Description Send current SCN OK Send current SCN OK Send current SCN OK Send current SCN OK Send current SCN OK Send current SCN OK
Bytes 192 212 192 212 192 212 192 212 192 212 192 212
juliandyke.com
72
juliandyke.com
73
juliandyke.com
74
juliandyke.com
75
juliandyke.com
juliandyke.com
2 110 3 110
109
108 Buffer Cache Redo Buffer RAC1 1 108 Buffer Cache Redo Buffer RAC2
Redo Log
77
STOP
juliandyke.com
3 110 110
109
108 Buffer Cache Redo Buffer RAC1 1 110 4 2 Buffer Cache Redo Buffer RAC2
Redo Log
78
STOP
juliandyke.com
4 5
109
108 Buffer Cache Redo Buffer RAC1 2 108 Buffer Cache Redo Buffer RAC2
Redo Log
79
STOP
juliandyke.com
2 108 109 110 5 7 6 8 108 Buffer Cache Redo Buffer RAC1 3 110 4 1 Buffer Cache Redo Buffer RAC2 110
109
Redo Log
80
STOP
juliandyke.com
or in /etc/sysconfig/ifcfg-eth<x>
MTU=9000
81
juliandyke.com
MTU=9000
Frame# 1 Total 82 Ethernet Header 14 14 IP Header 20 20 UDP Header 8 8 Data 8200 8200 Ethernet Trailer 4 4 Total 8246 8246
juliandyke.com
83
juliandyke.com
Any questions?
info@juliandyke.com
84
juliandyke.com