Aries

Review: The ACID properties
A tomicity: All actions in the Xact happen, or none happen.

ARIES: Logging and C onsistency: If each Xact is consistent, and the DB starts
consistent, it ends up consistent.
Recovery
I solation: Execution of one Xact is isolated from that of other
Slides derived from Joe Hellerstein; Xacts.
Updated by A. Fekete D urability: If a Xact commits, its effects persist.
If you are going to be in the
logging business, one of the
The Recovery Manager guarantees Atomicity & Durability.
things that you have to do is to
learn about heavy equipment.
- Robert VanNatta,
Logging History of
Columbia County
Motivation Intended Functionality

Atomicity:
Transactions may abort (Rollback). At any time, each data item contains the value
produced by the most recent update done by a
Durability:
transaction that committed
What if DBMS stops running? (Causes?)
Desired Behavior after

system restarts: crash!
T1
T1, T2 & T3 should be T2
durable. T3
T4 & T5 should be T4
aborted (effects not seen). T5
Assumptions Challenge: REDO
Action Buffer Disk
Essential concurrency control is in effect. Initially 0
For read/write items: Write locks taken and held T1 writes 1 1 0
till commit T1 commits 1 0
Eg Strict 2PL, but read locks not important for recovery. CRASH 0
For more general types: operations of concurrent
transactions commute Need to restore value 1 to item
Updates are happening in place. Last value written by a committed transaction
i.e. data is overwritten on (deleted from) its
location.
Unlike multiversion approaches
Buffer in volatile memory; data persists on
disk
Challenge: UNDO Handling the Buffer Pool

Action Buffer Disk
Can you think of a simple scheme
Initially 0
to guarantee Atomicity & No Steal Steal
T1 writes 1 1 0
Durability?
Page flushed 1
Force Trivial
CRASH 1
Force write to disk at commit?
Need to restore value 0 to item Poor response time.
Last value from a committed transaction But provides durability. No Force Desired
No Steal of buffer-pool frames
from uncommited Xacts (pin)?
Poor throughput.
But easily ensure atomicity
More on Steal and Force Basic Idea: Logging
STEAL (why enforcing Atomicity is hard)
To steal frame F: Current page in F (say P) is Record REDO and UNDO information, for every
written to disk; some Xact holds lock on P. update, in a log.
What if the Xact with the lock on P aborts? Sequential writes to log (put it on a separate disk).
Must remember the old value of P at steal time (to Minimal info (diff) written to log, so multiple updates
support UNDOing the write to page P). fit in a single log page.
NO FORCE (why enforcing Durability is hard) Log: An ordered list of REDO/UNDO actions
What if system crashes before a modified page is Log record contains:
written to disk? <XID, pageID, offset, length, old data, new data>
Write as little as possible, in a convenient place, at and additional control info (which well see soon)
commit time,to support REDOing modifications.
For abstract types, have operation(args) instead of
old value new value.
Write-Ahead Logging (WAL) ARIES

The Write-Ahead Logging Protocol:
Exactly how is logging (and recovery!) done?
1. Must force the log record for an update before
Many approaches (traditional ones used in
the corresponding data page gets to disk. relational systems of 1980s);
2. Must write all log records for a Xact before ARIES algorithms developed by IBM used many of
commit. the same ideas, and some novelties that were
#1 (undo rule) allows system to have quite radical at the time
Atomicity. Research report in 1989; conference paper on an
#2 (redo rule) allows system to have extension in 1989; comprehensive journal publication in
Durability. 1992
10 Year VLDB Award 1999
DB RAM
Key ideas of ARIES WAL & the Log LSNs pageLSNs flushedLSN
Each log record has a unique

Log every change (even undos during txn Log Sequence Number (LSN). Log records
abort) LSNs always increasing. flushed to disk
In restart, first repeat history without Each data page contains a

backtracking pageLSN.
Even redo the actions of loser transactions The LSN of the most recent log
Then undo actions of losers record
for an update to that page.
System keeps track of pageLSN Log tail
LSNs in pages used to coordinate state
flushedLSN. in RAM
between log, buffer, disk
The max LSN flushed so far.
Novel features of ARIES in italics
WAL constraints Log Records

Possible log record types:
Before a page is written, LogRecord fields: Update
pageLSN flushedLSN prevLSN Commit
Commit record included in log; all related XID Abort
update log records precede it in log type End (signifies end of commit
pageID or abort)
update length Compensation Log Records
records offset (CLRs)
only before-image for UNDO actions
after-image (and some other tricks!)
Other Log-Related State Normal Execution of an Xact
Transaction Table: Series of reads & writes, followed by commit or

One entry per active Xact. abort.
Contains XID, status (running/commited/aborted), We will assume that page write is atomic on disk.
and lastLSN. In practice, additional details to deal with non-atomic writes.
Dirty Page Table: Strict 2PL (at least for writes).
One entry per dirty page in buffer pool. STEAL, NO-FORCE buffer management, with Write-
Contains recLSN -- the LSN of the log record which Ahead Logging.
first caused the page to be dirty.
Checkpointing The Big Picture: Whats Stored Where

Periodically, the DBMS creates a checkpoint, in order to
minimize the time taken to recover in the event of a system LOG
crash. Write to log: DB RAM
begin_checkpoint record: Indicates when chkpt began.
LogRecords
end_checkpoint record: Contains current Xact table and dirty page Xact Table
prevLSN
table. This is a `fuzzy checkpoint: Data pages lastLSN
XID
Other Xacts continue to run; so these tables only known to reflect each status
type
some mix of state after the time of the begin_checkpoint record. with a
pageID
No attempt to force dirty pages to disk; effectiveness of checkpoint pageLSN Dirty Page Table
length
limited by oldest unwritten change to a dirty page. (So its a good idea recLSN
to periodically flush dirty pages to disk!) offset
before-image master record
Store LSN of chkpt record in a safe place (master record).
after-image flushedLSN
Simple Transaction Abort Abort, cont.
For now, consider an explicit abort of a Xact.
No crash involved. To perform UNDO, must have a lock on data!
No problem!
We want to play back the log in reverse
order, UNDOing updates. Before restoring old value of a page, write a CLR:
You continue logging while you UNDO!!
Get lastLSN of Xact from Xact table. CLR has one extra field: undonextLSN
Can follow chain of log records backward via the Points to the next LSN to undo (i.e. the prevLSN of the record were
prevLSN field. currently undoing).
Note: before starting UNDO, could write an Abort CLR contains REDO info
log record. CLRs never Undone
Undo neednt be idempotent (>1 UNDO wont happen)
Why bother?
But they might be Redone when repeating history (=1 UNDO
guaranteed)
At end of all UNDOs, write an end log record.
Transaction Commit Crash Recovery: Big Picture

Oldest log
Write commit record to log. rec. of Xact Start from a checkpoint (found
active at crash
All log records up to Xacts lastLSN are flushed. via master record).
Guarantees that flushedLSN lastLSN. Smallest Three phases. Need to:
recLSN in
Note that log flushes are sequential, synchronous dirty page Figure out which Xacts
writes to disk. table after committed since checkpoint,
Analysis
Many log records per log page. which failed (Analysis).
Make transaction visible REDO all actions.
Commit() returns, locks dropped, etc. Last chkpt (repeat history)
Write end record to log. UNDO effects of failed Xacts.

CRASH
A R U
Recovery: The Analysis Phase Recovery: The REDO Phase
We repeat History to reconstruct state at crash:
Reapply all updates (even of aborted Xacts!), redo
Reconstruct state at checkpoint.
CLRs.
via end_checkpoint record.
Scan forward from log rec containing smallest
Scan log forward from begin_checkpoint. recLSN in D.P.T. For each CLR or update log rec LSN,
End record: Remove Xact from Xact table. REDO the action unless page is already more
Other records: Add Xact to Xact table, set uptodate than this record:
lastLSN=LSN, change Xact status on commit. REDO when Affected page is in D.P.T., and has
Update record: If P not in Dirty Page Table, pageLSN (in DB) < LSN. [if page has recLSN > LSN no
Add P to D.P.T., set its recLSN=LSN. need to read page in from disk to check pageLSN]
To REDO an action:
This phase could be skipped; Reapply logged action.
information can be regained in REDO pass following Set pageLSN to LSN. No additional logging!
Invariant Recovery: The UNDO Phase
State of page P is the outcome of all changes Key idea: Similar to simple transaction abort,
of relevant log records whose LSN is <= for each loser transaction (that was in flight or
P.pageLSN aborted at time of crash)
During redo phase, every page P has Process each loser transactions log records
P.pageLSN >= redoLSN backwards; undoing each record in turn and
generating CLRs
Thus at end of redo pass, the database has a But: loser may include partial (or complete)
state that reflects exactly everything on the rollback actions
(stable) log Avoid to undo what was already undone
undoNextLSN field in each CLR equals prevLSN
field from the original action
UndoNextLSN Recovery: The UNDO Phase
ToUndo={ l | l a lastLSN of a loser Xact}
Repeat:
Choose largest LSN among ToUndo.
If this LSN is a CLR and undonextLSN==NULL
Write an End record for this Xact.
If this LSN is a CLR, and undonextLSN != NULL
Add undonextLSN to ToUndo
(Q: what happens to other CLRs?)
Else this LSN is an update. Undo the update,
write a CLR, add prevLSN to ToUndo.
Until ToUndo is empty.
From Mohan et al, TODS 17(1):94-162
Example of Recovery Example: Crash During Restart!

LSN LOG LSN LOG
00,05 begin_checkpoint, end_checkpoint
RAM 00 begin_checkpoint RAM 10 update: T1 writes P5
05 end_checkpoint 20 update T2 writes P3
undonextLSN
Xact Table 10 update: T1 writes P5 prevLSNs Xact Table 30 T1 abort
lastLSN 20 update T2 writes P3 lastLSN
40,45 CLR: Undo T1 LSN 10, T1 End
status status
30 T1 abort 50 update: T3 writes P1
Dirty Page Table Dirty Page Table
recLSN 40 CLR: Undo T1 LSN 10 recLSN 60 update: T2 writes P5
flushedLSN 45 T1 End flushedLSN CRASH, RESTART
50 update: T3 writes P1 70 CLR: Undo T2 LSN 60
ToUndo 60 update: T2 writes P5 ToUndo 80,85 CLR: Undo T3 LSN 50, T3 end
CRASH, RESTART CRASH, RESTART
90 CLR: Undo T2 LSN 20, T2 end
Additional Crash Issues Parallelism during restart
What happens if system crashes during
Analysis? During REDO? Activities on a given page must be processed
How do you limit the amount of work in REDO? in sequence
Flush asynchronously in the background. Activities on different pages can be done in
parallel
Watch hot spots!
How do you limit the amount of work in UNDO?
Avoid long-running Xacts.
Log record contents Physical logging

Describe the bits (optimization: only those that
What is actually stored in a log record, to allow change)
REDO and UNDO to occur?
Eg
Many choices, 3 main types
OLD STATE: 0x47A90E.
PHYSICAL
NEW STATE: 0x632F00
LOGICAL
So REDO: set to NEW; UNDO: set to OLD
PHYSIOLOGICAL
Or just delta (OLD XOR NEW)
DELTA: 0x24860E
So REDO=UNDO=xor with delta
Ponder: XOR is not idempotent, but redo and
undo must be; why is this OK?
Logical Logging Physiological Logging
Describe the operation and arguments Describe changes to a specified page, logically
Eg Update field 3 of record whose key is 37, by within that page
adding 32 Goes with common page layout, with records
We need a programmer supplied inverse indexed from a page header
operation to undo this Allows movement within the page (important
for records whose length varies over time)
Eg on page 298, replace record at index 17
from old state to new state
Eg on page 35, insert new record at index 20
ARIES logging Interactions
ARIES allows different log approaches; Recovery is designed with deep awareness of
common choice is: access methods (eg B-trees) and concurrency
Physiological REDO logging control
Independence of REDO (e.g. indexes & tables) And vice versa
Can have concurrent commutative logical operations like Need to handle failure during page split,
increment/decrement (escrow transactions) reobtaining locks for prepared transactions
Logical UNDO during recovery, etc
To allow for simple management of physical
structures that are invisible to users
CLR may act on different page than original action
To allow for escrow
Nested Top Actions Summary of Logging/Recovery
Trick to support physical operations you do not Recovery Manager guarantees Atomicity &
want to ever be undone Durability.
Example? Use WAL to allow STEAL/NO-FORCE w/o
Basic idea sacrificing correctness.
At end of the nested actions, write a dummy CLR LSNs identify log records; linked into
Nothing to REDO in this CLR backwards chains per transaction (via
Its UndoNextLSN points to the step before the prevLSN).
nested action. pageLSN allows comparison of data page and
log records.
Summary, Cont. Further reading
Checkpointing: A quick way to limit the Repeating History Beyond ARIES,

amount of log to scan on recovery. C. Mohan, Proc VLDB99
Recovery works in 3 phases: Reflections on the work 10 years later
Analysis: Forward from checkpoint. Model and Verification of a Data Manager
Redo: Forward from oldest recLSN. Based on ARIES
Undo: Backward from end to first LSN of oldest D. Kuo, ACM TODS 21(4):427-479
Xact alive at crash. Proof of a substantial subset
Upon Undo, write CLRs. A Survey of B-Tree Logging and Recovery
Redo repeats history: Simplifies the logic! Techniques
G. Graefe, ACM TODS 37(1), article 1

Aries

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Aries

Загружено:

Авторское право:

Доступные форматы

Review: The ACID properties

A tomicity: All actions in the Xact happen, or none happen.

Motivation Intended Functionality

Desired Behavior after

Challenge: UNDO Handling the Buffer Pool

Write-Ahead Logging (WAL) ARIES

Each log record has a unique

In restart, first repeat history without Each data page contains a

WAL constraints Log Records

Transaction Table: Series of reads & writes, followed by commit or

Checkpointing The Big Picture: Whats Stored Where

Transaction Commit Crash Recovery: Big Picture

Write end record to log. UNDO effects of failed Xacts.

Invariant Recovery: The UNDO Phase

Example of Recovery Example: Crash During Restart!

Log record contents Physical logging

ARIES logging Interactions

Summary, Cont. Further reading

Checkpointing: A quick way to limit the Repeating History Beyond ARIES,

Вам также может понравиться