Академический Документы
Профессиональный Документы
Культура Документы
Example:
Here,
o At time t2, transaction-X reads A's value.
o At time t3, Transaction-Y reads A's value.
o At time t4, Transactions-X writes A's value on the basis of the value seen at time t2.
o At time t5, Transactions-Y writes A's value on the basis of the value seen at time t3.
o So at time T5, the update of Transaction-X is lost because Transaction-Y overwrites it
without looking at its current value.
o Such type of problem is known as Lost Update Problem as update made by one transaction is
lost here.
2. Dirty Read
o The dirty read occurs in the case when one transaction updates an item of the database, and
then the transaction fails for some reason. The updated database item is accessed by another
transaction before it is changed back to the original value.
o A transaction T1 updates a record which is read by T2. If T1 aborts then T2 now has values
which have never formed part of the stable database.
Here,
o At time t2, transaction-Y writes A's value.
o At time t3, Transaction-X reads A's value.
o At time t4, Transactions-Y rollbacks. So, it changes A's value back to that of prior to t1.
o So, Transaction-X now contains a value which has never become part of the stable database.
o Such type of problem is known as Dirty Read Problem, as one transaction reads a dirty value
which has not been committed.
Example:
Suppose two transactions operate on three accounts.
Here,
o Transaction-X is doing the sum of all balance while transaction-Y is transferring an amount 50
from Account-1 to Account-3.
o Here, transaction-X produces the result of 550 which is incorrect. If we write this produced
result in the database, the database will become an inconsistent state because the actual sum is
600.
o Here, transaction-X has seen an inconsistent state of the database.
Lock-based Protocols
A lock is a mechanism to control concurrent access to a data item. Lock requests are made to
concurrency-control manager. Transaction can proceed only after request is granted. Data items can
be locked in two modes :
1. Shared lock:
o It is also known as a Read-only lock. In a shared lock, the data item can only read by the
transaction.
o It can be shared between the transactions because when the transaction holds a lock, then it
can't update the data on the data item.
o Shared-lock is requested using lock-S instruction.
For example, consider a case where two transactions are reading the account balance of a person. The
database will let them read by placing a shared lock. However, if another transaction wants to update
that account's balance, shared lock prevent it until the reading process is over.
2. Exclusive lock:
o In the exclusive lock, the data item can be both reads as well as written by the transaction.
o This lock is exclusive, and in this lock, multiple transactions do not modify the same data
simultaneously.
o Transactions may unlock the data item after finishing the 'write' operation so that other
transaction can acquire lock on that data item for its operations.
o Exclusive-lock is requested using lock-X instruction.
For example, when a transaction needs to update the account balance of a person. You can allows this
transaction by placing X lock on it. Therefore, when the second transaction wants to read or write,
exclusive lock prevent this operation.
Lock-compatibility matrix
o A transaction may be granted a lock on an item if the requested lock is compatible with locks
already held on the item by other transactions
o Any number of transactions can hold shared locks on an item, but if any transaction holds an
exclusive lock on the item no other transaction may hold any lock on the item.
o If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks
held by other transactions have been released. The lock is then granted.
Starvation
Starvation is the situation when a transaction needs to wait for an indefinite period to acquire a lock.
Following are the reasons for Starvation:
When waiting scheme for locked items is not properly managed
In the case of resource leak
The same transaction is selected as a victim repeatedly
Deadlock
Deadlock refers to a specific situation where two or more processes are waiting for each other to
release a resource or more than two processes are waiting for the resource in a circular chain.
Growing phase:
o In the growing phase, a new lock on the data item may be acquired by the transaction, but
none can be released.
Shrinking phase:
o In the shrinking phase, existing lock held by the transaction may be released, but no new locks
can be acquired.
The 2PL protocol assures serializability, but does not ensure freedom from deadlocks. Cascading roll-
back is possible under two-phase locking.
Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has entered the
system at 007 times and transaction T2 has entered the system at 009 times. T1 has the higher
priority, so it executes first as it is entered the system first.
The timestamp ordering protocol also maintains the timestamp of last 'read' and 'write' operation on a
data.
Where,
Advantages:
Schedules are serializable just like 2PL protocols
No waiting for the transaction, which eliminates the possibility of deadlocks!
Disadvantages:
Starvation is possible if the same transaction is restarted and continually aborted
o If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and operation is rejected.
o If TS(T) < W_TS(X) then don't execute the W_item(X) operation of the transaction and
continue processing.
o If neither condition 1 nor condition 2 occurs, then allowed to execute the WRITE operation by
transaction Ti and set W_TS(X) to TS(T).
If we use the Thomas write rule then some serializable schedule can be permitted that does not conflict
serializable as illustrate by the schedule in a given figure:
In the above figure, T1's read and precedes T1's write of the same data item. This schedule does not
conflict serializable.
Thomas write rule checks that T2's write is never seen by any transaction. If we delete the write
operation in transaction T2, then conflict serializable schedule can be obtained which is shown in below
figure.
1. Read phase: In this phase, the transaction T is read and executed. It is used to read the value
of various data items and stores them in temporary local variables. It can perform all the write
operations on temporary variables without an update to the actual database.
2. Validation phase: In this phase, the temporary variable value will be validated against the
actual data to see if it violates the serializability.
3. Write phase: If the validation of the transaction is validated, then the temporary results are
written to the database or system otherwise the transaction is rolled back.
This protocol is used to determine the time stamp for the transaction for serialization using the time
stamp of the validation phase, as it is the actual phase which determines if the transaction will commit
or rollback.
The serializability is determined during the validation process. It can't be decided in advance. While
executing the transaction, it ensures a greater degree of concurrency and also less number of conflicts.
Thus it contains transactions which have less number of rollbacks.
Multiple Granularity
In the various Concurrency Control schemes have used different methods and every individual Data
item as the unit on which synchronization is performed. A certain drawback of this technique is if a
transaction Ti needs to access the entire database, and a locking protocol is used, then T i must lock
each item in the database. It is less efficient, it would be more simpler if Ti could use a single lock to
lock the entire database.
But, if it consider the second proposal, this should not in fact overlook the certain flaw in the proposed
method. Suppose another transaction just needs to access a few data items from a database, so
locking the entire database seems to be unnecessary moreover it may cost us loss of Concurrency,
which was our primary goal in the first place. To bargain between Efficiency and Concurrency. Use
Granularity. Granularity: It is the size of data item allowed to lock.
o It can be defined as hierarchically breaking up the database into blocks which can be locked.
o The Multiple Granularity protocol enhances concurrency and reduces lock overhead.
o It maintains the track of what to lock and how to lock.
o It makes easy to decide either to lock a data item or to unlock a data item. This type of
hierarchy can be graphically represented as a tree.
a) Intention-shared (IS): It contains explicit locking at a lower level of the tree but only with
shared locks.
b) Intention-Exclusive (IX): It contains explicit locking at a lower level with exclusive or shared
locks.
c) Shared & Intention-Exclusive (SIX): In this lock, the node is locked in shared mode, and
some node is locked in exclusive mode by the same transaction.
Compatibility Matrix with Intention Lock Modes: The below table describes the compatibility
matrix for these lock modes:
It uses the intention lock modes to ensure serializability. It requires that if a transaction attempts to
lock a node, then that node must follow these protocols:
o If transaction T1 reads record Ra9 in file Fa, then transaction T1 needs to lock the database, area
A1 and file Fa in IX mode. Finally, it needs to lock R a2 in S mode.
o If transaction T2 modifies record Ra9 in file Fa, then it can do so after locking the database, area
A1 and file Fa in IX mode. Finally, it needs to lock the Ra9 in X mode.
o If transaction T3 reads all the records in file Fa, then transaction T3 needs to lock the database,
and area A in IS mode. At last, it needs to lock Fa in S mode.
o If transaction T4 reads the entire database, then T4 needs to lock the database in S mode.
1. If transaction T issues a write_item(X) operation, and version i of X has the highest write_TS()
of all versions of X that is alsoless than or equal to TS(T), and read_TS() > TS(T), then abort
and roll back transaction T; otherwise, create a new version of Xwith read_TS() = write_TS() =
TS(T).
2. If transaction T issues a read_item(X) operation, find the version i of X that has the highest
write_TS() of all versions of Xthat is also less than or equal to TS(T); then return the value of to
transaction T, and set the value of read_TS() to the larger of TS(T) and the current read_TS().
As we can see in case 2, a read_item(X) is always successful, since it finds the appropriate version to
read based on the write_TS of the various existing versions of X.
In case 1, however, transaction T may be aborted and rolled back. This happens if T is attempting to
write a version of X that should have been read by another transaction T whose timestamp is
read_TS(); however, T has already read version Xi, which was written by the transaction with
timestamp equal to write_TS(). If this conflict occurs, T is rolled back; otherwise, a new version of X,
written by transaction T, is created. Notice that, if T is rolled back, cascading rollback may occur.
Hence, to ensure recoverability, a transaction T should not be allowed to commit until after all the
transactions that have written some version that T has read have committed.
o In this multiple-mode locking scheme, there are three locking modes for an item: read, write,
and certify, instead of just the two modes (read, write). Hence, the state of LOCK(X) for an
item X can be one of read-locked, write-locked, certify-locked, or unlocked.
o In the standard locking scheme, once a transaction obtains a write lock on an item, no other
transactions can access that item. The idea behind multiversion 2PL is to allow other
transactions T to read an item X while a single transaction T holds a write lock on X. This is
accomplished by allowing two versions for each item X; one version must always have been
written by some committed transaction.
o The second version X is created when a transaction T acquires a write lock on the item. Other
transactions can continue to read the committed version of X while T holds the write lock.
o Transaction T can write the value of X as needed, without affecting the value of the committed
version X. However, once T is ready to commit, it must obtain a certify lock on all items that it
currently holds write locks on before it can commit.
o The certify lock is not compatible with read locks, so the transaction may have to delay its
commit until all its write-locked items are released by any reading transactions in order to
obtain the certify locks.
o Once the certify locks—which are exclusive locks—are acquired, the committed version X of the
data item is set to the value of version X, version X is discarded, and the certify locks are then
released.
o In this multiversion 2PL scheme, reads can proceed concurrently with a single write operation—
an arrangement not permitted under the standard 2PL schemes.
Log based recovery and shadow paging holds good if there is single transaction like updating the
address or so. But what will happen when there are multiple transactions which occur concurrently?
Same method of logging the logs can be followed. But since there are a concurrent transactions, order
and time of each transaction makes a great difference. Failing to maintain the order of transaction will
lead to wrong data while recovering. Also, transactions may have number of steps. Maintaining the log
for each step will increase the log file size. Again it will become an overhead to maintain a log file along
with these transactions. In addition performing redo operation is also an overhead because it is
executing the executed transaction again and again. It is not actually necessary. So our goal here
should be small log file with easy recovery of data in case of failure. To handle this
situation Checkpoints are introduced during the transaction.
Checkpoint acts like a bookmark. During the execution of transaction, such checkpoints are marked
and transaction is executed. The log files will be created as usual with the steps of transactions. When
it reaches the checkpoint, the transaction will be updated into database and all the logs till that point
will be removed from file. Log files then are updated with new steps of transaction till next checkpoint
and so on. Here care should be taken to create a checkpoint because, if any checkpoints are created
before any transaction is complete fully, and data is updated to database, it will not meet the purpose
of the log file and checkpoint. If checkpoints are created when each transaction is complete or where
the database is at consistent state, then it will be useful.
Suppose there are 4 concurrent transactions – T1, T2, T3 and T4. A checkpoint is added at the middle
of T1 and there is failure while executing T4. Let us see how a recovery system recovers the database
from this failure.
It starts reading the log files from the end to start, so that it can reverse the transactions. i.e.;
it reads log files from transaction T4 to T1.
Recovery system always maintains undo log and redo log. The log entries in the undo log will be
used to undo the transactions where as entries in the redo list will be re executed. The
transactions are put into redo list if it reads the log files with (<Tn, Start>, <Tn, Commit>)
or <Tn , Commit>. That means, it lists all the transactions that are fully complete into redo
list to re execute after the recovery. In above example, transactions T2 andT3 will have (<Tn,
Start>, <Tn, Commit>) in the log file. The transaction T1 will have only <Tn, Commit> in the
log file. This because, the transaction is committed after the checkpoint is crossed. Hence all
the logs with<Tn, Start>, are already written to the database and log file is removed for
those steps. Hence it puts T1, T2 and T3 into redo list.
The logs with only <Tn, Start> are put into undo list because they are not complete and can
lead to inconsistent state of DB. In above example T4 will be put into undo list since this
transaction is not yet complete and failed amid.
This is how a DBMS recovers the data in case concurrent transaction failure.