You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Lock-based Concurrency Control and Recovery From Failures
Lesson Introduction: Lock-based Concurrency Control
Database systems are equipped with lock-based protocols to ensure that any transaction to a database cannot read or write data until it acquires an appropriate lock on the transaction. In a big data system where thousands of transactions can be executed simultaneously, it is critical to control the concurrency of transactions.
We will learn about the family of approaches called Lock-Based Concurrency Control. You will learn about the most widely used protocol, the Strict Two-phase Locking, or Strict 2PL.
Topic: Lock-Based Concurrency Control
First, using a dependency graph, we can tell whether a schedule is conflict serializable or not.
Lock based concurrency control
Strict Two-phase Locking (Strict 2PL) Protocol
Each transaction must obtain a shared lock on the object before reading, and an exclusive lock on the object before writing.
All locks held by a transaction are released when the transaction completes.
Strict 2PL allows only serializable schedules, it simplifies transaction aborts
To be more clear, before reading a transaction A, R(A), it needs to acquire shared lock, S(A). Whenever a transaction wants to write a data item W(A), it needs to acquire exclusive lock X(A).
It calls Two-phase because the transaction goes to two main phases.
A phase where the transaction keeps acquiring locks, either shared locks or exclusive locks.
Release locks at the end of the transaction. Here is a less strict version where you can release it in the second phase, start releasing locks step by step.
There is a separate module in the database system called lock manager, it controls all the locking going up, all the transactions, or the concurrency control module in the database system has to call the lock manager to acquire locks. So the lock manager basically receives locks and releases all locks as well.
Simple rule:
if one transaction already holds a shared lock on item A, if another transaction comes and wants to also acquire a shared lock on the same item, this is allowed. So the lock manager will grant that lock to the other transaction as well.
If the transaction has a shared lock on item A, but another transaction came and it wants to acquire an exclusive lock, that is Not allowed
If the original transaction already has an exclusive lock on item A, and another transaction comes in and wants a shared lock. It is not allowed
If an original transaction has an exclusive lock and another transaction also wants an exclusive lock on the data item
As we can see. Shared on Shared is allowed, otherwise, not allowed
We have to wait for the transaction to release the lock first if it has an exclusive lock in order to acquire a shared or exclusive lock on it.
And if we want to have an exclusive lock, we have to wait for a shared lock also to be released.
Deadlock
A deadlock means that one transaction wants to acquire a lock on a data item A, and this data item A is already locked by another transaction B. At the same time, this other transaction B also wants to acquire a lock on a data item A that is already also locked by transaction B. They both waiting for each other, and it leads to a deadlock because nobody will release the lock until the other one does.
Deadlock Detection
Create a data structure called the waits-for graph. In the graph, there is an edge from Ti to Tj if Ti is waiting for Tj to release a lock
Ti => Tj
To detect deadlock, we periodically check for cycles in the graph. if there is a cycle, it means we have a deadlock and we have to break the cycle by aborting one of the transactions that are causing the cycle
No deadlocks because of no cycle ⬆️
deadlocks because of cycle ⬆️
Topic: Database Recovery
As we already know from the previous unit, a transaction is a collection of actions that make consistent transformations of system states while preserving system consistency. There might be temporarily in an inconsistent state during execution in the middle, but that is fine. As long as the beginning and the end of the transaction are consistent states.
If we have a deadlock, for example, we may need to abort the transaction that causes the deadlock.
However, aborting a transaction is a failure.
Type of Failures:
Transaction failures
Transaction aborts
Avg 3% of transactions abort abnormally.
System failure
Failure of processor, main memory loss due to power failure
Main memory contents are lost, but secondary storage contents are safe
partial vs. total failure
Media failures
Failure of secondary storage devices such that the stored data is lost
Communication failures
Lost/undeliverable messages due to network unstable
Local Recovery Management - Architecture
Volatile storage
consists of the main memory of the computer system (RAM).
Stable / Persistent storage
Resilient to failures and loses its contents only in the presence of media failures
Main memory is smaller than the persistent storage so you have to select a few pages that we can actually store in the buffer. When computers shut down, all data in RAM is wiped out.
That is why we have a Recovery manager
Before sending data to the new stable database state to do recovery, we have a database log, which logs every single operation that is happening in transactions into persistent storage. Database Log is an append-only file
Recovering From a Crash
There are 3 phases in the Aries recovery algorithm:
Analysis: Scan the log forward (from the most recent checkpoint)
Redo: For all the transactions committed before the crash, we will apply for a redo of all the operations of this transaction.
Undo: For transactions that did not commit before the crash, undo all operations to the old values
This way, we ensure that we recover from failure. Transactions that are committed, are redone again, we are fine. For transactions that did not commit before the crash, they don't have a commit record in the log, undo them, and run them again later
Knowledge Check: Lock-Based Concurrency Control
What kind of lock must a transaction acquire in order to read from a database?
Concurrency Lock
[Correct] Shared Lock
(This is the type of lock a transaction needs to acquire in order to read from a database)
Exclusive Lock
Read Lock
What kind of lock must a transaction acquire in order to write to a database?
[Correct!] Exclusive Lock
(This is the type of lock a transaction needs to acquire in order to write to a database.)
Shared Lock
Write Lock
Atomicity Lock
What is the name of the database module that manages lock-based concurrency control for transactions?
Concurrency Manager
Transaction Manager
[Correct] Lock Manager
(Lock-based concurrency control is managed by a specific database module called lock manager)
Control Manager
Unit 5 & 6: Practice Quiz
What is the result of the following database transaction if the initial values of both A and B are 100?
BEGIN
A=A+100
B=A+100
END
A = 100 + 100 = 200 B = 200 + 100 = 300
A=200, B=300
How does a database achieve concurrency?
By putting transactions in a queue.
[Correct] By interleaving transactions.
(A database can process multiple transactions concurrently by interleaving them.)
By processing transactions at the same time.
By processing transactions as they come in.
Which of the following describes the atomicity principle of database transactions?
A database can only serve one user at a time.
A database can only process one query at a time.
[Correct]All of the components of a transaction run in their entirety, or none of them run at all.
A database can only process one transaction at a time.
What kind of lock must a transaction acquire in order to read from a database?
Read Lock
Concurrency Lock
[Correct] Shared Lock
(This is the type of lock a transaction needs to acquire in order read from a database.)
Exclusive Lock
Consider the following two database transactions with initial values of A=500 and B=500:
T1: A=A+100, B=A-100
T2: A=A1.06, B=B1.06
Assuming T1 arrives first, what will be the final values of A and B if the database management system interleaves transactions?
T1: A = 500 + 100 = 600
T2: A = 600 * 1.06 = 636
T1: B = 636 - 100 = 536
T2: B = 536 * 1.06 = 568.16
A=636, B=568.16
Unit 5 & 6: Graded Quiz
Database transactions are processed according to a principle that all of the components of a transaction run in their entirety or none of them run at all. What is the name of this principle?
Concurrency
Durability
[Correct] Atomicity
(The atomicity principle states that all of the components of a transaction run in their entirety or none of them run at all)
Isolation
Which of the following database operations requires a transaction to obtain a shared lock?
Delete Data
Update Data
[Correct] Read Data
(A database transaction needs to obtain a shared lock in order to read data from a database)
Write Data
Which of the following database operations does not require a transaction to obtain an exclusive lock?
[Correct] Read Data
(Reading data from a database does not require an exclusive lock; a shared lock is sufficient)
Write Data
Delete Data
Update Data
Consider the following two database transactions with initial values of A=0 and B=0:
T1: A=A+10, B=A+10
T2: A=A*1.1, B=B*1.1
Assuming T1 arrives first, what will be the final values of A and B if the two transactions are processed using a serial schedule?
T1: A = 0 + 10 = 10
T1: B = 10 + 10 = 20
T2: A = 10 * 1.1 = 11
T2: B = 20 * 1.1 = 22
A=11, B=22
Consider the following two database transactions with initial values of A=0 and B=0:
T1: A=A+10, B=A+10
T2: A=A1.1, B=B1.1
Assuming T1 arrives first, what will be the final values of A and B if the database management system interleaves transactions?
T1: A = 0 + 10 = 10
T2: A = 10 = 10 * 1.1 = 11
T1: B = 11 + 10 = 21
T2: B = 21 * 1.1 = 23.1
A=11, B=23.1
Why do database management systems implement transaction interleaving?
To achieve atomicity
To achieve isolation
[Correct] To achieve concurrency
(Database management systems implement transaction interleaving in order to achieve concurrency, which is the ability of a database to execute multiple transactions simultaneously)
To achieve consistency
Which of the following describes the isolation principle of database transactions?
[Correct] Each transaction runs as if there is no other transaction running in the database system simultaneously.
(This is the definition of the isolation principle of database transactions)
All of the components of a database transaction run in their entirety, or none of them run at all.
A database management system can only process one transaction at a time.
A database management system can only process one query at a time.
Which of the following describes the serializability property of database transactions?
All of the components of a transaction run in their entirety or none of them run at all.
Each transaction runs as if there are no other transactions running in the database system simultaneously.
A database management system can only process transactions in series, not concurrently.
[Correct] If several transactions are executed concurrently, the results must be the same as if they were executed one after another.
Which of the following statements about the ACID properties (Atomicity, Consistency, Isolation, Durability) is correct?
[Correct] Isolation implies Serializability
(Isolation means that concurrent changes are invisible, which implies serializability)
Atomicity implies Serializability
Consistency implies Serializability
Durability implies Serializability
Which of the following describes the durability property of database transactions?
Each transaction runs as if there are no other transactions running in the database system simultaneously.
[Correct] Once a transaction commits, the system guarantees the results will never be lost, in spite of subsequent failures.
All of the components of a transaction run in their entirety, or none of them run at all.
If several transactions are executed concurrently, the results must be the same as if they were executed one after another.
The text was updated successfully, but these errors were encountered:
Lock-based Concurrency Control and Recovery From Failures
Lesson Introduction: Lock-based Concurrency Control
Database systems are equipped with lock-based protocols to ensure that any transaction to a database cannot read or write data until it acquires an appropriate lock on the transaction. In a big data system where thousands of transactions can be executed simultaneously, it is critical to control the concurrency of transactions.
We will learn about the family of approaches called Lock-Based Concurrency Control. You will learn about the most widely used protocol, the Strict Two-phase Locking, or Strict 2PL.
Topic: Lock-Based Concurrency Control
First, using a dependency graph, we can tell whether a schedule is conflict serializable or not.
Lock based concurrency control
Strict Two-phase Locking (Strict 2PL) Protocol
To be more clear, before reading a transaction A, R(A), it needs to acquire shared lock, S(A). Whenever a transaction wants to write a data item W(A), it needs to acquire exclusive lock X(A).
It calls Two-phase because the transaction goes to two main phases.
There is a separate module in the database system called lock manager, it controls all the locking going up, all the transactions, or the concurrency control module in the database system has to call the lock manager to acquire locks. So the lock manager basically receives locks and releases all locks as well.
Simple rule:
As we can see. Shared on Shared is allowed, otherwise, not allowed
We have to wait for the transaction to release the lock first if it has an exclusive lock in order to acquire a shared or exclusive lock on it.
And if we want to have an exclusive lock, we have to wait for a shared lock also to be released.
Deadlock
A deadlock means that one transaction wants to acquire a lock on a data item A, and this data item A is already locked by another transaction B. At the same time, this other transaction B also wants to acquire a lock on a data item A that is already also locked by transaction B. They both waiting for each other, and it leads to a deadlock because nobody will release the lock until the other one does.
Deadlock Detection
Ti => Tj
To detect deadlock, we periodically check for cycles in the graph. if there is a cycle, it means we have a deadlock and we have to break the cycle by aborting one of the transactions that are causing the cycle
No deadlocks because of no cycle ⬆️
deadlocks because of cycle ⬆️
Topic: Database Recovery
As we already know from the previous unit, a transaction is a collection of actions that make consistent transformations of system states while preserving system consistency. There might be temporarily in an inconsistent state during execution in the middle, but that is fine. As long as the beginning and the end of the transaction are consistent states.
If we have a deadlock, for example, we may need to abort the transaction that causes the deadlock.
However, aborting a transaction is a failure.
Type of Failures:
Local Recovery Management - Architecture
Main memory is smaller than the persistent storage so you have to select a few pages that we can actually store in the buffer. When computers shut down, all data in RAM is wiped out.
That is why we have a Recovery manager
Recovering From a Crash
This way, we ensure that we recover from failure. Transactions that are committed, are redone again, we are fine. For transactions that did not commit before the crash, they don't have a commit record in the log, undo them, and run them again later
Knowledge Check: Lock-Based Concurrency Control
(This is the type of lock a transaction needs to acquire in order to read from a database)
(This is the type of lock a transaction needs to acquire in order to write to a database.)
(Lock-based concurrency control is managed by a specific database module called lock manager)
Unit 5 & 6: Practice Quiz
BEGIN
A=A+100
B=A+100
END
A = 100 + 100 = 200
B = 200 + 100 = 300
A=200, B=300
(A database can process multiple transactions concurrently by interleaving them.)
(This is the type of lock a transaction needs to acquire in order read from a database.)
T1: A=A+100, B=A-100
T2: A=A1.06, B=B1.06
Assuming T1 arrives first, what will be the final values of A and B if the database management system interleaves transactions?
T1: A = 500 + 100 = 600
T2: A = 600 * 1.06 = 636
T1: B = 636 - 100 = 536
T2: B = 536 * 1.06 = 568.16
A=636, B=568.16
Unit 5 & 6: Graded Quiz
(The atomicity principle states that all of the components of a transaction run in their entirety or none of them run at all)
(A database transaction needs to obtain a shared lock in order to read data from a database)
(Reading data from a database does not require an exclusive lock; a shared lock is sufficient)
T1: A=A+10, B=A+10
T2: A=A*1.1, B=B*1.1
Assuming T1 arrives first, what will be the final values of A and B if the two transactions are processed using a serial schedule?
T1: A = 0 + 10 = 10
T1: B = 10 + 10 = 20
T2: A = 10 * 1.1 = 11
T2: B = 20 * 1.1 = 22
A=11, B=22
T1: A=A+10, B=A+10
T2: A=A1.1, B=B1.1
Assuming T1 arrives first, what will be the final values of A and B if the database management system interleaves transactions?
T1: A = 0 + 10 = 10
T2: A = 10 = 10 * 1.1 = 11
T1: B = 11 + 10 = 21
T2: B = 21 * 1.1 = 23.1
A=11, B=23.1
(Database management systems implement transaction interleaving in order to achieve concurrency, which is the ability of a database to execute multiple transactions simultaneously)
(This is the definition of the isolation principle of database transactions)
(Isolation means that concurrent changes are invisible, which implies serializability)
The text was updated successfully, but these errors were encountered: