Transactions and Concurrency Control
Overview
- What a transaction is and how ACID keeps it reliable.
- Where concurrency goes wrong: lost updates, non-repeatable reads and phantoms.
- How to judge correctness with serializability and precedence graphs.
- How DBMSs enforce order: timestamp ordering, locks, two-phase locking and deadlock handling.
1. Transactions and ACID
1.1 What counts as a transaction?
A transaction is a sequence of operations treated as a single unit of work:
- It can read or modify the database,
- It ends with either COMMIT (make changes permanent) or ROLLBACK (undo them).
Elementary actions inside a transaction:
- Read: fetch data (SELECT).
- Write: change data (INSERT, UPDATE, DELETE).
- Commit: persist every change of this transaction.
- Rollback: cancel every change of this transaction.
Generic form:
-- Oracle starts a transaction implicitly on the first change
-- or when you set autocommit off.
-- Reads and writes here...
COMMIT; -- or ROLLBACK;
Example - bank transfer in one transaction:
UPDATE accounts SET balance = balance - 100 WHERE account_id = 'A';
UPDATE accounts SET balance = balance + 100 WHERE account_id = 'B';
COMMIT; -- or ROLLBACK if any step fails
1.2 ACID properties
- Atomicity - all or nothing: partial effects are not allowed.
- Consistency - every constraint stays valid after a correct transaction.
- Isolation - intermediate results stay private; the outcome must match some serial order.
-
Durability - once committed, data survives crashes.
-
Reference: Ensuring ACID semantics (Oracle docs)
- Video overview: ACID properties in databases
2. Concurrency problems
When transactions overlap, their operations can interleave and create anomalies.
2.1 Lost update
- Two transactions write the same data; one overwrites the other.
- Timeline (balance starts at 600):
T1reads 600 and plans to subtract 100.T2reads 600 and plans to add 100.T2writes 700.T1writes 500 ->T2's update is lost.
2.2 Non-repeatable read
- A transaction reads a row twice and gets two different values because another transaction updated it between the reads.
- Example:
T2sums accounts A and B. After reading A,T1withdraws from A and B. WhenT2reads B, the total depends on timing.
2.3 Phantom read (invalid dependency)
- A transaction reads a set of rows that satisfy a condition.
- Another transaction inserts or deletes rows matching that condition.
- Re-running the same query returns a different set (extra or missing "phantom" rows).
Example: counting employees in a department while another session inserts one more employee there.
These issues motivate formal correctness criteria and control mechanisms.
Video on read anomalies: Dirty, inconsistent and phantom reads
3. Schedules, serializability and the scheduler
- Schedule: the exact order of all reads, writes, commits and rollbacks from concurrent transactions.
- Serial schedule: transactions run one after another with no interleaving (always correct but slow).
- Interleaved schedule: operations are mixed to improve throughput.
- Serializable schedule: interleaved execution that produces the same final database state as some serial order.
Scheduler: the DBMS component that orders operations to avoid incorrect interleavings.
3.1 Testing serializability with precedence (conflict) graphs
Conflicting operations are on the same data item where at least one is a write:
- Read-Write, Write-Read, Write-Write.
To check a schedule:
1. Create one node per transaction.
2. For each conflict, draw an edge Ti -> Tj if Ti's operation must occur before Tj's.
3. If the graph has no cycle, the schedule is conflict-serializable.
Another intuition: operations on different data can often be permuted (swapped) without changing the result; conflicts mark operations that cannot be swapped.
- Video: Serializability and recoverability
4. Timestamp ordering
Goal: impose a global time order so operations "respect" who arrived first.
- Each transaction gets a unique timestamp
TS(Ti)on start (clock time or counter). - For every data item
X, the system tracks: TSR(X)- timestamp of the last transaction that readX,TSW(X)- timestamp of the last transaction that wroteX.
Rules
- Read Ti on X: allowed if TS(Ti) >= TSW(X); then set TSR(X) = max(TSR(X), TS(Ti)). Otherwise abort and restart Ti.
- Write Ti on X: allowed if TS(Ti) >= TSR(X) and TS(Ti) >= TSW(X); then set TSW(X) = TS(Ti). Otherwise abort and restart Ti.
Intuition: no operation is allowed to "go back in time" and overwrite or see something that a newer transaction already produced.
- Video: Timestamp ordering protocols
5. Lock-based protocols
Transactions must request locks before using data:
- SLOCK (shared/read) - many readers allowed; no writers.
- XLOCK (exclusive/write) - single owner; blocks all other access.
- UNLOCK - releases the item.
Compatibility:
- If an item is free: grant requested lock.
- If item has SLOCKs: new SLOCK allowed; XLOCK must wait.
- If item has XLOCK: everyone else waits.
Simple locking alone can still lead to non-serializable schedules, deadlocks, or cascading rollbacks (if a transaction reads data that is later rolled back).
6. Two-phase locking (2PL) and variants
Two phases per transaction:
- Growing: acquire locks, release none.
- Shrinking: release locks, acquire none.
Breaking the rule (releasing then acquiring) risks non-serializable schedules.
Variants:
- Strict 2PL: keep XLOCKs until COMMIT/ROLLBACK. Ensures serializability.
- Rigorous 2PL: keep both SLOCKs and XLOCKs until the end. Prevents cascading rollbacks.
- Conservative 2PL: request all needed locks before starting; if any lock is unavailable, wait before doing anything. Avoids deadlocks but can reduce parallelism.
- Concurrencyr overview (Arabic): Concurrency control
7. Deadlocks: detection and prevention
Deadlock: transactions wait for each other's locks forever (e.g., T1 holds a and wants b; T2 holds b and wants a).
Detection: build a wait-for graph (edge Ti -> Tj if Ti waits on Tj). A cycle means deadlock; abort one transaction to break it.
Prevention (use timestamps to decide who waits vs aborts):
- Wait-Die: older waits for younger; younger requesting a lock held by older is aborted ("dies") and restarts with the same timestamp.
- Wound-Wait: older aborts ("wounds") younger that block it; younger waits if the blocker is older.
8. Key takeaways
- ACID keeps each transaction reliable; isolation matters most under concurrency.
- Lost updates, non-repeatable reads and phantoms show why ordering rules are needed.
- Conflict serializability (no cycles in the precedence graph) formalizes “safe” interleavings.
- Timestamp ordering and lock-based (2PL) protocols are standard enforcement tools; each has trade-offs in aborts, waiting and throughput.
- Deadlocks are a natural side effect of locking and must be detected or prevented.