Imagine trying to book the very last seat on a popular flight online. At the exact same moment, another person clicks “confirm” for the same seat. How does the system ensure only one booking goes through and the database remains accurate? This scenario highlights the core challenge of database concurrency.

In today’s world, almost every application interacts with databases accessed by multiple users or processes simultaneously. Database concurrency is the ability of a Database Management System (DBMS) to handle these simultaneous operations efficiently while maintaining data integrity and consistency. For developers, DevOps engineers, and SREs, understanding concurrency is vital for building reliable and performant applications. Let’s explore what concurrency entails, the problems it can cause if unmanaged, and the techniques used to control it.

TL;DR Summary

  • Database concurrency is how a DBMS lets multiple transactions read and write shared data at the same time, while still preserving ACID and keeping the database consistent.
  • Without proper control, concurrency can cause dirty reads, lost updates, unrepeatable reads, phantom reads, and deadlocks, leading to inaccurate results or stalled transactions.
  • DBMSs prevent these issues with concurrency control (aiming for serializability) using approaches like locking (e.g., 2PL), timestamp ordering, MVCC, and optimistic control, plus isolation levels that trade performance for stronger consistency.
  • Because concurrency problems often show up as lock contention, deadlocks, or rollbacks, monitoring helps you spot bottlenecks and keep performance stable under real-world load.

What Is Database Concurrency?

At its heart, database concurrency refers to the capability of a database to allow multiple transactions (sequences of operations) to access and modify shared data seemingly at the same time. Without concurrency, transactions would have to run one after another (serially), severely limiting the throughput and responsiveness of any multi-user application.

The goal isn’t just to allow simultaneous access but to do so while preserving the fundamental ACID properties (Atomicity, Consistency, Isolation, Durability) of transactions. This ensures that even with multiple operations happening in parallel, the database transitions from one consistent state to another.

Why Is Concurrency Needed?

Managing concurrency is essential for several reasons:

  • Increased Throughput: More transactions can be processed in a given amount of time, improving overall system performance.
  • Better Resource Utilization: System resources like CPU, memory, and I/O are used more effectively instead of sitting idle while waiting for one transaction to complete.
  • Reduced Waiting Time: Users experience shorter delays as their transactions don’t necessarily have to wait for others to finish completely.
  • Improved Response Time: The time taken for a transaction to start processing and provide an initial response is reduced.

In essence, concurrency transforms a database from a single-lane road into a multi-lane highway, allowing more traffic (transactions) to flow simultaneously.

Common Concurrency Problems

Allowing transactions to execute concurrently without proper management can lead to several data inconsistency issues. These problems arise when transactions interfere with each other’s operations on shared data:

Dirty Reads

A dirty read occurs when a transaction (say, T2) reads data that has been modified by another transaction (T1) but has not yet been committed. If T1 subsequently fails and rolls back its changes, T2 is left holding incorrect, “dirty” data that never technically existed in the stable database state.

  • Example: Transaction T1 updates a product’s stock level from 10 to 8. Before T1 commits, Transaction T2 reads the stock level as 8 for a report. T1 then fails and rolls back, restoring the stock level to 10. T2’s report now contains an inaccurate stock level of 8.

Lost Updates

This problem happens when two transactions read the same data item, modify it independently, and then write their changes back. The second transaction to write effectively overwrites the update made by the first transaction, causing the first update to be “lost.”

  • Example: Both T1 and T2 read an account balance of $500.
    • T1 calculates a new balance of $400 (a $100 withdrawal).
    • T2 calculates a new balance of $600 (a $100 deposit).
    • T1 writes its $400 balance.
    • T2 writes its $600 balance, overwriting T1’s update. The final balance is $600, incorrectly reflecting only the deposit; the $100 withdrawal is lost. The correct balance should have been $500 - $100 + $100 = $500.

Unrepeatable Reads (Non-Repeatable Reads)

An unrepeatable read occurs when a transaction reads the same data item twice but gets different values because another committed transaction modified the data between the reads. The original read becomes “unrepeatable.”

  • Example: Transaction T1 reads the price of an item as $20. Transaction T2 then updates the price to $25 and commits. If T1 reads the price of the same item again within the same transaction, it will now see $25. This inconsistency within T1 might cause issues if calculations were based on the initial $20 value.

Phantom Reads

A phantom read is similar to an unrepeatable read but applies to a set of rows. It occurs when a transaction executes a query twice, and the second execution returns a different set of rows satisfying the query criteria because another committed transaction inserted or deleted rows that match the criteria. The newly appeared or disappeared rows are called “phantoms.”

  • Example: Transaction T1 queries for all employees in the ‘Sales’ department and finds 5 employees. Transaction T2 then inserts a new employee record for the ‘Sales’ department and commits. If T1 repeats its query within the same transaction, it will now find 6 employees. The new employee is the “phantom.”

Deadlocks

A deadlock occurs when two or more transactions are blocked indefinitely, each waiting for a resource (like a lock on a data row) held by the other.

  • Example: Transaction T1 locks resource A and requests resource B. Transaction T2 locks resource B and requests resource A. Neither can proceed because each is waiting for the other to release its lock.

Concurrency Control Techniques

To prevent these problems and ensure data integrity while allowing concurrent execution, DBMSs employ concurrency control techniques. The goal is typically to achieve serializability – ensuring that the outcome of executing transactions concurrently is identical to some serial (one-after-another) execution of those same transactions.

Common techniques include:

Lock-Based Protocols

This is one of the most common methods. Transactions must acquire locks on data items before accessing them. Locks can be:

  • Shared (Read) Locks: Multiple transactions can hold a shared lock on the same item simultaneously, allowing them to read but not write.
  • Exclusive (Write) Locks: Only one transaction can hold an exclusive lock on an item at a time. This is required for writing (updating or deleting). If a transaction holds an exclusive lock, no other transaction can acquire any lock (shared or exclusive) on that item.

By managing how locks are acquired and released (e.g., Two-Phase Locking - 2PL), these protocols prevent conflicts. However, they can lead to deadlocks, which the DBMS must detect and resolve (often by aborting one of the deadlocked transactions).

Timestamp-Based Protocols

Instead of locks, transactions are assigned unique, typically increasing timestamps when they start. The DBMS uses these timestamps to determine the execution order for conflicting operations. If two transactions conflict, the operation of the transaction with the older timestamp is generally given priority, potentially causing the younger transaction to roll back and restart. This avoids deadlocks but can lead to more transaction restarts.

Multi-Version Concurrency Control (MVCC)

Popular in databases like PostgreSQL, Oracle, and MySQL (with InnoDB), MVCC maintains multiple versions of data items. When a transaction needs to read data, it reads the version that was current when the transaction started. When a transaction writes data, it creates a new version instead of overwriting the old one immediately.

This approach allows readers and writers to operate concurrently without blocking each other, significantly improving performance for read-heavy workloads. Old versions are eventually cleaned up. MVCC effectively prevents dirty reads and typically avoids lost updates and unrepeatable reads without traditional locking for read operations.

Optimistic Concurrency Control

This technique operates under the assumption that conflicts between transactions are rare. Transactions proceed without acquiring locks. Before committing, the transaction checks if any other committed transaction has modified the data it accessed. If a conflict is detected, the transaction rolls back; otherwise, it commits. This can be efficient when conflicts are infrequent but can lead to high rollback rates in high-contention scenarios.

Choosing The Right Technique

The specific concurrency control mechanism used often depends on the database system itself (e.g., PostgreSQL heavily relies on MVCC). Database administrators and developers can sometimes influence concurrency behavior by setting transaction isolation levels. These levels (like Read Uncommitted, Read Committed, Repeatable Read, Serializable) define the degree to which one transaction must be isolated from the effects of others, effectively allowing a trade-off between performance and the types of concurrency problems permitted. Higher isolation levels prevent more problems but can reduce concurrency.

Database concurrency is fundamental to the performance and usability of modern multi-user database systems. While allowing simultaneous operations boosts efficiency and throughput, it introduces potential pitfalls like dirty reads, lost updates, unrepeatable reads, and phantom reads that can compromise data integrity.

Database concurrency control techniques, such as locking, timestamp ordering, and MVCC, are essential mechanisms implemented by DBMSs to manage parallel access safely, typically aiming for serializability while preserving ACID properties. Understanding these concepts helps developers and SREs design better applications, diagnose performance issues, and configure database systems appropriately.

Effective monitoring is crucial for understanding how concurrency affects your database performance, identifying bottlenecks, and diagnosing issues like lock contention or high transaction rollback rates.

Gain deep insights into your database performance and concurrency behavior with Netdata’s real-time monitoring. Sign up for free today and take control of your database health.

Database Concurrency FAQs

What Is Database Concurrency?

Database concurrency is a DBMS’s ability to let multiple transactions read and write shared data at roughly the same time, while still preserving correctness (ACID) so the database moves from one consistent state to another.

Why Is Concurrency Important To Database Systems?

Concurrency boosts throughput and responsiveness by keeping CPU, memory, and I/O busy instead of forcing transactions to wait in line, which is essential for modern multi-user apps where many requests hit the database at once.

What Problems Can Happen If Concurrency Is Not Controlled?

Without controls, transactions can interfere and produce inconsistencies such as dirty reads, lost updates, unrepeatable reads, phantom reads, or even deadlocks where work gets stuck waiting on locks.

What Is A Dirty Read?

A dirty read happens when one transaction reads changes made by another transaction that hasn’t committed yet; if the writer rolls back, the reader has consumed data that never truly became part of the database state.

What Is A Lost Update?

A lost update occurs when two transactions read the same value, both compute a new value, and the later write overwrites the earlier one, effectively erasing one change unless the DBMS prevents it via locking, MVCC rules, or conflict detection.

What’s The Difference Between Unrepeatable Reads And Phantom Reads?

An unrepeatable read is when re-reading the same row returns a different value because another transaction committed an update; a phantom read is when re-running a query returns a different set of rows because rows were inserted or deleted that now match the predicate.

What Is Serializability, And Why Do DBMSs Aim For It?

Serializability means the end result of concurrent transactions matches the result you’d get if those same transactions ran one-by-one in some order, which is the gold standard for correctness when transactions touch the same data.

How Do Lock-Based Protocols Control Concurrency?

Locking requires transactions to acquire shared (read) or exclusive (write) locks before accessing data; rules like Two-Phase Locking (2PL) prevent conflicting access patterns, but locking can create contention and can lead to deadlocks that must be detected and resolved.

How Do Timestamp-Based Protocols Work?

Timestamp ordering assigns each transaction a timestamp and enforces a consistent “happens-before” order for conflicting operations; this avoids deadlocks, but can cause rollbacks and retries when a newer transaction conflicts with an older one’s permitted order.

Multi-Version Concurrency Control (MVCC) keeps multiple versions of rows so readers can see a consistent snapshot while writers create new versions, which reduces read-write blocking and improves performance in read-heavy systems (commonly used in PostgreSQL and InnoDB-based MySQL).

What Is Optimistic Concurrency Control, And When Does It Fit?

Optimistic control assumes conflicts are rare, so transactions proceed without locks and validate at commit time; it’s efficient under low contention, but can cause frequent rollbacks when many transactions hit the same records.

How Do Isolation Levels Relate To Concurrency Problems?

Isolation levels define how much a transaction can “see” other transactions’ effects, trading performance for consistency: lower isolation may allow anomalies, while higher isolation prevents more issues but can reduce concurrency or increase conflict handling work.

What Are The Key Signs That Concurrency Is Hurting Database Performance?

Common symptoms include elevated lock waits/lock contention, deadlocks, rising transaction latency, and increasing rollback or retry rates, especially during traffic spikes or batch jobs.

How Can Netdata Help You Understand Concurrency Issues?

Netdata focuses on real-time visibility into database behavior, including per-second query and lock insight (depending on your setup), so you can spot lock contention, deadlocks, slow queries, and other bottlenecks early and tie symptoms to root causes.