1. Introduction
In high-concurrency PostgreSQL environments—such as financial systems, e-commerce platforms, or large-scale SaaS products—deadlocks represent one of the most critical blocking issues that can halt mission-critical workloads and severely impact system reliability.
A deadlock occurs when two or more transactions create a circular dependency by waiting indefinitely for resources (rows, tables, or advisory locks) that are locked by each other. While PostgreSQL includes sophisticated built-in deadlock detection mechanisms, relying solely on detection means that downtime, failed transactions, and performance degradation have already occurred by the time the system responds.
The Business Impact
Deadlocks in production systems can cause:
- Revenue Loss: Failed payment transactions, abandoned shopping carts
- Data Integrity Issues: Partial updates leaving systems in inconsistent states
- Cascading Failures: One deadlock triggering a cascade of timeouts and retries
- Poor User Experience: Slow response times and transaction failures
- Operational Overhead: Manual intervention and emergency hotfixes
For enterprise systems processing thousands of transactions per second, the goal should be prevention first, detection second. A proactive approach combining architectural patterns, monitoring, and tuning can reduce deadlock incidents by 90% or more.
PostgreSQL’s Lock Architecture
PostgreSQL uses a multi-layered locking system:
- MVCC (Multi-Version Concurrency Control): Allows concurrent reads without blocking
- Row-level locks: Protect individual rows during updates
- Table-level locks: Control schema changes and bulk operations
- Advisory locks: Application-defined custom locks
- Page-level locks: Internal buffer management (usually transparent)
Understanding these layers is crucial for effective deadlock prevention.
2. How Deadlocks Form in PostgreSQL
PostgreSQL’s MVCC system allows concurrent reads and writes without blocking in most scenarios. However, when transactions require exclusive locks—whether on rows, tables, or custom advisory resources—contention can escalate into deadlock cycles.
2.1 The Anatomy of a Deadlock
Deadlocks require four conditions (Coffman conditions):
- Mutual Exclusion: Resources cannot be shared
- Hold and Wait: Processes hold resources while waiting for others
- No Preemption: Resources cannot be forcibly taken away
- Circular Wait: A circular chain of processes waiting for each other
2.2 Common Deadlock Scenarios
Classic Two-Transaction Deadlock
sql
-- Transaction A (Session 1)
BEGIN
;
UPDATE
accounts
SET
balance
=
balance
-
100
WHERE
id
=
1
;
-- Acquires exclusive lock on row 1
-- ... some business logic ...
UPDATE
accounts
SET
balance
=
balance
+
100
WHERE
id
=
2
;
-- Waits for exclusive lock on row 2
-- Transaction B (Session 2) - running concurrently
BEGIN
;
UPDATE
accounts
SET
balance
=
balance
-
50
WHERE
id
=
2
;
-- Acquires exclusive lock on row 2
-- ... some business logic ...
UPDATE
accounts
SET
balance
=
balance
+
50
WHERE
id
=
1
;
-- Waits for exclusive lock on row 1
Result: Transaction A holds lock on row 1 and waits for row 2. Transaction B holds lock on row 2 and waits for row 1. Circular dependency detected.
Foreign Key Deadlock
sql
-- Transaction A
BEGIN
;
INSERT
INTO
orders (customer_id, amount)
VALUES
(
123
,
500.00
);
-- Acquires shared lock on customers(123)
UPDATE
customers
SET
last_order_date
=
NOW
()
WHERE
id
=
123
;
-- Waits for exclusive lock
-- Transaction B
BEGIN
;
UPDATE
customers
SET
credit_limit
=
5000
WHERE
id
=
123
;
-- Acquires exclusive lock on customers(123)
INSERT
INTO
orders (customer_id, amount)
VALUES
(
123
,
200.00
);
-- Waits for shared lock
Index Deadlock (B-tree Splits)
sql
-- High-frequency inserts on indexed columns can cause deadlocks during B-tree page splits
-- Transaction A inserts into page X, triggers split, needs lock on page Y
-- Transaction B inserts into page Y, triggers split, needs lock on page X
SELECT FOR UPDATE Deadlock
sql
-- Transaction A
BEGIN
;
SELECT
*
FROM
inventory
WHERE
product_id
=
'PROD1'
FOR
UPDATE
;
-- Locks PROD1
SELECT
*
FROM
inventory
WHERE
product_id
=
'PROD2'
FOR
UPDATE
;
-- Waits for PROD2
-- Transaction B
BEGIN
;
SELECT
*
FROM
inventory
WHERE
product_id
=
'PROD2'
FOR
UPDATE
;
-- Locks PROD2
SELECT
*
FROM
inventory
WHERE
product_id
=
'PROD1'
FOR
UPDATE
;
-- Waits for PROD1
2.3 Lock Escalation and Deadlock Risk
PostgreSQL can escalate locks under memory pressure:
- Row locks → Table locks (when max_locks_per_transaction exceeded)
- Shared locks → Exclusive locks (during certain operations)
Lock escalation increases deadlock probability by reducing granularity and creating more contention points.
3. Detection Mechanisms
PostgreSQL’s deadlock detector uses a sophisticated algorithm to identify cycles in the wait-for graph. Understanding its behavior is essential for tuning detection performance.
3.1 Deadlock Detection Algorithm
The detector runs periodically and:
- Builds Wait-For Graph: Maps which transactions are waiting for which resources
- Cycle Detection: Uses depth-first search to find circular dependencies
- Victim Selection: Chooses which transaction to abort (typically the one with least work done)
- Cleanup: Releases locks and notifies the application with a deadlock error
3.2 Key Configuration Parameters
deadlock_timeout (default: 1s)
sql
-- View current setting
SHOW
deadlock_timeout;
-- Tune for faster detection in high-frequency systems
ALTER
SYSTEM
SET
deadlock_timeout
=
'200ms'
;
SELECT
pg_reload_conf();
Considerations:
- Lower values (100-500ms): Faster detection but more CPU overhead
- Higher values (2-5s): Less overhead but longer wait times
- Very low values (<100ms): May cause false positives on heavily loaded systems
log_lock_waits (default: off)
sql
-- Enable lock wait logging
ALTER
SYSTEM
SET
log_lock_waits
=
'on'
;
SELECT
pg_reload_conf();
Log Output Example:
LOG: process 12345 still waiting for ShareLock on transaction 67890 after 1000.123 ms
DETAIL: Process holding the lock: 12346. Wait queue: 12345.
STATEMENT: UPDATE accounts SET balance = balance + 100 WHERE id = 2;
Additional Monitoring Parameters
sql
-- Track lock contention statistics
ALTER
SYSTEM
SET
track_locks
=
'on'
;
-- Log slow queries that might hold locks too long
ALTER
SYSTEM
SET
log_min_duration_statement
=
'1000ms'
;
-- Log all DDL operations that acquire strong locks
ALTER
SYSTEM
SET
log_statement
=
'ddl'
;
3.3 Real-Time Lock Monitoring Queries
Current Lock Status
sql
SELECT
pl.pid,
pl.locktype,
pl.
mode
,
pl.granted,
pl.relation::regclass
AS
table_name,
pl.page,
pl.tuple,
pl.virtualtransaction,
psa.query,
psa.state,
psa.wait_event_type,
psa.wait_event,
NOW
()
-
psa.query_start
AS
duration
FROM
pg_locks pl
JOIN
pg_stat_activity psa
ON
pl.pid
=
psa.pid
ORDER
BY
pl.granted
ASC
, psa.query_start
ASC
;
Blocking Relationships
sql
SELECT
blocked_locks.pid
AS
blocked_pid,
blocked_activity.usename
AS
blocked_user,
blocking_locks.pid
AS
blocking_pid,
blocking_activity.usename
AS
blocking_user,
blocked_activity.query
AS
blocked_statement,
blocking_activity.query
AS
blocking_statement,
blocked_activity.application_name
AS
blocked_application,
blocking_activity.application_name
AS
blocking_application,
blocked_locks.locktype
AS
blocked_locktype,
blocked_locks.
mode
AS
blocked_mode,
blocking_locks.
mode
AS
blocking_mode,
NOW
()
-
blocked_activity.query_start
AS
blocked_duration
FROM
pg_catalog.pg_locks blocked_locks
JOIN
pg_catalog.pg_stat_activity blocked_activity
ON
blocked_activity.pid
=
blocked_locks.pid
JOIN
pg_catalog.pg_locks blocking_locks
ON
blocking_locks.locktype
=
blocked_locks.locktype
AND
blocking_locks.
database
IS
NOT
DISTINCT
FROM
blocked_locks.
database
AND
blocking_locks.relation
IS
NOT
DISTINCT
FROM
blocked_locks.relation
AND
blocking_locks.page
IS
NOT
DISTINCT
FROM
blocked_locks.page
AND
blocking_locks.tuple
IS
NOT
DISTINCT
FROM
blocked_locks.tuple
AND
blocking_locks.virtualxid
IS
NOT
DISTINCT
FROM
blocked_locks.virtualxid
AND
blocking_locks.transactionid
IS
NOT
DISTINCT
FROM
blocked_locks.transactionid
AND
blocking_locks.classid
IS
NOT
DISTINCT
FROM
blocked_locks.classid
AND
blocking_locks.objid
IS
NOT
DISTINCT
FROM
blocked_locks.objid
AND
blocking_locks.objsubid
IS
NOT
DISTINCT
FROM
blocked_locks.objsubid
AND
blocking_locks.pid
!=
blocked_locks.pid
JOIN
pg_catalog.pg_stat_activity blocking_activity
ON
blocking_activity.pid
=
blocking_locks.pid
WHERE
NOT
blocked_locks.granted;
Lock Wait Statistics
sql
-- Create a monitoring view for regular checking
CREATE
OR
REPLACE
VIEW
lock_monitor
AS
SELECT
locktype,
mode
,
COUNT
(
*
)
as
lock_count,
COUNT
(
*
) FILTER (
WHERE
NOT
granted)
AS
waiting_count,
AVG
(EXTRACT(EPOCH
FROM
(
NOW
()
-
query_start)))
AS
avg_wait_time
FROM
pg_locks pl
JOIN
pg_stat_activity psa
ON
pl.pid
=
psa.pid
GROUP
BY
locktype,
mode
ORDER
BY
waiting_count
DESC
, avg_wait_time
DESC
;
-- Usage
SELECT
*
FROM
lock_monitor
WHERE
waiting_count
>
0
;
4. Prevention Strategies
Prevention is always more effective than detection. Here are comprehensive strategies organized by complexity and impact.
4.1 Enforce Consistent Lock Ordering
The most effective deadlock prevention technique is establishing and enforcing a system-wide lock ordering convention.
Implementation Approaches
Primary Key Ordering:
sql
-- Bad: Inconsistent ordering
BEGIN
;
UPDATE
accounts
SET
balance
=
balance
-
100
WHERE
id
=
456
;
UPDATE
accounts
SET
balance
=
balance
+
100
WHERE
id
=
123
;
COMMIT
;
-- Good: Always lock lower ID first
BEGIN
;
UPDATE
accounts
SET
balance
=
balance
+
100
WHERE
id
=
123
;
-- Lower ID first
UPDATE
accounts
SET
balance
=
balance
-
100
WHERE
id
=
456
;
-- Higher ID second
COMMIT
;
Application-Level Implementation:
python
def
transfer_funds
(from_id, to_id, amount):
# Always process accounts in ascending ID order
first_id, second_id
=
(from_id, to_id)
if
from_id
<
to_id
else
(to_id, from_id)
with
transaction():
# Lock accounts in consistent order
if
first_id
==
from_id:
debit_account(from_id, amount)
credit_account(to_id, amount)
else
:
credit_account(to_id, amount)
debit_account(from_id, amount)
Table-Level Ordering:
sql
-- Define explicit table hierarchy
-- Level 1: Reference tables (customers, products)
-- Level 2: Transactional tables (orders, payments)
-- Level 3: Derived tables (analytics, logs)
-- Always lock in hierarchical order:
BEGIN
;
LOCK
TABLE
customers
IN
ROW
EXCLUSIVE
MODE
;
LOCK
TABLE
orders
IN
ROW
EXCLUSIVE
MODE
;
LOCK
TABLE
order_items
IN
ROW
EXCLUSIVE
MODE
;
-- ... perform operations ...
COMMIT
;
Complex Multi-Table Operations:
sql
-- Create a stored procedure that enforces ordering
CREATE
OR
REPLACE
FUNCTION
safe_order_update(
p_customer_id
INT
,
p_product_ids
INT
[],
p_quantities
INT
[]
)
RETURNS
VOID
AS
$$
DECLARE
sorted_products
INT
[];
BEGIN
-- Sort product IDs to ensure consistent locking order
SELECT
ARRAY_AGG(unnest
ORDER
BY
unnest)
INTO
sorted_products
FROM
unnest(p_product_ids);
-- Lock customer first (lowest hierarchy)
PERFORM
*
FROM
customers
WHERE
id
=
p_customer_id
FOR
UPDATE
;
-- Lock products in sorted order
PERFORM
*
FROM
products
WHERE
id
=
ANY
(sorted_products)
ORDER
BY
id
FOR
UPDATE
;
-- Perform business logic...
-- UPDATE operations here
END
;
$$
LANGUAGE
plpgsql;
4.2 Minimize Transaction Scope and Duration
Long-running transactions dramatically increase deadlock probability by holding locks longer and creating more opportunities for conflicts.
Transaction Boundary Optimization
Anti-Pattern: Long Transaction:
sql
BEGIN
;
-- Step 1: Update inventory (holds locks)
UPDATE
inventory
SET
quantity
=
quantity
-
1
WHERE
product_id
=
'PROD123'
;
-- Step 2: Call external payment API (network I/O - very slow!)
SELECT
process_payment_external(
'card123'
,
99.99
);
-- 2-5 seconds
-- Step 3: Update order status
UPDATE
orders
SET
status
=
'paid'
WHERE
id
=
12345
;
-- Step 4: Send confirmation email (more network I/O)
SELECT
send_email_external(
'user@email.com'
,
'Order confirmed'
);
-- 1-3 seconds
COMMIT
;
-- Total transaction time: 3-8 seconds with locks held
Best Practice: Minimal Transaction:
sql
-- Step 1: Pre-validate outside transaction
SELECT
quantity
FROM
inventory
WHERE
product_id
=
'PROD123'
;
-- Check if sufficient inventory available
-- Step 2: Minimal transaction for data changes only
BEGIN
;
UPDATE
inventory
SET
quantity
=
quantity
-
1
WHERE
product_id
=
'PROD123'
;
INSERT
INTO
orders (customer_id, product_id,
status
)
VALUES
(
123
,
'PROD123'
,
'pending'
);
SELECT
order_id
FROM
orders
WHERE
customer_id
=
123
ORDER
BY
created_at
DESC
LIMIT
1
;
COMMIT
;
-- Transaction time: 10-50ms
-- Step 3: External operations outside transaction
SELECT
process_payment_external(
'card123'
,
99.99
);
-- Step 4: Final update transaction
BEGIN
;
UPDATE
orders
SET
status
=
'paid'
WHERE
id
=
@order_id
;
COMMIT
;
-- Step 5: Async email sending
SELECT
queue_email_async(
'user@email.com'
,
'Order confirmed'
);
Connection Pooling Considerations
python
# Bad: Long-lived connection holds transaction
def
process_order
():
conn
=
get_connection()
conn.execute(
"BEGIN"
)
# Multiple operations with user input
product_id
=
input
(
"Enter product ID: "
)
# User interaction!
conn.execute(
"UPDATE inventory SET quantity = quantity - 1 WHERE product_id = %s"
, [product_id])
# More user interaction
confirm
=
input
(
"Confirm order? (y/n): "
)
if
confirm
==
'y'
:
conn.execute(
"INSERT INTO orders ..."
)
conn.execute(
"COMMIT"
)
else
:
conn.execute(
"ROLLBACK"
)
# Good: Acquire connection only when needed
def
process_order
():
product_id
=
input
(
"Enter product ID: "
)
# Outside transaction
confirm
=
input
(
"Confirm order? (y/n): "
)
# Outside transaction
if
confirm
==
'y'
:
with
get_connection()
as
conn:
# Brief connection usage
conn.execute(
"BEGIN"
)
conn.execute(
"UPDATE inventory SET quantity = quantity - 1 WHERE product_id = %s"
, [product_id])
conn.execute(
"INSERT INTO orders ..."
)
conn.execute(
"COMMIT"
)
4.3 Reduce Lock Granularity and Scope
Using the most specific locks possible reduces contention and deadlock probability.
Row-Level Locking Strategies
SELECT FOR UPDATE Variants:
sql
-- Standard FOR UPDATE: Blocks other transactions
SELECT
*
FROM
inventory
WHERE
product_id
=
'PROD123'
FOR
UPDATE
;
-- FOR UPDATE SKIP LOCKED: Skip locked rows (useful for job queues)
SELECT
*
FROM
job_queue
WHERE
status
=
'pending'
ORDER
BY
priority
DESC
, created_at
ASC
FOR
UPDATE
SKIP LOCKED
LIMIT
1
;
-- FOR UPDATE NOWAIT: Fail immediately if row is locked
BEGIN
;
SELECT
*
FROM
accounts
WHERE
id
=
123
FOR
UPDATE
NOWAIT;
-- If row is locked, get immediate error instead of waiting
Conditional Locking:
sql
-- Lock only when necessary
DO
$$
DECLARE
current_balance
DECIMAL
;
account_id
INT
:
=
123
;
transfer_amount
DECIMAL
:
=
500.00
;
BEGIN
-- Check balance first (no lock needed for read)
SELECT
balance
INTO
current_balance
FROM
accounts
WHERE
id
=
account_id;
-- Only acquire lock if transfer is viable
IF
current_balance
>=
transfer_amount
THEN
-- Lock and double-check (balance might have changed)
SELECT
balance
INTO
current_balance
FROM
accounts
WHERE
id
=
account_id
FOR
UPDATE
;
IF
current_balance
>=
transfer_amount
THEN
UPDATE
accounts
SET
balance
=
balance
-
transfer_amount
WHERE
id
=
account_id;
ELSE
RAISE EXCEPTION
'Insufficient funds after lock acquisition'
;
END
IF
;
ELSE
RAISE EXCEPTION
'Insufficient funds'
;
END
IF
;
END
;
$$;
Index Design for Reduced Contention
Partition Indexes by Access Pattern:
sql
-- Bad: Single index on timestamp creates hotspot
CREATE
INDEX
idx_orders_created_at
ON
orders(created_at);
-- Good: Composite index spreads load
CREATE
INDEX
idx_orders_customer_created
ON
orders(customer_id, created_at);
-- Better: Partial indexes reduce index contention
CREATE
INDEX
idx_orders_pending
ON
orders(created_at)
WHERE
status
=
'pending'
;
CREATE
INDEX
idx_orders_active
ON
orders(created_at)
WHERE
status
IN
(
'processing'
,
'shipped'
);
UUID vs Sequential IDs:
sql
-- Sequential IDs create contention on index tail
CREATE
TABLE
orders_sequential (
id
SERIAL
PRIMARY
KEY
,
-- All inserts contend for same index page
created_at
TIMESTAMP
DEFAULT
NOW
()
);
-- UUIDs distribute inserts across index
CREATE
EXTENSION
IF
NOT
EXISTS
"uuid-ossp"
;
CREATE
TABLE
orders_distributed (
id UUID
PRIMARY
KEY
DEFAULT
uuid_generate_v4(),
-- Distributed across index
created_at
TIMESTAMP
DEFAULT
NOW
()
);
4.4 Optimistic Concurrency Control
Instead of acquiring locks early, use version checking and retry logic to handle concurrent modifications.
Version-Based Optimistic Locking
Using xmin for Version Control:
sql
-- Step 1: Read data with version info
SELECT
id, name, balance, xmin
as
version
FROM
accounts
WHERE
id
=
123
;
-- Returns: id=123, name="John", balance=1000, version=12345
-- Step 2: Update with version check
UPDATE
accounts
SET
balance
=
900
, updated_at
=
NOW
()
WHERE
id
=
123
AND
xmin
=
12345
;
-- Only update if version matches
-- Step 3: Check if update succeeded
GET DIAGNOSTICS rows_affected
=
ROW_COUNT;
IF
rows_affected
=
0
THEN
RAISE EXCEPTION
'Concurrent modification detected - please retry'
;
END
IF
;
Application-Level Version Fields:
sql
-- Add version column to tables
ALTER
TABLE
accounts
ADD
COLUMN
version_number
INTEGER
DEFAULT
1
;
-- Update with version increment
UPDATE
accounts
SET
balance
=
900
,
version_number
=
version_number
+
1
,
updated_at
=
NOW
()
WHERE
id
=
123
AND
version_number
=
@expected_version
;
Retry Logic Implementation:
python
import
time
import
random
def
optimistic_update
(account_id, new_balance, max_retries
=
3
):
for
attempt
in
range
(max_retries):
try
:
# Read current state
cursor.execute(
"""
SELECT balance, version_number
FROM accounts WHERE id = %s
"""
, [account_id])
current_balance, current_version
=
cursor.fetchone()
# Perform business logic validation
if
new_balance
<
0
:
raise
ValueError(
"Negative balance not allowed"
)
# Attempt optimistic update
cursor.execute(
"""
UPDATE accounts
SET balance = %s, version_number = version_number + 1
WHERE id = %s AND version_number = %s
"""
, [new_balance, account_id, current_version])
if
cursor.rowcount
==
0
:
# Concurrent modification detected
if
attempt
<
max_retries
-
1
:
# Exponential backoff with jitter
delay
=
(
2
**
attempt)
+
random.uniform(
0
,
1
)
time.sleep(delay)
continue
else
:
raise
ConcurrentModificationError(
"Max retries exceeded"
)
# Success
return
True
except
Exception
as
e:
if
attempt
==
max_retries
-
1
:
raise
continue
return
False
4.5 Advisory Locks for Custom Synchronization
PostgreSQL’s advisory locks provide application-level synchronization without interfering with data locks.
Basic Advisory Lock Usage
Session-Level Advisory Locks:
sql
-- Acquire lock (blocks if already held)
SELECT
pg_advisory_lock(
12345
);
-- Try to acquire lock (returns immediately)
SELECT
pg_try_advisory_lock(
12345
);
-- Returns true if acquired, false if not
-- Release lock
SELECT
pg_advisory_unlock(
12345
);
-- Check current advisory locks
SELECT
locktype, classid, objid,
mode
, granted
FROM
pg_locks
WHERE
locktype
=
'advisory'
;
Transaction-Level Advisory Locks:
sql
BEGIN
;
-- Lock is automatically released at transaction end
SELECT
pg_advisory_xact_lock(
67890
);
-- Perform operations that need coordination
UPDATE
global_counters
SET
value
=
value
+
1
WHERE
name
=
'order_sequence'
;
COMMIT
;
-- Lock automatically released
Practical Advisory Lock Patterns
Singleton Process Enforcement:
sql
-- Ensure only one instance of a batch job runs
CREATE
OR
REPLACE
FUNCTION
run_daily_reports()
RETURNS
VOID
AS
$$
BEGIN
-- Try to acquire exclusive lock for this job
IF
NOT
pg_try_advisory_lock(hashtext(
'daily_reports'
))
THEN
RAISE NOTICE
'Daily reports job already running, skipping...'
;
RETURN
;
END
IF
;
-- Perform the work
RAISE NOTICE
'Starting daily reports generation...'
;
-- Generate reports here
INSERT
INTO
daily_sales_report
SELECT
...;
UPDATE
report_status
SET
last_run
=
NOW
()
WHERE
report_name
=
'daily_sales'
;
RAISE NOTICE
'Daily reports completed'
;
-- Lock is automatically released when function ends
PERFORM pg_advisory_unlock(hashtext(
'daily_reports'
));
END
;
$$
LANGUAGE
plpgsql;
Resource Pool Management:
sql
-- Implement a custom semaphore for limiting concurrent operations
CREATE
OR
REPLACE
FUNCTION
acquire_processing_slot(max_slots
INT
DEFAULT
5
)
RETURNS
INT
AS
$$
DECLARE
slot_id
INT
;
BEGIN
-- Try to acquire one of N available slots
FOR
slot_id
IN
1.
.max_slots
LOOP
IF
pg_try_advisory_lock(hashtext(
'processing_slot'
), slot_id)
THEN
RETURN
slot_id;
-- Successfully acquired slot
END
IF
;
END
LOOP
;
RETURN
0
;
-- No slots available
END
;
$$
LANGUAGE
plpgsql;
-- Usage in application
DO
$$
DECLARE
my_slot
INT
;
BEGIN
my_slot :
=
acquire_processing_slot(
3
);
-- Max 3 concurrent processes
IF
my_slot
>
0
THEN
RAISE NOTICE
'Acquired processing slot %'
, my_slot;
-- Do heavy processing work
PERFORM heavy_data_processing();
-- Release slot
PERFORM pg_advisory_unlock(hashtext(
'processing_slot'
), my_slot);
ELSE
RAISE NOTICE
'No processing slots available, deferring work'
;
END
IF
;
END
;
$$;
Ordered Queue Processing:
sql
-- Process queue items in strict order using advisory locks
CREATE
OR
REPLACE
FUNCTION
process_queue_item()
RETURNS
VOID
AS
$$
DECLARE
queue_item RECORD;
lock_key
BIGINT
;
BEGIN
-- Find next item to process
SELECT
*
INTO
queue_item
FROM
job_queue
WHERE
status
=
'pending'
ORDER
BY
priority
DESC
, created_at
ASC
LIMIT
1
;
IF
queue_item
IS
NULL
THEN
RETURN
;
-- No work to do
END
IF
;
-- Create unique lock key for this item
lock_key :
=
hashtext(
'queue_item_'
||
queue_item.id::
TEXT
);
-- Try to acquire exclusive lock for this specific item
IF
NOT
pg_try_advisory_lock(lock_key)
THEN
RETURN
;
-- Another worker is processing this item
END
IF
;
BEGIN
-- Mark as processing
UPDATE
job_queue
SET
status
=
'processing'
, started_at
=
NOW
()
WHERE
id
=
queue_item.id;
-- Perform the actual work
PERFORM execute_job(queue_item.job_type, queue_item.job_data);
-- Mark as completed
UPDATE
job_queue
SET
status
=
'completed'
, completed_at
=
NOW
()
WHERE
id
=
queue_item.id;
EXCEPTION
WHEN
OTHERS
THEN
-- Mark as failed
UPDATE
job_queue
SET
status
=
'failed'
, error_message
=
SQLERRM
WHERE
id
=
queue_item.id;
RAISE;
-- Re-raise the exception
END
;
-- Release the lock
PERFORM pg_advisory_unlock(lock_key);
END
;
$$
LANGUAGE
plpgsql;
5. Monitoring and Troubleshooting
Effective monitoring is essential for maintaining deadlock-free systems. This section covers comprehensive monitoring strategies, alerting, and troubleshooting techniques.
5.1 Real-Time Monitoring Queries
Comprehensive Lock Analysis View
sql
CREATE
OR
REPLACE
VIEW
comprehensive_lock_analysis
AS
WITH
lock_details
AS
(
SELECT
pl.pid,
pl.locktype,
pl.
mode
,
pl.granted,
pl.relation,
pl.relation::regclass
AS
table_name,
pl.page,
pl.tuple,
pl.virtualxid,
pl.transactionid,
pl.classid,
pl.objid,
pl.objsubid,
psa.query,
psa.state,
psa.application_name,
psa.client_addr,
psa.usename,
psa.datname,
psa.backend_start,
psa.xact_start,
psa.query_start,
psa.state_change,
NOW
()
-
psa.query_start
AS
query_duration,
NOW
()
-
psa.xact_start
AS
transaction_duration,
psa.wait_event_type,
psa.wait_event
FROM
pg_locks pl
LEFT
JOIN
pg_stat_activity psa
ON
pl.pid
=
psa.pid
),
blocking_info
AS
(
SELECT
blocked.pid
AS
blocked_pid,
blocked.locktype
AS
blocked_locktype,
blocked.
mode
AS
blocked_mode,
blocked.table_name
AS
blocked_table,
blocking.pid
AS
blocking_pid,
blocking.
mode
AS
blocking_mode,
blocking.query
AS
blocking_query,
blocking.application_name
AS
blocking_app,
blocking.transaction_duration
AS
blocking_duration
FROM
lock_details blocked
JOIN
lock_details blocking
ON
(
blocking.locktype
=
blocked.locktype
AND
blocking.
database
IS
NOT
DISTINCT
FROM
blocked.
database
AND
blocking.relation
IS
NOT
DISTINCT
FROM
blocked.relation
AND
blocking.page
IS
NOT
DISTINCT
FROM
blocked.page
AND
blocking.tuple
IS
NOT
DISTINCT
FROM
blocked.tuple
AND
blocking.virtualxid
IS
NOT
DISTINCT
FROM
blocked.virtualxid
AND
blocking.transactionid
IS
NOT
DISTINCT
FROM
blocked.transactionid
AND
blocking.classid
IS
NOT
DISTINCT
FROM
blocked.classid
AND
blocking.objid
IS
NOT
DISTINCT
FROM
blocked.objid
AND
blocking.objsubid
IS
NOT
DISTINCT
FROM
blocked.objsubid
AND
blocking.pid
!=
blocked.pid
AND
blocking.granted
=
TRUE
AND
blocked.granted
=
FALSE
)
)
SELECT
ld.
*
,
bi.blocking_pid,
bi.blocking_mode,
bi.blocking_query,
bi.blocking_app,
bi.blocking_duration,
CASE
WHEN
ld.granted
=
FALSE
THEN
'WAITING'
WHEN
bi.blocking_pid
IS
NOT
NULL
THEN
'BLOCKING'
ELSE
'ACTIVE'
END
AS
lock_status
FROM
lock_details ld
LEFT
JOIN
blocking_info bi
ON
ld.pid
=
bi.blocked_pid
ORDER
BY
CASE
WHEN
ld.granted
=
FALSE
THEN
1
ELSE
2
END
,
ld.query_duration
DESC
;
Lock Wait Chain Analysis
sql
-- Recursive CTE to find complete lock wait chains
WITH
RECURSIVE lock_chain
AS
(
-- Base case: find root blockers (blocking but not blocked)
SELECT
blocking.pid
AS
blocker_pid,
blocked.pid
AS
blocked_pid,
blocking.query
AS
blocker_query,
blocked.query
AS
blocked_query,
blocking.application_name
AS
blocker_app,
blocked.application_name
AS
blocked_app,
1
AS
chain_depth,
ARRAY[blocking.pid]
AS
chain_pids,
blocking.pid::
TEXT
AS
chain_path
FROM
comprehensive_lock_analysis blocking
JOIN
comprehensive_lock_analysis blocked
ON
blocking.pid
=
blocked.blocking_pid
WHERE
blocking.blocking_pid
IS
NULL
-- Root blocker
UNION
ALL
-- Recursive case: extend chains
SELECT
lc.blocker_pid,
cla.pid
AS
blocked_pid,
lc.blocker_query,
cla.query
AS
blocked_query,
lc.blocker_app,
cla.application_name
AS
blocked_app,
lc.chain_depth
+
1
,
lc.chain_pids
||
cla.pid,
lc.chain_path
||
' -> '
||
cla.pid::
TEXT
FROM
lock_chain lc
JOIN
comprehensive_lock_analysis cla
ON
lc.blocked_pid
=
cla.blocking_pid
WHERE
cla.pid
!=
ALL
(lc.chain_pids)
-- Prevent infinite loops
AND
lc.chain_depth
<
10
-- Reasonable depth limit
)
SELECT
blocker_pid,
blocked_pid,
chain_depth,
chain_path,
blocker_query,
blocked_query,
blocker_app,
blocked_app
FROM
lock_chain
ORDER
BY
blocker_pid, chain_depth;
Historical Lock Statistics
sql
-- Create table to store lock statistics over time
CREATE
TABLE
IF
NOT
EXISTS
lock_statistics (
recorded_at
TIMESTAMP
WITH
TIME
ZONE
DEFAULT
NOW
(),
locktype
TEXT
,
mode
TEXT
,
total_locks
INTEGER
,
waiting_locks
INTEGER
,
avg_wait_duration_ms
NUMERIC
,
max_wait_duration_ms
NUMERIC
,
blocked_sessions
INTEGER
,
blocking_sessions
INTEGER
);
Automated Lock Statistics Collection
-- Function to collect and store lock statistics
CREATE OR REPLACE FUNCTION collect_lock_statistics() RETURNS VOID AS $$
BEGIN
INSERT INTO lock_statistics (
locktype, mode, total_locks, waiting_locks,
avg_wait_duration_ms, max_wait_duration_ms,
blocked_sessions, blocking_sessions
)
SELECT
pl.locktype,
pl.mode,
COUNT(*) as total_locks,
COUNT(*) FILTER (WHERE NOT pl.granted) as waiting_locks,
AVG(EXTRACT(EPOCH FROM (NOW() - psa.query_start)) * 1000)
FILTER (WHERE NOT pl.granted) as avg_wait_duration_ms,
MAX(EXTRACT(EPOCH FROM (NOW() - psa.query_start)) * 1000)
FILTER (WHERE NOT pl.granted) as max_wait_duration_ms,
COUNT(DISTINCT pl.pid) FILTER (WHERE NOT pl.granted) as blocked_sessions,
COUNT(DISTINCT blocking.pid) as blocking_sessions
FROM pg_locks pl
LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid
LEFT JOIN (
SELECT DISTINCT blocking_pid as pid
FROM comprehensive_lock_analysis
WHERE blocking_pid IS NOT NULL
) blocking ON pl.pid = blocking.pid
GROUP BY pl.locktype, pl.mode
HAVING COUNT(*) FILTER (WHERE NOT pl.granted) > 0
OR COUNT(DISTINCT blocking.pid) > 0;
END;
$$ LANGUAGE plpgsql;
-- Schedule collection every minute
SELECT cron.schedule('collect-lock-stats', '* * * * *', 'SELECT collect_lock_statistics()');
Deadlock History Analysis
-- Parse deadlock information from PostgreSQL logs
CREATE TABLE IF NOT EXISTS deadlock_incidents (
id SERIAL PRIMARY KEY,
occurred_at TIMESTAMP WITH TIME ZONE,
process_1_pid INTEGER,
process_1_query TEXT,
process_1_application TEXT,
process_2_pid INTEGER,
process_2_query TEXT,
process_2_application TEXT,
victim_pid INTEGER,
involved_tables TEXT[],
lock_types TEXT[],
resolution_time_ms INTEGER,
log_entry TEXT
);
-- Function to analyze deadlock patterns
CREATE OR REPLACE FUNCTION analyze_deadlock_patterns(
days_back INTEGER DEFAULT 7
) RETURNS TABLE (
pattern_type TEXT,
frequency INTEGER,
avg_resolution_time_ms NUMERIC,
involved_tables TEXT[],
common_applications TEXT[]
) AS $$
BEGIN
RETURN QUERY
WITH deadlock_analysis AS (
SELECT
CASE
WHEN array_length(involved_tables, 1) = 1 THEN 'Single Table'
WHEN array_length(involved_tables, 1) = 2 THEN 'Two Table'
ELSE 'Multi Table'
END as pattern_type,
resolution_time_ms,
involved_tables,
ARRAY[process_1_application, process_2_application] as applications
FROM deadlock_incidents
WHERE occurred_at >= NOW() - (days_back || ' days')::INTERVAL
)
SELECT
da.pattern_type,
COUNT(*)::INTEGER as frequency,
AVG(da.resolution_time_ms) as avg_resolution_time_ms,
array_agg(DISTINCT unnest) as involved_tables,
array_agg(DISTINCT unnest) as common_applications
FROM deadlock_analysis da,
unnest(da.involved_tables) as involved_table_unnest,
unnest(da.applications) as app_unnest
GROUP BY da.pattern_type
ORDER BY frequency DESC;
END;
$$ LANGUAGE plpgsql;
5.2 Alerting and Notification Systems
Real-Time Deadlock Alerts
-- Create alert function for immediate deadlock detection
CREATE OR REPLACE FUNCTION send_deadlock_alert(
alert_message TEXT,
severity TEXT DEFAULT 'HIGH'
) RETURNS VOID AS $$
DECLARE
webhook_url TEXT := 'https://hooks.slack.com/services/YOUR/WEBHOOK/URL';
payload JSON;
BEGIN
-- Construct alert payload
payload := json_build_object(
'text', alert_message,
'username', 'PostgreSQL Monitor',
'channel', '#alerts',
'attachments', json_build_array(
json_build_object(
'color', CASE
WHEN severity = 'CRITICAL' THEN 'danger'
WHEN severity = 'HIGH' THEN 'warning'
ELSE 'good'
END,
'fields', json_build_array(
json_build_object(
'title', 'Database',
'value', current_database(),
'short', true
),
json_build_object(
'title', 'Timestamp',
'value', NOW()::TEXT,
'short', true
)
)
)
)
);
-- Send webhook notification (requires http extension)
PERFORM http_post(webhook_url, payload::TEXT, 'application/json');
-- Also log to PostgreSQL log
RAISE WARNING 'DEADLOCK ALERT [%]: %', severity, alert_message;
EXCEPTION WHEN OTHERS THEN
-- If webhook fails, at least log locally
RAISE WARNING 'Failed to send deadlock alert: % - Original message: %', SQLERRM, alert_message;
END;
$$ LANGUAGE plpgsql;
-- Monitoring function for long-running lock waits
CREATE OR REPLACE FUNCTION monitor_lock_waits() RETURNS VOID AS $$
DECLARE
long_wait_threshold INTERVAL := '30 seconds';
critical_wait_threshold INTERVAL := '2 minutes';
wait_record RECORD;
alert_message TEXT;
BEGIN
FOR wait_record IN
SELECT
pid,
application_name,
query,
locktype,
mode,
table_name,
query_duration,
blocking_pid,
blocking_query,
blocking_app
FROM comprehensive_lock_analysis
WHERE lock_status = 'WAITING'
AND query_duration > long_wait_threshold
LOOP
alert_message := format(
'Lock wait detected: PID %s (%s) waiting %s for %s lock on %s. Blocked by PID %s (%s)',
wait_record.pid,
wait_record.application_name,
wait_record.query_duration,
wait_record.mode,
COALESCE(wait_record.table_name::TEXT, wait_record.locktype),
wait_record.blocking_pid,
wait_record.blocking_app
);
IF wait_record.query_duration > critical_wait_threshold THEN
PERFORM send_deadlock_alert(alert_message, 'CRITICAL');
ELSE
PERFORM send_deadlock_alert(alert_message, 'HIGH');
END IF;
END LOOP;
END;
$$ LANGUAGE plpgsql;
Automated Deadlock Resolution
-- Function to automatically resolve certain types of deadlocks
CREATE OR REPLACE FUNCTION auto_resolve_deadlocks() RETURNS TEXT[] AS $$
DECLARE
resolution_actions TEXT[] := ARRAY[]::TEXT[];
deadlock_session RECORD;
kill_query TEXT;
BEGIN
-- Find sessions that have been waiting too long and are safe to kill
FOR deadlock_session IN
SELECT
pid,
application_name,
query,
query_duration,
transaction_duration,
usename
FROM comprehensive_lock_analysis
WHERE lock_status = 'WAITING'
AND query_duration > INTERVAL '5 minutes'
AND application_name NOT IN ('critical_app', 'admin_console') -- Protect critical apps
AND query !~* '(backup|restore|reindex|vacuum)' -- Don't kill maintenance
ORDER BY query_duration DESC
LIMIT 5 -- Limit resolution actions per run
LOOP
-- Attempt to cancel the query first (graceful)
PERFORM pg_cancel_backend(deadlock_session.pid);
resolution_actions := resolution_actions ||
format('Cancelled query for PID %s (app: %s, duration: %s)',
deadlock_session.pid,
deadlock_session.application_name,
deadlock_session.query_duration);
-- Wait a moment and check if it worked
PERFORM pg_sleep(2);
-- If still running, terminate the connection
IF EXISTS (SELECT 1 FROM pg_stat_activity WHERE pid = deadlock_session.pid) THEN
PERFORM pg_terminate_backend(deadlock_session.pid);
resolution_actions := resolution_actions ||
format('Terminated connection for PID %s', deadlock_session.pid);
END IF;
-- Log the action
RAISE WARNING 'Auto-resolved potential deadlock: PID %, Query: %',
deadlock_session.pid,
left(deadlock_session.query, 100);
END LOOP;
RETURN resolution_actions;
END;
$$ LANGUAGE plpgsql;
5.3 Performance Tuning for Deadlock Prevention
Connection Pool Optimization
-- Function to analyze connection pool efficiency
CREATE OR REPLACE FUNCTION analyze_connection_patterns() RETURNS TABLE (
application_name TEXT,
avg_connections INTEGER,
max_connections INTEGER,
avg_transaction_duration INTERVAL,
long_transaction_count INTEGER,
idle_in_transaction_count INTEGER,
deadlock_involvement_rate NUMERIC
) AS $$
BEGIN
RETURN QUERY
SELECT
psa.application_name,
COUNT(*)::INTEGER as avg_connections,
COUNT(*)::INTEGER as max_connections, -- This would need historical data
AVG(NOW() - psa.xact_start) as avg_transaction_duration,
COUNT(*) FILTER (WHERE NOW() - psa.xact_start > INTERVAL '1 minute')::INTEGER as long_transaction_count,
COUNT(*) FILTER (WHERE psa.state = 'idle in transaction')::INTEGER as idle_in_transaction_count,
0.0 as deadlock_involvement_rate -- Would need to join with deadlock history
FROM pg_stat_activity psa
WHERE psa.state NOT IN ('idle', '')
GROUP BY psa.application_name
ORDER BY long_transaction_count DESC, avg_transaction_duration DESC;
END;
$$ LANGUAGE plpgsql;
-- Recommended connection pool settings based on analysis
CREATE OR REPLACE FUNCTION recommend_pool_settings(app_name TEXT)
RETURNS TABLE (
setting_name TEXT,
current_value TEXT,
recommended_value TEXT,
reasoning TEXT
) AS $$
DECLARE
app_stats RECORD;
BEGIN
SELECT * INTO app_stats FROM analyze_connection_patterns() WHERE application_name = app_name;
IF app_stats IS NULL THEN
RAISE EXCEPTION 'Application % not found', app_name;
END IF;
-- Pool size recommendation
RETURN QUERY VALUES
('max_pool_size',
'unknown',
CASE
WHEN app_stats.avg_connections > 50 THEN '20-30'
WHEN app_stats.avg_connections > 20 THEN '10-15'
ELSE '5-10'
END,
format('Based on %s average connections', app_stats.avg_connections));
-- Connection lifetime
RETURN QUERY VALUES
('max_connection_lifetime',
'unknown',
CASE
WHEN app_stats.avg_transaction_duration > INTERVAL '5 minutes' THEN '30 minutes'
WHEN app_stats.avg_transaction_duration > INTERVAL '1 minute' THEN '15 minutes'
ELSE '10 minutes'
END,
format('Based on %s average transaction duration', app_stats.avg_transaction_duration));
-- Idle timeout
RETURN QUERY VALUES
('idle_timeout',
'unknown',
CASE
WHEN app_stats.idle_in_transaction_count > 5 THEN '30 seconds'
WHEN app_stats.idle_in_transaction_count > 2 THEN '60 seconds'
ELSE '120 seconds'
END,
format('Based on %s idle-in-transaction connections', app_stats.idle_in_transaction_count));
END;
$$ LANGUAGE plpgsql;
Index Optimization for Deadlock Reduction
-- Analyze index contention and suggest optimizations
CREATE OR REPLACE FUNCTION analyze_index_contention()
RETURNS TABLE (
schema_name TEXT,
table_name TEXT,
index_name TEXT,
index_size TEXT,
index_scans BIGINT,
index_tup_read BIGINT,
index_tup_fetch BIGINT,
contention_score NUMERIC,
recommendation TEXT
) AS $$
BEGIN
RETURN QUERY
SELECT
schemaname::TEXT,
tablename::TEXT,
indexname::TEXT,
pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
idx_scan as index_scans,
idx_tup_read as index_tup_read,
idx_tup_fetch as index_tup_fetch,
-- Simple contention score based on read/fetch ratio
CASE
WHEN idx_tup_fetch = 0 THEN 0
ELSE ROUND((idx_tup_read::NUMERIC / idx_tup_fetch::NUMERIC), 2)
END as contention_score,
CASE
WHEN idx_scan = 0 THEN 'Consider dropping - unused index'
WHEN idx_tup_read::NUMERIC / NULLIF(idx_tup_fetch, 0) > 100 THEN 'High contention - consider partitioning'
WHEN pg_relation_size(indexrelid) > 1024*1024*100 THEN 'Large index - monitor for hotspots'
ELSE 'Normal usage pattern'
END as recommendation
FROM pg_stat_user_indexes psi
JOIN pg_indexes pi ON psi.schemaname = pi.schemaname
AND psi.tablename = pi.tablename
AND psi.indexname = pi.indexname
WHERE idx_scan > 0 OR idx_tup_read > 0
ORDER BY contention_score DESC NULLS LAST;
END;
$$ LANGUAGE plpgsql;
-- Generate index optimization suggestions
CREATE OR REPLACE FUNCTION suggest_index_optimizations(
target_table TEXT DEFAULT NULL
) RETURNS TABLE (
optimization_type TEXT,
current_situation TEXT,
suggested_action TEXT,
expected_benefit TEXT
) AS $$
BEGIN
-- Missing indexes for foreign keys
RETURN QUERY
SELECT
'Missing FK Index'::TEXT,
format('Table %s.%s, column %s', tc.table_schema, tc.table_name, kcu.column_name),
format('CREATE INDEX idx_%s_%s ON %s.%s (%s);',
tc.table_name, kcu.column_name, tc.table_schema, tc.table_name, kcu.column_name),
'Reduce deadlocks on foreign key lookups'::TEXT
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu USING (constraint_name, table_schema, table_name)
WHERE tc.constraint_type = 'FOREIGN KEY'
AND (target_table IS NULL OR tc.table_name = target_table)
AND NOT EXISTS (
SELECT 1 FROM pg_indexes
WHERE schemaname = tc.table_schema
AND tablename = tc.table_name
AND indexdef ~* ('\\(' || kcu.column_name || '\\)')
);
-- Partial indexes for common WHERE conditions
RETURN QUERY
SELECT
'Partial Index Opportunity'::TEXT,
format('Frequent queries on %s with WHERE conditions', target_table),
'CREATE INDEX idx_partial ON table (column) WHERE condition;'::TEXT,
'Reduce index size and contention'::TEXT
WHERE target_table IS NOT NULL;
END;
$$ LANGUAGE plpgsql;
6. Advanced Prevention Techniques
6.1 Database Design Patterns for Deadlock Prevention
Event Sourcing Pattern
-- Instead of updating records directly, append events
CREATE TABLE account_events (
id BIGSERIAL PRIMARY KEY,
account_id INTEGER NOT NULL,
event_type TEXT NOT NULL,
amount DECIMAL(15,2),
metadata JSONB,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
sequence_number BIGSERIAL
);
-- Create index for efficient event replay
CREATE INDEX idx_account_events_account_sequence
ON account_events (account_id, sequence_number);
-- Function to append events (no locks on existing data)
CREATE OR REPLACE FUNCTION append_account_event(
p_account_id INTEGER,
p_event_type TEXT,
p_amount DECIMAL DEFAULT NULL,
p_metadata JSONB DEFAULT NULL
) RETURNS BIGINT AS $$
DECLARE
event_id BIGINT;
BEGIN
INSERT INTO account_events (account_id, event_type, amount, metadata)
VALUES (p_account_id, p_event_type, p_amount, p_metadata)
RETURNING id INTO event_id;
-- Optionally update materialized view asynchronously
PERFORM pg_notify('account_changed', p_account_id::TEXT);
RETURN event_id;
END;
$$ LANGUAGE plpgsql;
-- Materialized view for current state (updated asynchronously)
CREATE MATERIALIZED VIEW account_balances AS
SELECT
account_id,
SUM(CASE
WHEN event_type = 'credit' THEN amount
WHEN event_type = 'debit' THEN -amount
ELSE 0
END) as current_balance,
COUNT(*) as transaction_count,
MAX(created_at) as last_transaction_at
FROM account_events
GROUP BY account_id;
CREATE UNIQUE INDEX idx_account_balances_account_id
ON account_balances (account_id);
Queue-Based Processing
-- Serialize conflicting operations through queues
CREATE TABLE operation_queue (
id BIGSERIAL PRIMARY KEY,
resource_key TEXT NOT NULL, -- What resource this operation affects
operation_type TEXT NOT NULL,
operation_data JSONB NOT NULL,
priority INTEGER DEFAULT 0,
status TEXT DEFAULT 'pending',
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
started_at TIMESTAMP WITH TIME ZONE,
completed_at TIMESTAMP WITH TIME ZONE,
worker_id TEXT,
retry_count INTEGER DEFAULT 0,
error_message TEXT
);
CREATE INDEX idx_operation_queue_processing
ON operation_queue (resource_key, status, priority DESC, created_at);
-- Worker function to process operations sequentially per resource
CREATE OR REPLACE FUNCTION process_operation_queue(
worker_id TEXT,
max_operations INTEGER DEFAULT 10
) RETURNS INTEGER AS $$
DECLARE
operation RECORD;
operations_processed INTEGER := 0;
BEGIN
-- Process operations one resource at a time to avoid deadlocks
FOR operation IN
SELECT DISTINCT ON (resource_key) *
FROM operation_queue
WHERE status = 'pending'
ORDER BY resource_key, priority DESC, created_at ASC
LIMIT max_operations
FOR UPDATE SKIP LOCKED
LOOP
-- Mark as in progress
UPDATE operation_queue
SET status = 'processing',
started_at = NOW(),
worker_id = process_operation_queue.worker_id
WHERE id = operation.id;
BEGIN
-- Process the operation
CASE operation.operation_type
WHEN 'transfer' THEN
PERFORM execute_transfer(operation.operation_data);
WHEN 'update_inventory' THEN
PERFORM update_inventory(operation.operation_data);
ELSE
RAISE EXCEPTION 'Unknown operation type: %', operation.operation_type;
END CASE;
-- Mark as completed
UPDATE operation_queue
SET status = 'completed', completed_at = NOW()
WHERE id = operation.id;
operations_processed := operations_processed + 1;
EXCEPTION WHEN OTHERS THEN
-- Mark as failed and potentially retry
UPDATE operation_queue
SET status = CASE
WHEN retry_count < 3 THEN 'pending'
ELSE 'failed'
END,
retry_count = retry_count + 1,
error_message = SQLERRM
WHERE id = operation.id;
END;
END LOOP;
RETURN operations_processed;
END;
$$ LANGUAGE plpgsql;
6.2 Application-Level Deadlock Prevention
Distributed Lock Manager
-- Distributed lock management using PostgreSQL
CREATE TABLE distributed_locks (
lock_name TEXT PRIMARY KEY,
owner_id TEXT NOT NULL,
acquired_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
expires_at TIMESTAMP WITH TIME ZONE NOT NULL,
metadata JSONB,
heartbeat_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
CREATE INDEX idx_distributed_locks_expires
ON distributed_locks (expires_at);
-- Function to acquire a distributed lock
CREATE OR REPLACE FUNCTION acquire_distributed_lock(
p_lock_name TEXT,
p_owner_id TEXT,
p_timeout_seconds INTEGER DEFAULT 300,
p_metadata JSONB DEFAULT NULL
) RETURNS BOOLEAN AS $$
DECLARE
lock_acquired BOOLEAN := FALSE;
expiry_time TIMESTAMP WITH TIME ZONE;
BEGIN
expiry_time := NOW() + (p_timeout_seconds || ' seconds')::INTERVAL;
-- Clean up expired locks first
DELETE FROM distributed_locks
WHERE expires_at < NOW();
BEGIN
-- Try to acquire the lock
INSERT INTO distributed_locks (lock_name, owner_id, expires_at, metadata)
VALUES (p_lock_name, p_owner_id, expiry_time, p_metadata);
lock_acquired := TRUE;
EXCEPTION WHEN unique_violation THEN
-- Lock already exists, check if we own it
UPDATE distributed_locks
SET expires_at = expiry_time,
heartbeat_at = NOW(),
metadata = COALESCE(p_metadata, metadata)
WHERE lock_name = p_lock_name
AND owner_id = p_owner_id;
lock_acquired := FOUND;
END;
RETURN lock_acquired;
END;
$$ LANGUAGE plpgsql;
-- Function to release a distributed lock
CREATE OR REPLACE FUNCTION release_distributed_lock(
p_lock_name TEXT,
p_owner_id TEXT
) RETURNS BOOLEAN AS $$
BEGIN
DELETE FROM distributed_locks
WHERE lock_name = p_lock_name
AND owner_id = p_owner_id;
RETURN FOUND;
END;
$$ LANGUAGE plpgsql;
-- Heartbeat function to keep locks alive
CREATE OR REPLACE FUNCTION heartbeat_distributed_lock(
p_lock_name TEXT,
p_owner_id TEXT,
p_extend_seconds INTEGER DEFAULT 300
) RETURNS BOOLEAN AS $$
BEGIN
UPDATE distributed_locks
SET heartbeat_at = NOW(),
expires_at = NOW() + (p_extend_seconds || ' seconds')::INTERVAL
WHERE lock_name = p_lock_name
AND owner_id = p_owner_id
AND expires_at > NOW(); -- Only extend if not expired
RETURN FOUND;
END;
$$ LANGUAGE plpgsql;
6.3 Microservices and Deadlock Prevention
Saga Pattern Implementation
-- Saga orchestration for distributed transactions
CREATE TABLE sagas (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
saga_type TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'started',
data JSONB NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
completed_at TIMESTAMP WITH TIME ZONE,
compensated_at TIMESTAMP WITH TIME ZONE
);
CREATE TABLE saga_steps (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
saga_id UUID NOT NULL REFERENCES sagas(id),
step_number INTEGER NOT NULL,
step_name TEXT NOT NULL,
step_type TEXT NOT NULL, -- 'action' or 'compensation'
status TEXT NOT NULL DEFAULT 'pending',
input_data JSONB,
output_data JSONB,
error_message TEXT,
started_at TIMESTAMP WITH TIME ZONE,
completed_at TIMESTAMP WITH TIME ZONE,
UNIQUE(saga_id, step_number)
);
-- Function to execute saga steps
CREATE OR REPLACE FUNCTION execute_saga_step(
p_saga_id UUID,
p_step_number INTEGER
) RETURNS BOOLEAN AS $$
DECLARE
saga_record RECORD;
step_record RECORD;
success BOOLEAN := FALSE;
BEGIN
-- Get saga and step information
SELECT * INTO saga_record FROM sagas WHERE id = p_saga_id;
SELECT * INTO step_record FROM saga_steps
WHERE saga_id = p_saga_id AND step_number = p_step_number;
IF saga_record IS NULL OR step_record IS NULL THEN
RETURN FALSE;
END IF;
-- Mark step as started
UPDATE saga_steps
SET status = 'running', started_at = NOW()
WHERE id = step_record.id;
BEGIN
-- Execute the step based on step_name
CASE step_record.step_name
WHEN 'reserve_inventory' THEN
PERFORM reserve_inventory_saga(step_record.input_data);
WHEN 'charge_payment' THEN
PERFORM charge_payment_saga(step_record.input_data);
WHEN 'create_shipment' THEN
PERFORM create_shipment_saga(step_record.input_data);
-- Compensation steps
WHEN 'release_inventory' THEN
PERFORM release_inventory_saga(step_record.input_data);
WHEN 'refund_payment' THEN
PERFORM refund_payment_saga(step_record.input_data);
WHEN 'cancel_shipment' THEN
PERFORM cancel_shipment_saga(step_record.input_data);
ELSE
RAISE EXCEPTION 'Unknown saga step: %', step_record.step_name;
END CASE;
-- Mark step as completed
UPDATE saga_steps
SET status = 'completed', completed_at = NOW()
WHERE id = step_record.id;
success := TRUE;
EXCEPTION WHEN OTHERS THEN
-- Mark step as failed
UPDATE saga_steps
SET status = 'failed',
error_message = SQLERRM,
completed_at = NOW()
WHERE id = step_record.id;
-- Trigger compensation
PERFORM compensate_saga(p_saga_id);
success := FALSE;
END;
RETURN success;
END;
$$ LANGUAGE plpgsql;
7. Production Deployment Checklist
7.1 Configuration Validation
-- Function to validate deadlock prevention configuration
CREATE OR REPLACE FUNCTION validate_deadlock_config()
RETURNS TABLE (
check_name TEXT,
status TEXT,
current_value TEXT,
recommended_value TEXT,
severity TEXT
) AS $$
BEGIN
-- Check deadlock_timeout
RETURN QUERY
SELECT
'deadlock_timeout'::TEXT,
CASE
WHEN current_setting('deadlock_timeout')::INTERVAL > INTERVAL '2 seconds'
THEN 'WARNING'
ELSE 'OK'
END,
current_setting('deadlock_timeout'),
'200ms - 1s'::TEXT,
'MEDIUM'::TEXT;
-- Check log_lock_waits
RETURN QUERY
SELECT
'log_lock_waits'::TEXT,
CASE
WHEN current_setting('log_lock_waits') = 'off'
THEN 'WARNING'
ELSE 'OK'
END,
current_setting('log_lock_waits'),
'on'::TEXT,
'HIGH'::TEXT;
-- Check max_connections vs max_locks_per_transaction
RETURN QUERY
SELECT
'lock_memory_sizing'::TEXT,
CASE
WHEN current_setting('max_connections')::INTEGER *
current_setting('max_locks_per_transaction')::INTEGER > 100000
THEN 'WARNING'
ELSE 'OK'
END,
format('%s connections × %s locks = %s total',
current_setting('max_connections'),
current_setting('max_locks_per_transaction'),
(current_setting('max_connections')::INTEGER *
current_setting('max_locks_per_transaction')::INTEGER)::TEXT),
'Consider reducing if memory limited'::TEXT,
'LOW'::TEXT;
-- Check statement_timeout
RETURN QUERY
SELECT
'statement_timeout'::TEXT,
CASE
WHEN current_setting('statement_timeout')::INTERVAL = INTERVAL '0'
THEN 'WARNING'
ELSE 'OK'
END,
current_setting('statement_timeout'),
'30s - 300s depending on workload'::TEXT,
'MEDIUM'::TEXT;
END;
$$ LANGUAGE plpgsql;
7.2 Deployment Monitoring Script
-- Create comprehensive monitoring dashboard
CREATE OR REPLACE VIEW deadlock_prevention_dashboard AS
SELECT
'Configuration' as category,
check_name as metric,
status as value,
severity as priority,
current_value as details
FROM validate_deadlock_config()
UNION ALL
SELECT
'Current Locks' as category,
lock_status as metric,
COUNT(*)::TEXT as value,
CASE
WHEN lock_status = 'WAITING' AND COUNT(*) > 5 THEN 'HIGH'
WHEN lock_status = 'WAITING' AND COUNT(*) > 0 THEN 'MEDIUM'
ELSE 'LOW'
END as priority,
string_agg(DISTINCT table_name::TEXT, ', ') as details
FROM comprehensive_lock_analysis
GROUP BY lock_status
UNION ALL
SELECT
'Performance' as category,
'Average Transaction Duration' as metric,
ROUND(EXTRACT(EPOCH FROM AVG(NOW() - xact_start))::NUMERIC, 2)::TEXT || 's' as value,
CASE
WHEN AVG(NOW() - xact_start) > INTERVAL '1 minute' THEN 'HIGH'
WHEN AVG(NOW() - xact_start) > INTERVAL '10 seconds' THEN 'MEDIUM'
ELSE 'LOW'
END as priority,
COUNT(*)::TEXT || ' active transactions' as details
FROM pg_stat_activity
WHERE state = 'active' AND xact_start IS NOT NULL
UNION ALL
SELECT
'Connection Health' as category,
'Idle in Transaction' as metric,
COUNT(*)::TEXT as value,
CASE
WHEN COUNT(*) > 10 THEN 'HIGH'
WHEN COUNT(*) > 5 THEN 'MEDIUM'
ELSE 'LOW'
END as priority,
string_agg(DISTINCT application_name, ', ') as details
FROM pg_stat_activity
WHERE state = 'idle in transaction'
GROUP BY state
ORDER BY
CASE priority
WHEN 'HIGH' THEN 1
WHEN 'MEDIUM' THEN 2
WHEN 'LOW' THEN 3
END,
category, metric;
-- Health check function for automated monitoring
CREATE OR REPLACE FUNCTION deadlock_health_check()
RETURNS TABLE (
overall_status TEXT,
critical_issues INTEGER,
warnings INTEGER,
recommendations TEXT[]
) AS $$
DECLARE
critical_count INTEGER := 0;
warning_count INTEGER := 0;
rec_array TEXT[] := ARRAY[]::TEXT[];
dashboard_row RECORD;
BEGIN
-- Count issues from dashboard
FOR dashboard_row IN SELECT * FROM deadlock_prevention_dashboard LOOP
CASE dashboard_row.priority
WHEN 'HIGH' THEN
critical_count := critical_count + 1;
rec_array := rec_array || format('CRITICAL: %s - %s',
dashboard_row.metric, dashboard_row.details);
WHEN 'MEDIUM' THEN
warning_count := warning_count + 1;
rec_array := rec_array || format('WARNING: %s - %s',
dashboard_row.metric, dashboard_row.details);
END CASE;
END LOOP;
-- Return overall assessment
RETURN QUERY SELECT
CASE
WHEN critical_count > 0 THEN 'CRITICAL'
WHEN warning_count > 3 THEN 'WARNING'
ELSE 'HEALTHY'
END::TEXT,
critical_count,
warning_count,
rec_array;
END;
$$ LANGUAGE plpgsql;
7.3 Automated Testing Framework
-- Create deadlock simulation for testing
CREATE OR REPLACE FUNCTION simulate_deadlock_scenario(
scenario_name TEXT,
concurrent_sessions INTEGER DEFAULT 2
) RETURNS TEXT AS $$
DECLARE
result TEXT;
session_count INTEGER;
BEGIN
-- Validate scenario
IF scenario_name NOT IN ('two_table_transfer', 'foreign_key_insert', 'index_contention') THEN
RAISE EXCEPTION 'Unknown scenario: %', scenario_name;
END IF;
-- Check current load
SELECT COUNT(*) INTO session_count
FROM pg_stat_activity
WHERE state = 'active';
IF session_count > 10 THEN
RETURN 'SKIPPED: High system load detected';
END IF;
-- Execute scenario
CASE scenario_name
WHEN 'two_table_transfer' THEN
result := simulate_transfer_deadlock(concurrent_sessions);
WHEN 'foreign_key_insert' THEN
result := simulate_fk_deadlock(concurrent_sessions);
WHEN 'index_contention' THEN
result := simulate_index_deadlock(concurrent_sessions);
END CASE;
RETURN result;
END;
$$ LANGUAGE plpgsql;
-- Two-table transfer deadlock simulation
CREATE OR REPLACE FUNCTION simulate_transfer_deadlock(sessions INTEGER)
RETURNS TEXT AS $$
DECLARE
start_time TIMESTAMP;
end_time TIMESTAMP;
deadlock_detected BOOLEAN := FALSE;
BEGIN
-- Create test tables if they don't exist
CREATE TABLE IF NOT EXISTS test_accounts (
id SERIAL PRIMARY KEY,
balance DECIMAL(10,2) DEFAULT 1000.00,
updated_at TIMESTAMP DEFAULT NOW()
);
-- Ensure test data
INSERT INTO test_accounts (id, balance)
VALUES (1, 1000.00), (2, 1000.00)
ON CONFLICT (id) DO UPDATE SET balance = EXCLUDED.balance;
start_time := clock_timestamp();
BEGIN
-- Simulate concurrent transfers in opposite directions
-- This would normally require multiple connections
-- For testing, we'll use advisory locks to simulate the pattern
IF pg_try_advisory_lock(1) THEN
-- Session 1 simulation
UPDATE test_accounts SET balance = balance - 100 WHERE id = 1;
PERFORM pg_sleep(0.1); -- Simulate processing delay
IF pg_try_advisory_lock(2) THEN
UPDATE test_accounts SET balance = balance + 100 WHERE id = 2;
PERFORM pg_advisory_unlock(2);
PERFORM pg_advisory_unlock(1);
deadlock_detected := FALSE;
ELSE
PERFORM pg_advisory_unlock(1);
deadlock_detected := TRUE;
END IF;
END IF;
EXCEPTION WHEN deadlock_detected THEN
deadlock_detected := TRUE;
END;
end_time := clock_timestamp();
-- Clean up test data
DELETE FROM test_accounts WHERE id IN (1, 2);
RETURN format('Simulation completed in %s ms. Deadlock detected: %s',
EXTRACT(MILLISECONDS FROM (end_time - start_time))::INTEGER,
deadlock_detected);
END;
$$ LANGUAGE plpgsql;
8. Emergency Response Procedures
8.1 Deadlock Incident Response Plan
-- Emergency deadlock resolution function
CREATE OR REPLACE FUNCTION emergency_deadlock_resolution(
max_kill_count INTEGER DEFAULT 5
) RETURNS TABLE (
action_taken TEXT,
pid INTEGER,
application_name TEXT,
query_duration INTERVAL,
query_preview TEXT
) AS $$
DECLARE
victim RECORD;
kill_count INTEGER := 0;
BEGIN
-- First, try graceful query cancellation
FOR victim IN
SELECT
psa.pid,
psa.application_name,
NOW() - psa.query_start as duration,
LEFT(psa.query, 100) as query_preview
FROM pg_stat_activity psa
JOIN comprehensive_lock_analysis cla ON psa.pid = cla.pid
WHERE cla.lock_status = 'WAITING'
AND psa.query_start < NOW() - INTERVAL '2 minutes'
AND psa.application_name NOT LIKE '%admin%'
ORDER BY psa.query_start ASC
LIMIT max_kill_count
LOOP
-- Try to cancel the query first
IF pg_cancel_backend(victim.pid) THEN
kill_count := kill_count + 1;
RETURN QUERY SELECT
'CANCELLED'::TEXT,
victim.pid,
victim.application_name,
victim.duration,
victim.query_preview;
-- Wait briefly and check if it worked
PERFORM pg_sleep(1);
-- If still running, terminate the connection
IF EXISTS (SELECT 1 FROM pg_stat_activity WHERE pid = victim.pid) THEN
PERFORM pg_terminate_backend(victim.pid);
RETURN QUERY SELECT
'TERMINATED'::TEXT,
victim.pid,
victim.application_name,
victim.duration,
victim.query_preview;
END IF;
END IF;
IF kill_count >= max_kill_count THEN
EXIT;
END IF;
END LOOP;
-- Log the emergency action
INSERT INTO deadlock_incidents (
occurred_at,
victim_pid,
log_entry
) VALUES (
NOW(),
NULL,
format('Emergency resolution killed %s sessions', kill_count)
);
IF kill_count = 0 THEN
RETURN QUERY SELECT
'NO_ACTION'::TEXT,
NULL::INTEGER,
'No long-running waiting sessions found'::TEXT,
NULL::INTERVAL,
NULL::TEXT;
END IF;
END;
$$ LANGUAGE plpgsql;
8.2 Post-Incident Analysis
-- Function to analyze deadlock incidents after resolution
CREATE OR REPLACE FUNCTION analyze_deadlock_incident(
incident_start TIMESTAMP,
incident_end TIMESTAMP DEFAULT NOW()
) RETURNS TABLE (
analysis_category TEXT,
finding TEXT,
impact_assessment TEXT,
prevention_recommendation TEXT
) AS $$
BEGIN
-- Analyze lock wait patterns during incident
RETURN QUERY
SELECT
'Lock Patterns'::TEXT,
format('Peak waiting sessions: %s, Most affected table: %s',
COUNT(*), mode()),
CASE
WHEN COUNT(*) > 20 THEN 'High impact - system-wide slowdown likely'
WHEN COUNT(*) > 10 THEN 'Medium impact - affected multiple users'
ELSE 'Low impact - isolated incident'
END::TEXT,
'Implement consistent lock ordering for affected tables'::TEXT
FROM lock_statistics
WHERE recorded_at BETWEEN incident_start AND incident_end
AND waiting_locks > 0;
-- Analyze application patterns
RETURN QUERY
SELECT
'Application Impact'::TEXT,
format('Applications affected: %s', string_agg(DISTINCT application_name, ', ')),
'Multiple applications involved - suggests systemic issue'::TEXT,
'Review transaction boundaries across all applications'::TEXT
FROM pg_stat_activity
WHERE backend_start BETWEEN incident_start AND incident_end;
-- Timeline analysis
RETURN QUERY
SELECT
'Timeline'::TEXT,
format('Incident duration: %s', incident_end - incident_start),
CASE
WHEN incident_end - incident_start > INTERVAL '5 minutes'
THEN 'Extended outage - immediate action required'
ELSE 'Brief incident - monitor for recurrence'
END::TEXT,
'Implement faster deadlock detection timeout'::TEXT;
END;
$$ LANGUAGE plpgsql;
9. Best Practices Summary
9.1 Development Guidelines
-- Code review checklist function
CREATE OR REPLACE FUNCTION deadlock_code_review_checklist()
RETURNS TABLE (
category TEXT,
checkpoint TEXT,
description TEXT,
sql_example TEXT
) AS $$
BEGIN
RETURN QUERY VALUES
('Lock Ordering', 'Consistent Resource Access',
'Always access resources (tables, rows) in the same order across all code paths',
'SELECT * FROM accounts WHERE id IN (1,2) ORDER BY id FOR UPDATE'),
('Transaction Scope', 'Minimize Transaction Time',
'Keep transactions as short as possible, avoid I/O within transactions',
'BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = 1; COMMIT;'),
('Error Handling', 'Deadlock Retry Logic',
'Implement exponential backoff retry for deadlock errors',
'EXCEPTION WHEN deadlock_detected THEN PERFORM pg_sleep(random() * power(2, attempt));'),
('Locking Strategy', 'Use Appropriate Lock Levels',
'Use the least restrictive lock level that maintains data integrity',
'SELECT * FROM inventory WHERE product_id = ? FOR UPDATE NOWAIT'),
('Index Design', 'Support Query Patterns',
'Ensure indexes support your query patterns to minimize lock duration',
'CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date)'),
('Advisory Locks', 'Custom Synchronization',
'Use advisory locks for application-level synchronization instead of table locks',
'SELECT pg_advisory_lock(hashtext(''resource_'' || resource_id))');
END;
$$ LANGUAGE plpgsql;
9.2 Operations Guidelines
-- Operational monitoring checklist
CREATE OR REPLACE FUNCTION operational_monitoring_checklist()
RETURNS TABLE (
frequency TEXT,
check_name TEXT,
action TEXT,
alert_threshold TEXT
) AS $$
BEGIN
RETURN QUERY VALUES
('Real-time', 'Active Lock Waits',
'Monitor pg_locks for waiting sessions', '> 5 waiting sessions'),
('Every minute', 'Long Running Transactions',
'Check for transactions running longer than expected', '> 2 minutes'),
('Every 5 minutes', 'Deadlock Detection Timeout',
'Verify deadlock_timeout is appropriate for workload', 'Current: 1s'),
('Hourly', 'Lock Statistics Trends',
'Review lock_statistics table for patterns', 'Increasing wait times'),
('Daily', 'Deadlock Incident Analysis',
'Review deadlock_incidents for patterns', '> 0 incidents'),
('Weekly', 'Index Contention Analysis',
'Analyze index usage and contention patterns', 'High read/fetch ratio');
END;
$$ LANGUAGE plpgsql;
10. Performance Optimization Strategies
10.1 Connection Pool Tuning
-- Function to generate connection pool recommendations
CREATE OR REPLACE FUNCTION connection_pool_recommendations()
RETURNS TABLE (
pool_parameter TEXT,
current_load_pattern TEXT,
recommended_value TEXT,
rationale TEXT
) AS $$
DECLARE
avg_connections INTEGER;
avg_transaction_time INTERVAL;
idle_connections INTEGER;
BEGIN
-- Gather current metrics
SELECT COUNT(*), AVG(NOW() - xact_start),
COUNT(*) FILTER (WHERE state = 'idle')
INTO avg_connections, avg_transaction_time, idle_connections
FROM pg_stat_activity
WHERE state != '';
RETURN QUERY VALUES
('max_pool_size',
format('%s avg connections', avg_connections),
CASE
WHEN avg_connections > 100 THEN '50-75'
WHEN avg_connections > 50 THEN '25-35'
ELSE '10-20'
END,
'Based on current connection patterns and deadlock reduction'),
('connection_timeout',
format('%s avg transaction time', avg_transaction_time),
CASE
WHEN avg_transaction_time > INTERVAL '1 minute' THEN '60s'
WHEN avg_transaction_time > INTERVAL '10 seconds' THEN '30s'
ELSE '15s'
END,
'Prevent long-running transactions from holding connections'),
('pool_mode',
format('%s idle connections', idle_connections),
CASE
WHEN idle_connections > avg_connections * 0.3 THEN 'transaction'
ELSE 'session'
END,
'Optimize based on connection reuse patterns');
END;
$$ LANGUAGE plpgsql;
10.2 Query Optimization for Deadlock Prevention
-- Function to identify queries that may cause deadlocks
CREATE OR REPLACE FUNCTION identify_deadlock_prone_queries()
RETURNS TABLE (
query_pattern TEXT,
frequency BIGINT,
avg_duration_ms NUMERIC,
tables_accessed TEXT[],
lock_risk_score INTEGER,
optimization_suggestion TEXT
) AS $$
BEGIN
RETURN QUERY
WITH query_analysis AS (
SELECT
regexp_replace(query, '\s+', ' ', 'g') as normalized_query,
COUNT(*) as frequency,
AVG(total_exec_time) as avg_duration_ms,
-- This is a simplified example - would need more complex parsing
ARRAY['unknown'] as tables_accessed,
CASE
WHEN query ~* 'UPDATE.*UPDATE' THEN 8 -- Multiple updates
WHEN query ~* 'SELECT.*FOR UPDATE.*ORDER BY' THEN 6 -- Ordered locking
WHEN query ~* 'INSERT.*SELECT.*JOIN' THEN 7 -- Complex inserts
WHEN query ~* 'DELETE.*EXISTS' THEN 5 -- Existence checks
ELSE 2
END as lock_risk_score
FROM pg_stat_statements
WHERE calls > 10 -- Only frequently executed queries
GROUP BY query
)
SELECT
LEFT(qa.normalized_query, 100)::TEXT,
qa.frequency,
qa.avg_duration_ms,
qa.tables_accessed,
qa.lock_risk_score,
CASE
WHEN qa.lock_risk_score >= 8 THEN 'Break into smaller transactions'
WHEN qa.lock_risk_score >= 6 THEN 'Add consistent ordering clauses'
WHEN qa.lock_risk_score >= 4 THEN 'Consider optimistic locking'
ELSE 'Monitor for patterns'
END::TEXT
FROM query_analysis qa
ORDER BY qa.lock_risk_score DESC, qa.frequency DESC
LIMIT 20;
END;
$$ LANGUAGE plpgsql;
11. Conclusion and Maintenance
11.1 Long-term Maintenance Strategy
-- Automated maintenance schedule
CREATE OR REPLACE FUNCTION schedule_deadlock_maintenance()
RETURNS TEXT AS $$
BEGIN
-- Daily tasks
PERFORM cron.schedule('deadlock-daily-cleanup', '0 2 * * *', $$
DELETE FROM lock_statistics WHERE recorded_at < NOW() - INTERVAL '30 days';
DELETE FROM deadlock_incidents WHERE occurred_at < NOW() - INTERVAL '90 days';
REFRESH MATERIALIZED VIEW CONCURRENTLY account_balances;
$$);
-- Weekly analysis
PERFORM cron.schedule('deadlock-weekly-analysis', '0 3 * * 0', $$
INSERT INTO maintenance_log (task_name, results, created_at)
SELECT 'weekly_deadlock_analysis',
jsonb_agg(to_jsonb(analysis)) as results,
NOW()
FROM analyze_deadlock_patterns(7) analysis;
$$);
-- Monthly optimization
PERFORM cron.schedule('deadlock-monthly-optimization', '0 4 1 * *', $$
VACUUM ANALYZE lock_statistics;
VACUUM ANALYZE deadlock_incidents;
REINDEX INDEX CONCURRENTLY idx_account_events_account_sequence;
$$);
RETURN 'Maintenance schedule configured successfully';
END;
$$ LANGUAGE plpgsql;
11.2 Success Metrics
Monitor these key metrics to measure the effectiveness of your deadlock prevention strategy:
- Deadlock Frequency: Target < 0.1 deadlocks per 1000 transactions
- Average Lock Wait Time: Target < 100ms for 95th percentile
- Transaction Duration: Target < 1 second for 99th percentile
- Long-running Transaction Count: Target < 5 transactions > 30 seconds
- System Recovery Time: Target < 30 seconds for deadlock resolution
-- Success metrics dashboard
CREATE OR REPLACE VIEW deadlock_success_metrics AS
WITH metrics AS (
SELECT
'Deadlock Frequency' as metric_name,
COALESCE(
(SELECT COUNT(*) FROM deadlock_incidents WHERE occurred_at >= NOW() - INTERVAL '24 hours'),
0
)::NUMERIC / NULLIF(
(SELECT SUM(xact_commit + xact_rollback) FROM pg_stat_database WHERE datname = current_database()),
0
) * 1000 as current_value,
0.1 as target_value,
'per 1000 transactions' as unit
UNION ALL
SELECT
'Average Lock Wait Time',
COALESCE(AVG(avg_wait_duration_ms), 0),
100.0,
'milliseconds'
FROM lock_statistics
WHERE recorded_at >= NOW() - INTERVAL '1 hour'
UNION ALL
SELECT
'Long Running Transactions',
COUNT(*)::NUMERIC,
5.0,
'transactions > 30s'
FROM pg_stat_activity
WHERE NOW() - xact_start > INTERVAL '30 seconds'
AND state = 'active'
)
SELECT
metric_name,
ROUND(current_value, 2) as current_value,
target_value,
unit,
CASE
WHEN current_value <= target_value THEN 'PASS'
WHEN current_value <= target_value * 1.5 THEN 'WARNING'
ELSE 'FAIL'
END as status,
ROUND(((target_value - current_value) / target_value * 100), 1) as performance_pct
FROM metrics;
This comprehensive guide provides a complete framework for PostgreSQL deadlock prevention in high-concurrency environments. Regular monitoring, proactive prevention strategies, and quick incident response will help maintain system reliability and performance while minimizing the impact of deadlocks on your critical business operations.
Remember that deadlock prevention is an ongoing process that requires continuous monitoring, analysis, and optimization based on your specific workload patterns and business requirements.
Leave a Reply