When it comes to PostgreSQL performance tuning, work_mem is one of those parameters that can make a significant difference. It’s a straightforward concept—memory allocated for query operations—but understanding how to use it effectively can help you optimize your database performance.
A Typical Performance Investigation
Let’s start with a scenario many database administrators encounter. An analytics dashboard is running slower than you’d like. The query performing the heavy lifting sorts 10 million customer records by purchase date, groups them by region, and calculates aggregates. It’s taking about 8 seconds to complete.
Looking at the query plan with EXPLAIN ANALYZE, you notice: “Sort Method: external merge Disk: 245MB.”
This tells you that PostgreSQL couldn’t fit the sort operation in memory and had to use temporary disk files instead. You decide to test something—you increase work_mem from 4MB to 256MB for that session and run the query again. It now completes in 380 milliseconds.
That’s a 20× improvement from a single configuration adjustment. This example shows why work_mem is worth understanding in depth.
What work_mem Actually Controls
When PostgreSQL executes a query, it often needs temporary workspace for data manipulation. Not the final results, but the intermediate shuffling, sorting, and combining that happens behind the scenes.
work_mem defines how much RAM each of these operations gets before PostgreSQL gives up and writes to disk instead. The operations that depend on it are everywhere:
Sorting operations power every ORDER BY clause. When you sort millions of rows by timestamp or alphabetically by name, PostgreSQL needs space to organize that data.
Hash tables enable the fastest joins and aggregations. When you JOIN two tables or GROUP BY a column with many distinct values, PostgreSQL builds hash tables in memory—if it can.
Deduplication for DISTINCT operations needs space to track which values have already been seen.
Materialization in complex query plans requires temporary storage for intermediate result sets.
These aren’t edge cases. They’re the fundamental building blocks of SQL.
The Death Spiral: When Memory Runs Out
Here’s what happens when work_mem is too small:
PostgreSQL starts a sort operation. It fills the allocated memory buffer. The data keeps coming—there’s more to sort than fits in RAM. PostgreSQL doesn’t fail; it adapts. It writes chunks to temporary files on disk, sorts what it can in memory, writes more chunks, then performs a multi-pass external merge sort.
Disk I/O is roughly 100× slower than RAM access for random operations. That sort that would have taken 200 milliseconds in memory now takes 5 seconds. But it doesn’t stop there.
The query has three more sorts and two hash joins. Each one spills to disk. The temporary files pile up—sometimes gigabytes worth. Your disk I/O saturates. Other queries waiting for disk access slow down. The problem cascades.
How to spot disk spills:
Enable log_temp_files = 0 in postgresql.conf. Your logs will show entries like:
LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp12345.0", size 245234688
Run EXPLAIN ANALYZE on slow queries. Look for:
Sort Method: external merge Disk: 245MB
The word “Disk” is your red flag. It means PostgreSQL ran out of work_mem and performance cratered.
The Memory Explosion: When work_mem Is Too High
Now imagine the opposite mistake. Your analytics are slow, so you do what seems logical: increase work_mem globally to 512MB. Problem solved, right?
Wrong. Catastrophically wrong.
Here’s the trap: work_mem isn’t a total limit. It’s a per-operation, per-worker, per-connection limit. The multiplication gets scary fast.
A worked example:
- You set
work_mem = 512MB - A reporting query performs 3 sorts and 2 hash joins (5 operations)
- It runs with 4 parallel workers
- You have 30 active database connections running similar queries
Theoretical peak memory: 512MB × 5 × 4 × 30 = 307GB
Your server has 64GB of RAM.
What happens next:
PostgreSQL allocates memory aggressively. The system exhausts physical RAM. The Linux kernel starts swapping. Performance doesn’t just degrade—it collapses. Every query becomes glacially slow as the system thrashes. In extreme cases, the OOM killer terminates PostgreSQL to save the system.
One “helpful” configuration change has taken down your database.
The Right Approach: Workload-Specific Tuning
There’s no magic number for work_mem. The correct value depends entirely on what your database does.
High-Concurrency OLTP: Stay Conservative
Transaction processing systems handle hundreds of small, fast queries per second. A checkout process, a user login, a status update—each is tiny but there are many happening simultaneously.
Strategy: Set a low global default (4–8MB) to prevent memory exhaustion. Most OLTP queries don’t need more—they touch small datasets and complete in milliseconds anyway.
When an occasional heavy operation needs more memory, grant it at the session level, not globally:
SET work_mem = '128MB';
-- run heavy operation
RESET work_mem;
Analytics and Reporting: Be Generous
Data warehouses and analytics platforms are the opposite. Few concurrent queries, but each one is complex—joining millions of rows, calculating aggregates across entire datasets, sorting massive result sets.
Strategy: Use moderate global settings (32–64MB) but override liberally for specific workloads. ETL scripts and reporting tools should explicitly set higher values:
-- In your ETL script
SET work_mem = '512MB';
-- Complex analytics queries here
Some organizations create separate user roles for analytics with elevated work_mem defaults.
Mixed Workloads: Layer Your Defenses
Real production systems rarely fit one category. You have transactional traffic during business hours and batch reporting overnight. API calls mix with dashboard queries.
Strategy: Use multiple configuration layers:
- Conservative global default (16–32MB)
- Connection pooler settings for different application types
- User-level or database-level settings for analytics roles
- Session-level overrides in known heavy operations
The goal is defense in depth—protect against runaway memory usage while enabling performance where it matters.
Measuring What Matters
Tuning work_mem without measurement is guesswork. Here’s what to monitor:
Temporary file usage tells you when operations spill to disk. Set log_temp_files = 0 and review logs daily. Any temporary file usage is a candidate for investigation—it means a query couldn’t fit its work in memory.
Query plans reveal the details. Run EXPLAIN ANALYZE on your slowest queries:
EXPLAIN (ANALYZE, BUFFERS)
SELECT ...
Look for “external merge Disk” in sort nodes or “Batches” greater than 1 in hash nodes. These indicate memory pressure.
System memory should never be exhausted. Monitor available RAM and swap usage. If PostgreSQL is causing system-wide memory pressure, your work_mem is too aggressive given your concurrency.
Query performance trends matter more than point-in-time measurements. A query that suddenly slows down might indicate data growth that now exceeds work_mem, even though the parameter hasn’t changed.
Real-World War Stories
The invisible bottleneck: A company’s analytics queries were mysteriously slow. Infrastructure looked fine—plenty of CPU and disk headroom. The culprit: work_mem was set to 4MB on a server with 128GB of RAM. Every single analytical query was spilling to disk. The solution: increasing work_mem to 64MB specifically for analytics sessions (not globally) reduced average query time by 80%—queries that took 10 seconds now completed in 2 seconds. The lesson: being too conservative with work_mem can leave significant performance on the table, especially when you have abundant RAM available.
The gradual degradation: A different organization had a system that ran fine for months with work_mem set to 16MB, then performance slowly declined over a quarter. Users complained that reports were getting slower, but there was no obvious cause—no code changes, no infrastructure issues. The DBA finally checked the PostgreSQL logs and discovered hundreds of temporary file entries that hadn’t been there before. The data volume had grown from 50 million to 200 million rows. Queries that once fit comfortably in 16MB now needed 40-60MB and were consistently spilling to disk. The solution was straightforward: increase work_mem to 64MB for the reporting database role. Performance immediately returned to normal. The lesson: work_mem requirements aren’t static—they grow with your data. Regular monitoring of temporary file usage (log_temp_files = 0) would have caught this trend early, allowing proactive tuning instead of reactive firefighting.
The midnight disaster: At another company, a DBA noticed slow nightly reports on a system where work_mem was set to 4MB. Wanting to fix the problem quickly, they increased it to 256MB globally—reasoning that if more is better, much more must be much better. At 2 AM, batch processing jobs started—20 concurrent ETL processes, each running queries with multiple sorts and hash joins. Each query used 4-5 operations, and several ran with 4 parallel workers each. The math became brutal: 256MB × 5 operations × 4 workers × 20 concurrent jobs = potential 102GB memory usage on a 64GB server. Memory consumption exploded. The system exhausted RAM and started swapping heavily. Performance ground to a halt. Finally, the Linux OOM killer struck, terminating PostgreSQL to save the system. The database crashed, taking down the entire application. The lesson: the mistake wasn’t increasing work_mem—it was jumping too high and setting it globally instead of per-session for specific workloads.
The Counterintuitive Truth
Here’s what makes work_mem different from most configuration parameters: more isn’t better, less isn’t safer, and the right answer changes.
A database handling 1,000 concurrent connections needs very different settings than one serving 10 analytical users. The same database needs different settings during business hours versus batch processing windows. A query that fits in memory today might not fit tomorrow when the data grows.
This dynamic nature means work_mem demands ongoing attention. It’s not something you configure once during initial setup and forget. It’s a parameter you revisit when workloads change, when data volumes grow, when performance problems emerge.
The Bottom Line
PostgreSQL gives you three parameters that matter more than all the others combined: shared_buffers, effective_cache_size, and work_mem. Of these, work_mem is the most dangerous because it multiplies unpredictably across your workload.
Get it right, and your queries fly. Complex sorts complete in milliseconds instead of seconds. Hash joins process millions of rows without breaking a sweat. Your analytics dashboards are responsive, your ETL pipelines finish on schedule, and your users are happy.
Get it wrong in one direction, and every analytical query grinds to a halt, spilling gigabytes to disk and saturating your I/O. Get it wrong in the other direction, and your database crashes under memory pressure, taking your application down with it.
The key insight: work_mem isn’t a number. It’s a strategy that varies by workload, by user, by time of day, and by query. Treat it with the respect it deserves—measure constantly, tune carefully, and never assume that what worked yesterday will work tomorrow.
Your database’s performance depends on it.
Leave a Reply