Most Django applications slow down at the database layer, not the view layer. A Gunicorn worker that takes 800ms to respond is almost never slow because Python is slow — it is slow because the database query it runs is doing a sequential scan on a million-row table. Here is how to find and fix that.
Step 1: Identify Slow Queries
Before adding any indexes, measure which queries are actually slow.
Django Debug Toolbar (development)
Install django-debug-toolbar and watch the SQL panel. Queries taking more than 50ms in development will be proportionally worse in production.
PostgreSQL pg_stat_statements (production)
This extension tracks all executed queries and their aggregate statistics:
-- Enable once per database
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Find the slowest queries
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;
This shows you the actual slow queries in production — not the ones you guess are slow.
Step 2: Read EXPLAIN ANALYZE Output
Once you have a slow query, run EXPLAIN ANALYZE to see the execution plan:
EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id = 42 AND status = 'pending';
Output to understand:
Seq Scan on orders (cost=0.00..8432.00 rows=1 width=80) (actual time=45.123..45.123 rows=1)
Filter: ((user_id = 42) AND (status = 'pending'))
Rows Removed by Filter: 250000
Planning Time: 0.5 ms
Execution Time: 45.6 ms
The key signals:
- Seq Scan — reading every row. Bad on large tables. Add an index.
- Index Scan — using an index. Good.
- Rows Removed by Filter — how many rows were discarded. High number with Seq Scan = needs index.
- cost (estimated) vs actual time — large discrepancy means stale table statistics. Run ANALYZE orders;.
Step 3: Add the Right Indexes
Single-column index
class Order(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE, db_index=True)
status = models.CharField(max_length=20)
created_at = models.DateTimeField(auto_now_add=True, db_index=True)
db_index=True creates a B-tree index on that column. Django does this automatically for ForeignKey fields.
A B-tree index stores keys in a sorted balanced tree. Each internal node contains key ranges and pointers to child nodes; each leaf node contains a (key, heap pointer) pair pointing to the actual row's location in the table's physical heap. PostgreSQL stores nodes in 8KB pages. On a table with one million rows, the tree is roughly 4 levels deep, so finding a single row requires 4 page reads (O(log n)). Without the index, PostgreSQL's sequential scan reads every page in the table — for a million rows at ~100 bytes per row, that's roughly 12,500 8KB pages to read (O(n)). At 10 million rows, that's 125,000 pages, and the difference between the indexed and unindexed case is the difference between 4ms and 4 seconds.
Composite index (for multi-column filters)
When you always query by both user_id AND status, a composite index is more efficient than two separate indexes:
class Order(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE)
status = models.CharField(max_length=20)
class Meta:
indexes = [
models.Index(fields=["user", "status"], name="order_user_status_idx"),
]
Column order matters in composite indexes: put the most selective column first (the one that filters out the most rows). If user_id filters better than status, put user first.
Partial index (index only a subset of rows)
If you frequently query for a specific status (like pending orders), a partial index is smaller and faster than a full index:
from django.db.models import Q
class Meta:
indexes = [
models.Index(
fields=["user", "created_at"],
name="pending_orders_idx",
condition=Q(status="pending"),
)
]
This index only includes rows where status = 'pending'. If 90% of orders are completed, you have a tiny index that covers the hot path.
GIN index for full-text search
For text search on large string fields:
from django.contrib.postgres.indexes import GinIndex
from django.contrib.postgres.search import SearchVectorField
class Article(models.Model):
title = models.CharField(max_length=255)
search_vector = SearchVectorField(null=True)
class Meta:
indexes = [
GinIndex(fields=["search_vector"]),
]
Step 4: Connection Pooling with pgBouncer
Every Django request opens and closes a database connection. At low traffic this is fine. At high traffic, the connection overhead becomes significant — PostgreSQL has a hard limit on concurrent connections (typically 100–200 on a VPS), and opening a new connection for each request can exhaust the limit.
What pgBouncer does
pgBouncer is a lightweight proxy that sits between Django and PostgreSQL. It maintains a pool of persistent connections and reuses them across requests. From Django's perspective, nothing changes.
Install pgBouncer (Ubuntu/Debian)
sudo apt install pgbouncer
Configure /etc/pgbouncer/pgbouncer.ini:
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb
[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
Point Django to pgBouncer instead of PostgreSQL:
DATABASES = {
"default": {
"ENGINE": "django.db.backends.postgresql",
"NAME": "mydb",
"USER": "myuser",
"PASSWORD": "mypassword",
"HOST": "127.0.0.1",
"PORT": "6432", # pgBouncer port, not 5432
"CONN_MAX_AGE": 0, # Let pgBouncer manage pooling
}
}
With pool_mode = transaction, each transaction gets a connection from the pool and returns it when the transaction ends. This is the most efficient mode for Django.
Django's built-in persistent connections
If pgBouncer is overkill for your scale, Django's CONN_MAX_AGE keeps connections alive across requests:
DATABASES = {
"default": {
# ...
"CONN_MAX_AGE": 60, # Reuse connections for 60 seconds
}
}
Set this to None for permanent connections (be careful — one per worker, times number of workers, must not exceed PostgreSQL's max_connections).
Step 5: Keep Statistics Fresh
PostgreSQL's query planner uses table statistics to choose execution plans. Outdated statistics lead to bad plans. Run ANALYZE regularly:
-- Manual
ANALYZE orders;
-- Autovacuum handles this automatically in most setups
-- Check if autovacuum is running
SELECT relname, last_autovacuum, last_autoanalyze
FROM pg_stat_user_tables
ORDER BY last_autoanalyze ASC;
If last_autoanalyze is null or very old for large tables, autovacuum may need tuning.
Quick Reference
| Problem | Fix |
|---|---|
| Seq Scan on large table | Add index on filtered columns |
| Multi-column filter slow | Add composite index (most selective column first) |
| Index too large / rarely needed rows | Partial index with condition=Q(...) |
| Too many connections / connection refused | Add pgBouncer |
| Query plan seems wrong | Run ANALYZE tablename; to refresh statistics |
| Slow text search | Use SearchVectorField + GIN index |
If you need a Django + PostgreSQL application built for real-world scale — with proper indexing, connection pooling, and query optimization from the start — hire me as a Django developer or as a backend developer.