Optimization for <10ms p99 API response times with Django
Table of Contents
- ASGI vs WSGI
- Production Server Configuration
- Async Views
- Async ORM Operations
- Middleware Optimization
- Django Ninja vs DRF
- Caching with Django-Redis
- Database Connection Optimization
- Response Streaming
- Profiling
- Quick Reference
1. ASGI vs WSGI
┌─────────────────────────────────────────────────────────────────────────────────┐
│ WSGI vs ASGI │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ WSGI (Synchronous): │
│ ├── Request → Process → Response → Next Request │
│ ├── One request at a time per worker │
│ ├── Blocking I/O │
│ └── Servers: Gunicorn, uWSGI │
│ │
│ ASGI (Asynchronous): │
│ ├── Request → Start Processing → Await I/O → Continue → Response │
│ ├── Many concurrent requests per worker │
│ ├── Non-blocking I/O │
│ └── Servers: Uvicorn, Daphne, Hypercorn │
│ │
│ PERFORMANCE: │
│ ├── ASGI: ~3,000 req/s (Django with Uvicorn) │
│ ├── WSGI: ~1,500 req/s (Django with Gunicorn) │
│ └── Note: ~15ms overhead observed in some ASGI setups │
│ │
└─────────────────────────────────────────────────────────────────────────────────┘
2. Production Server Configuration
Uvicorn (ASGI) - Recommended
# settings.py - ASGI configuration
ASGI_APPLICATION = 'myproject.asgi.application'
# asgi.py
import os
from django.core.asgi import get_asgi_application
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myproject.settings')
application = get_asgi_application()
Gunicorn with Uvicorn Workers (Production)
# Production command
gunicorn myproject.asgi:application \
--worker-class uvicorn.workers.UvicornWorker \
--workers 4 \
--bind 0.0.0.0:8000 \
--timeout 30 \
--keep-alive 5 \
--max-requests 1000 \
--max-requests-jitter 50
Worker Count Formula
Workers = (2 × CPU cores) + 1
Example: 4-core server
Workers = (2 × 4) + 1 = 9
3. Async Views
Django 4.1+ Async Views
# ❌ SYNC VIEW: Blocks worker during I/O
def sync_view(request):
data = fetch_from_external_api() # Blocks!
items = Item.objects.all() # Blocks!
return JsonResponse({'data': data, 'items': list(items)})
# ✅ ASYNC VIEW: Non-blocking I/O
async def async_view(request):
# Parallel async operations
async with aiohttp.ClientSession() as session:
api_task = session.get('https://api.example.com/data')
# Async ORM (Django 4.1+)
items_task = Item.objects.all().aiterator()
api_response = await api_task
data = await api_response.json()
items = [item async for item in items_task]
return JsonResponse({'data': data, 'items': items})
When to Use Async
USE ASYNC FOR:
├── External API calls
├── Multiple database queries (run in parallel)
├── File I/O operations
├── WebSocket connections
└── Long-polling endpoints
USE SYNC FOR:
├── Simple CRUD operations
├── CPU-bound processing
├── Third-party libraries without async support
└── Legacy code compatibility
4. Async ORM Operations
Django 4.1+ Async ORM Methods
from django.http import JsonResponse
async def get_items(request):
# Async get
item = await Item.objects.aget(pk=1)
# Async filter with list
items = await sync_to_async(list)(
Item.objects.filter(active=True)[:10]
)
# Async iteration
items = []
async for item in Item.objects.filter(active=True)[:10]:
items.append(item)
# Async count
count = await Item.objects.acount()
# Async exists
exists = await Item.objects.filter(pk=1).aexists()
return JsonResponse({'items': items, 'count': count})
Async Aggregation
async def get_stats(request):
from django.db.models import Avg, Sum
stats = await Item.objects.aaggregate(
avg_price=Avg('price'),
total=Sum('quantity')
)
return JsonResponse(stats)
Parallel Database Queries
import asyncio
async def dashboard_view(request):
# Run multiple queries in parallel
users_task = asyncio.create_task(
User.objects.filter(active=True).acount()
)
orders_task = asyncio.create_task(
Order.objects.filter(status='pending').acount()
)
revenue_task = asyncio.create_task(
Order.objects.aaggregate(total=Sum('amount'))
)
user_count, order_count, revenue = await asyncio.gather(
users_task, orders_task, revenue_task
)
return JsonResponse({
'users': user_count,
'orders': order_count,
'revenue': revenue['total']
})
5. Middleware Optimization
Minimal Middleware Stack for APIs
# settings.py
MIDDLEWARE = [
# Essential only
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware', # Only if using sessions
'django.middleware.common.CommonMiddleware',
# 'django.middleware.csrf.CsrfViewMiddleware', # Skip for APIs
# 'django.contrib.auth.middleware.AuthenticationMiddleware', # Only if needed
# 'django.contrib.messages.middleware.MessageMiddleware', # Not needed for APIs
]
Custom Timing Middleware
class TimingMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
import time
start = time.perf_counter()
response = self.get_response(request)
duration = time.perf_counter() - start
response['X-Request-Duration'] = f'{duration:.3f}'
return response
6. Django Ninja vs DRF
Django REST Framework (DRF)
from rest_framework.decorators import api_view
from rest_framework.response import Response
@api_view(['GET'])
def drf_items(request):
items = Item.objects.all()[:100]
serializer = ItemSerializer(items, many=True)
return Response(serializer.data)
Django Ninja (Faster)
from ninja import NinjaAPI, Schema
from typing import List
api = NinjaAPI()
class ItemSchema(Schema):
id: int
name: str
price: float
@api.get('/items', response=List[ItemSchema])
def ninja_items(request):
return Item.objects.all()[:100]
Performance Comparison
Django Ninja: ~4,000 req/s
DRF: ~1,500 req/s
Raw Django: ~6,000 req/s
7. Caching with Django-Redis
Configuration
# settings.py
CACHES = {
'default': {
'BACKEND': 'django_redis.cache.RedisCache',
'LOCATION': 'redis://127.0.0.1:6379/1',
'OPTIONS': {
'CLIENT_CLASS': 'django_redis.client.DefaultClient',
'CONNECTION_POOL_KWARGS': {'max_connections': 50},
'PARSER_CLASS': 'redis.connection.HiredisParser', # Faster parsing
}
}
}
View-Level Caching
from django.views.decorators.cache import cache_page
@cache_page(60 * 5) # 5 minutes
def cached_view(request):
return JsonResponse(expensive_computation())
Manual Caching
from django.core.cache import cache
def get_items(request):
cache_key = 'items:all'
items = cache.get(cache_key)
if items is None:
items = list(Item.objects.all()[:100].values())
cache.set(cache_key, items, timeout=300)
return JsonResponse({'items': items})
# Pattern-based invalidation (requires django-redis)
def invalidate_items_cache():
cache.delete_pattern('items:*')
8. Database Connection Optimization
Basic Configuration
# settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'mydb',
'CONN_MAX_AGE': 60, # Persistent connections (seconds)
'CONN_HEALTH_CHECKS': True, # Django 4.1+
'OPTIONS': {
'connect_timeout': 5,
'options': '-c statement_timeout=30000', # 30s query timeout
}
}
}
Connection Pooling
# pip install django-db-connection-pool
DATABASES = {
'default': {
'ENGINE': 'dj_db_conn_pool.backends.postgresql',
'POOL_OPTIONS': {
'POOL_SIZE': 10,
'MAX_OVERFLOW': 10,
'RECYCLE': 300,
}
}
}
9. Response Streaming
from django.http import StreamingHttpResponse
import json
def stream_large_response(request):
def generate():
yield '['
first = True
for item in Item.objects.iterator(chunk_size=100):
if not first:
yield ','
yield json.dumps({'id': item.id, 'name': item.name})
first = False
yield ']'
return StreamingHttpResponse(
generate(),
content_type='application/json'
)
10. Profiling
Django-Silk
# settings.py
INSTALLED_APPS = [
# ...
'silk',
]
MIDDLEWARE = [
'silk.middleware.SilkyMiddleware', # Add early in middleware
# ...
]
# Limit profiling to API paths
SILKY_INTERCEPT_FUNC = lambda request: request.path.startswith('/api/')
# Enable Python profiling
SILKY_PYTHON_PROFILER = True
SILKY_PYTHON_PROFILER_BINARY = True
# Access: /silk/ for profiling dashboard
Django Debug Toolbar (Development)
# settings.py
INSTALLED_APPS = [
# ...
'debug_toolbar',
]
MIDDLEWARE = [
'debug_toolbar.middleware.DebugToolbarMiddleware',
# ...
]
INTERNAL_IPS = ['127.0.0.1']
11. Quick Reference
Optimization Checklist
├── [ ] Use ASGI (Uvicorn) instead of WSGI
├── [ ] Enable async views for I/O operations
├── [ ] Minimize middleware stack
├── [ ] Use Django Ninja over DRF for speed
├── [ ] Configure CONN_MAX_AGE for connection pooling
├── [ ] Enable Redis caching (django-redis)
├── [ ] Profile with django-silk
├── [ ] Use iterator() for large querysets
├── [ ] Implement response streaming for large data
└── [ ] Set query timeouts
10ms Latency Budget
DJANGO (Budget: 10ms total):
├── Middleware: 1ms
├── Auth check: 1ms
├── ORM query: 4ms (use cache if possible)
├── Serialization: 2ms
├── Response: 2ms
Performance Benchmarks
Django + Gunicorn (WSGI): ~1,500 req/s
Django + Uvicorn (ASGI): ~3,000 req/s
Django Ninja: ~4,000 req/s
Raw Django: ~6,000 req/s