How I Reduced AWS ElastiCache Connections by 97.9%
Bao Dang
Jun 1, 2024
When our enterprise e-commerce platform started throwing Redis connection errors under peak load, I was handed the on-call alert at 2am. ElastiCache was reporting 65,000 active connections. The limit was 20,000.
Here’s exactly what caused it, and how I fixed it.
The Root Cause: Lambda Cold Starts Creating New Connections
AWS Lambda functions are stateless by design — each invocation can spin up a new execution context. Our code looked like this:
// ❌ BAD: Client created inside the handler
exports.handler = async (event) => {
const redis = new Redis(process.env.REDIS_URL);
const result = await redis.get(event.key);
await redis.quit();
return result;
};
At 60 concurrent executions, this means 60 new TCP connections to ElastiCache opened per request cycle. Under traffic bursts, connections piled up faster than they could be closed.
The Fix: Singleton Pattern + Concurrency Limits
Part 1: Move the Client Outside the Handler
// ✅ GOOD: Client created once, reused across warm invocations
const redis = new Redis(process.env.REDIS_URL);
exports.handler = async (event) => {
const result = await redis.get(event.key);
return result;
};
Lambda execution contexts are reused for warm invocations. By initializing the Redis client at module scope, the same connection is reused across all subsequent invocations on a warm context.
Part 2: Enforce Reserved Concurrency
Even with the Singleton pattern, uncapped concurrency will eventually create too many connections. We set:
Reserved Concurrency = ElastiCache max_connections / connections_per_function
This creates a hard ceiling that prevents connection exhaustion regardless of traffic spikes.
Results
| Metric | Before | After |
|---|---|---|
| Peak ElastiCache connections | 65,000 | 1,350 |
| Email delivery rate | 68% | 99.9% |
| EMFILE errors | Frequent | 0 |
The fix took two hours to implement and deploy. The lesson: Serverless doesn’t mean connection-free.