Framework Deep DivesFramework Series #6

Caching Strategies for Serverless, Part 2: ElastiCache Redis for Low-Latency Session and API Caching

When DynamoDB TTL caches are not fast enough, Redis steps in. How we integrated ElastiCache Redis for sub-millisecond session lookups, API response caching, and rate limit counters — all from Lambda inside a VPC, with pluggable storage backends and circuit breaker protection.

June 25, 2026· 13 min read

TCTF Editorials

TCTF Newsletter

< 1msRedis Latency

1–5msDynamoDB Latency

3Storage Backends

SupportedCluster Mode

Built-inHealth Checks

IntegratedCircuit Breaker

Part 1 of this series covered DynamoDB-backed TTL caches. They are reliable, durable, and sufficient for the majority of caching needs. However, certain operations demand faster response times. Session validation runs on every authenticated request. Rate limit checks run on every API call. Geolocation lookups run on every login. DynamoDB serves these in single-digit milliseconds. Redis serves them in under one millisecond. At scale, that gap compounds across millions of requests into measurable overhead. This article explains how we integrated Amazon ElastiCache Redis into the TCTF caching layer, how it coexists with DynamoDB caching, and why the pluggable storage interface reduces the backend choice to a single environment variable.

01Why Redis When You Already Have DynamoDB

DynamoDB is an excellent cache backend. It is durable, scales automatically, and requires no operational overhead. For most caching at TCTF — geolocation lookups, circuit breaker state, configuration data — DynamoDB is the correct choice.

However, DynamoDB has a latency floor. Even with on-demand capacity and optimized clients, a read takes 1 to 5 milliseconds. For operations that execute on every inbound request — session validation, rate limit checks, API response caching — those milliseconds accumulate.

Redis via Amazon ElastiCache operates below that floor. A Redis GET completes in 0.1 to 0.5 milliseconds. For session validation on every authenticated request, moving from DynamoDB to Redis saves 1 to 4 milliseconds per call. At 1,000 requests per second, that translates to 1 to 4 seconds of aggregate latency eliminated each second.

This speed comes with a cost: Redis is volatile. If the node restarts, the cache is empty. During a cluster failover, there is a brief window of cache misses. Redis is not a source of truth. It is a speed layer positioned in front of one. At TCTF, DynamoDB remains the durable store. Redis absorbs the high-frequency read load.

⚡
DynamoDB: 1-5ms reads, durable, zero ops. Redis: < 1ms reads, volatile, requires VPC. Use DynamoDB for durability. Use Redis for speed. The pluggable interface lets you choose per service.

02Lambda in a VPC: The Redis Requirement

ElastiCache Redis runs inside a VPC. Lambda functions that access Redis must also reside in the VPC. This is a deliberate architectural decision.

On the positive side, Lambda functions inside the VPC reach ElastiCache, RDS, and other VPC-bound resources directly. All traffic remains within AWS — no internet traversal, no NAT gateway required for cache calls.

On the negative side, VPC-attached Lambda functions historically had slower cold starts because they allocated an Elastic Network Interface. AWS has significantly reduced this penalty with Hyperplane ENI improvements, though cold starts remain slightly longer than non-VPC functions.

There is a second cost: VPC Lambda functions cannot reach the public internet by default. Calls to external services such as ipinfo.io or Stripe require a NAT Gateway or VPC endpoints. This adds monthly cost and configuration complexity.

At TCTF, we split along that boundary. Services that need Redis run inside the VPC. Services that do not need Redis run outside it. The pluggable cache interface means the same CacheService code operates identically in both environments. VPC services use the Redis backend. Non-VPC services use DynamoDB. The only difference is an environment variable.

03The Pluggable Cache Architecture

The CacheService is the single entry point for all caching operations. It exposes a consistent API — get, set, delete, getMany, setMany, has, expire, getTtl, getStats, and clear — and every method behaves identically regardless of the active storage backend.

Three backends implement the ICacheStorage interface:

RedisCacheStorage connects to ElastiCache Redis. It supports standalone and cluster mode. Keys are prefixed to prevent collisions between services sharing a cluster. TTL is set per key. Batch operations use Redis pipelines for efficiency, and the clear operation uses SCAN-based iteration rather than KEYS to avoid blocking the server.

DynamoDBCacheStorage uses a DynamoDB table with TTL-based expiration. Items are stored with a partition key, a JSON-serialized value, and a Unix timestamp TTL attribute. DynamoDB handles automatic removal of expired items. This backend needs no VPC and no additional infrastructure.

MemoryCacheStorage uses an in-memory Map with TTL tracking. It is fast but resets on every Lambda cold start. It is suitable for testing and for caching data that is acceptable to recompute (parsed configuration, schema validation results).

The backend is selected by the CACHE_STORAGE_TYPE environment variable. The CacheStorageFactory reads this value and instantiates the appropriate class:

// Environment variables (set per service in the CDK stack)
// CACHE_STORAGE_TYPE = 'redis' | 'dynamodb' | 'memory'
// REDIS_URL = 'redis://elasticache-cluster.abc123.use1.cache.amazonaws.com:6379'
// CACHE_KEY_PREFIX = 'auth'

import { CacheService } from '@tctf/cache';
import { CacheStorageFactory } from '@tctf/cache/storage';

const storageType = process.env.CACHE_STORAGE_TYPE || 'dynamodb';
const storage = CacheStorageFactory.create(storageType, {
  redisUrl: process.env.REDIS_URL,
  keyPrefix: process.env.CACHE_KEY_PREFIX || 'default',
  tableName: process.env.CACHE_TABLE_NAME,
});

export const cacheService = new CacheService(storage, {
  defaultTtlSeconds: 300,
  enableMonitoring: true,
});

Switching from DynamoDB to Redis is a configuration change. No application code changes. No redeployment of business logic.

🔌
Set CACHE_STORAGE_TYPE=redis for sub-millisecond caching, dynamodb for durable caching, memory for testing. Same code, same API, different performance characteristics.

04Redis Cache Storage: Under the Hood

The RedisCacheStorage class encapsulates the complexity of communicating with ElastiCache Redis from Lambda.

Connection management uses lazy initialization. The Redis client is created on the first cache access and reused across subsequent invocations during Lambda warm starts. Configuration supports both URL-based connection strings and explicit host/port/password parameters. Credentials are resolved from environment variables or AWS Secrets Manager.

Key prefixing prevents namespace collisions. Every key is prepended with a service-specific prefix (for example, auth:, session:, or geo:). Multiple services share the same Redis cluster without risk of overwriting each other's data.

Serialization uses JSON. Values are stringified before storage and parsed on retrieval. This handles objects, arrays, numbers, and strings transparently. Serialization failures are caught and wrapped in a RedisCacheSerializationError that includes the key and operation context.

Here is a simplified view of how get and set operations work internally:

import { createClient, RedisClientType } from 'redis';

export class RedisCacheStorage implements ICacheStorage {
  private client: RedisClientType | null = null;
  private readonly prefix: string;

  constructor(private readonly config: RedisStorageConfig) {
    this.prefix = config.keyPrefix ? `${config.keyPrefix}:` : '';
  }

  private async getClient(): Promise<RedisClientType> {
    if (!this.client) {
      this.client = createClient({ url: this.config.redisUrl });
      this.client.on('error', (err) => console.error('Redis connection error:', err));
      await this.client.connect();
    }
    return this.client;
  }

  async get<T>(key: string): Promise<T | null> {
    const client = await this.getClient();
    const raw = await client.get(`${this.prefix}${key}`);
    if (!raw) return null;
    return JSON.parse(raw) as T;
  }

  async set<T>(key: string, value: T, ttlSeconds: number): Promise<void> {
    const client = await this.getClient();
    const serialized = JSON.stringify(value);
    await client.set(`${this.prefix}${key}`, serialized, { EX: ttlSeconds });
  }

  async shutdown(): Promise<void> {
    if (this.client) {
      await this.client.disconnect();
      this.client = null;
    }
  }
}

Error handling is thorough. Connection errors throw RedisCacheConnectionError. Invalid keys throw RedisCacheKeyError. Configuration problems throw RedisCacheConfigurationError. Each error carries context — the failed operation, the key involved, and the underlying cause. This makes debugging straightforward even in a distributed Lambda environment.

The shutdown method closes the connection cleanly. It is invoked during Lambda graceful shutdown to prevent connection leaks.

05What We Cache in Redis

Not everything belongs in Redis. We use it for data that is read frequently, changes infrequently, and is acceptable to lose on a cache miss because the source of truth exists elsewhere.

Session lookups are the primary use case. Every authenticated request validates the session. The canonical session record lives in DynamoDB. Redis holds a cached copy. A cache hit returns the session in under 1 millisecond. A miss falls back to DynamoDB, then repopulates the cache for subsequent requests.

Here is the pattern in practice:

import { cacheService } from './cache';
import { SessionRepository } from './repositories/session';

interface UserSession {
  userId: string;
  roles: string[];
  expiresAt: number;
}

export async function getSession(sessionId: string): Promise<UserSession | null> {
  const cacheKey = `session:${sessionId}`;

  // 1. Check Redis first
  const cached = await cacheService.get<UserSession>(cacheKey);
  if (cached) {
    return cached;
  }

  // 2. Cache miss — read from DynamoDB (source of truth)
  const session = await SessionRepository.findById(sessionId);
  if (!session) {
    return null;
  }

  // 3. Repopulate Redis for subsequent requests
  const ttl = Math.floor((session.expiresAt - Date.now()) / 1000);
  if (ttl > 0) {
    await cacheService.set(cacheKey, session, ttl);
  }

  return session;
}

Rate limit counters are another natural fit. Rate limiting requires atomic increment operations with TTL expiration. Redis INCR combined with EXPIRE handles this natively. The rate limit service tracks per-user, per-endpoint, and per-IP request counts, with counters expiring automatically at the end of each time window.

API response caching stores expensive computations — leaderboard rankings, search results, feed aggregations — in Redis with short TTLs ranging from 30 seconds to 5 minutes. This absorbs read spikes without repeatedly hitting downstream services.

Geolocation data uses a two-layer approach. Redis caches lookups for hot IPs (the same address making multiple requests in quick succession). DynamoDB stores longer-lived entries for returning users. The result is sub-millisecond lookups for active sessions and durable caching for everyone else.

Circuit breaker state can also be stored in Redis when faster state checks are needed. This is configurable per circuit breaker instance.

📊
Sessions, rate limits, API responses, geolocation, circuit breaker state — all cached in Redis for sub-millisecond access. DynamoDB remains the source of truth for everything.

06CacheMonitor: Health Checks and Circuit Breaker

Redis is an external dependency. It can become unavailable. The CacheMonitor provides production-grade resilience for exactly this scenario.

Health checks run on a configurable interval. The monitor writes a test key, reads it back, and deletes it. If the check fails beyond the configured retry threshold, it logs an error and emits a CloudWatch metric. The operations team gains visibility into Redis health before users experience errors.

Circuit breaker integration protects against prolonged Redis outages. When Redis fails repeatedly, the circuit breaker transitions to the Open state. Cache operations then fall back to the DynamoDB backend or return cache misses directly. The circuit breaker follows the same three-state model (Closed, Open, Half-Open) used throughout TCTF, with configurable failure thresholds and recovery windows.

The executeWithCircuitBreaker method wraps any cache operation. When the circuit is open, the operation is skipped and the caller receives a cache miss. From the caller's perspective, the behavior is transparent — no code changes are required.

Graceful shutdown cleans up health check intervals, resets circuit breaker state, and closes the Redis connection. This prevents resource leaks in Lambda functions running with provisioned concurrency over extended periods.

07When to Use Redis vs DynamoDB Caching

The decision framework is straightforward.

Choose Redis when the data is read on every request (sessions, rate limits), sub-millisecond latency is a requirement, you need atomic operations (INCR, EXPIRE), and the service already runs inside a VPC.

Choose DynamoDB when the data is read occasionally (geolocation, configuration), single-digit millisecond latency is acceptable, you need durability across restarts, and the service runs outside a VPC.

Choose in-memory when the data is computed per invocation (parsed configuration, validated schemas), you are running tests, or the data is small enough that external caching adds more overhead than it saves.

The pluggable interface removes the pressure to decide upfront. Start with DynamoDB caching, which requires zero additional infrastructure. If profiling reveals cache latency as a bottleneck, switch to Redis by changing an environment variable. If Redis introduces too much operational complexity for a given service, revert. The application code remains unchanged.

That flexibility is the purpose of the pluggable architecture. The optimal caching strategy depends on the service, its access patterns, and its performance requirements. The interface allows per-service optimization without rewriting anything.

🎯
Start with DynamoDB caching (zero infra). Switch to Redis when sub-millisecond latency matters. Switch back if Redis adds too much complexity. The code never changes.

Redis is not a replacement for DynamoDB caching. It is a complement — a speed layer dedicated to operations that demand sub-millisecond response times. The pluggable cache architecture allows every service to select the backend that fits its needs, and that selection is a configuration decision rather than a code change. Sessions use Redis. Geolocation uses DynamoDB. Tests use in-memory. The CacheService API remains consistent everywhere. That uniformity — same interface, different backends, per-service optimization — is what allows the caching layer to scale across 34 microservices without becoming a maintenance burden.

Editor's Note: This is Framework Series #6 in the TCTF Newsletter, Part 2 of the caching series. Part 1 covered DynamoDB-backed TTL caches. Next in the series: DynamoDB Framework Part 3 — Transaction Builder and Advanced Patterns.

Never miss an edition

Subscribe to get TCTF newsletters delivered to your inbox.

PreviousCaching Strategies for Serverless: From In-Memory to DynamoDB-Backed TTL Caches

NextQ3 2026 Roadmap: Social Network, Billing, and the Road to Launch

More From TCTF Newsletter

Vol. 1, Issue 4

Built to Last: Why Sustained Collaboration Is the Future of Tech Teams

Most platforms optimize for transactions — post a job, hire, move on. TCTF is built around sustained collaboration: long-term teams, milestone-driven projects, language support that breaks barriers, and a community where everyone — not just developers — has a seat at the table.

April 15, 2026

Q2 2026

Q2 2026 Roadmap: What's Next for the TCTF Portal

Our quarterly roadmap for Q2 — what shipped in April, the origin of Cometbid Social, and the plan for May and June as we build toward user accounts, authentication, and the social network launch.

April 1, 2026

Tech Series #3

How We Built a Real-Time Messaging System with AWS Lambda and WebSockets

Inside the architecture of TCTF's messaging platform — three services handling real-time chat, campaign delivery, and transactional notifications, all built on Lambda, API Gateway WebSockets, SQS, and multi-provider email with automatic failover.

March 15, 2026

Browse by Month

2026

June

May

April

March

February

January

Account

Caching Strategies for Serverless, Part 2: ElastiCache Redis for Low-Latency Session and API Caching

01Why Redis When You Already Have DynamoDB

02Lambda in a VPC: The Redis Requirement

03The Pluggable Cache Architecture

04Redis Cache Storage: Under the Hood

05What We Cache in Redis

06CacheMonitor: Health Checks and Circuit Breaker

07When to Use Redis vs DynamoDB Caching

More From TCTF Newsletter

Built to Last: Why Sustained Collaboration Is the Future of Tech Teams

Q2 2026 Roadmap: What's Next for the TCTF Portal

How We Built a Real-Time Messaging System with AWS Lambda and WebSockets

Browse by Month

2026

The Cometbid
Technology Foundation

Our Community

Learn

Legal

More

Subscribe to our Newsletter

Caching Strategies for Serverless, Part 2: ElastiCache Redis for Low-Latency Session and API Caching

01Why Redis When You Already Have DynamoDB

02Lambda in a VPC: The Redis Requirement

03The Pluggable Cache Architecture

04Redis Cache Storage: Under the Hood

05What We Cache in Redis

06CacheMonitor: Health Checks and Circuit Breaker

07When to Use Redis vs DynamoDB Caching

More From TCTF Newsletter

Built to Last: Why Sustained Collaboration Is the Future of Tech Teams

Q2 2026 Roadmap: What's Next for the TCTF Portal

How We Built a Real-Time Messaging System with AWS Lambda and WebSockets

Browse by Month

2026

Account

Caching Strategies for Serverless, Part 2: ElastiCache Redis for Low-Latency Session and API Caching

01Why Redis When You Already Have DynamoDB

02Lambda in a VPC: The Redis Requirement

03The Pluggable Cache Architecture

04Redis Cache Storage: Under the Hood

05What We Cache in Redis

06CacheMonitor: Health Checks and Circuit Breaker

07When to Use Redis vs DynamoDB Caching

More From TCTF Newsletter

Built to Last: Why Sustained Collaboration Is the Future of Tech Teams

Q2 2026 Roadmap: What's Next for the TCTF Portal

How We Built a Real-Time Messaging System with AWS Lambda and WebSockets

Browse by Month

2026

The Cometbid Technology Foundation

Follow Us

Our Community

Learn

Legal

More

Subscribe to our Newsletter

Caching Strategies for Serverless, Part 2: ElastiCache Redis for Low-Latency Session and API Caching

01Why Redis When You Already Have DynamoDB

02Lambda in a VPC: The Redis Requirement

03The Pluggable Cache Architecture

04Redis Cache Storage: Under the Hood

05What We Cache in Redis

06CacheMonitor: Health Checks and Circuit Breaker

07When to Use Redis vs DynamoDB Caching

More From TCTF Newsletter

Built to Last: Why Sustained Collaboration Is the Future of Tech Teams

Q2 2026 Roadmap: What's Next for the TCTF Portal

How We Built a Real-Time Messaging System with AWS Lambda and WebSockets

Browse by Month

2026

The Cometbid
Technology Foundation