Framework Deep DivesFramework Series #9

Error Handling Architecture: From Custom Errors to Automatic Recovery

How we structured error handling across 34 microservices — custom error hierarchies, the withErrorHandling wrapper, priority-based error routing, the ErrorResponseBuilder, and the factory pattern for specialized handlers.

July 20, 2026· 12 min read

TCTF Editorials

TCTF Newsletter

15+Error Classes

8 levelsHandler Priority

ConsistentResponse Format

withErrorHandlingWrapper

4Factory Handlers

34Services Using

Every Lambda function can fail. The database times out. The user sends invalid input. The authentication token expires. The external API returns a 503. The question is not whether errors happen, but how they are handled when they do. At TCTF, error handling is not an afterthought bolted onto each Lambda function. It is a shared architecture — a hierarchy of custom error classes, a centralized handler that routes errors to the right response, and a wrapper that ensures every Lambda function handles errors consistently. This article explains how we built an error handling system that works across 34 microservices without duplicating a single line of error-handling code.

01The Problem: Error Handling Without Structure

Without a shared error handling architecture, every Lambda function reinvents the wheel. One function catches errors and returns a 500 with a generic message. Another returns the raw error message (leaking internal details). A third forgets to catch errors entirely and lets the Lambda runtime return its default error response.

The result is inconsistent API responses. The frontend cannot reliably parse error responses because every endpoint formats them differently. The monitoring team cannot aggregate errors because there is no common error code scheme. And debugging is painful because error messages do not include correlation IDs, timestamps, or context.

Worse, some errors need special handling. A rate limit error should return a 429 with a Retry-After header. An authentication error should return a 401 and invalidate the session. A validation error should return a 400 with field-level details. A database error should return a 500 but never expose the raw DynamoDB error to the client.

Handling all of this correctly in every Lambda function is error-prone and tedious. The solution: handle it once, in a shared library, and wrap every Lambda function with it.

🎯
Without shared error handling, every Lambda function reinvents the wheel. Inconsistent responses, leaked internal details, missing correlation IDs. The solution: handle it once, wrap every function.

02The Error Class Hierarchy

At the foundation is CustomError — a base class that extends JavaScript's native Error with three additional properties: errorCode (a machine-readable string like AUTHENTICATION_ERROR), statusCode (the HTTP status code to return), and additionalData (a key-value map of context).

Every error in the system extends CustomError. The hierarchy is organized by domain:

Authentication errors: AuthenticationError (401), AuthorizationError (403), UserExistsError (409), UserNotFoundError (404). These cover the auth flow — wrong credentials, insufficient permissions, duplicate signups, missing users.

Token errors: AccessTokenError, RefreshTokenError, MissingTokenError, TokenValidationError. These cover JWT lifecycle — expired tokens, invalid signatures, missing headers, malformed payloads.

Resource errors: ResourceNotFoundException (404) is the base, with domain-specific subclasses like UserNotFoundError. Any entity that can be looked up and not found gets its own error class.

System errors: TimeoutError (408), RateLimitError (429), DatabaseError (500), DynamodbError (500). These cover infrastructure-level failures — slow responses, throttling, database issues.

Validation errors: BadRequestException (400) with field-level details. Request body validation, query parameter validation, path parameter validation.

Crypto errors: EncryptionError, DecryptionError, KeyRotationError. These cover the encryption service failures.

The hierarchy matters because the error handler uses instanceof checks to route errors. An AuthenticationError is handled differently from a DatabaseError, which is handled differently from a ValidationError. The class hierarchy makes this routing clean and extensible.

🏗
️ 15+ error classes organized by domain. Every error carries an errorCode, statusCode, and additionalData. The class hierarchy enables clean, extensible error routing.

03The withErrorHandling Wrapper

Every Lambda handler at TCTF is wrapped with withErrorHandling. This is a higher-order function that takes a handler function and returns a new function with error handling built in.

The wrapper does three things. First, it generates a correlation ID for the request — a unique identifier that traces the request through every log entry, every error response, and every downstream service call. Second, it records the start time for performance tracking. Third, it wraps the handler in a try-catch that routes any thrown error to the centralized handleError function.

The handler function receives the API Gateway event, the correlation ID, and the start time as parameters. It does not need to worry about error handling — it just throws errors when something goes wrong, and the wrapper catches them.

The wrapper also supports an optional cleanup function that runs in the finally block — useful for releasing resources, closing connections, or flushing metrics regardless of whether the handler succeeded or failed.

This pattern means every Lambda function in the platform has consistent error handling, consistent correlation IDs, consistent performance tracking, and consistent response formatting — without any of that code being duplicated in the handler itself.

🔧
withErrorHandling wraps every Lambda handler. It generates correlation IDs, records timing, catches all errors, and routes them to the centralized handler. Zero error-handling code in the business logic.

04Priority-Based Error Routing

The handleError function is the brain of the error handling system. It receives an error and routes it to the appropriate response builder based on a priority order.

The priority is deliberate. Validation errors are checked first because they are the most common — bad input from the client. Rate limit errors come next because they need a Retry-After header. Timeout errors follow because they indicate infrastructure stress. Authentication and authorization errors come next because they need security context (was the token invalid? was it missing?).

Resource not found errors, bad request errors, and database errors follow in decreasing frequency. Custom application errors with their own status codes are handled next. AWS service errors (from the SDK) are mapped using an error registry. And finally, unknown errors — anything that does not match any of the above — get a generic 500 response.

This priority order means the most common errors are handled with the fewest checks. A validation error hits on the first check. An unknown error falls through all checks to the bottom. The system is optimized for the common case while still handling every edge case.

05The ErrorResponseBuilder

Every error response in the platform follows the same JSON structure. The ErrorResponseBuilder ensures this consistency.

Every response includes: the HTTP status code, a machine-readable error code, a human-readable message (safe to show to users), the correlation ID (for support tickets and debugging), and a timestamp. Some responses include additional fields — a Retry-After header for rate limit errors, field-level details for validation errors, a security context for auth errors.

The builder has specialized methods for each error category: buildValidationErrorResponse, buildRateLimitErrorResponse, buildTimeoutErrorResponse, buildSecurityErrorResponse, buildResourceNotFoundErrorResponse, buildCustomErrorResponse, buildAWSServiceErrorResponse, and buildUnknownErrorResponse.

Critically, the builder never exposes internal error details to the client. A DynamoDB ConditionalCheckFailedException becomes a generic database error. A Cognito NotAuthorizedException becomes an authentication error. The raw error is logged server-side with the correlation ID, but the client only sees a safe, formatted response.

This separation — detailed logging server-side, safe responses client-side — is essential for security. Internal error messages can reveal database schemas, service names, and infrastructure details that an attacker could exploit.

🔒
The ErrorResponseBuilder never exposes internal details to the client. Raw errors are logged server-side with correlation IDs. The client sees safe, formatted responses. Security by design.

06Error Handler Factory

Some services need specialized error handling beyond the default. The error handler factory provides pre-built handlers for common scenarios.

createAPIErrorHandler wraps errors with API-specific context — the API name, the endpoint, the HTTP method. This context appears in logs and metrics, making it easy to identify which API endpoint is generating errors.

createAuthErrorHandler adds authentication-specific handling — logging failed login attempts, tracking suspicious patterns, and enriching error responses with security context.

createDatabaseErrorHandler adds database-specific handling — distinguishing between transient errors (throttling, timeout) that should be retried and permanent errors (validation, not found) that should not.

createValidationErrorHandler adds field-level validation context — which field failed, what the expected format was, what the actual value was (sanitized). This makes validation errors actionable for the frontend.

These factory handlers compose with the default handler. A service can use createAuthErrorHandler for its authentication endpoints and the default handler for everything else. The factory pattern keeps the specialization modular and reusable.

07What This Enables

With this architecture in place, several things become possible that would be difficult without it.

Consistent API responses across all 34 services. The frontend team writes one error parser that works with every endpoint. No special cases, no per-service error formats.

Correlation ID tracing from request to response. When a user reports an error, the support team searches by correlation ID and sees the entire request path — the Lambda handler, the database query, the external service call, and the exact error that occurred.

Metrics aggregation by error type. CloudWatch dashboards show authentication errors, validation errors, rate limit errors, and database errors as separate metrics. Alerts trigger on error rate spikes per category, not just overall error rates.

Safe error responses by default. No developer can accidentally leak a DynamoDB error message or a Cognito exception to the client. The ErrorResponseBuilder enforces the safety boundary.

And zero error-handling boilerplate in business logic. Lambda handlers throw errors. The wrapper catches them. The handler routes them. The builder formats them. The handler function itself is pure business logic — clean, focused, and testable.

✅
Consistent responses across 34 services. Correlation ID tracing. Metrics by error type. Safe responses by default. Zero boilerplate in business logic. One architecture, used everywhere.

Error handling is the least glamorous part of any platform. Nobody writes blog posts about their error classes. Nobody tweets about their error response format. But when a user hits an error and sees a clear message with a correlation ID they can send to support — and the support team finds the exact issue in seconds — that is the architecture paying off. Every error, handled consistently, across every service, every time. That is what shared error handling gives you.

Editor's Note: This is Framework Series #9 in the TCTF Newsletter. Next in the series: Circuit Breaker Pattern for Serverless — how we protect external service calls.

Never miss an edition

Subscribe to get TCTF newsletters delivered to your inbox.

PreviousPluggable Encryption: How We Built a Provider-Agnostic Crypto Service

NextCircuit Breaker Pattern for Serverless: How We Protect External Service Calls

More From TCTF Newsletter

Vol. 1, Issue 4

Built to Last: Why Sustained Collaboration Is the Future of Tech Teams

Most platforms optimize for transactions — post a job, hire, move on. TCTF is built around sustained collaboration: long-term teams, milestone-driven projects, language support that breaks barriers, and a community where everyone — not just developers — has a seat at the table.

April 15, 2026

Q2 2026

Q2 2026 Roadmap: What's Next for the TCTF Portal

Our quarterly roadmap for Q2 — what shipped in April, the origin of Cometbid Social, and the plan for May and June as we build toward user accounts, authentication, and the social network launch.

April 1, 2026

Tech Series #3

How We Built a Real-Time Messaging System with AWS Lambda and WebSockets

Inside the architecture of TCTF's messaging platform — three services handling real-time chat, campaign delivery, and transactional notifications, all built on Lambda, API Gateway WebSockets, SQS, and multi-provider email with automatic failover.

March 15, 2026

Browse by Month

2026

June

May

April

March

February

January

Account

Error Handling Architecture: From Custom Errors to Automatic Recovery

01The Problem: Error Handling Without Structure

02The Error Class Hierarchy

03The withErrorHandling Wrapper

04Priority-Based Error Routing

05The ErrorResponseBuilder

06Error Handler Factory

07What This Enables

More From TCTF Newsletter

Built to Last: Why Sustained Collaboration Is the Future of Tech Teams

Q2 2026 Roadmap: What's Next for the TCTF Portal

How We Built a Real-Time Messaging System with AWS Lambda and WebSockets

Browse by Month

2026

The Cometbid
Technology Foundation

Our Community

Learn

Legal

More

Subscribe to our Newsletter

Error Handling Architecture: From Custom Errors to Automatic Recovery

01The Problem: Error Handling Without Structure

02The Error Class Hierarchy

03The withErrorHandling Wrapper

04Priority-Based Error Routing

05The ErrorResponseBuilder

06Error Handler Factory

07What This Enables

More From TCTF Newsletter

Built to Last: Why Sustained Collaboration Is the Future of Tech Teams

Q2 2026 Roadmap: What's Next for the TCTF Portal

How We Built a Real-Time Messaging System with AWS Lambda and WebSockets

Browse by Month

2026

Account

Error Handling Architecture: From Custom Errors to Automatic Recovery

01The Problem: Error Handling Without Structure

02The Error Class Hierarchy

03The withErrorHandling Wrapper

04Priority-Based Error Routing

05The ErrorResponseBuilder

06Error Handler Factory

07What This Enables

More From TCTF Newsletter

Built to Last: Why Sustained Collaboration Is the Future of Tech Teams

Q2 2026 Roadmap: What's Next for the TCTF Portal

How We Built a Real-Time Messaging System with AWS Lambda and WebSockets

Browse by Month

2026

The Cometbid Technology Foundation

Follow Us

Our Community

Learn

Legal

More

Subscribe to our Newsletter

Error Handling Architecture: From Custom Errors to Automatic Recovery

01The Problem: Error Handling Without Structure

02The Error Class Hierarchy

03The withErrorHandling Wrapper

04Priority-Based Error Routing

05The ErrorResponseBuilder

06Error Handler Factory

07What This Enables

More From TCTF Newsletter

Built to Last: Why Sustained Collaboration Is the Future of Tech Teams

Q2 2026 Roadmap: What's Next for the TCTF Portal

How We Built a Real-Time Messaging System with AWS Lambda and WebSockets

Browse by Month

2026

The Cometbid
Technology Foundation