
The final piece of our DynamoDB framework — transactional writes for atomic multi-item operations, batch operations for throughput, the retry and circuit breaker integration, and the real-world patterns that tie it all together across 34 services.
In Part 1, we covered single-table design — the PK/SK patterns, GSI strategies, and entity prefixing that let one DynamoDB table serve dozens of access patterns. In Part 2, we built a fluent query builder that eliminates raw expression strings and makes queries readable and composable. Now, in Part 3, we tackle the operations that go beyond simple reads and writes: transactions that guarantee all-or-nothing atomicity across multiple items, batch operations that maximize throughput, and the production hardening — retry logic, circuit breakers, and validation — that makes the framework reliable at scale. This is the final piece that ties the DynamoDB framework together.
Single-table design stores multiple entity types in the same table. A user, their sessions, their settings, and their activity records all live side by side, differentiated by key prefixes. This is powerful for read performance — one query can fetch related entities in a single round trip.
But it creates a consistency challenge for writes. When you create a new session, you also need to check the session count and potentially evict the oldest session. When you release an escrow payment, you need to debit the escrow account, credit the recipient's wallet, and log the transaction. When you complete MFA setup, you need to update the session and enable the MFA flag on the user record.
These are multi-item operations that must succeed or fail as a unit. If the escrow debit succeeds but the wallet credit fails, money disappears. If the session creation succeeds but the eviction fails, the user exceeds their session limit. Partial writes are worse than no writes.
DynamoDB transactions solve this. TransactWriteItems accepts up to 100 operations — Put, Update, Delete, and ConditionCheck — and executes them atomically. All succeed or all fail. No partial writes. No inconsistent state.
⚡Transactions guarantee all-or-nothing atomicity. If the escrow debit succeeds but the wallet credit fails, the entire transaction rolls back. No partial writes, no inconsistent state.
The ProductionDynamoDBService exposes transactWrite as a first-class method. You pass an array of operations — each specifying a table, a key, and the action (Put, Update, Delete, or ConditionCheck) — and the service executes them as a single atomic transaction.
ConditionCheck is the operation that makes transactions powerful beyond simple writes. It verifies a condition on an item without modifying it. For example, before releasing an escrow payment, you can include a ConditionCheck that verifies the escrow status is still FUNDED. If someone else already released it, the entire transaction fails — including the wallet credit and the audit log entry.
The transaction builder integrates with the retry logic. If a transaction fails due to a transient error (throttling, internal server error), it retries with exponential backoff. If it fails due to a condition check failure, it does not retry — the failure is intentional and the caller needs to handle it.
Transactions have a 100-operation limit. For operations that exceed this — bulk imports, mass updates — we use batch operations instead, accepting the trade-off of non-atomic execution.
🔒ConditionCheck verifies a condition without modifying the item. Before releasing escrow, verify the status is still FUNDED. If not, the entire transaction rolls back.

Not every multi-item operation needs atomicity. Sometimes you need throughput — reading 50 items or writing 25 items as fast as possible, without the overhead of transaction coordination.
batchGet reads up to 100 items in parallel. You provide an array of keys, and the service returns all matching items. Unlike transactions, batch gets are not atomic — individual items may fail independently. The service automatically retries unprocessed keys with exponential backoff.
batchWrite handles up to 25 mixed Put and Delete operations. Like batchGet, it is not atomic — individual operations may fail. The service retries unprocessed items automatically. This is used for bulk data loading, cache warming, and cleanup operations where atomicity is not required.
The key difference: transactions guarantee consistency (all or nothing). Batch operations maximize throughput (as fast as possible, retry failures). Choose transactions when correctness requires atomicity. Choose batch operations when speed matters more than all-or-nothing guarantees.
Both batch operations validate inputs before execution. Keys are checked for structure and size. Items are validated against the 400KB DynamoDB limit. Batch sizes are enforced (100 for gets, 25 for writes). Invalid inputs fail fast with clear error messages.
Every DynamoDB operation in the ProductionDynamoDBService goes through the retry layer. The retry logic uses exponential backoff with jitter — the delay doubles with each attempt, plus a random jitter to prevent thundering herd problems when multiple Lambda instances retry simultaneously.
The configuration is tunable: maxAttempts (default 3), baseDelay (default 100ms), maxDelay (default 5000ms), and jitter factor (default 0.1). For critical operations like payment processing, we increase maxAttempts to 5. For non-critical operations like analytics writes, we keep the defaults.
Not every error is retryable. The retry logic checks the error type: ProvisionedThroughputExceededException and ThrottlingException are retryable (the table is temporarily overloaded). ServiceUnavailable and InternalServerError are retryable (DynamoDB is having a bad moment). ConditionalCheckFailedException is not retryable (the condition was intentionally violated). ResourceNotFoundException is not retryable (the table does not exist).
The retry layer also enforces operation timeouts. If an operation does not complete within the configured timeout (default 30 seconds), it is aborted and counted as a failure. This prevents Lambda functions from hanging on slow DynamoDB responses — a common cause of Lambda timeout errors.
🔄Exponential backoff with jitter prevents thundering herd. Retryable errors (throttling, service unavailable) are retried. Non-retryable errors (condition failed, not found) fail immediately.
The ProductionDynamoDBService integrates with the circuit breaker pattern covered in Framework Series #10. Every operation passes through the circuit breaker before reaching DynamoDB.
If DynamoDB is experiencing sustained failures — a regional outage, a table-level issue, or persistent throttling — the circuit breaker opens and subsequent operations fail immediately without calling DynamoDB. This prevents Lambda functions from accumulating timeout errors and burning through their execution budget.
The circuit breaker state is stored in a separate DynamoDB table (or the same table with a different key prefix). This means the circuit breaker state persists across Lambda invocations — if one instance detects a failure pattern, all instances benefit from the circuit opening.
The circuit breaker initializes gracefully. If the circuit breaker table is not available (misconfigured, permissions issue), the service continues without circuit breaking and logs a warning. This ensures that a circuit breaker configuration problem does not prevent the service from starting.
Every operation validates its inputs before touching DynamoDB. Table names are checked for valid characters and length. Keys are validated for structure and sanitized for injection patterns. Items are checked against the 400KB size limit. Query conditions are validated for type correctness. Pagination tokens are verified for integrity using the encrypted pagination system from the crypto module.
The validation is defensive by design. String values are checked for script injection patterns. Attribute names are length-limited. Batch sizes are enforced. Invalid inputs throw DynamoDBServiceError with clear error codes — INVALID_TABLE_NAME, INVALID_KEY, ITEM_TOO_LARGE, BATCH_SIZE_EXCEEDED — so callers know exactly what went wrong.
This validation layer means that even if a Lambda handler passes unsanitized user input to the DynamoDB service, the service catches it before it reaches the database. Defense in depth — the handler should validate, the service validates again, and DynamoDB validates at the API level. Three layers, each catching what the others miss.
🛡️ Three layers of validation: the Lambda handler, the DynamoDB service, and DynamoDB itself. Even if the handler passes unsanitized input, the service catches it before it reaches the database.
The transaction and batch capabilities enable several patterns used across the platform.
Session creation with LRU eviction: When a user creates their 6th session, a transaction creates the new session and deletes the oldest in a single atomic operation. The user never exceeds 5 sessions, and the creation never partially succeeds.
Escrow release: When a milestone is approved, a transaction debits the escrow account, credits the recipient's wallet, updates the milestone status, and creates an audit log entry. If any step fails, the entire release rolls back.
MFA setup completion: A transaction updates the user's session with the MFA setup ID, enables the MFA flag on the user record, and logs the security event. The user's MFA status is never in an inconsistent state.
Bulk notification delivery: When a campaign sends notifications to thousands of users, batchWrite creates the notification records in batches of 25. Atomicity is not needed — a failed notification can be retried independently.
Cache warming: On cold start, batchGet loads frequently accessed configuration items in a single call instead of 50 individual gets. This reduces cold start latency by parallelizing the reads.
These patterns are not unique to TCTF. They are common in any DynamoDB application. What the framework provides is a consistent, validated, retry-protected way to execute them — so every service uses the same patterns with the same safety guarantees.
This concludes the three-part DynamoDB series. Part 1 covered single-table design — an established DynamoDB pattern that we adopted for all 34 services. Part 2 gave us the query language — a fluent builder that eliminates expression strings. Part 3 gave us the operations — transactions for atomicity, batches for throughput, retries for resilience, and validation for safety. Together, the fluent API, the transaction builder, the repository pattern, and the provider architecture form the TCTF DynamoDB framework. The framework is not clever. It is consistent. And in a platform with hundreds of Lambda functions, consistency is worth more than cleverness. One more thing: we plan to release the TCTF DynamoDB framework as a public open-source package once it matures. The fluent query builder, the transaction builder, the repository pattern, the provider architecture, and the encrypted pagination system — the tools that make working with DynamoDB at scale manageable. Single-table design is an established pattern you can adopt today. The framework is what makes it practical across dozens of services. If you are building serverless applications on DynamoDB, stay tuned. We will announce the public release in a future newsletter.
Never miss an edition
Subscribe to get TCTF newsletters delivered to your inbox.