
How we structured data access across 34 services using the repository pattern — DynamoDB providers, the BaseRepository abstraction, the RepositoryFactory, and the layered architecture that keeps business logic clean.
In the first three parts of this DynamoDB series, we covered single-table design, the fluent query builder, and the transaction builder. Those are the tools for talking to DynamoDB. But tools alone do not create clean architecture. You also need a structure that keeps database access organized, testable, and consistent across 34 microservices. That structure is the provider and repository pattern — a layered approach where each service gets its own database provider, each entity gets its own repository, and business logic never touches DynamoDB directly. This article explains how it works and why it matters.
When you have 34 microservices, each with its own Lambda functions, each talking to DynamoDB, the temptation is to just import the DynamoDB client and write queries inline. It works. It ships. And then it becomes unmaintainable.
Inline database access scatters DynamoDB-specific code across every Lambda handler. The same query patterns get duplicated in different services. Table names are hardcoded in multiple places. When you need to change a table name or add a new index, you are hunting through dozens of files.
Worse, business logic gets tangled with data access. A Lambda handler that validates input, queries DynamoDB, applies business rules, and returns a response is doing four jobs. Testing any one of those jobs requires mocking the others. Refactoring is risky because everything is coupled.
The provider and repository pattern solves this by introducing clear layers. Each layer has one job. Each layer is independently testable. And the pattern is consistent across all 34 services.
🏗️ Without structure, database access scatters across every Lambda handler. The provider and repository pattern introduces clear layers — each with one job, each independently testable.
The first layer is the database provider. Every service has one. It extends BaseDatabaseProvider from tctf-utils and adds project-specific table name getters.
BaseDatabaseProvider is a static class that manages the singleton DynamoDB service instance. It creates the ProductionDynamoDBService on first access and caches it for subsequent calls — critical for Lambda warm starts where you want to reuse connections across invocations.
The provider resolves table names from environment variables with optional defaults. This means the same code works in development (where table names might be different) and production (where they come from CDK-generated environment variables). If a required table name is missing, the provider throws a clear error with the missing variable name.
Each project extends the base with its own table name getters. The billing service has getBillingTableName(), getInvoiceTableName(), getSubscriptionTableName(). The social network service has getPostsTableName(), getConnectionsTableName(). The user management service has getUsersTableName(), getProfilesTableName().
The provider also exposes a convenience method — getDatabase() — that returns both the DynamoDB service instance and the table name in a single object. This is the most common access pattern: get the service and the table, pass them to a repository.
📦Every service extends BaseDatabaseProvider with its own table name getters. Table names come from environment variables — the same code works in dev and production.
The second layer is the repository. BaseRepository is an abstract class that provides typed CRUD operations, query methods, and batch operations — all without exposing DynamoDB internals to the caller.
A repository is created for a specific entity type and table. It receives the ProductionDynamoDBService in its constructor and exposes protected methods that subclasses use to build their public API. The protected methods include getItem, putItem, updateItem, deleteItem, query, scan, batchGet, batchWrite, transactWrite, and transactRead.
Beyond basic CRUD, the base repository provides higher-level methods that handle common patterns: queryByPartitionKey for single-table access patterns, queryByGSI for index queries, paginated variants of both, findFirst for getting the first matching item, count and countByGSI for aggregation, exists for existence checks, and conditional writes with putWithCondition and updateWithCondition.
Every method is typed. A UserRepository extends BaseRepository with the User type, so getItem returns User or null, queryByPartitionKey returns User arrays, and putItem accepts User objects. The TypeScript compiler catches type mismatches at build time, not at runtime.
The key principle: repositories expose domain-specific methods. A UserRepository has getUserById(), getUsersByStatus(), createUser(), updateProfile(). It does not expose query() or scan(). The caller never sees DynamoDB expressions, key conditions, or projection expressions. They see domain language.
🔒Repositories expose domain-specific methods like getUserById() and createUser(). The caller never sees DynamoDB expressions. They see domain language.
The third layer is the factory. There are two factories in the system, each serving a different purpose.
DynamoDBFactory is a static utility class that creates fresh instances of DynamoDB clients, query builders, and production services. Every call returns a new instance — no caching, no shared state. This is critical for the query builder, which accumulates state as you chain methods. Reusing a query builder across operations would leak state from one query into the next. The factory prevents this by always returning fresh instances.
RepositoryFactory is the higher-level factory that manages repository creation. You initialize it once with the DynamoDB configuration, and then use createRepository() to instantiate typed repositories. The factory holds the shared DynamoDB service instance and passes it to each repository it creates.
The factory pattern means Lambda handlers never construct repositories directly. They call the factory, get a repository, and use it. If the underlying DynamoDB configuration changes — region, retry policy, timeout — it changes in one place. Every repository created by the factory picks up the new configuration automatically.
The factory also exposes metrics — operation counts, latency distributions, error rates — aggregated across all repositories it has created. This gives you a single view of database health for the entire service.
Here is the flow in a typical Lambda handler.
The handler receives an API Gateway event. It validates the input using the request validation middleware. Then it calls the database provider to get the DynamoDB service and table name. It passes these to the repository factory, which creates a typed repository. The handler calls domain-specific methods on the repository — getUserById(), updateProfile(), getAchievementsByTier(). The repository translates these into DynamoDB operations using the query builder and production service. Results come back as typed domain objects.
The handler never imports DynamoDB clients. It never writes expression strings. It never constructs key objects. It works entirely in domain language — users, profiles, achievements, milestones — and the repository layer handles the translation to DynamoDB.
This separation means you can test the handler by mocking the repository. You can test the repository by mocking the DynamoDB service. You can test the DynamoDB service against a local DynamoDB instance. Each layer is independently testable, and each test is focused on one concern.
When a new developer joins the team and needs to add a new Lambda function, they follow the same pattern: extend the provider if they need a new table, create a repository for their entity, use the factory to instantiate it, and write their handler in domain language. The pattern is consistent across all 34 services.
✅Lambda handlers work in domain language — users, profiles, achievements. The repository layer handles the translation to DynamoDB. Each layer is independently testable.
A common question: why build a custom repository layer instead of using an ORM like DynamoDB Toolbox, ElectroDB, or Dynamoose?
The answer is control and fit. ORMs add abstraction, but they also add opinions. They define how you model entities, how you handle relationships, and how you construct queries. When those opinions align with your needs, ORMs are great. When they do not, you fight the framework.
TCTF uses single-table design with generic PK/SK keys, entity prefixing, and GSI overloading. This pattern does not map cleanly to most DynamoDB ORMs, which assume one entity per table or one entity per index. Our query builder and repository pattern are designed specifically for single-table access patterns — they understand PK/SK prefixing, GSI overloading, and the sparse index patterns that make single-table design work.
The custom approach also gives us full control over the DynamoDB client configuration — retry policies, timeout settings, X-Ray tracing, metrics collection, and circuit breaker integration. These are production concerns that ORMs typically do not expose or handle differently than we need.
The trade-off is that we maintain more code. But that code is tailored to our exact needs, fully typed, and consistent across all 34 services. For a platform at this scale, the control is worth the maintenance cost.
Building this pattern across 34 services taught us several lessons.
First, fresh instances matter. Early versions of the factory cached query builders for performance. This caused subtle bugs where state from one query leaked into the next. The fix was simple: always return fresh instances. The performance cost is negligible — creating a query builder is cheap. The correctness benefit is enormous.
Second, table name resolution should fail loudly. Early versions returned empty strings for missing table names. This caused confusing DynamoDB errors deep in the call stack. Now the provider throws a clear error with the missing environment variable name. Fail fast, fail clearly.
Third, the repository's public API should use domain language, not database language. A method called queryByPartitionKeyWithSortKeyBeginsWith is a database method. A method called getMessagesByConversation is a domain method. The repository's job is to translate between the two.
Fourth, metrics at the factory level are more useful than metrics at the repository level. You want to know how the database is performing across all operations, not just within one repository. The factory aggregates metrics from every repository it creates, giving you a single dashboard for database health.
📊Key lessons: always return fresh query builder instances, fail loudly on missing table names, use domain language in repository APIs, and aggregate metrics at the factory level.
The provider and repository pattern is not glamorous. It does not make DynamoDB faster or cheaper. What it does is make 34 microservices maintainable. Every service follows the same structure. Every Lambda handler reads the same way. Every new developer can navigate the codebase because the pattern is consistent. In a serverless architecture where you have hundreds of Lambda functions across dozens of services, that consistency is worth more than any clever optimization.
Never miss an edition
Subscribe to get TCTF newsletters delivered to your inbox.