
A behind-the-scenes look at how TCTF runs 34 independent services on AWS Lambda — from event-driven communication and database-per-service to deployment pipelines that ship to production in under 8 minutes.
The Cometbid Technology Foundation platform runs on 34 independent serverless microservices. No containers. No servers to patch. No clusters to manage. Every service is a self-contained unit — its own Lambda functions, its own API Gateway, its own DynamoDB tables, its own deployment pipeline. This article explains why we chose this architecture, the patterns that make it work, and the lessons we learned building it.
Traditional microservices run on containers — Kubernetes, ECS, Docker Compose. You get service independence but you also get cluster management, container orchestration, load balancer configuration, and a team that spends half its time on infrastructure.
We chose a different path. Every TCTF service is an AWS CDK project — a self-contained TypeScript application that defines its own Lambda functions, API Gateway routes, DynamoDB tables, SQS queues, and IAM roles. The CDK synthesizes CloudFormation and deploys directly. No containers. No servers. No clusters.
Each service lives in its own directory under a monorepo: cdk-auth-core, cdk-billing, cdk-social-network, cdk-messaging-consumers, and so on. Each has its own CDK stack, its own Lambda handlers, its own OpenAPI spec, and its own CI/CD pipeline. A change to billing never triggers a build of authentication.
⚡Every service is a CDK project. Lambda functions, API Gateway routes, DynamoDB tables, queues, and permissions — all defined in TypeScript, all deployed independently.

Each of our 34 services owns a single business domain. Authentication is six services (core, MFA, OAuth, password, admin, user) because auth is complex enough to warrant separation. Messaging is three services (consumers, communication, user-message) because sending a campaign email is fundamentally different from delivering a real-time chat message.
The rule is simple: if two capabilities change for different reasons, they belong in different services. Authentication changes when security requirements evolve. Billing changes when payment providers update their APIs. Social networking changes when users want new engagement features. None of these should force a deployment of the others.
Every service has its own CDK stack, its own package.json, its own test suite, and its own GitHub Actions pipeline. A change to the billing service does not trigger a build of the messaging service.
Services do not call each other directly. When a user signs up, the user-management service does not call the messaging service to send a welcome email. Instead, it publishes a USER_CREATED event to SNS. The messaging service subscribes to that topic and handles the email independently.
This pattern — publish events, let subscribers react — is the backbone of the platform. It reduces coupling, improves resilience, and makes the system easier to extend. Adding a new reaction to a signup event means adding a new subscriber, not modifying the signup code.
We use SQS for task queues (campaign delivery, failed message retry), SNS for fan-out notifications (one event, many subscribers), and EventBridge for complex routing rules. Dead-letter queues catch failures. CloudWatch alarms trigger when queues back up.
📨Design events as immutable facts, not commands. A USER_CREATED event is a fact. A SEND_WELCOME_EMAIL is a command. Facts are replayable. Commands are not.
Every service owns its data. The social-network service has its own DynamoDB tables. The billing service has its own. The achievement-engine has its own. No service reads another service's tables. If billing needs a user's name, it gets it from an event payload or calls the user-management API.
Why DynamoDB? Three reasons. First, single-digit millisecond latency at any scale — whether you have 10 users or 10 million. Second, zero operational overhead — no connection pools to tune, no vacuuming, no replica lag to worry about. Third, pay-per-request pricing that scales to zero when idle, which matches the serverless model perfectly.
We use single-table design with generic partition keys (PK) and sort keys (SK). A single table stores users, sessions, settings, and activity records — differentiated by key prefixes. Global secondary indexes (GSIs) enable access patterns that would require separate tables in a traditional design. This approach minimizes the number of tables to manage while maximizing query flexibility.
For services that need graph queries — like social connections and network traversal — we use Amazon Neptune. For full-text search across posts and profiles, we use Amazon OpenSearch. But DynamoDB is the primary data store for every service, and single-table design is the pattern that makes it work at scale.
🗄️ Single-table design with PK/SK keys lets one DynamoDB table serve dozens of access patterns. Fewer tables, fewer costs, faster queries.
Every service deploys independently through its own GitHub Actions pipeline. A developer merges to the dev branch. The pipeline runs lint, unit tests, and integration tests. If everything passes, CDK synthesizes the CloudFormation template and deploys directly to the dev environment. Promotion to staging and production follows the same pattern with additional gates.
The entire flow — from merge to production — takes under 8 minutes for most services. There is no monolithic deployment. There is no release train. Each service ships when it is ready.
Infrastructure is defined entirely in AWS CDK (TypeScript). Every Lambda function, API Gateway route, DynamoDB table, SQS queue, and IAM role is code. Nothing is configured manually in the AWS console. If it is not in CDK, it does not exist.
Starting in September 2026, all services will use canary deployments — progressive rollouts with automatic rollback based on CloudWatch alarm thresholds. A bad deployment affects 10% of traffic for 5 minutes before it is automatically rolled back.
🚀If it is not in CDK, it does not exist. Infrastructure as code is not a best practice — it is the only practice.
Every Lambda function logs structured JSON with a correlation ID that traces a request across services. If a user reports a problem, we search by correlation ID and see the entire request path — from API Gateway through Lambda to DynamoDB to SQS to the downstream service.
We use AWS Powertools for Lambda (TypeScript) for structured logging, custom metrics, and distributed tracing. Every service emits business metrics — not just technical metrics like latency and error rate, but domain metrics like campaigns sent, messages delivered, and signups completed.
CloudWatch dashboards give us real-time visibility. Alarms trigger on error rate spikes, latency degradation, and queue depth anomalies. The goal is to know about a problem before any user reports it.
Cold starts matter less than you think. With Node.js 24 and provisioned concurrency on critical paths (authentication, payment processing), cold starts are a non-issue for our use case.
Service boundaries matter more than you think. We split authentication into six services early. That decision has paid for itself every month — MFA changes never risk breaking OAuth, password resets never touch session management.
Event schemas need versioning from day one. We learned this the hard way when a schema change in user-management broke three downstream consumers. Now every event has a version field and consumers handle unknown versions gracefully.
Monorepo structure helps. All 34 services live in one repository with shared utilities (tctf-utils). This makes cross-service refactoring possible and keeps dependency versions aligned. The trade-off is a larger repository, but the benefits far outweigh the cost.

Thirty-four services sounds like a lot. It is. But each one is small, focused, and independently deployable. That is the point. When the billing team ships a fix, the social network keeps running. When the messaging system handles a campaign spike, authentication is unaffected. Serverless microservices give us the freedom to move fast without breaking each other's work. That is the architecture we bet on, and so far, it is paying off.
Never miss an edition
Subscribe to get TCTF newsletters delivered to your inbox.