We use serverless for specific patterns, not as a default. The patterns where it shines, the ones it doesn't, and the gotchas at production scale.

On this page

Serverless Architecture: When It Fits, When It Doesn't

We've used serverless heavily for some workloads (event handlers, scheduled jobs, async processing) and avoided it for others (long-running APIs, latency-sensitive services). After a few years, this is the working model: serverless is a tool with specific strengths, not a default architectural choice. This post is when we reach for it and when we don't.

What we mean by serverless #

The functions-as-a-service flavor: AWS Lambda, GCP Cloud Functions, Azure Functions, Cloudflare Workers. The defining characteristics:

Per-invocation billing
Auto-scaling from 0 to many
No persistent server / process lifetime
Stateless (state lives in databases, queues, or external storage)

This post is mostly about Lambda since that's where our experience is, but the patterns apply to other FaaS platforms.

Where serverless wins #

Specific patterns where serverless is clearly the right answer:

Event-triggered handlers. S3 upload triggers a Lambda that processes the file. SQS messages trigger Lambdas that process them. EventBridge events trigger workflow steps. The "respond to event" pattern fits Lambda's invocation model perfectly.

Scheduled jobs. Cron jobs run on Lambda via EventBridge schedules. Cheaper than running a small server 24/7 to do work every 5 minutes. We have ~30 such jobs.

Async pipelines with bursty load. A document processing pipeline that handles 10 docs/minute most of the time and 1000 docs/minute occasionally. Lambda scales seamlessly; a fixed-capacity service would either over-provision or hit limits.

Glue between AWS services. Tiny pieces of code that bridge two services. "When a row appears in DynamoDB, write a copy to OpenSearch." A 30-line Lambda is the right tool.

Webhooks. Receive a webhook, validate, queue for downstream processing. Webhooks are bursty and event-shaped — Lambda is a natural fit.

For all these, the per-invocation billing means you pay for actual work, not for capacity. At low volume that's "essentially free." At high volume you can verify it's still cheaper than the equivalent always-on infrastructure.

Where serverless gets expensive or awkward #

Patterns where we don't use serverless:

High-throughput sustained APIs. A service handling 1000+ req/s sustained: the per-invocation cost crosses over and ECS/EKS becomes cheaper. Plus the cold-start latency variance hurts user experience. We use Lambda for APIs only at low and unpredictable traffic.

Long-running tasks. Lambda has a 15-minute max. If your job might take longer, you need to chunk it (often awkward) or use a different platform. Step Functions help orchestrate long-running work but add complexity.

Latency-sensitive interactive services. Cold starts hurt p99 latency. Provisioned concurrency helps but partially defeats the cost benefit. For sub-100ms p99 requirements, traditional servers are easier.

Workloads with persistent connections. WebSocket servers, gRPC streaming. Lambda has limited support; the connection lifecycle doesn't fit Lambda's invocation model. Use ECS/EKS.

Workloads with heavy local state. A service that loads a 5GB ML model: cold-start is brutal, every Lambda execution environment loads the model from scratch. Better to keep the model in a long-running pod.

Heavy stateful operations. Workflow engines, real-time collaboration servers. Lambda is the wrong tool.

The cost crossover point #

A back-of-envelope rule we use:

For sustained traffic, Lambda is roughly cost-comparable to ECS Fargate at ~5-10 req/s per function. Below that, Lambda is cheaper. Above that, ECS Fargate is cheaper. EKS on EC2 spot is cheaper than both above ~20 req/s sustained.

Plus operational considerations: at low volume, Lambda's "no servers to manage" is real value. At high volume, you have a fleet to manage either way.

Specific numbers from our bill:

Lambdas processing < 5 req/s: ~$5-50/month each, very low operational overhead.
Services on ECS Fargate at 50+ req/s sustained: ~$80-200/month + minimal operational overhead.
Services on EKS spot at 200+ req/s sustained: ~$50-150/month per service + some operational overhead.

The "operational overhead" piece matters. Lambda is the lowest-toil option. EKS is the most flexible but requires platform engineering.

Common patterns we use #

Event-driven processing pipeline #

code

S3 upload → S3 event → SQS → Lambda → DynamoDB

S3 event triggers SQS message; SQS triggers Lambda; Lambda processes and writes results. SQS in the middle gives us:

Buffering during burst load
Retry on failure (with DLQ for dead letters)
Visibility into queue depth as a backpressure signal

Cost at our scale (~5,000 uploads/hr): ~$30/month. Trivial.

Webhook receiver #

code

HTTP webhook → API Gateway → Lambda (validate, queue) → SQS → Lambda (process)

Two Lambdas: one validates and acks fast (sub-100ms), the other does the actual work asynchronously. Why split: webhook senders care about response time. The processing might be slow; we don't want them retrying because of our slow response.

Scheduled job #

code

EventBridge schedule (cron) → Lambda → various AWS services

Replace what would be a cron job on a long-running server. Cheaper, no server to manage. Most of our admin / batch jobs run this way.

Glue function #

code

DynamoDB Stream → Lambda → OpenSearch (denormalize)

Real-time cross-system sync. Stream events into OpenSearch as they happen. ~$5/month for the Lambda; would be a meaningful project to build with custom infrastructure.

Specific Lambda gotchas #

A few things that bite teams new to Lambda:

Concurrency limits. Account-wide default is 1000 concurrent executions. A burst that exceeds this gets throttled (TooManyRequestsException). For burst-heavy workloads, request a limit increase or use SQS to smooth the burst.

Cold-start variability. Cold starts depend on runtime, package size, and module-level setup. A small Go function might cold-start in 100ms; a Node function with 50MB of dependencies might take 3 seconds. Provisioned concurrency removes the variance for critical functions.

Timeout handling. Default Lambda timeout is 3 seconds. Many users miss this and wonder why their Lambda mysteriously dies. Set timeout based on the actual work; don't accept the default.

Idempotency. Lambda might invoke your handler twice for the same event in rare retry scenarios. Code defensively — assume the same event might be processed twice, dedupe at the destination via idempotency keys.

State across invocations. Module-level variables persist across warm invocations within the same execution environment. This is a feature (cache DB clients, etc.) but can be a bug if you accidentally rely on per-invocation isolation.

Specific incidents we've had #

Production issues we've hit:

Lambda invocation cost surprise. A new feature got more traffic than expected. The Lambda's cost jumped from $50/month to $1500/month. We had no per-function budget alerts. Now we have CloudWatch alarms on Lambda invocation cost per function.

Concurrency hit. A batch job triggered 2000 Lambdas simultaneously. Hit the 1000 concurrency limit; half failed. Fix: SQS in front to smooth the burst, plus reserved concurrency on the source Lambda.

Data egress charges. A Lambda was downloading large objects from a different region. Cross-region transfer at $0.02/GB added up. Moved the Lambda to the same region as the data source.

Long cold starts on a customer-facing API. A Node Lambda with 80MB of dependencies had 4-second cold starts. Customer-facing latency was bad. Switched to esbuild bundling (6MB final), cold start dropped to 800ms. Provisioned concurrency for the remaining latency-sensitive paths.

Beyond Lambda: serverless containers #

Serverless containers (AWS Fargate, Cloud Run, Container Apps) are the next step up:

Same per-invocation billing model
Run any container (no language limits)
Longer max duration (sometimes effectively unlimited)
Slower to start than Lambda but more flexible

We use Fargate for some containerized workloads where Lambda's constraints (15-min limit, language support, package size) don't fit. The cost is higher than Lambda for the same throughput; the flexibility is worth it for the right workloads.

What we don't bother with #

A few serverless patterns we've abandoned:

Lambda layers for shared dependencies. They sounded good; in practice, they complicate deploy pipelines and lock you to specific Lambda runtimes. We bundle dependencies into each function instead.

Step Functions for everything. Step Functions are great for clear state-machine workflows. We use them for our document processing pipeline. But for general orchestration, they're heavyweight and the visual editor doesn't really help debugging.

API Gateway in front of every Lambda. API Gateway adds latency and cost. For internal-only or simple webhook receivers, Lambda Function URLs (cheaper, no API Gateway features) are better.

Lambda for absolutely everything. "Serverless-first" as a philosophy creates teams that rebuild things badly because of Lambda's constraints. Use Lambda where it fits; use other tools where they fit.

What I'd tell a team starting #

Default to Lambda for event handlers, scheduled jobs, and glue. It's the right tool. Per-invocation billing wins at low volume.

Don't default to Lambda for sustained APIs. ECS/EKS is usually cheaper and gives more flexibility above the crossover point.

Watch for cold-start sensitivity. If p99 latency matters, plan for it (provisioned concurrency, smaller packages, faster runtimes).

Set budget alerts per function. Cost surprises are the #1 reason teams sour on serverless.

Idempotency from day one. Assume invocations can repeat. Design destinations to dedupe.

Right-size memory empirically. The default is rarely optimal. Power Tuning finds the cost-vs-latency sweet spot.

Serverless is a great tool for specific patterns. The mistake is treating it as a default architectural choice. The teams that get the most out of it are the ones who use it where it fits and use other tools where they fit better. The "serverless-first" or "all-in on Lambda" framings tend to lead to architectural distortions; "right tool for the job" leads to better outcomes.

Serverless Architecture Patterns: Building Scalable Applications

Serverless Architecture: When It Fits, When It Doesn't

What we mean by serverless #

Where serverless wins #

Where serverless gets expensive or awkward #

The cost crossover point #

Common patterns we use #

Event-driven processing pipeline #

Webhook receiver #

Scheduled job #

Glue function #

Specific Lambda gotchas #

Specific incidents we've had #

Beyond Lambda: serverless containers #

What we don't bother with #

What I'd tell a team starting #

Stay Updated

Real-World RAG Incidents: Lessons from a Production Rollout

What We Learned Running Weekly Game Days on Our CI/CD Pipeline

More from Cloud

External Secrets Operator: One Secrets Workflow Across Clouds

AWS Graviton Migration: What Broke and What We Saved

Serverless Cold Starts: Measuring and Fixing Them on Lambda

External Secrets Operator: One Secrets Workflow Across Clouds

AWS Graviton Migration: What Broke and What We Saved

Serverless Cold Starts: Measuring and Fixing Them on Lambda

Multi-Region Failover with Route 53: Health Checks and Gotchas

NAT Gateway Costs: The Silent Line Item and How to Cut It

Terraform Import at Scale: Bringing Legacy Infra Under Code

You might have missed

GitOps with Argo CD: Best Practices for 2025

Prompt Engineering Best Practices: Maximizing LLM Performance

AI Agents in DevOps: From Copilots to Autonomous Automation in 2025

About Kiril Urbonas