We use serverless for specific patterns, not as a default. The patterns where it shines, the ones it doesn't, and the gotchas at production scale.
We've used serverless heavily for some workloads (event handlers, scheduled jobs, async processing) and avoided it for others (long-running APIs, latency-sensitive services). After a few years, this is the working model: serverless is a tool with specific strengths, not a default architectural choice. This post is when we reach for it and when we don't.
The functions-as-a-service flavor: AWS Lambda, GCP Cloud Functions, Azure Functions, Cloudflare Workers. The defining characteristics:
This post is mostly about Lambda since that's where our experience is, but the patterns apply to other FaaS platforms.
Specific patterns where serverless is clearly the right answer:
Event-triggered handlers. S3 upload triggers a Lambda that processes the file. SQS messages trigger Lambdas that process them. EventBridge events trigger workflow steps. The "respond to event" pattern fits Lambda's invocation model perfectly.
Scheduled jobs. Cron jobs run on Lambda via EventBridge schedules. Cheaper than running a small server 24/7 to do work every 5 minutes. We have ~30 such jobs.
Async pipelines with bursty load. A document processing pipeline that handles 10 docs/minute most of the time and 1000 docs/minute occasionally. Lambda scales seamlessly; a fixed-capacity service would either over-provision or hit limits.
Glue between AWS services. Tiny pieces of code that bridge two services. "When a row appears in DynamoDB, write a copy to OpenSearch." A 30-line Lambda is the right tool.
Webhooks. Receive a webhook, validate, queue for downstream processing. Webhooks are bursty and event-shaped — Lambda is a natural fit.
For all these, the per-invocation billing means you pay for actual work, not for capacity. At low volume that's "essentially free." At high volume you can verify it's still cheaper than the equivalent always-on infrastructure.
Patterns where we don't use serverless:
High-throughput sustained APIs. A service handling 1000+ req/s sustained: the per-invocation cost crosses over and ECS/EKS becomes cheaper. Plus the cold-start latency variance hurts user experience. We use Lambda for APIs only at low and unpredictable traffic.
Long-running tasks. Lambda has a 15-minute max. If your job might take longer, you need to chunk it (often awkward) or use a different platform. Step Functions help orchestrate long-running work but add complexity.
Latency-sensitive interactive services. Cold starts hurt p99 latency. Provisioned concurrency helps but partially defeats the cost benefit. For sub-100ms p99 requirements, traditional servers are easier.
Workloads with persistent connections. WebSocket servers, gRPC streaming. Lambda has limited support; the connection lifecycle doesn't fit Lambda's invocation model. Use ECS/EKS.
Workloads with heavy local state. A service that loads a 5GB ML model: cold-start is brutal, every Lambda execution environment loads the model from scratch. Better to keep the model in a long-running pod.
Heavy stateful operations. Workflow engines, real-time collaboration servers. Lambda is the wrong tool.
A back-of-envelope rule we use:
For sustained traffic, Lambda is roughly cost-comparable to ECS Fargate at ~5-10 req/s per function. Below that, Lambda is cheaper. Above that, ECS Fargate is cheaper. EKS on EC2 spot is cheaper than both above ~20 req/s sustained.
Plus operational considerations: at low volume, Lambda's "no servers to manage" is real value. At high volume, you have a fleet to manage either way.
Specific numbers from our bill:
The "operational overhead" piece matters. Lambda is the lowest-toil option. EKS is the most flexible but requires platform engineering.
S3 upload → S3 event → SQS → Lambda → DynamoDB
S3 event triggers SQS message; SQS triggers Lambda; Lambda processes and writes results. SQS in the middle gives us:
Cost at our scale (~5,000 uploads/hr): ~$30/month. Trivial.
HTTP webhook → API Gateway → Lambda (validate, queue) → SQS → Lambda (process)
Two Lambdas: one validates and acks fast (sub-100ms), the other does the actual work asynchronously. Why split: webhook senders care about response time. The processing might be slow; we don't want them retrying because of our slow response.
EventBridge schedule (cron) → Lambda → various AWS services
Replace what would be a cron job on a long-running server. Cheaper, no server to manage. Most of our admin / batch jobs run this way.
DynamoDB Stream → Lambda → OpenSearch (denormalize)
Real-time cross-system sync. Stream events into OpenSearch as they happen. ~$5/month for the Lambda; would be a meaningful project to build with custom infrastructure.
A few things that bite teams new to Lambda:
Concurrency limits. Account-wide default is 1000 concurrent executions. A burst that exceeds this gets throttled (TooManyRequestsException). For burst-heavy workloads, request a limit increase or use SQS to smooth the burst.
Cold-start variability. Cold starts depend on runtime, package size, and module-level setup. A small Go function might cold-start in 100ms; a Node function with 50MB of dependencies might take 3 seconds. Provisioned concurrency removes the variance for critical functions.
Timeout handling. Default Lambda timeout is 3 seconds. Many users miss this and wonder why their Lambda mysteriously dies. Set timeout based on the actual work; don't accept the default.
Idempotency. Lambda might invoke your handler twice for the same event in rare retry scenarios. Code defensively — assume the same event might be processed twice, dedupe at the destination via idempotency keys.
State across invocations. Module-level variables persist across warm invocations within the same execution environment. This is a feature (cache DB clients, etc.) but can be a bug if you accidentally rely on per-invocation isolation.
Production issues we've hit:
Lambda invocation cost surprise. A new feature got more traffic than expected. The Lambda's cost jumped from $50/month to $1500/month. We had no per-function budget alerts. Now we have CloudWatch alarms on Lambda invocation cost per function.
Concurrency hit. A batch job triggered 2000 Lambdas simultaneously. Hit the 1000 concurrency limit; half failed. Fix: SQS in front to smooth the burst, plus reserved concurrency on the source Lambda.
Data egress charges. A Lambda was downloading large objects from a different region. Cross-region transfer at $0.02/GB added up. Moved the Lambda to the same region as the data source.
Long cold starts on a customer-facing API. A Node Lambda with 80MB of dependencies had 4-second cold starts. Customer-facing latency was bad. Switched to esbuild bundling (6MB final), cold start dropped to 800ms. Provisioned concurrency for the remaining latency-sensitive paths.
Serverless containers (AWS Fargate, Cloud Run, Container Apps) are the next step up:
We use Fargate for some containerized workloads where Lambda's constraints (15-min limit, language support, package size) don't fit. The cost is higher than Lambda for the same throughput; the flexibility is worth it for the right workloads.
A few serverless patterns we've abandoned:
Lambda layers for shared dependencies. They sounded good; in practice, they complicate deploy pipelines and lock you to specific Lambda runtimes. We bundle dependencies into each function instead.
Step Functions for everything. Step Functions are great for clear state-machine workflows. We use them for our document processing pipeline. But for general orchestration, they're heavyweight and the visual editor doesn't really help debugging.
API Gateway in front of every Lambda. API Gateway adds latency and cost. For internal-only or simple webhook receivers, Lambda Function URLs (cheaper, no API Gateway features) are better.
Lambda for absolutely everything. "Serverless-first" as a philosophy creates teams that rebuild things badly because of Lambda's constraints. Use Lambda where it fits; use other tools where they fit.
Default to Lambda for event handlers, scheduled jobs, and glue. It's the right tool. Per-invocation billing wins at low volume.
Don't default to Lambda for sustained APIs. ECS/EKS is usually cheaper and gives more flexibility above the crossover point.
Watch for cold-start sensitivity. If p99 latency matters, plan for it (provisioned concurrency, smaller packages, faster runtimes).
Set budget alerts per function. Cost surprises are the #1 reason teams sour on serverless.
Idempotency from day one. Assume invocations can repeat. Design destinations to dedupe.
Right-size memory empirically. The default is rarely optimal. Power Tuning finds the cost-vs-latency sweet spot.
Serverless is a great tool for specific patterns. The mistake is treating it as a default architectural choice. The teams that get the most out of it are the ones who use it where it fits and use other tools where they fit better. The "serverless-first" or "all-in on Lambda" framings tend to lead to architectural distortions; "right tool for the job" leads to better outcomes.
Get the latest tutorials, guides, and insights on AI, DevOps, Cloud, and Infrastructure delivered directly to your inbox.
A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.
Practical game day scenarios for CI/CD: broken rollbacks, permission issues, and slow feedback loops—and how we fixed them.
Explore more articles in this category
There are two hard problems in computer science." We've worked on the cache-invalidation one for a while. The patterns that hold up at scale and the ones that look clean and aren't.
We use Step Functions for batch processing, document ingestion, and a few agentic workflows. The patterns that work, the limits we hit, and where we'd reach for something else.
After two years of running Karpenter on production EKS clusters, the NodePool patterns that survived, the ones we replaced, and the tuning that matters.