We deployed the same edge function on both platforms and measured for a quarter. Where each wins, where each loses, and the surprises along the way.
We deployed the same edge function on both Cloudflare Workers and Vercel Edge for a full quarter. Same code, same inputs, same upstream. Both running globally, both production-grade. Here's the breakdown of where each wins, where each loses, and what surprised us most.
A simple but representative edge function: an A/B-test cohort assigner.
Code: about 80 lines of TypeScript. Runs on every request to the marketing site (~14M requests/day at peak).
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const cohort = await getOrCreateCohort(request, env);
const variant = await env.KV.get(`cohort:${cohort}`) ?? "control";
const response = await fetch(request, {
headers: { ...request.headers, "x-variant": variant },
});
const newResp = new Response(response.body, response);
newResp.headers.set("set-cookie", `cohort=${cohort}; Path=/; Max-Age=31536000`);
return newResp;
},
};
Deployed via wrangler publish. KV reads from Cloudflare KV.
import { NextRequest, NextResponse } from "next/server";
export const config = { runtime: "edge" };
export async function middleware(request: NextRequest) {
const cohort = await getOrCreateCohort(request);
const variant = await fetch(`https://kv.upstash.io/get/cohort:${cohort}`).then(r => r.text());
const response = NextResponse.next({
headers: { ...request.headers, "x-variant": variant },
});
response.cookies.set("cohort", cohort, { maxAge: 31536000 });
return response;
}
Deployed as Next.js middleware. KV reads from Upstash (Vercel's edge function doesn't have native KV in the same shape).
Measurements are p50/p95/p99 of total edge-function execution time, not including upstream-fetch time. Sampled across a quarter, geographic distribution roughly weighted to our user base (~60% NA, ~30% EU, ~10% rest).
| Metric | Cloudflare Workers | Vercel Edge |
|---|---|---|
| Cold start (95p) | 4ms | 38ms |
| Hot p50 | 2.1ms | 9.2ms |
| Hot p95 | 5.4ms | 21ms |
| Hot p99 | 12ms | 42ms |
| KV read p95 (regional) | 1.8ms | 8.4ms |
| Global PoPs | 330+ | ~70 |
Cloudflare wins on raw latency by a wide margin. Two reasons:
| CF Workers | Vercel Edge | |
|---|---|---|
| Base plan | $5/mo | included with hosting |
| Per-request beyond plan | $0.30/M | $2.00/M (rounded across products) |
| KV read | $0.50/M | depends on KV provider |
| KV write | $5/M | depends on KV provider |
| Outbound bandwidth | included | included |
For our 14M req/day workload, we needed paid tiers on both:
CF was about 2.5× cheaper for our workload.
Vercel Edge runs in regional zones, not literally at every CDN PoP. For requests far from those zones, the "edge function" is sometimes 30+ms away. CF Workers actually run at the PoP serving the user.
This matters most for users in regions like LATAM, Africa, and parts of Asia. CF served them in single-digit milliseconds; Vercel sometimes routed them to a US zone with 80+ms RTT.
CF Workers' V8 isolates reset in roughly 4ms. Vercel Edge cold starts (when the function had been idle for a while) regularly hit 38ms+. For a function that runs on every request, this matters because traffic patterns rotate through enough PoPs that you hit fresh isolates often.
We tried Vercel's "regional functions" mode to keep functions warm. It helped (p95 cold start dropped from 38ms to 18ms) but cost more.
Wrangler is fine. The web UI is unfortunate. Cloudflare's docs are thorough but sometimes inconsistent (multiple ways to set the same thing; some guides outdated). Vercel's developer experience for the same complexity is meaningfully better.
We spent ~3 days more on CF setup than Vercel due to this.
If your app is Next.js and you're using middleware.ts, Vercel Edge is essentially "the obvious place" for that code. Code, deploys, env vars all unify.
CF requires you to write the middleware as a separate Worker, point your DNS at it, and proxy back. The proxy adds another hop and another failure mode.
CF KV is eventually consistent across PoPs. We had a "user just signed up, immediately hits another page" flow that occasionally read stale state. We mitigated by also writing to a strongly-consistent store and using KV as a cache.
Upstash (with Vercel) was strongly consistent by default in our config. One less footgun.
After the quarter we kept Cloudflare Workers for the cohort assigner and any other always-on edge logic. The cost and latency math was decisive for that workload.
But we kept Vercel Edge for two other functions: a redirect handler tied to our marketing CMS and an auth-token verifier for our staff app. Both benefit more from Vercel's ecosystem fit than from CF's latency.
The lesson: don't pick one for everything. Pick per workload based on what the workload's bottleneck actually is.
Once you accept multi-platform edge, a few patterns simplify life:
Send api.example.com to one platform, assets.example.com to another. Don't try to load-balance between them in code; route at DNS.
Whatever KV provider you use on each platform, agree on a key schema across them. We use namespace:type:id everywhere; same lookup logic, different backends.
Both platforms log differently. We pipe both into Datadog with platform-tagged metrics so a dashboard can show "edge p99 by platform" side by side.
Both platforms can be run locally with their CLIs. Pin those CLIs in your dev container so the local dev experience is the same regardless of where the function ends up running.
The right answer isn't a platform; it's the right tool per workload. Most teams don't need both. The teams that do should structure for it from day one rather than discover it later.
Get the latest tutorials, guides, and insights on AI, DevOps, Cloud, and Infrastructure delivered directly to your inbox.
We started using eBPF tooling for ad-hoc production debugging six months ago. Three real incidents where it cut investigation time from hours to minutes.
We launched Backstage in October. Six months in, 80% of services are catalogued, on-boarding takes a third of the time, and we mostly know what owns what.
Explore more articles in this category
We deleted every static GCP service account key in our org over six weeks. Here's the migration plan, the gotchas, and the policies we now enforce.
We moved a 60-node production EKS cluster to Auto Mode. Some pain points evaporated, others got harder. The cost picture is more nuanced than the marketing suggests.
We replaced 14 long-lived IAM users with SSO + temporary credentials. The migration plan, the gotchas, and the policies we now enforce.