We adopted Backstage for service catalogs and templates. What works, what was over-engineered for our size, and what we'd do differently.

On this page

Internal Developer Platforms: Backstage in Practice

We adopted Backstage about 18 months ago to give engineers a single place to find services, scaffold new ones, and read internal docs. The promise — one portal for "everything about your service" — partially landed. Some pieces are genuinely useful; some were over-engineered for our size; one piece we abandoned. This post is the honest assessment.

What "internal developer platform" actually means #

The term is overloaded. In practice, an IDP is some combination of:

Service catalog — a registry of every service, with metadata: owning team, runbook, dashboard links, dependencies.
Software templates — scaffolds for creating new services with the right defaults baked in.
TechDocs / documentation hub — central docs living next to the code.
Workflow integrations — links into CI/CD, deployment, on-call.
Self-service plugins — provision a database, request a feature flag, etc.

Backstage covers #1–4 well. #5 is where teams over-invest and where most "IDP project failed" stories come from.

What we actually use #

Of the five categories, two earn their place daily:

Service catalog (1). Every service has a catalog-info.yaml in its repo. It declares ownership, dependencies, lifecycle (production / experimental / deprecated), and links (runbook, dashboard, on-call). The catalog UI lets an engineer find "who owns this service" in one click instead of grepping commit history. Daily-used.

Software templates (2). When a new service is created, we use a Backstage template that scaffolds the repo with our standard Dockerfile, GitHub Actions workflow, Argo CD manifest, OpenTelemetry init, and so on. The template lives in a separate repo; we update it whenever our service template changes. Engineers run it from the Backstage UI; ~5 minutes from "I need a new service" to "PR open with all the boilerplate."

The rest:

TechDocs (3) — we use it lightly. Most docs still live in Notion / wiki. Backstage TechDocs is great in theory; in practice maintaining two docs systems became a burden, and engineers preferred Notion's editing UX.
Workflow integrations (4) — works but mostly cosmetic. Plugins for CI status, deployment state. Useful but not revolutionary; engineers also have direct access to the underlying tools.
Self-service plugins (5) — we tried building a few. Most weren't worth the effort. More on this below.

What we abandoned: deep self-service plugins #

The promise: "engineers can self-serve database creation, feature-flag toggling, secret rotation, etc., all from Backstage." Sounds great.

Reality: each plugin needed:

An integration with the target system (DB provisioning needs Terraform; flag toggling needs the flag platform API)
An approval / authorization layer (who can provision a production DB?)
Audit logging
Error handling when the integration breaks

Each was a small project. We built two (DB provisioning, S3 bucket creation) and used them maybe ten times each before realizing the engineers preferred the CLI tools they already had. The Backstage UI was a thin layer; the actual work was the integrations behind it.

We retired both plugins. New self-service work goes into the CLI tools directly. Backstage links to the docs.

Operational reality #

Backstage is a real Node.js app you have to run. We deployed it on EKS:

~3 pods, ~1 CPU and 1 GB RAM each. Modest.
Postgres backend for the catalog state. Same RDS cluster as one of our internal apps.
Search backend (we use Elasticsearch; in-memory Lucene is also an option for small catalogs).
GitHub auth integration for SSO.

Upgrades are the biggest pain. Backstage releases monthly with frequent breaking changes in plugin APIs. We upgrade quarterly, treating it like a real maintenance task. Some quarters we skip a release because the changelog is heavy.

Operational time: ~6 hours per quarter on upgrades + occasional plugin issues. Manageable but real.

Service catalog as code #

The pattern that worked: catalog entries live in the same repo as the code. Each service has catalog-info.yaml:

yaml.yaml

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: checkout-api
  description: Customer checkout API
  annotations:
    backstage.io/source-location: url:https://github.com/company/checkout-api
    grafana/dashboard-selector: "tag:checkout-api"
    pagerduty.com/service-id: P12345
    github.com/project-slug: company/checkout-api
spec:
  type: service
  lifecycle: production
  owner: team-checkout
  system: checkout
  dependsOn:
    - component:default/checkout-db
    - component:default/payments-api

Backstage scans configured repos and ingests these. Updates take ~5 minutes to propagate. Engineers update catalog-info.yaml in PRs alongside code changes; ownership info stays current automatically.

What we don't do: manual entry of services through the UI. That drifts. Repo-based catalog declarations stay in sync with reality.

Templates that earn their place #

Our Backstage software template for a new web service generates:

Repo with our standard directory structure
Dockerfile (multi-stage, distroless base)
GitHub Actions workflow (test → build → push image)
Argo CD Application manifest for staging + prod
catalog-info.yaml with sensible defaults
README skeleton with required sections
OpenTelemetry init code
Health check endpoints

Engineer answers ~5 questions (service name, team, language, etc.) in the Backstage UI. Template generates a PR against a new repo. Engineer reviews, merges, gets to feature work.

The template lives in version control. When we change our standards (new logging library, new image base), we update the template; existing services get fixed via their own PRs as they update.

Time from "I want a new service" to "running in staging" used to be ~half a day of copying from another service. Now it's 30 minutes including code review of the generated PR.

What we wish we'd done differently #

Two things in hindsight:

Started with just the catalog. Trying to do catalog + templates + tech docs + plugins simultaneously meant none of them got polished. Catalog alone, deeply, then templates, then add what's needed. Phased adoption beats wholesale launch.

Treated TechDocs as optional. We tried to migrate all docs into TechDocs and the migration stalled because the Notion → TechDocs friction was higher than the value. If we'd left docs where they were and just linked to them from Backstage, we'd have spent less effort and gotten the same outcome.

What it's worth #

If your team is small (< 5 engineers), Backstage is overkill. A README and a services.md in a docs repo does the job.

For 20+ engineers, multiple services, ownership questions like "who do I page for X" — Backstage starts paying back. The catalog and templates are the highest-ROI features.

For 100+ engineers, true platform-engineering territory, Backstage is one of several tools you'd consider (Port, Cortex, custom). The operational cost is justified.

We're around 30 engineers across 5 teams. Backstage is net-positive for us but barely. If we were 15 engineers we'd probably skip it.

Alternatives #

Worth knowing about:

Port — managed IDP, less customizable, less ops overhead. Good if you don't want to run Backstage.
Cortex — similar, more focused on service maturity scoring.
Just a git repo — services.md with a YAML table. Sounds primitive; works fine for many teams. We did this for 18 months before Backstage.
Custom internal portal — some teams build their own. Real investment; worth it only if Backstage's defaults are wrong for you in specific ways.

What I'd tell a team considering it #

Validate the catalog use case first. Can your engineers answer "who owns this service" in 30 seconds today? If yes, Backstage's catalog adds little. If no, that's the value to test.

Start small. Catalog + one good template. Don't try to do plugins on day one.

Catalog-info.yaml in-repo, not UI-managed. Drift is the enemy.

Plan for upgrade cost. Quarterly maintenance, real ops time.

Skip self-service plugins unless they save real time. Most are thin UI on top of CLI tools engineers already use.

Backstage is one of those tools where the marketing oversells and the operational reality is more nuanced. Used carefully, for the things it's actually good at, it's a useful platform layer. Used to its full advertised potential, it becomes a small platform-engineering team's full-time job. Pick what serves you.

Internal Developer Platforms — Backstage in Practice

Internal Developer Platforms: Backstage in Practice

What "internal developer platform" actually means #

What we actually use #

What we abandoned: deep self-service plugins #

Operational reality #

Service catalog as code #

Templates that earn their place #

What we wish we'd done differently #

What it's worth #

Alternatives #

What I'd tell a team considering it #

Stay Updated

Postgres Replication Lag — Monitoring and Failover Practice

LLM Streaming UX — Backpressure, Cancellation, Partial Results

More from DevOps

Helm Chart Anti-Patterns We've Stopped Using

Job Queues — Sidekiq, Celery, BullMQ Patterns That Hold Up

Chaos Engineering — What We Actually Run as Game Days

Helm Chart Anti-Patterns We've Stopped Using

Job Queues — Sidekiq, Celery, BullMQ Patterns That Hold Up

Chaos Engineering — What We Actually Run as Game Days

Kubernetes 101 — Pods, Deployments, and Services Explained

Your First CI/CD Pipeline with GitHub Actions

Docker for Beginners — Build, Run, and Ship Your First Container

You might have missed

AI Agents in DevOps: From Copilots to Autonomous Automation in 2025

Prompt Engineering Best Practices: Maximizing LLM Performance

AI Model Deployment Strategies: From Development to Production

About Admin