K8s Secrets are barely encrypted. We moved every secret to Vault with the Vault Agent injector and never went back. The setup checklist.

Kubernetes Secrets and External Vault Integration: Operational Checklist

Kubernetes Secrets are not really secret. They're base64-encoded strings stored in etcd, accessible to anyone with cluster RBAC for the namespace. For development that's fine; for production it's a gap that bothered us enough to fix. We moved every production secret to HashiCorp Vault, fetched at runtime via the Vault Agent injector, about a year ago.

This is the checklist of what made the integration actually work, in the order we'd tackle it on a new cluster.

Pre-work: decide what's actually a secret #

Before installing anything, we made an inventory. A "secret" is anything where leakage causes harm — API keys, DB passwords, signing keys, OAuth client secrets, encryption keys. Random configuration values (feature flags, log levels) aren't secrets and don't belong in Vault.

The inventory landed at about 40 distinct secrets across our 14 production services. Smaller than we expected. Most "secrets" people had been treating as such were actually configuration.

Step 1: Vault is provisioned and reachable #

We use HCP Vault (HashiCorp Cloud Platform) Plus tier. Self-hosted is fine but adds operational cost we didn't want. Whatever you pick, the cluster needs network access to Vault's API endpoint.

Verify before continuing:

bash.bash

# from inside the cluster
kubectl run -it --rm vault-test --image=hashicorp/vault \
  --restart=Never -- vault status -address=https://your-vault.example.com

Expects to return Initialized: true, Sealed: false. If it can't reach Vault, fix the networking before trying anything else.

Step 2: Configure Kubernetes auth in Vault #

Vault needs to trust the cluster's service account tokens. From the Vault side:

bash.bash

# Enable the Kubernetes auth method
vault auth enable kubernetes

# Tell Vault about the cluster
vault write auth/kubernetes/config \
  kubernetes_host="https://kubernetes.default.svc.cluster.local" \
  kubernetes_ca_cert=@/path/to/ca.crt \
  token_reviewer_jwt=@/path/to/token-reviewer-jwt

The token_reviewer_jwt is a long-lived service account token in the cluster that Vault uses to validate other service accounts' tokens. We create a dedicated vault-auth-delegator SA for this, with system:auth-delegator cluster role binding.

Step 3: Install the Vault Agent injector #

The injector is a mutating webhook that watches pod annotations and injects an init container + sidecar that fetches secrets and writes them to a shared volume. Install via Helm:

bash.bash

helm repo add hashicorp https://helm.releases.hashicorp.com
helm install vault hashicorp/vault \
  --namespace vault-system --create-namespace \
  --set "injector.enabled=true" \
  --set "server.enabled=false" \
  --set "injector.externalVaultAddr=https://your-vault.example.com"

server.enabled=false because we use HCP Vault, not in-cluster Vault. The injector still runs in-cluster.

Verify:

bash.bash

kubectl get pods -n vault-system
# expects: vault-agent-injector-xxxxx  Running

Step 4: Create a Vault role per app #

For each app/service that needs secrets, create a Vault role bound to a Kubernetes service account:

bash.bash

vault write auth/kubernetes/role/checkout-app \
  bound_service_account_names=checkout \
  bound_service_account_namespaces=production \
  policies=checkout-app-policy \
  ttl=1h

The role says "any pod running with the checkout service account in the production namespace can authenticate, with these policies, for 1 hour."

Then write the policy that grants read access to specific secret paths:

hcl.hcl

# checkout-app-policy.hcl
path "secret/data/production/checkout/*" {
  capabilities = ["read"]
}

Apply it:

bash.bash

vault policy write checkout-app-policy checkout-app-policy.hcl

The naming convention matters. We follow secret/data/{env}/{app}/{secret-name} and the policy scopes to {env}/{app}/*. Application can read its own secrets and nothing else.

Step 5: Annotate the pod template #

In the application's Deployment manifest:

yaml.yaml

spec:
  template:
    metadata:
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/role: "checkout-app"
        vault.hashicorp.com/agent-inject-secret-db: "secret/data/production/checkout/db"
        vault.hashicorp.com/agent-inject-template-db: |
          {{- with secret "secret/data/production/checkout/db" -}}
          DB_HOST={{ .Data.data.host }}
          DB_USER={{ .Data.data.user }}
          DB_PASSWORD={{ .Data.data.password }}
          {{- end }}
    spec:
      serviceAccountName: checkout
      containers:
      - name: app
        image: ourorg/checkout
        command: ["/bin/sh", "-c", "source /vault/secrets/db && exec /app/checkout"]

The injector adds an init container that fetches the secret and writes it to /vault/secrets/db. The application reads it from there. The application code doesn't know about Vault — it just reads env-style files from a shared volume.

Step 6: Verify the pod gets secrets #

After deploying, check:

bash.bash

kubectl logs deployment/checkout -c vault-agent
# Should show: "[INFO] template server: rendered ... /vault/secrets/db"

kubectl exec deployment/checkout -c app -- ls -la /vault/secrets/
# Should show: -rw------- 1 root root  ... db

If the pod is in CrashLoopBackOff, check the init container logs first:

bash.bash

kubectl logs deployment/checkout -c vault-agent-init

Common failure: the role binding doesn't match (typo in service account name or namespace). Vault returns 403, init container exits, main container can't read the missing secrets file.

Step 7: Set up the agent template renewal #

The default behaviour is "fetch at pod start, never re-fetch." For most secrets that's fine. For credentials with short TTLs (like Vault's database secrets engine), you want continuous renewal:

yaml.yaml

vault.hashicorp.com/agent-inject-template-db: |
  {{- with secret "database/creds/checkout-role" -}}
  DB_USER={{ .Data.username }}
  DB_PASSWORD={{ .Data.password }}
  {{- end }}

The agent will renew the secret as it approaches expiration and re-render the file. The application has to either re-read the file periodically or get a SIGHUP-style reload signal (Vault Agent can send one).

We don't use this for most apps — the rotation complexity isn't worth it for static API keys. We do use it for our DB credentials, where Vault issues short-lived per-pod credentials (5 minute TTL), and the agent rotates them automatically.

Step 8: CI/CD and scripts that need secrets #

CI doesn't run as a Kubernetes pod, so the K8s auth method doesn't apply. We use Vault's GitHub OIDC auth method:

bash.bash

vault auth enable jwt
vault write auth/jwt/config \
  oidc_discovery_url="https://token.actions.githubusercontent.com" \
  bound_issuer="https://token.actions.githubusercontent.com"

vault write auth/jwt/role/github-actions-deploy \
  role_type="jwt" \
  user_claim="actor" \
  bound_audiences="https://github.com/yourorg" \
  bound_claims='{"repository_owner":"yourorg","ref":"refs/heads/main"}' \
  policies="ci-deploy"
  ttl=15m

In GitHub Actions:

yaml.yaml

- uses: hashicorp/vault-action@<sha>
  with:
    url: https://your-vault.example.com
    method: jwt
    role: github-actions-deploy
    secrets: |
      secret/data/ci/aws-deploy-key access_key | AWS_ACCESS_KEY_ID
      secret/data/ci/aws-deploy-key secret_key | AWS_SECRET_ACCESS_KEY

The CI gets short-lived credentials, only valid for the duration of the job. No long-lived static secrets in GitHub.

Step 9: Audit log everything #

Vault has an audit device for logging every secret access. Enable it before going live:

bash.bash

vault audit enable file file_path=/var/log/vault/audit.log

For HCP Vault, the audit log streams to the HashiCorp Cloud Platform's logging integration; we forward it to our SIEM.

Every read of every secret is logged with the requesting identity, timestamp, path, and granular outcome. When (not if) you have a security review, the audit log is what answers "did anyone read X secret outside normal operations?"

Common issues we've hit #

A short list of issues people on the team have run into:

Pods stuck in Init. Vault Agent init container can't reach Vault. Network policy issue, DNS issue, or Vault itself unreachable. Check kubectl logs <pod> -c vault-agent-init.

Secrets not refreshing. The agent template only re-renders if the secret's TTL is short enough. For static secrets, the agent fetches once and that's it. Restart the pod (or use Vault Agent in long-running mode with explicit renewal) to refresh.

RBAC permission denied. The Kubernetes auth role's bound_service_account_namespaces doesn't include the namespace. Easy to typo. Quick check: vault read auth/kubernetes/role/your-role.

Token TTL too short. Default ttl=1h is fine for most apps. For services with long initial startup that read secrets only at boot, this is irrelevant — you don't need the token after init. For services that re-fetch, ensure the TTL covers your refresh interval.

What we don't bother with #

Static Vault tokens stored in K8s Secrets. Defeats the purpose. The only auth method we use is the Kubernetes service-account-token method.
Vault as a generic key-value store for non-secret config. Over-applied. Use ConfigMaps for non-secrets.
Vault Agent in standalone mode (without injector). Possible but adds complexity per app. Injector model is consistent and pod templates stay clean.

What we measure #

Secret reads per minute, per app. Anomalies (a sudden spike) suggest a buggy retry loop or an attacker.
Failed authentications. A burst of 401s from one identity is worth investigating.
Token TTL utilization. If tokens are expiring frequently, either TTL is too short or apps aren't re-fetching properly.

These come from Vault's metrics endpoint scraped by Prometheus.

What I'd tell a team starting #

Start with one app. Get the full path working — Kubernetes auth method, role, policy, agent injector, app reads secrets — before you try to roll out widely. The first one is the slowest; subsequent apps take maybe 30 minutes each.

Decide your secret-path naming convention early. We use {env}/{app}/{secret-name}. Changing convention later is annoying because you have to update both Vault paths AND application code.

Don't migrate everything at once. We did it service-by-service over a sprint. Each migration is small but the cumulative effect is significant. By the end, no one had to think "wait, where does this secret live" — the answer was always "Vault, fetched by the agent."

The biggest win isn't the cryptographic improvement (real but boring). It's the audit trail. Six months in, when somebody asks "who accessed the prod DB password during last Tuesday's incident," the Vault audit log answers in 5 minutes. Without it, that question is unanswerable.

Operational Checklist: Kubernetes Secrets and External Vault Integration

Kubernetes Secrets and External Vault Integration: Operational Checklist

Pre-work: decide what's actually a secret #

Step 1: Vault is provisioned and reachable #

Step 2: Configure Kubernetes auth in Vault #

Step 3: Install the Vault Agent injector #

Step 4: Create a Vault role per app #

Step 5: Annotate the pod template #

Step 6: Verify the pod gets secrets #

Step 7: Set up the agent template renewal #

Step 8: CI/CD and scripts that need secrets #

Step 9: Audit log everything #

Common issues we've hit #

What we don't bother with #

What we measure #

What I'd tell a team starting #

Stay Updated

Infrastructure Testing Strategies: Validating Your IaC

A Pragmatic Multi-Region Strategy for Small Teams

More from DevOps

Feature Flags for Safe Deploys: Decoupling Release From Deploy

On-Call Without Burnout: Rotations, Runbooks, and Escalation

Blameless Postmortems: The Template and Facilitation That Works

Feature Flags for Safe Deploys: Decoupling Release From Deploy

On-Call Without Burnout: Rotations, Runbooks, and Escalation

Blameless Postmortems: The Template and Facilitation That Works

Four Signals That Matter: Choosing SLIs Users Actually Feel

External Secrets Operator: One Secrets Workflow Across Clouds

Kustomize Overlays That Scale Across Environments

You might have missed

Prompt Engineering Best Practices: Maximizing LLM Performance

AI Agents in DevOps: From Copilots to Autonomous Automation in 2025

Process Management and Monitoring in Linux

Linux Network Debugging — tcpdump, ss, and eBPF in Anger

About Kiril Urbonas