We have a private module registry with ~25 modules used across 12 accounts. Versioning, interface design, and the over-modularization mistake we keep making.
We have a private Terraform module registry with about 25 modules — VPCs, EKS clusters, RDS instances, ALBs, IAM patterns, Lambda functions, etc. They're used across 12 AWS accounts and dozens of services. Some modules are great. Some have caused more pain than they've saved. This post is what we've learned about which is which.
Not every piece of Terraform should be a module. Our criteria:
aws_s3_bucket in a module that just renames the inputs is not worth it.Modules that fail any of these tend to fight their users. The module exists, but every consumer overrides half its inputs because the module's opinions don't match their needs.
Our better modules share a few traits:
Strong defaults, optional overrides. A module like our-rds-postgres has default settings for backups, encryption, monitoring, parameter groups, etc. The user provides 4 things: name, instance class, allocated storage, and the VPC ID. Everything else is opinionated.
Outputs for everything downstream might need. When a consumer needs the security group ID and the module doesn't expose it, they have to data-source it elsewhere. Modules should export their resource IDs/ARNs as outputs even if "you might not need this."
Tags are forced, not optional. Our module's variables include tags map(string) with no default. The module mandates certain tags (environment, team) by failing validation if they're missing. This sounds annoying; it has saved us from countless untagged resources.
Versioned semantically. Every module has a SemVer tag. Breaking changes bump major. Consumers pin a major version (source = "...//rds?ref=v3").
Documented interface. Every variable has a description. Every output has a description. We use terraform-docs to generate README.md from the variable definitions.
Example of a clean variable definition:
variable "instance_class" {
description = "RDS instance class. Use t3.medium or larger for production. Smaller classes lack performance insights."
type = string
validation {
condition = can(regex("^db\\.[a-z0-9]+\\.[a-z0-9]+$", var.instance_class))
error_message = "instance_class must be like 'db.t3.medium' or 'db.r6g.large'"
}
}
Description, type, validation. Three sentences worth of self-documentation.
The ones that have caused us pain:
Too many inputs. A module with 60 input variables is configurable but unusable. Every variable is a decision the user has to make. Better to make decisions in the module and offer a small surface.
Weak defaults. "Pass in the security group ID, the IAM role ARN, the parameter group, the CloudWatch log group..." If the module makes the user provide all of these, why is it a module?
Hidden coupling. Module A creates an SG. Module B "knows" the SG ID will be a specific value. When Module A changes how it names things, Module B silently breaks. Every cross-module relationship should go through outputs and inputs explicitly, not via guessed values.
Backward-incompatible changes without major bumps. A "small change" in a module breaks every consumer. SemVer is a contract — break it and consumers stop trusting your module.
We have a (private) graveyard of modules that became too painful to maintain. Most fell into one of these patterns.
Our module sources look like:
module "vpc" {
source = "git::ssh://git@github.com/company/tf-modules.git//vpc?ref=v3.2.1"
...
}
The ?ref=v3.2.1 pins to a specific version. Some teams pin only to major (v3), accepting any minor/patch update; others pin fully.
We've moved toward fully pinned. Reason: even minor changes can have unintended effects. A dependency update in the module might bump the AWS provider version, which might recompute a resource hash, which might cause a Terraform plan to show a (cosmetic but confusing) diff.
Updates happen explicitly, in PRs, with the diff visible.
Two ways to build complex infrastructure:
Composition: small modules that compose. A "service" stack uses a vpc module, an eks-cluster module, an rds module, an alb module, etc.
Configuration: a single module that takes a big config object describing the whole stack.
We started with composition. The right answer is mostly composition, with service-level "facade" modules that compose lower-level modules behind a simpler interface for common cases.
Example: our standard-web-service module takes 6 inputs and internally creates an ECS service, ALB target group, security groups, IAM roles, CloudWatch log group, and DNS record. The user doesn't see the lower-level modules; they see one module.
This is a mistake we made twice: trying to make standard-web-service configurable enough to handle every use case. The module ballooned to 40+ inputs and most teams had to fork it for their non-standard needs. We simplified back to "this module handles 80% of cases; for the rest, compose the lower-level modules yourself."
A common confusion: modules don't define state boundaries. State boundaries are defined by where you run terraform apply. A single state file can contain resources from many modules.
This means moving a resource between modules within the same state is fine (state stays the same; the resource address might change). Moving a resource between states is harder (terraform state mv with -state-out).
Our practice: define state files at the "root module" level. A root module is a small piece of Terraform that calls reusable modules. Each root module has its own state file. We have 40+ root modules and 25 reusable modules.
Hardest part of module hygiene. Options:
Terratest: Go-based testing framework. Spins up real AWS resources, asserts on state. We have Terratest for our most-used 5 modules. Tests take 10-30 minutes; we run them on PRs that touch the module, not on every commit.
terraform validate + terraform plan: catches syntax errors and plan-failures. Runs in seconds. Every PR runs this.
Linting via tflint and tfsec: catches common antipatterns and security issues. Required in CI.
Manual testing: spinning up a new instance of the module in a sandbox account before a release. We do this for major version bumps.
We don't aim for high test coverage. We aim for: every module has at least one passing Terratest run, validating it can be instantiated successfully. That alone catches 80% of module bugs.
count and for_each trap#When a module supports "create N of these," for_each is much safer than count. With count, removing item N from the middle of the list re-indexes everything after, causing Terraform to plan to destroy and recreate. With for_each, items are keyed by name; removal affects only that one.
Real example: a module created 5 IAM roles via count. We removed role index 2. Terraform planned to destroy and recreate roles 3 and 4 (because their indices shifted). Caught in plan; never applied. We rewrote the module to use for_each keyed by role name.
In modules, we now use for_each exclusively for collections. count = var.create ? 1 : 0 is fine for "should this resource exist" booleans.
Modules should NOT include their own provider blocks. The convention:
required_providers and the root passes specific provider instances.We had a module that included its own provider "aws" block. When we tried to use it with a non-default profile, we couldn't override. Removing the in-module provider was a breaking change for everyone.
Three over-modularization mistakes we keep making:
Modules for single resources. "I'll wrap aws_s3_bucket in our module to enforce our defaults." Often, the module wrapper adds little; a locals block in the consumer's code with the same defaults is simpler.
Modules to enforce conventions. "We'll write a module so everyone names things consistently." A linter or policy (Sentinel, Checkov, OPA) is a better tool — it can enforce conventions across all Terraform, not just users of the module.
Premature abstraction. A module written for a hypothetical second use case. Then the second use case never materializes, and the module has features no one uses, complicating the only real use case.
The pattern: we write the module on use #3, not on use #1. Two copy-pastes is fine. Three is the trigger.
Every module has:
README.md (auto-generated by terraform-docs from variable descriptions)EXAMPLE.tf showing the most common usageCHANGELOG.md with version historyThe README we don't write — terraform-docs writes it. The example we maintain manually. The changelog gets a line per release with the SemVer change reason.
Things we don't bother with: a "design doc" per module, "future improvements" sections, ASCII art. Just enough to use the module.
Don't write modules until use #3. Two uses is a copy-paste; three is a module. Premature modules calcify around incorrect assumptions.
Strong defaults, small interfaces. The module's job is to encapsulate decisions. If you defer all decisions to the user, you've made a thin wrapper, not a module.
SemVer and pin to specific versions. The "always update to latest" strategy seems convenient and is a debugging nightmare.
for_each, not count, for collections. Your future self will thank you.
No provider blocks inside modules. Use the implicit provider from the caller.
Tags as a required input. Every resource gets tagged correctly without thinking.
Modules are leverage. Done well, one engineer's careful work compounds across many teams. Done poorly, they're an obligation everyone has to work around. The patterns above are the ones we keep coming back to — they're not exotic, they're just consistent.
Get the latest tutorials, guides, and insights on AI, DevOps, Cloud, and Infrastructure delivered directly to your inbox.
A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.
Practical game day scenarios for CI/CD: broken rollbacks, permission issues, and slow feedback loops—and how we fixed them.
Explore more articles in this category
Backups are easy. Restores are hard. The quarterly drill we run, what's failed during it, and the discipline that makes "we have backups" actually mean something.
Replication is the foundation of database HA. What we monitor, how we practice failover, and the gotchas that show up only when you actually fail over.
Why Postgres connection limits bite at unexpected times, the pooling layer we put in front, and the pool-mode tradeoffs we learned the hard way.