A practical Linux hardening checklist for production hosts. The settings that earn their place via real production reasons, not the cargo-cult version.
Most Linux hardening guides are exhaustive lists of settings, each labeled "important" with no prioritization. After running production Linux for years and surviving one audit, this is the working version: the settings that earn their place because of specific threats they prevent, with the production reasons we tightened each.
Most Linux compromises follow a pattern:
Hardening makes each step harder. We don't pretend to make compromise impossible; we make the bad actor's job materially harder and we make their actions visible to detection.
SSH is usually the front door. Hardening it:
# /etc/ssh/sshd_config
PasswordAuthentication no
PermitRootLogin no
PubkeyAuthentication yes
AuthenticationMethods publickey
ChallengeResponseAuthentication no
UsePAM yes
ClientAliveInterval 300
ClientAliveCountMax 0
MaxAuthTries 3
LoginGraceTime 30
AllowUsers admin operations
Specific reasons:
AllowUsers: explicit allowlist. Users not in the list can't SSH in regardless of what's in /etc/passwd.MaxAuthTries 3: limit attempts per connection. Combined with fail2ban (below), brute force becomes impractical.For our cloud production hosts, we've replaced SSH entirely with AWS SSM Session Manager. No SSH port open, no SSH keys to manage, full audit log of every session. SSH stays available for break-glass via a separate bastion.
fail2ban watches logs and blocks IPs that exceed failure thresholds:
# /etc/fail2ban/jail.local
[sshd]
enabled = true
maxretry = 3
findtime = 10m
bantime = 1h
Three failed SSH attempts in 10 minutes โ banned for 1 hour. Repeat offenders get longer bans (escalating).
For internet-facing SSH, fail2ban is essential. For SSM-only servers, it's irrelevant (no SSH to attack). Match the tool to the threat.
Default sudo: any sudoer can run anything as root. Better:
ALL=(ALL) ALLNOPASSWD: only for specific safe commands (like systemctl status)# /etc/sudoers.d/operations
%operations ALL=(ALL) /usr/bin/systemctl restart myservice, /usr/bin/journalctl
%operations ALL=(ALL) NOPASSWD: /usr/bin/systemctl status
The operations group can restart services and read journals. They can't cat /etc/shadow. The blast radius of a compromised operations account is bounded.
Hardening permissions:
/etc/shadow: 0600, root:root (default; verify)/etc/sudoers.d/*: 0440, root:root/var/log/: 0750, root:admSpecific things we check:
/tmp and /var/tmp (find offenders with find / -perm -o+w -type f)sudo, passwd)We have a script that checks these on every host quarterly.
A few sysctl settings worth tightening:
# /etc/sysctl.d/99-security.conf
# Disable IP source routing
net.ipv4.conf.all.accept_source_route = 0
net.ipv6.conf.all.accept_source_route = 0
# Don't accept ICMP redirects
net.ipv4.conf.all.accept_redirects = 0
net.ipv6.conf.all.accept_redirects = 0
# Log martian packets
net.ipv4.conf.all.log_martians = 1
# Disable IP forwarding (unless needed for routing)
net.ipv4.ip_forward = 0
# TCP SYN cookie protection
net.ipv4.tcp_syncookies = 1
# Disable kernel core dumps (info disclosure)
fs.suid_dumpable = 0
# Restrict ptrace access
kernel.yama.ptrace_scope = 1
# Restrict /proc visibility
kernel.kptr_restrict = 2
kernel.dmesg_restrict = 1
Each of these closes a class of attack:
These don't stop a determined attacker but they raise the bar.
Audit what's listening:
ss -tlnp
Output lists every TCP listener. Each one is an attack surface. We periodically review:
Common findings:
Each exposure is either justified, restricted (firewall rule), or removed.
Default-deny inbound, allow specific:
# Simplified iptables / nftables; in practice we use AWS Security Groups + host-level
nft add rule inet filter input ct state established,related accept
nft add rule inet filter input ip protocol icmp accept
nft add rule inet filter input tcp dport 22 ip saddr 10.0.0.0/8 accept
nft add rule inet filter input drop
For cloud hosts: AWS Security Groups do the bulk. Host-level firewall is defense-in-depth.
For on-prem / standalone hosts: ufw or nftables is the primary firewall. Configure once via Ansible; review annually.
Auditd logs syscall-level events:
/etc/shadow, AWS credentials)Sample rules:
-w /etc/shadow -p wa -k shadow_access
-w /etc/sudoers.d/ -p wa -k sudo_changes
-a always,exit -F arch=b64 -S execve -F path=/usr/bin/curl -F auid>=1000 -k user_curl
Logs go to a central audit log collector. Anomalous events alert.
We use auditd selectively (heavy logging hurts performance). For containerized environments, we replaced most of this with eBPF-based tools (Falco, Tetragon) which have similar visibility with less overhead.
For server hosts, mandatory access control (SELinux on RHEL family, AppArmor on Debian/Ubuntu) is helpful but real work to maintain.
Our policy:
Operating SELinux well is a project; many teams opt out. We've found the default policies catch real issues occasionally without much tuning.
Unpatched systems are the most common compromise vector. Discipline:
Without this, hosts accumulate unpatched CVEs. We've seen 18-month-old hosts with hundreds of high-severity CVEs in industry incidents. Don't be that team.
Hardening reduces the chance of compromise; it doesn't make compromise impossible. Detection catches the cases that get through:
A successful hardening + detection setup means: even if an attacker gets in, we know quickly and can respond before damage spreads.
A few things from older hardening guides we don't apply:
Disabling ICMP entirely. Breaks legitimate use cases (path MTU discovery). Limit, don't disable.
Removing all unused user accounts manually. Cloud images don't have many unnecessary accounts. Reviewing once at AMI build is enough.
Custom kernel builds. Mainline kernels with vendor patches are fine. Maintaining a custom kernel is a recipe for missing security updates.
Hand-curated AppArmor profiles per service. The default profiles work; per-service customization is a maintenance burden that doesn't add proportional security.
Disabling TLS protocols beyond TLS 1.2. Already done by default in current OpenSSL versions. Don't pre-emptively block 1.3.
Replace SSH with SSM Session Manager if you're on AWS. Removes the SSH attack surface entirely.
Use cloud Security Groups + host firewall. Both, not either.
Patches are the most important thing. Most compromises target known CVEs. Stay current.
Audit what's listening on every host. Each open port is an attack surface.
Centralize logs and alert on anomalies. Detection is as important as prevention.
Cloud images / golden AMIs. Bake in the hardening; new hosts are hardened from the start.
Don't over-tighten. Restrictive settings can break legitimate workflows. Tighten until something breaks; back off carefully.
Linux hardening is a discipline, not a one-time project. The setup runs in the background; the discipline is in patching, monitoring, and reviewing periodically. Most of the wins come from a small number of high-leverage changes (SSH hardening, patching, default-deny firewalls, audit logging). The rest is incremental and useful but lower-priority. Get the fundamentals right and most threats become much harder.
Get the latest tutorials, guides, and insights on AI, DevOps, Cloud, and Infrastructure delivered directly to your inbox.
A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.
A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.
Explore more articles in this category
We migrated most scheduled jobs from cron to systemd timers. The wins, the gotchas, and the cases we kept on cron anyway.
A curated list of shell one-liners that earn their place in real ops work โ the ones I reach for weekly, not the trick-shot variety.
Generate an SSH key, set up passwordless login, and configure aliases for the servers you use daily โ all without copy-pasting yet another long command.