We migrated 47 cron jobs to systemd timers across our fleet. The mechanical conversion was easy. The interesting parts were the bugs we found that cron had been hiding.

systemd Timers vs Cron: When We Switched and What We Learned

We migrated 47 cron jobs to systemd timers across roughly 200 hosts. The migration itself was a few weeks of Ansible work. The valuable part was what the migration forced us to confront — bugs that cron had been quietly hiding.

Why We Switched #

Three concrete reasons, in order of how often they bit us:

Cron silently swallows output. MAILTO= was empty on most boxes. Failing jobs were invisible until something downstream broke.
No native dependency or retry support. "Run X only after Y completes successfully" was always a shell wrapper hack.
No structured logs. grep CRON /var/log/syslog is not a monitoring strategy.

The Two-File Pattern #

Every cron job became two systemd units: a .service (what to run) and a .timer (when to run it). It feels heavier at first; it's worth it.

ini.ini

# /etc/systemd/system/db-backup.service
[Unit]
Description=Nightly database backup to S3
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
User=backup
Group=backup
EnvironmentFile=/etc/db-backup.env
ExecStart=/usr/local/bin/db-backup.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=db-backup
TimeoutStartSec=2h

ini.ini

# /etc/systemd/system/db-backup.timer
[Unit]
Description=Run db-backup nightly at 02:30
Requires=db-backup.service

[Timer]
OnCalendar=*-*-* 02:30:00
Persistent=true
RandomizedDelaySec=15min
AccuracySec=1min
Unit=db-backup.service

[Install]
WantedBy=timers.target

Enable and start:

bash.bash

systemctl daemon-reload
systemctl enable --now db-backup.timer
systemctl list-timers --all

Bugs We Found During the Migration #

Bug #1: A Job That Had Been Failing for 4 Months #

A log-rotation job exited 1 every night because of a permission issue introduced during a hardening sweep. Cron's MAILTO was unset on the host; nothing got mailed; the job just kept failing. Disk usage was creeping up but nobody noticed because the alert threshold was 90% and we were at 84%.

The systemd version surfaced this immediately:

bash.bash

$ systemctl status log-rotate.service
● log-rotate.service - Log rotation
   Active: failed (Result: exit-code) since Mon 2026-03-09 03:00:01

Plus we wired OnFailure=alert@%i.service to ship every failed unit to PagerDuty.

Bug #2: Two Jobs Racing Against the Same File #

Two crons were rotating the same log via different scripts that nobody knew about. They had been racing for years; the loser silently corrupted partial gz files about once a week. We caught it during the migration when systemd's PartOf= couldn't resolve the dependency.

Bug #3: Timezone Drift on a Single Host #

One box's cron was set to the user's local timezone (CDT) while everything else ran in UTC. OnCalendar=*-*-* 02:30:00 is always UTC by default unless you set OnCalendar=*-*-* 02:30:00 UTC or change the host TZ. Migration forced us to canonicalize: every timer line ends in UTC to remove all ambiguity.

Patterns We Now Use #

Reliable Periodic Jobs #

ini.ini

[Timer]
OnCalendar=hourly
Persistent=true            # run on next boot if we missed it while down
RandomizedDelaySec=10min   # spread load across the fleet
AccuracySec=1min           # default 1min is fine; tighten only if you need it

Persistent=true was the most underrated win. If the host was off for maintenance, the missed run fires once on boot. Cron has nothing like this without anacron, which is itself a third file to manage.

Retry on Failure (Without a Loop in the Script)#

ini.ini

[Service]
Type=oneshot
ExecStart=/usr/local/bin/sync-secrets.sh
Restart=on-failure
RestartSec=30s
# But: don't restart forever
StartLimitIntervalSec=10min
StartLimitBurst=3

If the script fails, systemd retries up to 3 times in 10 minutes, then gives up and leaves the unit in failed state for the alert hook.

Resource Caps #

Cron has no equivalent of these. We add them on every job that touches CPU or memory:

ini.ini

[Service]
CPUQuota=50%
MemoryMax=512M
TasksMax=128
IOWeight=50

A backup job that used to spike load to 12 now stays under 4.

Run-After Dependencies #

ini.ini

# /etc/systemd/system/db-backup-verify.service
[Unit]
After=db-backup.service
Requires=db-backup.service

# In the timer, point at db-backup.timer's completion via PartOf or use a chain

For real chains we prefer Requires + After + Type=oneshot rather than chaining timers. Easier to reason about.

What's Still Worse Than Cron #

Boilerplate. Two files per job is more typing.
Editing. crontab -e is one command; systemd is vim /etc/systemd/system/foo.timer && systemctl daemon-reload. We hide this in Ansible.
Discoverability for newcomers. crontab -l is universal; systemctl list-timers is unfamiliar to some sysadmins.

Ansible Snippet We Use Everywhere #

yaml.yaml

- name: Install systemd timer + service
  copy:
    src: "{{ item }}"
    dest: "/etc/systemd/system/{{ item | basename }}"
    mode: "0644"
  loop:
    - "files/{{ job_name }}.service"
    - "files/{{ job_name }}.timer"
  notify:
    - daemon reload
    - enable timer

handlers:
  - name: daemon reload
    systemd: { daemon_reload: yes }
  - name: enable timer
    systemd:
      name: "{{ job_name }}.timer"
      enabled: yes
      state: started

When Cron Is Still the Right Choice #

A single one-off task on a personal server. Don't overengineer.
A throwaway box. If it's gone in a week, who cares.
Embedded systems without systemd. Use what's there.

For everything else — production hosts, fleet at any scale, anything you'd want to debug at 3am — systemd timers pay for themselves the first time something fails.

Best Practices Summary #

Always set StandardOutput=journal and a SyslogIdentifier= so journalctl -u name -f works.
Always set Persistent=true for periodic jobs unless you have a specific reason not to.
Always set resource caps (CPUQuota, MemoryMax). Future-you will thank present-you.
Always wire OnFailure= to your alerting stack. Silent failures are the only failures that matter.
Always pin timezones with explicit UTC in OnCalendar=.
Always use Ansible (or your config-mgmt tool) to deploy units. Manual edits drift; deployed configs converge.

systemd Timers vs Cron: When We Switched and What We Learned

systemd Timers vs Cron: When We Switched and What We Learned

Why We Switched #

The Two-File Pattern #

Bugs We Found During the Migration #

Bug #1: A Job That Had Been Failing for 4 Months #

Bug #2: Two Jobs Racing Against the Same File #

Bug #3: Timezone Drift on a Single Host #

Patterns We Now Use #

Reliable Periodic Jobs #

Retry on Failure (Without a Loop in the Script)#

Resource Caps #

Run-After Dependencies #

What's Still Worse Than Cron #

Ansible Snippet We Use Everywhere #

When Cron Is Still the Right Choice #

Best Practices Summary #

Stay Updated

Zero Trust on AWS: Lessons From Implementing IAM Identity Center

Blue/Green Deploys for Stateful Services: A Postgres Cutover Story

More from Linux

SSH Hardening in 2026: Keys, Certificates, and Bastion Patterns

Linux TCP Tuning for High-Throughput Services

Debugging Latency with eBPF: bpftrace One-Liners That Find It

SSH Hardening in 2026: Keys, Certificates, and Bastion Patterns

Linux TCP Tuning for High-Throughput Services

Debugging Latency with eBPF: bpftrace One-Liners That Find It

systemd Timers vs Cron: Migrating Scheduled Jobs the Right Way

Observability for Edge Functions — Logs, Traces, and Metrics

Four Signals That Matter: Choosing SLIs Users Actually Feel

About Kiril Urbonas

You might have missed

GitOps with Argo CD: Best Practices for 2025

Prompt Engineering Best Practices: Maximizing LLM Performance

AI Agents in DevOps: From Copilots to Autonomous Automation in 2025

AWS Graviton Migration: What Broke and What We Saved