We migrated most scheduled jobs from cron to systemd timers. The wins, the gotchas, and the cases we kept on cron anyway.
We migrated most of our scheduled jobs from cron to systemd timers over the past year. Not because cron is broken — it works fine for what it does — but because systemd timers fit the rest of our operational stack better. This post is what we kept, what we abandoned, and a few cases where cron is still the right answer.
Three real pain points:
Logging is awkward. Cron's default is to email job output to a local mailbox nobody reads. We piped stdout/stderr to log files manually; rotation was a separate concern; failures could go unnoticed if you didn't actively check. Years of accreted hacks across hosts.
No dependency awareness. "Run this job only after the database has come up" isn't a thing in cron. Boot-time cron jobs running before their dependencies were a recurring source of "why did the 3:01 AM job fail last night?"
No retry or backoff. A job that fails just fails. Nobody knows. The same job runs again at the next scheduled time, possibly compounding the problem.
We worked around all three. The workarounds added up to "we should probably move to something that handles these properly."
A few clear wins:
Logs go to journald. Same place service logs go. journalctl -u my-backup.service shows everything; cross-correlates with other system logs in time order. The unified view is the single biggest day-to-day improvement.
Dependency-aware scheduling. A timer's associated service can declare After=postgresql.service (or any other unit). The job won't run until its dependencies are up. Eliminates a class of boot-order bugs.
Real retry policies. The service has Restart=on-failure with StartLimitIntervalSec and StartLimitBurst — bounded retries, just like any other systemd-managed service.
Calendar-style scheduling expressions. Easier to read than cron's five-column syntax. OnCalendar=Mon..Fri 02:00 instead of 0 2 * * 1-5. Same flexibility; better readability.
OnBootSec and OnUnitActiveSec. Relative-to-boot or relative-to-last-run scheduling. Useful for "run 5 minutes after the system comes up" or "run 30 minutes after each previous run finishes."
The two-file pattern that took some getting used to:
/etc/systemd/system/db-backup.service:
[Unit]
Description=Database backup
After=postgresql.service
[Service]
Type=oneshot
User=postgres
ExecStart=/usr/local/bin/db-backup.sh
StandardOutput=journal
StandardError=journal
TimeoutStartSec=2h
/etc/systemd/system/db-backup.timer:
[Unit]
Description=Run db-backup daily
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target
Enable + start: systemctl enable --now db-backup.timer. The timer triggers the service at the scheduled time.
Persistent=true is the one to remember — if the machine was off when the job should have run, run it as soon as the machine comes back up. Most cron jobs have this implicit (anacron-style); systemd timers require the explicit setting.
The two-file thing felt fussy at first. After a couple dozen of them, it's just a habit. The split makes sense: the service describes what to run; the timer describes when.
Three categories:
User-level scheduled tasks. A handful of admin scripts that run as a specific user, not root. Could be done with user-level systemd timers (yes, those exist) but the overhead of explaining "you have to enable lingering for your user account" was higher than just having a user crontab.
Very simple periodic things. "Once an hour, curl this endpoint to keep a session alive." A cron line is 5 seconds to write. A systemd timer + service pair is 30 seconds. For trivial periodic tasks, cron is still faster.
Things on shared servers we don't fully own. Some of our infrastructure runs on machines a partner team controls. We don't push our preferred patterns onto them; their crontab works fine for the few jobs we have there.
The principle: systemd timers are the default for new scheduled work on machines we control. Cron sticks for cases where the migration cost exceeds the benefit.
A few things that bit us during the migration:
Type=oneshot is mandatory for these. Forgetting it makes systemd consider the service "always running" — which is wrong; backup scripts exit. With Type=oneshot, systemd knows the service runs to completion and exits cleanly.
OnCalendar is local time by default. A timer set to 02:00 runs at the host's local time. We standardize on UTC by setting OnCalendar=*-*-* 02:00:00 UTC. Otherwise daylight-saving transitions can cause two runs or zero runs on the transition day.
Persistent=true semantics. It runs once after the missed window, not once for each missed run. If a daily job hasn't run for 5 days, it runs once when the host comes up, not 5 times. Usually what you want; occasionally not.
Listing timers. systemctl list-timers shows every active timer with its next and previous run times. Way better than parsing cron's syntax to figure out when something will run.
Time zones and OnCalendar syntax. The syntax is weekday year-month-day hour:minute:second timezone. Verbose but unambiguous. systemd-analyze calendar "Mon..Fri 02:00" shows the next 5 occurrences — useful for sanity-checking your expression.
We didn't do a big-bang switch. The pattern:
After 12 months, ~70% of scheduled work is on timers. The remaining 30% is mostly the user-level and shared-server cases above. Probably won't migrate further; the cost/benefit doesn't justify it.
Beyond journalctl, what we monitor:
systemctl --failed lists units in failed state. We scrape this via node_exporter; alert if any unit on a production host is in failed state.systemctl list-timers.Persistent=true jobs are running late or skipping, that's an alert. Usually means clock skew or the host was off.Cron has none of this built in. We had to bolt on monitoring; with systemd, it comes from existing operational infra.
Honest list:
Distributed cron-like scheduling (Kubernetes CronJobs). For Kubernetes workloads, use CronJob resources, not systemd timers. CronJobs run as Pods; the orchestration is correct for the platform.
Workflow with many steps and conditional logic. Step Functions or Airflow, not a timer.
Cross-host coordination. "Run this on whichever host has free capacity." Need a real scheduler; systemd timers are per-host.
Triggering on events, not time. Use systemd path units, inotify, or message queues — not timers.
Use systemd timers as the default for new scheduled work on hosts you control. The logging and dependency story alone is worth the file-pair overhead.
Don't migrate existing cron jobs en masse. Migrate when touched. The cost/benefit on stable jobs is rarely worth a dedicated migration.
Persistent=true, UTC, Type=oneshot, journal output. Four settings that should be muscle memory.
systemd-analyze calendar before deploying. Easy way to verify your OnCalendar expression does what you think.
Document the timer name in the service. Both files for the same job should have the same prefix. db-backup.service + db-backup.timer. Saves confusion six months later.
systemd timers aren't a revolution. They're a small upgrade over cron that fits naturally if you're already on systemd. The migration is incremental; the wins are operational consistency, not raw capability. The teams that benefit most are the ones who were already working around cron's limitations.
Get the latest tutorials, guides, and insights on AI, DevOps, Cloud, and Infrastructure delivered directly to your inbox.
We use Step Functions for batch processing, document ingestion, and a few agentic workflows. The patterns that work, the limits we hit, and where we'd reach for something else.
We run three different job queue systems across our services. The patterns that work across all of them, the differences that matter, and the operational gotchas.
Explore more articles in this category
A curated list of shell one-liners that earn their place in real ops work — the ones I reach for weekly, not the trick-shot variety.
Generate an SSH key, set up passwordless login, and configure aliases for the servers you use daily — all without copy-pasting yet another long command.
A clear walkthrough of Linux file permissions. Read the funny rwx- letters, change them safely with chmod, fix "permission denied" errors with confidence.