Simple cron heartbeats — why low-friction monitoring wins

Scroll the r/selfhosted thread asking "how do you monitor your cron jobs?" and notice which answers the community actually upvotes. The top responses don't look like "install Prometheus and Grafana." They look like:

"I send a curl ping to Healthchecks.io from each job."
"I use ntfy for a start and finish notification."
"I use gotify and a tiny wrapper script."
"I use cron-mon or a small homemade collector that listens for posts."

The wisdom is consistent: for side projects and small VPS setups, the winning answer is low-friction visibility, not a heavyweight observability stack. The question "how do I monitor crons" reads as a request for tooling, but the upvoted answers reveal it's really a request for a signal.

The distinction matters. The signal is easy. The tooling is what people get stuck installing.

What "low-friction visibility" actually delivers

The value of a heartbeat-style cron monitor is that it answers exactly the question you're going to ask first.

Did the backup run today?
Did the cleanup job finish?
Did the sync push the files it was supposed to push?
If something failed, when?

A push notification at the end of a job — success or failure — covers all four of these for almost zero engineering cost. A ping to Healthchecks at start and end covers them with even better history. Neither requires you to stand up a metrics pipeline or a logs pipeline or a dashboard.

The heavyweight monitoring stack answers a different set of questions: how many requests per second did your API handle, what's the p99 latency, which endpoints are slow, are any of them returning more 500s than usual. These are real questions in production systems. They are mostly not the questions side-project operators are asking when they ask about cron jobs.

Matching the answer to the actual question is what makes the simple stack win.

The three patterns that show up over and over

Three shapes do most of the work.

Pattern 1: ntfy or gotify for raw notifications

The job runs. At the end, it sends a push notification. "Backup finished — 42 files copied." Or, on failure: "Backup failed — exit 2, see logs at /tmp/backup.log."

The whole setup is two lines added to the job script. The operator gets a phone notification at the end of each run. Success is invisible (a small notification that fades). Failure is loud.

This pattern works because notifications are push. The operator doesn't have to remember to check anything. The system reaches out.

Pattern 2: Healthchecks-style start/finish pings

Each job hits a unique URL at start and end. A central service knows the schedule. If the job doesn't ping by the deadline, the service notifies the operator.

This pattern's advantage is that it catches non-runs as well as failures. The job didn't run at all? You hear about it. The job ran past its deadline? You hear about it. The job ran cleanly? You don't hear about it, which is the right default.

Healthchecks.io is the canonical hosted version. A self-hosted version can be a small script or a basic dashboard the operator runs.

Pattern 3: A tiny custom collector

For users who already have an ntfy or gotify server, the next step is often a tiny script that listens for posts, stores them in SQLite, and exposes a one-page dashboard.

The dashboard shows the last 24 hours of cron events: which jobs ran, which succeeded, which failed. It's not Grafana. It's a table. The table is enough.

The payoff for going this far is that the operator gets history. They can answer "did this fail last week too?" instead of having to guess.

All three patterns share a property: they're built in an afternoon, they don't pull in agents or sidecars, and they don't impose a mental model on the operator.

Why heavyweight stacks lose for this use case

When someone in r/selfhosted recommends Prometheus for cron monitoring, the responses are politely uncomfortable. The thread acknowledges Prometheus is great — and then explains why it's not the answer here.

Setup cost is wrong-shaped. Setting up Prometheus, alertmanager, exporters, and a dashboard takes a weekend. Sending a curl ping takes thirty seconds.

The mental model doesn't fit. Prometheus thinks in time series; cron jobs are events. You can shoehorn events into time series, but the result is awkward and the alerts are unintuitive.

Operational burden. Prometheus and its friends are services that need monitoring of their own. For a side project, monitoring the monitoring is overhead nobody signed up for.

Alert fatigue is real. Heavy stacks tend to generate alerts that aren't actionable. The operator stops trusting them.

The simple heartbeat pattern dodges all four. It's not that Prometheus is wrong; it's that side-project cron is the wrong shape of problem for it.

The mental model behind the pattern

The deeper reason the simple stack wins is a mental model thing. The simple stack treats job runs as events: a thing happened, here's what it was, here's when. The heavy stack treats system state as a continuous signal: how is the system doing right now.

Cron jobs are events. Trying to convert them into continuous signals adds friction without adding insight. Accepting that they're events — and using event-shaped tools — keeps the design honest.

This is part of why the upvoted answers cluster. They share a worldview: cron monitoring is event monitoring, not metric monitoring. Event monitoring deserves event tools.

Practical setup, ten minutes

If you want to deploy this pattern on your side projects today, the ten-minute version:

Run an ntfy server (or use ntfy.sh for free).

In each cron job script, wrap the work in a small wrapper:

#!/bin/sh
set -e
URL="https://ntfy.example/cron-$JOB_NAME"
START=$(date -Iseconds)
if ./run-the-real-job.sh; then
  curl -s -d "OK $START" "$URL" >/dev/null
else
  curl -s -H "Priority: high" -d "FAIL $START exit=$?" "$URL" >/dev/null
  exit 1
fi

Subscribe to the relevant topics on your phone.
Done.

A failed cron now reaches you within seconds. A successful one logs quietly. You can extend this — add a Healthchecks ping for non-runs, add a tiny SQLite collector for history, add tags per job — but the base setup is functional immediately.

When to upgrade to something heavier

The heartbeat pattern stops being enough when:

You have dozens of jobs with complex inter-dependencies.
You need historical analysis across long windows.
You're running cron on behalf of customers who have SLAs.
You're starting to see overlapping runs and concurrency issues.
The cron infrastructure has become the product, not the support.

When those conditions hit, upgrading to Healthchecks-self-hosted, then to Cronicle, then eventually to a proper job scheduler — Airflow, Temporal, Argo, etc. — starts to make sense.

The key is to upgrade in response to felt pain, not in anticipation of it. Premature upgrade is the failure mode that gets people stuck installing Grafana when they should be writing scripts.

What this says about side-project tooling generally

The broader pattern is worth naming. Side projects benefit disproportionately from tools that match the event-shaped, single-operator, low-stakes reality of side projects. They are penalized by tools designed for continuous-state, team-scale, high-stakes production.

The upvoted r/selfhosted answers know this. They're recommending tools that respect the operator's time and the operator's mental model. The heavyweight recommendations don't — and the upvotes reflect it.

For anyone choosing tooling for a side project, the rule of thumb is: prefer the tool that fits how big your problem is right now, not the tool that would fit a problem ten times larger. The latter ends up unused.

The summary

Developers asking about cron monitoring are mostly asking for signal, not infrastructure. Heartbeats — ntfy, gotify, Healthchecks pings — deliver the signal in minutes with no operational burden. Heavyweight monitoring is the right answer to a different question. Match the tool to the question and the side project keeps working without the cron observability project becoming the side project's main job.