Files
uptime/todo.md
Blake Ridgway d593744ff7 updated todo
2026-03-22 11:29:41 -05:00

2.7 KiB

arcline-uptime — Lightweight Uptime Monitor

Polls HTTP/TCP/TLS/DNS endpoints on a schedule, stores results in SQLite, sends alerts via webhook (Discord, Slack), email, ntfy, or Gotify. Single binary, no external services.

Stack

  • Language: Go
  • Storage: SQLite (via modernc.org/sqlite — pure Go, no CGO)
  • Config: YAML
  • Alerts: Discord/Slack webhook, SMTP email, ntfy.sh, Gotify
  • UI: embedded web dashboard (net/http + Go templates)

Done

  • Project scaffold
  • YAML config parser
  • HTTP monitor (status code, body contains, response time threshold)
  • HTTP POST/PUT with custom body and headers
  • TCP monitor (dial timeout)
  • TLS certificate expiry monitor
  • DNS resolution monitor (optional expected IP assertion)
  • Per-monitor interval and timeout overrides
  • Scheduler (ticker per monitor, immediate first check)
  • SQLite schema (checks, alerts_sent)
  • Result storage with configurable retention / auto-pruning
  • Discord / Slack webhook alerter
  • SMTP email alerter (multiple recipients)
  • ntfy.sh alerter
  • Gotify alerter
  • Per-monitor alert routing (named alerters)
  • Maintenance windows (suppress alerts on a schedule)
  • Alert cooldown logic (don't spam on sustained outage)
  • Recovery alert ("Main Website is back up, was down 12m 34s")
  • Web dashboard — current status page (24h / 7d / 30d uptime)
  • Web dashboard — history / SVG sparkline graph with down markers
  • Web dashboard — incident log
  • Public status page (no auth)
  • /metrics Prometheus endpoint (up, response_ms, uptime_24h, uptime_7d)
  • Basic auth for dashboard
  • systemd unit file example
  • README with self-hosting guide
  • Cross-compile Makefile
  • Structured logging (slog, text or JSON)
  • start, check, list, version CLI subcommands

Ideas

  • ICMP/ping monitor
  • HTTP response header assertion
  • HTTP JSON path check
  • SSH command check
  • Generic webhook alerter (configurable template body)
  • Telegram alerter
  • PagerDuty alerter
  • Escalation policy (alert A immediately, alert B after N minutes still down)
  • test-alert subcommand
  • validate subcommand (parse config, print summary, exit non-zero on errors)
  • --dry-run flag for start (run checks, no alerts)
  • Per-monitor detail page with full history and time-axis chart
  • Uptime calendar heatmap (GitHub-style, per day)
  • CSV / JSON export of check history
  • JSON API (/api/v1/monitors, /api/v1/monitors/{name}/checks)
  • Environment variable substitution in config (${VAR})
  • Config hot-reload on SIGHUP
  • TLS for the dashboard itself
  • Database backup command