---
title: "BOM radar rollout — Sydney and Brisbane on cremonde"
date-created: "2026-06-09"
type: "memoir"
status: "complete"
author:
- "st33v"
- "claude-opus-4-7"
model:
- "claude-opus-4-7"
tldr: "Took the bomsynoptic side-spec for IDR713 from a draft to a live, looping APNG on radar.pestrel.com in one session — and added Brisbane (IDR663) at the end. Most of the work was wiring, not coding; the spec turned out to be the easy part."
chat-url: "https://claude.ai/chat/..."
session-kind: "infrastructure"
side-quests: 1
reader-targets:
- "st33v"
- "claude-code"
- "scrivener"
related:
entities:
- "cremonde"
- "pestrel.com"
concepts:
- "static layers + dynamic layers"
- "lower plate / upper plate"
songs: []
tags:
- "bom"
- "radar"
- "deployment"
- "ffmpeg"
- "systemd"
provenance:
- "doc/bom-radar-spec.md (v0.1, drafted same day)"
---
# BOM radar rollout — Sydney and Brisbane on cremonde
## TL;DR
A spec drafted earlier on 2026-06-09 (`doc/bom-radar-spec.md`) called for adding a 6-minute rain-radar loop to the existing `bomsynoptic` deployment, beginning with Sydney (IDR713) and parameterising on product code so other radars could follow. By end of session both Sydney and Brisbane were live at `radar.pestrel.com/sydney/` and `/brisbane/`, served as proper animated APNGs from a 6-minute systemd timer. The spec was correct in every architectural choice; the friction was entirely in the plumbing — git remote shuffling, file ownership, ImageMagick lying about APNG support, and a palette-mismatch in ffmpeg's apng encoder.
## Context
`bomsynoptic` was already deployed on cremonde: a shell script on a 6-**hourly** systemd timer fetching the BOM MSLP synoptic PDF, rasterising it to PNG, and serving the latest as a single `
` on `pestrel.com`. Spec proposed a second product class — rain radar — on a 6-**minute** cadence, layered (transparencies + echo frames), with a rolling buffer of the last six frames as a loopable animation.
The spec opened with four §12 questions to the implementing agent. Those were worked through against the existing code first, then implementation proceeded.
## Spec triage — the §12 answers
In order:
1. **Scheduler reuse:** no shared in-code scheduler exists — each product is its own systemd timer. The radar pattern mirrors synoptic's: `radar.{service,timer}` + `radar-retry.{service,timer}` with `OnFailure` chaining, just at `OnUnitActiveSec=6min` instead of `OnCalendar=*:10:00`.
2. **Asset convention:** existing pattern is `/srv/www/pestrel/synopticLatest.png` served by nginx from `/srv/www/pestrel`. Decided radar gets its own subdomain (`radar.pestrel.com`), own web root (`/srv/www/radar/`), own working dir (`/var/lib/radar//`), own `/opt/radar/` for scripts. Mirroring the synoptic shape, but isolated.
3. **Pillow vs ImageMagick:** existing stack is pure shell + curl + magick + gs. Staying in shell+IM keeps the footprint identical. Recommended this; st33v confirmed.
4. **Loop format:** existing front-end is one `
` tag, no JS. APNG drops in unchanged; sequence+manifest would have meant introducing JS. Picked APNG.
The §10 copyright concern (BOM FTP is personal-use, not commercial-republish) was flagged and dismissed: *"No one is looking at the site right now."* Attribution to BOM is implicit by content; no formal gating.
## Implementation — first pass
Wrote in roughly this order:
- `radarFetch.sh`: parameterised on `RADAR_ID` (defaulting `IDR713`); refresh transparencies on 24h TTL with `.last_refreshed` marker; build two cached plates (lower = background + topography + optional feature overlays; upper = range + locations); fetch top-N echo frames by lexical sort; composite each as `lower → echo → upper → legend@SE`; evict frames outside the rolling buffer by set membership.
- Four systemd units (`radar.service`, `radar.timer`, `radar-retry.service`, `radar-retry.timer`).
- `nginx/radar.pestrel.com.conf` with `Cache-Control: no-cache` on `.apng`.
- A black-page `radar.index.html` with the single `
`.
- Extended `setup.sh` to provision radar dirs/units/scripts/index. Extended `deploy.sh` to take a `synoptic|radar` argument.
DNS: st33v added an `A` record for `radar` → 139.162.32.70 (cremonde) alongside the apex and `www`.
## Git remote shuffle
Existing remote was github (`f3rr3t/bomSynoptic`). User wanted cremonde as origin. First attempt set the URL to `cremonde:/home/git/bomSynoptic.git` — but cremonde's ssh alias logs in as `st33v`, and the bare repo is owned by user `git`. Hit two consecutive failures:
1. **"dubious ownership"** — git's anti-tampering check. Added `safe.directory` exception.
2. **"unable to create temporary object directory"** — st33v can't write into git-owned dirs. Real fix: change the URL to `git@cremonde:/home/git/bomSynoptic.git`, matching how the other ~30 bare repos in `/home/git/*.git` are reached. *"You can — I just set the remote wrong."* Lesson logged: "use ssh cremonde" can mean two things, host alias and user; check the existing convention rather than guess.
Reverted the safe.directory entry as no longer needed.
## Deployment — three real bugs
The deploy was bumpy and instructive. In order encountered:
### Bug 1 — empty deployment
After provisioning everything, ran setup.sh on cremonde and was told *"setup.sh did not install the radar units."* The clone on cremonde was at `c132b39` — predating all the radar work, because the radar files had never been committed or pushed. Stupid but easy fix: stage, commit (`f058e83`), push, pull on cremonde, re-run setup.
### Bug 2 — ImageMagick silently writing single-frame PNGs
First successful service run produced `/srv/www/radar/idr713-loop.apng` at 35,106 bytes — identical to `frame.00.png`. `file` confirmed: plain PNG, not animated. ImageMagick on Arch is built against an upstream libpng that lacks the APNG patch — so `magick -delay 50 -loop 0 frame*.png out.apng` happily produces a single-frame PNG with no warning. Switched to `ffmpeg`'s `concat` demuxer + `apng` muxer (also unlocking variable per-frame durations cleanly via the demuxer text format). Added `ffmpeg` to setup.sh's pacman line. st33v's reaction was just "but ffmpeg is useful" — fair, took the install.
### Bug 3 — ffmpeg palette mismatch
ffmpeg succeeded silently in test but exited 255 in production with:
> `Input contains more than one unique palette. APNG does not support multiple palettes.`
The composited frames were 8-bit palettized PNGs, and each frame's palette differed slightly. ffmpeg's apng encoder won't accept that. Single-character fix: `-vf format=rgba` in the ffmpeg invocation. The file went from a stuck 35 KB single-frame PNG to a proper 91 KB 7-frame RGBA APNG (six frames + the trailing duplicate the concat demuxer requires).
Important meta-observation: between bug 2 and bug 3, the user noticed *"seems stuck on the first image"*. Without that, the failed `systemctl` retries would have just kept hammering the BOM FTP every two minutes and the published file would have remained stuck at 13:05. The error was in the journal but nobody was watching the journal.
## Brisbane
Once Sydney was confirmed working, st33v provided the Brisbane code (IDR663, Mt Stapylton) and asked the design question: select between, or show both? Considered three shapes (subdirs, subdomains, stacked on one page), recommended subdirs (`/sydney/`, `/brisbane/`) with cross-links — minimal new infra, no JS, matches the existing one-img-per-page aesthetic. st33v picked that.
Implementation: changed `radar.service` to have two `ExecStart` lines (one per `RADAR_ID` arg), prefixed both with `-` so one radar's outage doesn't skip the other. Added `radar.sydney.html` and `radar.brisbane.html` (corner cross-link nav) and rewrote `radar.index.html` as a centred two-button landing. setup.sh provisions the subdirs and installs all three pages.
Deployed clean. Both APNGs publishing on the same 6-minute timer.
## Resolutions
- **Subdomain not subpath.** radar.pestrel.com instead of pestrel.com/radar — clean nginx separation, independent evolution.
- **APNG over GIF, despite GIF requiring no new deps.** st33v's "ffmpeg is useful" weighed against the GIF size penalty.
- **Sequence+manifest dropped.** Would have meant the first JS in the project. APNG keeps the black-page-one-img idiom intact.
- **Per-radar invocation of one script over a template service.** `ExecStart=` × N with `-` prefix is simpler than `radar@.service` template units for the current scale of two.
- **TLS deferred.** Plain HTTP. To revisit when adding certbot to cover both pestrel.com and radar.pestrel.com.
## Operational learnings
- **Arch ImageMagick has no APNG writer.** It does not warn; it just writes a single-frame PNG. If APNG matters, use ffmpeg or apngasm.
- **ffmpeg's apng muxer rejects palette mismatches.** Composite chains that go through palette-PNG intermediates need `-vf format=rgba` to be safe.
- **`git remote = git@host:path` for cremonde bare repos.** Not `host:path`, which logs in as st33v and can't write the git-owned object dirs.
- **systemd `ExecStart=-...` prefix** ignores the exit code of that one line — useful when a service runs several independent jobs and you want partial-success semantics.
- **The `concat` demuxer wants the last file repeated** without a `duration` line, otherwise the final frame's duration is ignored.
- **`Cache-Control: no-cache` on `.apng`** matters when the file updates every 6 minutes and the URL stays stable.
## Cheat sheet — adding a third radar
1. Confirm the IDR code (lookup BOM's radar codes page).
2. Add `ExecStart=-/opt/radar/radarFetch.sh IDRxxx` to `systemd/radar.service` and `radar-retry.service`.
3. Write `radar..html` (copy of an existing one, swap the apng filename and the cross-link target).
4. Add `install -d /srv/www/radar/` and the html install line to `setup.sh`.
5. Update `radar.index.html` to add a third button (and rebalance the gap).
6. Commit; push; pull on cremonde; `sudo ./setup.sh`.
7. `sudo systemctl start radar.service` to publish immediately rather than waiting up to 6 min for the next timer fire.
Everything else — directories, naming, compositing — is identical across radars.
## Appendix A — side-quest: temp-file race hardening
*Surfaced when a manual `/opt/radar/radarFetch.sh` invocation collided with a service-triggered run at 14:05:31. Both wrote to the same `loop.apng.tmp`; one won, the other lost with `install: cannot stat …`.*
In production only the timer should invoke the script, so this race is unlikely under normal operation. But the failure mode is silent corruption rather than a clean fail, which makes it worth hardening:
- **Option a:** `mktemp` per invocation. `TMP_APNG=$(mktemp "${OUT_DIR}/loop.XXXXXX.apng")`, then `trap 'rm -f "$TMP_APNG"' EXIT`. Minimal change; each invocation has its own temp.
- **Option b:** `flock` over the whole script. `exec 9>/var/lock/radar-${RADAR_ID,,}.lock; flock -n 9 || exit 0`. Cleaner — prevents concurrent runs entirely, including the spec's stated "skip a poll cleanly if the previous one is still running" requirement.
Option b is closer to what the spec asked for. Worth doing.