- Dockerfile 100%
|
All checks were successful
build / build (push) Successful in 8m17s
Rename git.tu-po.com -> code.podesta.ai and reorganize orgs (auralang -> PodestaAI/akribes, runner image -> public/runner-image, brew tap -> public/brew-tap, mirrored bases -> public/*). Product domains aura/akribes.tu-po.com -> api.akribes.ai, studio -> podesta.studio. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .forgejo/workflows | ||
| .gitignore | ||
| Containerfile | ||
| README.md | ||
| renovate.json | ||
runner-image
Pre-warmed Forgejo Actions runner image for the tu-po cluster. Published to
code.podesta.ai/public/runner-image:latest and pulled by the Kata-VM runner
pods in infra/apps/forgejo/ (label runner).
Goals
- Fast "Set up environment" step. Bake rust, sccache, uv, bun into the
image so
rust-toolchain,setup-bun,setup-uvare instant (no GitHub API calls, no download, no rate limits). - Drop-in replacement for
ubuntu-24.04. Base onghcr.io/catthehacker/ubuntu:full-24.04so anything that worked under the stock label keeps working. - Sane cache story. Ship
sccacheconfigured for S3; workflows opt in withRUSTC_WRAPPER=sccacheand credentials from the runner env. - Renovate-tracked versions. Every pinned tool has a
# renovate:marker sorenovate.json's custom manager bumps it automatically.
What's in the image
| Tool | Source | Notes |
|---|---|---|
| Base OS | catthehacker/ubuntu:act-24.04 | GitHub-Actions-compatible Ubuntu 24.04 (slim variant, ~1.5GB vs 15GB for :full) |
| rustup | sh.rustup.rs, minimal profile | + clippy, rustfmt |
| sccache | mozilla/sccache release (musl) | Default RUSTC_WRAPPER=sccache |
| uv / uvx | astral-sh/uv release | Replaces pip per global CLAUDE.md |
| bun / bunx | bun.sh install script | Replaces npm/npx per global CLAUDE.md |
| playwright | npm, via bun add -g |
Browsers + OS deps pre-installed at /opt/pw-browsers |
| node | nodejs from Ubuntu noble |
Needed for the Playwright CLI (#!/usr/bin/env node) |
RUSTUP_HOME=/opt/rust, CARGO_HOME=/opt/cargo (world-writable so any uid
can use them). SCCACHE_IDLE_TIMEOUT=0 keeps the daemon alive for the full
job. PLAYWRIGHT_BROWSERS_PATH=/opt/pw-browsers points workflows at the
pre-baked browsers so npx playwright install is a no-op.
Build
podman build -t code.podesta.ai/public/runner-image:latest \
-t code.podesta.ai/public/runner-image:$(date +%Y%m%d) \
-f Containerfile .
podman push code.podesta.ai/public/runner-image:latest
podman push code.podesta.ai/public/runner-image:$(date +%Y%m%d)
CI does the same on push to main and on a weekly cron (see
.forgejo/workflows/build.yml). Secrets PUSH_USER /
PUSH_TOKEN must exist on the repo (names cannot be prefixed with
FORGEJO_ or GITHUB_ — Forgejo rejects those).
Problems encountered / solutions
1. catthehacker base switches to USER runner
apt-get update and rustup need root. Symptom:
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)
Fix: USER root right after FROM.
2. podman rootless: "history lists N non-empty layers, but we have M layers on disk"
Final COMMIT step fails on this error with the catthehacker base. The base
has ~18 layers with empty-layer history markers that podman's overlay driver
mishandles under rootless home-directory storage. --squash alone does not
fix it — the final commit still hits the check.
Fix: build in CI (Kata VM has a clean overlay store) or on a workstation
with a non-home storage root. If you must build locally, falling back to
--storage-driver=vfs works but is slow and disk-hungry.
3. Kata sandbox crashes during long builds
CI builds that take >10 min have hit SandboxChanged events from
cloud-hypervisor on the Kata runtime. The Rust toolchain layer is the
long pole. Workarounds:
- Keep the
RUNsteps small and fast — every layer is a checkpoint. - If builds start failing reproducibly, build locally and push; see
infra/scripts/kata-capture.shfor post-mortem capture.
4. GitHub API rate limits on action setup
setup-bun, setup-node, dtolnay/rust-toolchain all hit
api.github.com/repos/*/git/refs/tags unauthenticated and get 403'd during
busy hours. That's the whole reason this image exists — bake the tools
in, skip those actions entirely in workflows that target runner.
5. Forgejo registry truncates large blob uploads (UNRESOLVED)
Pushing this image to code.podesta.ai/public/runner-image fails on the
playwright-browser layer (~GB-sized) with one of:
499 Client Closed Request(podman, CI with docker)504 Gateway Timeout(from the registry ingress)unknown: Client Closed Requestafter per-blob retry loop
Reproduces both from CI and from a local podman push over the same
public ingress, so the bottleneck is not in this repo — it's the
Forgejo registry (or the ingress-nginx/Traefik/whatever proxy sits in
front of it) cutting long blob uploads. Retrying doesn't help because
the connection is torn down before the blob finishes.
Actual fix lives in infra/apps/forgejo/ (or the ingress chart):
- Raise the proxy's
proxy-read-timeout/proxy-send-timeout/proxy-request-timeoutpast the time it takes to upload the biggest single blob at realistic bandwidth. 10–15 min is a safe target. - Raise
proxy-body-size/client_max_body_sizeif set — individual blobs can be >500MB. - Check Forgejo's own
[packages]/[server]upload limits inapp.ini(LFS_MAX_FILE_SIZE,STORAGE.MINIOtimeouts if backed by MinIO, etc.).
Until that's done, builds fail at push even when the image itself is correct — there's no reasonable image-side workaround (splitting RUN steps doesn't shrink the browser blobs; playwright lays each browser down as one directory tree, and tar/zstd compresses the whole thing as a single blob per changed layer).
6. The act-24.04 base doesn't ship node in PATH
Unlike :full, the slim :act-24.04 variant has no system node. Playwright's
CLI is a #!/usr/bin/env node shebang script, so playwright install fails
with env: 'node': No such file or directory. Fix: install the nodejs
apt package in the same RUN as the rest of the apt deps.
Renovate
Pinned versions have # renovate: datasource=github-releases depName=…
markers immediately before their ARG. The renovate.json in this repo has
a custom manager that picks them up.
Consumer
infra/apps/forgejo/runner-config.yaml exposes the image as a runner label:
- "runner:docker://code.podesta.ai/public/runner-image:latest"
Workflows opt in with runs-on: runner. Rust jobs additionally export:
env:
RUSTC_WRAPPER: sccache
SCCACHE_BUCKET: sccache
SCCACHE_ENDPOINT: http://minio.tu-po.svc.cluster.local:9000
SCCACHE_S3_USE_SSL: "false"
AWS creds reach the job via the runner StatefulSet's env_file: .env
(populated from a secretKeyRef in the runner pod).