unkinben 649f89f58b
ci/woodpecker/tag/docker Pipeline was successful
fix: make local docker uploads replica-independent (#104)
## Why

Chunked blob uploads kept the in-progress session in **process memory** keyed by upload UUID, so the `POST`/`PATCH`/`PUT` of a single `docker push` had to land on the same replica. The API runs at `minReplicas: 2` with no session affinity (see argocd-apps `api-hpa.yaml`), so a real push — which streams the layer via `PATCH` then finalises with `PUT` — intermittently 404s with `BLOB_UPLOAD_UNKNOWN` when a chunk hits a replica that never saw the `POST`. This was flagged when the local docker registry landed (#103).

## Changes

- Stage chunked uploads in object storage under `uploads/<uuid>` instead of an in-memory temp file. The UUID travels in the `Location` URL handed to the client, so any replica reconstructs the staging key with no shared in-process state. Finalise streams the staged bytes plus any trailing `PUT` body through the CAS in one pass; monolithic uploads are unchanged.
- Support `DELETE` of an in-progress upload (cancel) by dropping its staging object.
- Reap abandoned staging objects in the GC (`uploads/` older than 24h) via a new `S3.ListStaleObjects`, so cancelled/interrupted pushes don't leak.

## Verification

- Split a single push across **two instances sharing one Postgres+MinIO**: `POST`→A, `PATCH`→B, `PUT`→A finalises with the correct digest, and the blob pulls back **byte-identical from both** replicas. Config-blob and manifest pushes split the same way succeed; `tags/list` is correct. (Pre-fix, the cross-replica `PATCH` 404s.)
- `scripts/docker-e2e.sh` still passes (incl. `TestLocalDockerPushPull`); unit tests + `go vet` clean.

Reviewed-on: #104
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-05 17:39:49 +10:00
2026-06-07 19:30:35 +10:00

ArtifactAPI

Caching proxy for package repositories. Single Go binary, 10 package types, content-addressable storage, managed by Terraform.

Quick Start

# Start backing services
docker compose up -d postgres redis minio

# Build and run
make build
./bin/artifactapi

# Frontend (separate container or dev server)
cd ui && npm install && npm run dev

API: http://localhost:8000 | Frontend: http://localhost:5173

Package Types

Type Mutable (auto-detected) Immutable (auto-detected)
generic nothing everything
docker tag manifests, /tags/list blobs, digest manifests
helm index.yaml .tgz charts
pypi simple/* index pages .whl, .tar.gz
npm package metadata .tgz tarballs
rpm repomd.xml, repodata/* .rpm
alpine APKINDEX.tar.gz .apk
puppet v3/modules/*, v3/releases* .tar.gz
terraform */versions */download/*/*
goproxy @v/list, @latest .info, .mod, .zip

Providers classify paths automatically. Users only configure what to proxy and TTLs.

Terraform

Remotes and virtuals are managed by Terraform. Each package type has its own resource:

resource "artifactapi_remote_generic" "github" {
  name     = "github"
  base_url = "https://github.com"

  immutable_ttl = 0
  mutable_ttl   = 7200

  patterns = [
    "ducaale/xh/.*/xh-.*-x86_64-unknown-linux-musl.tar.gz$",
    "mikefarah/yq/.*/yq_linux_amd64$",
  ]

  mutable_patterns = [
    ".*/archive/refs/heads/.*\\.tar\\.gz$",
  ]
}

resource "artifactapi_remote_docker" "dockerhub" {
  name     = "dockerhub"
  base_url = "https://registry-1.docker.io"

  immutable_ttl    = 0
  mutable_ttl      = 300
  ban_tags_enabled = true
  ban_tags         = ["latest"]

  patterns = [
    "^library/postgres",
    "^library/redis",
  ]
}

resource "artifactapi_remote_helm" "jetstack" {
  name     = "jetstack"
  base_url = "https://charts.jetstack.io"

  immutable_ttl = 0
  mutable_ttl   = 3600
}

resource "artifactapi_virtual" "helm" {
  name         = "helm"
  package_type = "helm"
  members      = [artifactapi_remote_helm.jetstack.name]
}

Provider: terraform-provider-artifactapi

Serving providers as a registry

A local terraform repo is a real provider registry: upload terraform-provider-{type}_{version}_{os}_{arch}.zip files under {namespace}/{type}/, and Terraform installs them from a bare source address — no .terraformrc mirror config:

terraform {
  required_providers {
    artifactapi = {
      source  = "artifactapi.k8s.syd1.au.unkin.net/<repo>/<type>"
      version = "0.1.2"
    }
  }
}

The Terraform namespace segment is the artifactapi repo name; the provider is matched by type. The registry serves service discovery (/.well-known/terraform.json), the providers.v1 version/download endpoints, and a GPG-signed SHA256SUMS per the provider registry protocol.

Signing needs a GPG key. By default artifactapi generates one on first start and stores it in the database (signing_keys table), so every replica shares it and there's nothing to provision. To bring your own key instead, point TF_SIGNING_KEY_PATH at an armored private key (optionally TF_SIGNING_KEY_PASSPHRASE), which takes precedence over the generated one. TF_PROVIDER_PROTOCOLS (default 5.0,6.0) sets the advertised plugin protocols.

Local docker registry

A local docker repo is a real container registry, not a mirror: it serves the Docker Registry HTTP API V2 for both push and pull, so any client (docker, podman, skopeo, buildah) can use it directly.

docker tag myapp:latest artifactapi.k8s.syd1.au.unkin.net/docker-internal/myapp:latest
docker push        artifactapi.k8s.syd1.au.unkin.net/docker-internal/myapp:latest
docker pull        artifactapi.k8s.syd1.au.unkin.net/docker-internal/myapp:latest

The first path segment after /v2/ is the artifactapi repo name; the remainder is the image name. Blobs and manifests are stored through the shared content-addressable store (deduplicated by digest, reaped by GC once unreferenced); tags are mutable references and re-pushing a tag moves it. Blob uploads support both the monolithic and chunked (POST/PATCH/PUT) flows.

Access Control

Field Default Behaviour
patterns empty (proxy all) If set, only matching paths are proxied. Acts as allowlist.
blocklist empty Matching paths always denied. Checked first.
mutable_patterns empty Override: force paths to mutable TTL.
immutable_patterns empty Override: force paths to immutable TTL.

No patterns + no blocklist = open proxy. Provider handles mutability classification automatically.

API

Proxy (v1)

GET /api/v1/remote/{name}/{path}     Proxy/cache artifact
GET /api/v1/virtual/{name}/{path}    Virtual repo (merged index)
GET /v2/{name}/{path}                Docker Registry v2

Management (v2)

GET/POST        /api/v2/remotes              List / create remotes
GET/PUT/DELETE  /api/v2/remotes/{name}       Read / update / delete remote
GET/DELETE      /api/v2/remotes/{name}/objects  Browse / evict cached objects
GET             /api/v2/stats                Overview stats
GET             /api/v2/health               Service health
POST            /api/v2/probe                Test a remote (fetch without streaming to client)
GET             /api/v2/events               SSE event stream

Architecture

PostgreSQL  ─── config (remotes, virtuals), artifact metadata, access log
Redis       ─── TTL keys, fetch locks, circuit breaker state
S3/MinIO    ─── content-addressable blob storage (blobs/sha256/{hash})

S3 client supports MinIO, Ceph RGW, and AWS S3 (via minio-go).

Environment Variables

Variable Default Description
LISTEN_ADDR :8000 Server listen address
DBHOST localhost PostgreSQL host
DBPORT 5432 PostgreSQL port
DBUSER artifacts PostgreSQL user
DBPASS PostgreSQL password
DBNAME artifacts PostgreSQL database
REDIS_URL redis://localhost:6379 Redis URL
MINIO_ENDPOINT localhost:9000 S3 endpoint
MINIO_ACCESS_KEY S3 access key
MINIO_SECRET_KEY S3 secret key
MINIO_BUCKET artifacts S3 bucket
MINIO_SECURE false Use HTTPS for S3
MINIO_REGION S3 region (AWS)

Development

make build       # Build binary
make test        # Unit tests
make e2e         # E2E tests (needs Docker)
make lint        # golangci-lint + go vet
make fmt         # gofmt + goimports

TUI

./bin/artifactapi tui --endpoint http://localhost:8000
S
Description
My terrible vibe coded artifact cache
Readme 1.9 MiB
Languages
Go 86.2%
TypeScript 10.1%
CSS 2.7%
Makefile 0.5%
Shell 0.3%