unkinben 76633403b2
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/test Pipeline was successful
ci/woodpecker/pr/build Pipeline was successful
chore: move example config files into examples/
Keeps the repo root clean — example remotes.yaml lives in examples/.
docker-compose.yml updated to mount from the new path.
2026-04-28 23:44:14 +10:00
2026-04-25 22:27:59 +10:00

Artifact Storage System

FastAPI caching proxy that downloads and stores files from remote sources in S3-compatible storage.

Features

  • Remote definitions via remotes.yaml — generic HTTP, Alpine APK, RPM, Docker, PyPI, npm, Helm
  • Immutable/mutable caching model with per-remote TTLs
  • Conditional revalidation (If-None-Match / If-Modified-Since) on TTL expiry
  • Stale-on-upstream-error: refreshes TTL when backend is unreachable rather than evicting
  • URL rewriting for PyPI simple index, npm metadata, and Helm index.yaml
  • Access control via regex patterns — unmatched paths return 403

Architecture

client → /api/v1/remote/{remote}/{path}
           ↓
         Redis: mutable TTL check
           ↓ miss / expired
         S3: object exists?
           ↓ no
         upstream remote → S3 + PostgreSQL metadata
           ↓
         response (X-Artifact-Source: cache|remote)

Docker Registry traffic uses the /v2/{remote}/{path} endpoint implementing the Docker Registry HTTP API v2.

Code layout

src/artifactapi/
├── main.py               — FastAPI app + thin route declarations only
├── config.py             — ConfigManager (loads remotes.yaml)
├── metrics.py            — Prometheus + Redis metrics
├── docker_auth.py        — backwards-compat shim → auth/docker.py
├── artifact/             — route handler implementations
│   ├── proxy.py          — GET /api/v1/remote (remote proxy, cache, revalidation)
│   ├── local.py          — PUT/HEAD/DELETE /api/v1/remote (local repos)
│   ├── docker.py         — /v2/ Docker Registry v2 proxy
│   ├── discovery.py      — /api/v1/artifacts discovery + bulk cache
│   └── flush.py          — PUT /cache/flush
├── auth/
│   ├── __init__.py       — re-exports Docker auth helpers
│   └── docker.py         — Bearer token fetching + in-memory cache
├── cache/
│   ├── __init__.py       — re-exports RedisCache
│   └── redis.py          — RedisCache (TTL keys, ETag metadata)
├── database/
│   ├── __init__.py       — re-exports DatabaseManager
│   └── postgres.py       — DatabaseManager (artifact + local-file tables)
├── storage/
│   ├── __init__.py       — re-exports S3Storage
│   └── s3.py             — S3Storage (MinIO/S3 abstraction)
└── remote/
    ├── __init__.py
    ├── base.py           — content-type detection
    ├── generic.py        — generic HTTP remotes
    ├── helm.py           — Helm index.yaml URL rewriting
    ├── npm.py            — npm metadata URL rewriting
    ├── python.py         — PyPI URL construction + HTML rewriting
    └── rpm.py            — RPM remotes

API Endpoints

Method Path Description
GET /api/v1/remote/{remote}/{path} Fetch artifact (auto-cache on miss)
PUT /api/v1/remote/{remote}/{path} Upload to local remote
HEAD /api/v1/remote/{remote}/{path} Check existence (local remotes)
DELETE /api/v1/remote/{remote}/{path} Delete from local remote
GET /v2/{remote}/{path} Docker Registry v2 proxy
PUT /cache/flush Flush cache entries
GET /health Health check
GET /config View loaded configuration
GET / API info and available remotes

Configuration

Runtime settings come from environment variables; remote definitions live in remotes.yaml.

Environment Variables

Variable Description
DBHOST, DBPORT, DBUSER, DBPASS, DBNAME PostgreSQL connection
REDIS_URL Redis URL (e.g. redis://localhost:6379)
MINIO_ENDPOINT MinIO/S3 endpoint
MINIO_ACCESS_KEY S3 access key
MINIO_SECRET_KEY S3 secret key
MINIO_BUCKET S3 bucket name
MINIO_SECURE Use HTTPS (true/false)

remotes.yaml Structure

remotes:
  remote-name:
    base_url: "https://example.com"
    type: "remote"            # "remote" or "local"
    package: "generic"        # generic, alpine, rpm, docker, pypi, npm, helm
    description: "..."
    immutable_patterns:       # regex — cached forever
      - ".*\\.tar\\.gz$"
    mutable_patterns:         # regex — expire after mutable_ttl
      - "index\\.yaml$"
    check_mutable_updates: false   # send HEAD (If-None-Match) on TTL expiry
    cache:
      immutable_ttl: 0        # 0 = indefinitely
      mutable_ttl: 3600

Remote Types

generic

Arbitrary HTTP file servers — GitHub releases, HashiCorp, custom servers.

remotes:
  github:
    base_url: "https://github.com"
    type: "remote"
    package: "generic"
    immutable_patterns:
      - "gruntwork-io/terragrunt/.*terragrunt_linux_amd64.*"
    cache:
      immutable_ttl: 0

  github-archive:
    base_url: "https://github.com"
    type: "remote"
    package: "generic"
    immutable_patterns:
      - ".*/archive/refs/tags/.*\\.tar\\.gz$"   # tag archives never change
    mutable_patterns:
      - ".*/archive/refs/heads/main\\.tar\\.gz$"  # branch archives can change
    check_mutable_updates: true
    cache:
      immutable_ttl: 0
      mutable_ttl: 86400

Access: GET /api/v1/remote/github/owner/repo/releases/download/v1.0/binary.tar.gz

alpine

remotes:
  alpine:
    base_url: "https://dl-cdn.alpinelinux.org"
    type: "remote"
    package: "alpine"
    immutable_patterns:
      - ".*/x86_64/.*\\.apk$"
    cache:
      immutable_ttl: 0
      mutable_ttl: 7200

APKINDEX.tar.gz is a built-in mutable pattern — no mutable_patterns entry needed.

rpm

remotes:
  almalinux:
    base_url: "https://mirror.example.com/almalinux"
    type: "remote"
    package: "rpm"
    immutable_patterns:
      - ".*/x86_64/.*\\.rpm$"
      - ".*/noarch/.*\\.rpm$"
    cache:
      immutable_ttl: 0
      mutable_ttl: 7200

repomd.xml and repodata/ metadata files are built-in mutable patterns.

docker

remotes:
  dockerhub:
    base_url: "https://registry-1.docker.io"
    type: "remote"
    package: "docker"
    # username / password optional for public images
    cache:
      immutable_ttl: 0
      mutable_ttl: 300

  ghcr:
    base_url: "https://ghcr.io"
    type: "remote"
    package: "docker"
    username: "your-github-username"
    password: "ghp_your_pat"   # read:packages scope
    cache:
      immutable_ttl: 0
      mutable_ttl: 300

Tag manifests and /tags/list are built-in mutable patterns. Digest-addressed blobs are immutable.

For RKE2/containerd, configure /etc/rancher/rke2/registries.yaml:

mirrors:
  docker.io:
    endpoint:
      - "https://artifacts.example.com"
    rewrite:
      "^(.*)$": "dockerhub/$1"
  ghcr.io:
    endpoint:
      - "https://artifacts.example.com"
    rewrite:
      "^(.*)$": "ghcr/$1"

pypi

remotes:
  pypi:
    base_url: "https://files.pythonhosted.org"
    type: "remote"
    package: "pypi"
    check_mutable_updates: true
    immutable_patterns:
      - "packages/.*\\.whl$"
      - "packages/.*\\.whl\\.metadata$"
      - "packages/.*\\.tar\\.gz$"
      - "packages/.*\\.zip$"
    cache:
      immutable_ttl: 0
      mutable_ttl: 600

Note

: Simple index requests (/simple/{package}/) are always fetched from https://pypi.org, regardless of base_url. This is hardcoded — base_url only controls where package files are downloaded from. For self-hosted registries (Gitea, Nexus) where both index and files share the same host, set base_url to that host and the override does not apply.

URLs in simple index HTML are rewritten to route package file downloads back through the same remote.

Configure uv:

# /etc/uv/uv.toml or ~/.config/uv/uv.toml
[[index]]
url = "https://artifacts.example.com/api/v1/remote/pypi/simple"
default = true

npm

remotes:
  npm:
    base_url: "https://registry.npmjs.org"
    type: "remote"
    package: "npm"
    check_mutable_updates: true
    immutable_patterns:
      - "\.tgz$"
    mutable_patterns:
      - "^(?!.*\.tgz$).*"
    cache:
      immutable_ttl: 0
      mutable_ttl: 600

dist.tarball URLs in package metadata JSON are rewritten to route tarball downloads back through the same remote.

Configure npm / yarn / pnpm:

# .npmrc or ~/.npmrc
registry=https://artifacts.example.com/api/v1/remote/npm/

helm

remotes:
  hashicorp-helm:
    base_url: "https://helm.releases.hashicorp.com"
    type: "remote"
    package: "helm"
    check_mutable_updates: true
    immutable_patterns:
      - "\\.tgz$"
    cache:
      immutable_ttl: 0
      mutable_ttl: 3600

index.yaml is a built-in mutable pattern. Chart URLs inside index.yaml are rewritten to route tarball downloads back through the same remote.

Configure Helm:

helm repo add hashicorp https://artifacts.example.com/api/v1/remote/hashicorp-helm
helm repo update

local

remotes:
  local-generic:
    type: "local"
    package: "generic"
    description: "Local file repository"
    cache:
      immutable_ttl: 0
      mutable_ttl: 0

No base_url. Files are uploaded via PUT and served via GET.

Caching Model

Immutable patterns

Files matching immutable_patterns are cached for immutable_ttl seconds (0 = indefinitely). Use for versioned release artifacts that never change once published.

Access control: only paths matching an immutable or mutable pattern are served; all others return 403. Omitting immutable_patterns entirely allows all paths from that remote.

Mutable patterns

Files matching mutable_patterns expire after mutable_ttl seconds and are re-fetched on the next request. Mutable files are always served regardless of immutable_patterns.

Each package type has built-in defaults that are merged with any user-defined mutable_patterns:

Package type Built-in mutable patterns
alpine APKINDEX\.tar\.gz$
rpm repomd\.xml$, repodata/ metadata variants, Packages\.gz$
docker Tag manifests (non-digest refs), /tags/list
pypi simple/ (per-package and top-level index pages)
helm index\.yaml$
npm (none built-in — define via mutable_patterns)
generic (none)

Conditional revalidation

Set check_mutable_updates: true to send HEAD with If-None-Match / If-Modified-Since on TTL expiry. A 304 response refreshes the TTL without re-downloading. Only applies to user-defined mutable_patterns — built-in patterns are always re-fetched unconditionally.

Stale-on-upstream-error

When a mutable file expires and the upstream is unreachable (connection refused, DNS failure, timeout), the cached copy is kept and its TTL refreshed. HTTP error responses (4xx, 5xx) are not treated as network failures and proceed with normal expiry.

Quarantine (supply-chain protection)

Set quarantine_new: true and quarantine_days: N on a remote to block immutable artifacts published within the last N days. Requests return 404 until the quarantine period expires, giving time to detect malicious packages before they are consumed.

remotes:
  pypi:
    base_url: "https://files.pythonhosted.org"
    type: "remote"
    package: "pypi"
    quarantine_new: true
    quarantine_days: 3        # block packages published in the last 3 days
    immutable_patterns:
      - "packages/.*\\.whl$"
      - "packages/.*\\.tar\\.gz$"
    cache:
      immutable_ttl: 0
      mutable_ttl: 600

The upstream Last-Modified response header is used as the publish date proxy. Artifacts that have no Last-Modified header are allowed through (fail-open). Mutable files (index pages, tag manifests) are never quarantined.

S
Description
My terrible vibe coded artifact cache
Readme 1.7 MiB
Languages
Go 76.4%
TypeScript 17.7%
CSS 4.8%
Makefile 0.8%
Dockerfile 0.2%
Other 0.1%