Commit Graph

2 Commits

Author SHA1 Message Date
unkinben 649f89f58b fix: make local docker uploads replica-independent (#104)
ci/woodpecker/tag/docker Pipeline was successful
## Why

Chunked blob uploads kept the in-progress session in **process memory** keyed by upload UUID, so the `POST`/`PATCH`/`PUT` of a single `docker push` had to land on the same replica. The API runs at `minReplicas: 2` with no session affinity (see argocd-apps `api-hpa.yaml`), so a real push — which streams the layer via `PATCH` then finalises with `PUT` — intermittently 404s with `BLOB_UPLOAD_UNKNOWN` when a chunk hits a replica that never saw the `POST`. This was flagged when the local docker registry landed (#103).

## Changes

- Stage chunked uploads in object storage under `uploads/<uuid>` instead of an in-memory temp file. The UUID travels in the `Location` URL handed to the client, so any replica reconstructs the staging key with no shared in-process state. Finalise streams the staged bytes plus any trailing `PUT` body through the CAS in one pass; monolithic uploads are unchanged.
- Support `DELETE` of an in-progress upload (cancel) by dropping its staging object.
- Reap abandoned staging objects in the GC (`uploads/` older than 24h) via a new `S3.ListStaleObjects`, so cancelled/interrupted pushes don't leak.

## Verification

- Split a single push across **two instances sharing one Postgres+MinIO**: `POST`→A, `PATCH`→B, `PUT`→A finalises with the correct digest, and the blob pulls back **byte-identical from both** replicas. Config-blob and manifest pushes split the same way succeed; `tags/list` is correct. (Pre-fix, the cross-replica `PATCH` 404s.)
- `scripts/docker-e2e.sh` still passes (incl. `TestLocalDockerPushPull`); unit tests + `go vet` clean.

Reviewed-on: #104
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-05 17:39:49 +10:00
benvin b46c116f6b Feat/v3 go rewrite (#47)
ci/woodpecker/tag/docker Pipeline was successful
Complete rewrite of ArtifactAPI from Python/FastAPI to Go as a single binary.

Core engine:
- 10 package providers: generic, docker, helm, pypi, npm, rpm, alpine,
  puppet, terraform, goproxy — each with built-in mutable patterns
- Content-addressable storage (SHA256 dedup across all remotes)
- Three-tier caching: Redis (TTL/locks) → S3/MinIO (blobs) → upstream
- Classifier with allowlist/blocklist per-remote (empty = allow all)
- Circuit breaker, conditional revalidation, stale-on-error
- Background garbage collection for orphaned blobs
- Access logging to PostgreSQL

API:
- v1 proxy endpoints (backwards compatible)
- v2 management API: CRUD remotes/virtuals, object browser, stats,
  health, SSE events, probe/test endpoint
- Virtual repos with index merging (Helm YAML + PyPI HTML)

Frontend (React + Vite, separate Dockerfile):
- Dashboard with stats, health indicators, top remotes
- Remotes list with type filter, remote detail with config/patterns
- Object browser with pagination and evict
- Test Remote page: probe any remote path, see headers/size/timing
- Virtuals page with expandable member lists

TUI (Bubble Tea):
- Dashboard, remotes list/detail, object browser, virtuals
- Vim-style navigation, artifactapi tui --endpoint <url>

Infrastructure:
- S3 client supports MinIO, Ceph RGW, AWS S3 (minio-go)
- PostgreSQL schema with migrations
- Docker Compose: API + UI + Postgres 17 + Redis 7 + MinIO
- Makefile with Go version check, build/test/lint/fmt/e2e targets
- Distroless Docker image (~15MB)

Testing:
- Unit tests for models, classifier, providers, mergers
- E2E tests with testcontainers-go (real Postgres/Redis/MinIO)

Terraform config:
- All 40 production remotes + helm virtual as HCL
- Provider repo: terraform-provider-artifactapi v0.0.1 (separate)

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #47
2026-06-07 19:30:35 +10:00