Compare commits

..

28 Commits

Author SHA1 Message Date
unkinben 1b585af14e feat: wire the circuit breaker into the proxy fetch path (#90)
Fixes #74

## Why
`internal/proxy/circuit.go` implemented and tested a circuit breaker, but nothing ever called it — a repeatedly-failing upstream was still hit on every request.

## Changes
- Construct a `CircuitBreaker` in `NewEngine`.
- In `Engine.Fetch`: short-circuit when the breaker is open (serve stale from the store if present, otherwise return 503), `RecordFailure` on each `UpstreamError`, and `RecordSuccess` on a successful fetch.

## Validation
- `go test ./internal/proxy/` and `make e2e` pass.

---------

Co-authored-by: BenVincent <benvin@main.unkin.net>
Reviewed-on: #90
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 22:43:22 +10:00
unkinben e7c9387bcc fix: GC has no grace period (TOCTOU with dedup uploads) (#86)
Fixes #71

## Why
`FindOrphanedBlobs` returned any blob not currently referenced. Because CAS dedups (the blob row can exist before its artifact/local_files row is written), a concurrent upload reusing an existing hash could have its S3 object deleted mid-flight by the GC.

## Changes
- `FindOrphanedBlobs` now takes a `minAge` and only returns blobs with `created_at < now()-minAge`.
- The collector passes a 1h `blobGracePeriod`.

## Validation
- `go test ./internal/gc/...` and `make e2e` pass.

---------

Co-authored-by: BenVincent <benvin@main.unkin.net>
Reviewed-on: #86
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 22:43:18 +10:00
unkinben 7e07eaa758 fix: repair master build after conflicting merges (#96)
## Why
`master` does not compile. Three PRs that each built individually combined badly:
- #92 changed `fetchBearerToken` to return `(string, time.Duration, error)` and added `cachedBearerToken` (which hashes the challenge via `sha256Hash`).
- #94 (streaming) removed the now-unused-in-that-PR `sha256Hash` helper and its `crypto/sha256` / `encoding/hex` imports.
- #89 (HEAD) added `headUpstream`, which calls `fetchBearerToken` expecting two return values.

Result on `master`: `internal/proxy/engine.go` fails to build (`assignment mismatch: 2 variables but fetchBearerToken returns 3 values`; `undefined: sha256Hash`).

## Changes
- Re-add the `sha256Hash` helper and its `crypto/sha256` + `encoding/hex` imports.
- Fix the `headUpstream` 401 path to handle `fetchBearerToken`s three return values.

## Validation
- `go build ./...`, `go vet`, and `make e2e` all pass.

Should merge before the other in-flight branches so they rebase onto a compiling `master`.

Reviewed-on: #96
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 22:36:09 +10:00
unkinben f61ab99ae8 fix: set timeouts on the upstream HTTP client (#83)
Fixes #67

## Why
The proxy used `http.DefaultClient` for all upstream GET/HEAD and bearer-token requests. It has no timeouts, so a slow or hung upstream holds a goroutine and connection indefinitely.

## Changes
- Add a shared `upstreamClient` (`internal/proxy/httpclient.go`) with dial, TLS-handshake, response-header and idle-connection timeouts, plus connection pooling.
- Deliberately no overall `Client.Timeout`, so large artifact bodies can still stream; total time is bounded by the request context.
- Route all four upstream calls in the engine through it.

## Validation
- `make e2e` passes.

Reviewed-on: #83
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 22:24:49 +10:00
unkinben c39703ed0d fix: getenv treats an explicitly-empty value as unset (#85)
Fixes #69

## Why
`getenv` returned the fallback whenever `os.Getenv` was empty, so an intentionally-empty env var could not override a non-empty default.

## Changes
- Use `os.LookupEnv` to distinguish unset from set-but-empty.

## Validation
- `make e2e` passes.

Reviewed-on: #85
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 22:09:09 +10:00
unkinben 5261af4c63 fix: coalesce concurrent cache-miss fetches (thundering herd) (#93)
Fixes #75

## Why
On a fetch-lock miss, `Engine.Fetch` slept a flat 500ms once, tried the store, and otherwise fell through to fetch upstream unlocked. A cold-cache stampede therefore still hit upstream once per waiter.

## Changes
- Add `waitForStore`, which polls the store every 100ms for up to 5s (stopping on context cancellation) so waiters pick up the lock leaders populated result.
- Only fall through to an upstream fetch if the leader has not populated the store within the wait budget.

## Validation
- `make e2e` passes.

Reviewed-on: #93
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 22:08:29 +10:00
unkinben 45d6cdbc64 perf: batch access-log writes instead of goroutine+insert per request (#91)
Fixes #76

## Why
Every proxied request spawned a goroutine running a 5s-timeout single-row INSERT. Under load this is unbounded goroutines and connection-pool pressure.

## Changes
- Add `database.AccessLogEntry` + `InsertAccessLogBatch` (bulk `COPY`).
- The engine starts one background writer that drains a buffered channel and flushes every 128 entries or 2s.
- `logAccess` is now a non-blocking channel send (drops on full buffer), so the request path never blocks on the DB. Best-effort telemetry: a small tail may be lost on abrupt shutdown.

## Validation
- `make e2e` passes.

Reviewed-on: #91
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 22:07:56 +10:00
unkinben b59cc45765 fix: HEAD requests fetch and stream the full body (#89)
Fixes #70

## Why
Docker `HEAD` routes mapped to `handleProxy`, which ran a full `Fetch` + `io.Copy` — downloading the entire blob (and fetching upstream on a miss) only for net/http to discard the body. HEAD existence checks (manifests, blobs) are common.

## Changes
- Add `Engine.Head`: answers cached artifacts/indexes from store metadata (no blob download); on a miss issues an upstream `HEAD` (with bearer-token handling) and never caches a body.
- Route `HEAD /v2/{remote}/*` to a dedicated `handleProxyHead` that writes headers only.
- Add e2e tests for HEAD on a blocklisted path (403) and an unknown remote (404).

## Note
`headUpstream` uses `http.DefaultClient` to build cleanly on master; it will pick up the shared timeout-configured client from #67 once that merges.

## Validation
- `make e2e` passes (includes new HEAD tests).

Reviewed-on: #89
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 22:06:50 +10:00
unkinben e7027c8ccc feat: cache upstream bearer tokens (#92)
Fixes #77

## Why
Each upstream 401 re-ran the full token-endpoint request, even though a single Docker pull triggers many blob/manifest requests sharing one scope.

## Changes
- Add Redis `GetToken`/`SetToken`.
- `fetchBearerToken` now also parses `expires_in` and returns a TTL.
- New `Engine.cachedBearerToken` reuses a cached token keyed by remote + challenge (hashed), caching for `expires_in` minus a safety margin (default 60s when absent).

## Validation
- `make e2e` passes.

Reviewed-on: #92
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 21:35:46 +10:00
unkinben f3680951b7 perf: stream proxied artifacts instead of buffering the full body in memory (#94)
Fixes #66

## Why
`fetchFromUpstream` read every upstream response with `io.ReadAll`, hashed it in memory, uploaded from memory and served from memory. A single large immutable blob (Docker layer, RPM, tarball, Go module zip) — or several concurrent ones — could OOM the process. The streaming, tempfile-backed CAS already existed but the proxy path bypassed it (and `Engine.cas` was assigned but unused).

## Changes
- Immutable fetches now stream through `CAS.Store` (tempfile -> sha256 -> S3), so memory stays bounded regardless of artifact size, and are served back from the store.
- Mutable indexes stay on the in-memory path (small, and subject to `RewriteResponse`).
- Skipping `RewriteResponse` for immutable content is behaviour-preserving: the proxy path always passes an empty `proxyBaseURL`, under which every providers `RewriteResponse` is a no-op.
- Remove the now-unused in-memory `sha256Hash` helper.

## Validation
- `make e2e` passes.
- Live smoke test against Postgres/Redis/MinIO: proxied a 12 MB blob through a generic remote — fetch #1 `X-Artifact-Source: remote`, fetch #2 `X-Artifact-Source: cache`, both byte-identical (sha256) to the origin.

Reviewed-on: #94
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 21:33:42 +10:00
unkinben 61a1a99112 perf: compile remote match patterns once instead of per-request (#88)
Fixes #73

## Why
`Classifier.Classify` runs on every proxied request and recompiled the Blocklist/Patterns/Immutable/Mutable regex lists each time. Regex compilation is expensive and fully redundant.

## Changes
- Memoise compilation in a `sync.Map` keyed by pattern text (`compileCached`); each distinct pattern compiles once and is reused. Patterns that fail to compile are cached as a typed nil so they are not retried. No invalidation needed since the pattern text is the key.

## Validation
- `go test ./internal/proxy/` and `make e2e` pass.

Reviewed-on: #88
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 20:20:00 +10:00
unkinben f0e44d6810 fix: blocklist fails open when a regex fails to compile (#87)
Fixes #72

## Why
`compilePatterns` silently discards any pattern that fails to compile. A typo in a blocklist entry therefore turns a deny rule into a no-op — a fail-open with security impact.

## Changes
- Add `Remote.ValidatePatterns`, which compiles every pattern list (patterns, blocklist, mutable/immutable patterns, ban_tags) and returns an error on the first invalid regex.
- Reject invalid patterns with 400 at remote create and update time.
- Unit test for valid and invalid patterns.

## Validation
- `go test ./pkg/models/` and `make e2e` pass.

Reviewed-on: #87
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 20:19:27 +10:00
unkinben 0a89b2005c fix: isNetworkError should use errors.As, not a bare type assertion (#84)
Fixes #68

## Why
`isNetworkError` type-asserted `err.(*UpstreamError)` directly. If the error is ever wrapped, stale-on-error handling silently stops triggering.

## Changes
- Use `errors.As` to detect `*UpstreamError` through wrapping.

## Validation
- `make e2e` passes.

Reviewed-on: #84
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 20:18:23 +10:00
unkinben f23bf2a6d9 fix: serveFromStore does a guaranteed-miss S3 lookup on every cache hit (#82)
Fixes #78

## Why
`serveFromStore` first called `store.Download` with the bare content hash as the S3 key, which never matches real object keys (`blobs/sha256/<hash>`). Every cached blob serve therefore paid an extra guaranteed-404 round-trip before retrying with the correct `BlobKey`.

## Changes
- Remove the dead first `Download` attempt; go straight to the `BlobKey` lookup, then fall back to the index key.

## Validation
- `make e2e` passes (proxy cache-hit paths exercised end-to-end).

Reviewed-on: #82
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 20:07:30 +10:00
unkinben b9098bf19c fix: e2e suite fails to build (stale server.New call) (#81)
Fixes #80

## Why
`make e2e` did not compile against master: `e2e/e2e_test.go` called `server.New(cfg)` but the signature is `New(cfg, version string)`. This blocked all end-to-end validation.

## Changes
- Pass a static `"e2e-test"` version to `server.New` in the e2e bootstrap.

## Validation
- `make e2e` builds and passes (testcontainers: postgres/redis/minio).

Reviewed-on: #81
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-02 20:00:24 +10:00
unkinben 8d9bc1c422 feat: add bandwidth saved stat to dashboard (#65)
ci/woodpecker/tag/docker Pipeline was successful
Shows total bytes served from cache (instead of upstream) over the last 30 days. Queries `SUM(size_bytes) WHERE cache_hit = TRUE` from access_log.

Reviewed-on: #65
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 22:18:02 +10:00
unkinben 30b7cef026 fix: strip base URL path prefix from helm chart download URLs (#64)
ci/woodpecker/tag/docker Pipeline was successful
When a helm repo base URL includes a path component (e.g. \`stakater.github.io/stakater-charts\`), the merger was extracting the full URL path (\`stakater-charts/reloader-2.2.8.tgz\`) and the proxy then constructed \`base_url/stakater-charts/reloader-2.2.8.tgz\` = double path = 404.

Fix: \`extractPathRelativeToBase()\` strips the shared base path prefix so only the filename portion is used as the proxy path.
Reviewed-on: #64
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 08:02:52 +10:00
unkinben 603be5b989 fix: report actual version instead of hardcoded 3.0.0-dev (#63)
ci/woodpecker/tag/docker Pipeline was successful
The / endpoint was hardcoded to return 3.0.0-dev. Now uses the git tag version set via ldflags at build time.

Reviewed-on: #63
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 00:51:26 +10:00
unkinben 9eba49500c feat: forward Accept header and fix Content-Type for Docker proxying (#62)
## Problems
1. Docker daemon sends specific Accept headers to negotiate manifest format, but the proxy dropped them — registries defaulted to OCI format, causing "mediaType should be manifest.v2+json not oci.image.index" errors
2. Upstream Content-Type was only used when the provider returned "application/octet-stream" — Docker manifests got the wrong Content-Type

## Fixes
- Forward client Accept header to upstream (both initial request and Bearer token retry)
- Always prefer upstream Content-Type when present
- Fetch signature now accepts variadic clientHeaders for backwards compat

## E2E tested
- DockerHub: redis:7-alpine, alpine:3 — skopeo inspect OK
- GHCR: OCI-only images work with docker pull (GHCR 404s Docker v2 Accept, which is expected)
- Quay: prometheus/node-exporter — skopeo inspect OK

Reviewed-on: #62
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 00:45:23 +10:00
unkinben 0083d67272 fix: nginx config for UI serving under base path (#61)
Vite's \`base: /ui\` makes HTML reference \`/ui/assets/...\` but files are at \`/usr/share/nginx/html/assets/\` (no \`ui/\` subdir). The previous \`location /ui { try_files ... }\` couldn't find the files.

Fix: rewrite strips the base path prefix before try_files, so \`/ui/assets/foo.js\` resolves to \`/usr/share/nginx/html/assets/foo.js\`.
Reviewed-on: #61
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 00:43:45 +10:00
unkinben 8ec7de50e3 feat: handle Docker Bearer token auth for upstream registries (#60)
ci/woodpecker/tag/docker Pipeline was successful
Docker Hub (and other registries) return 401 with a `Www-Authenticate: Bearer realm=...` challenge even for public images. The proxy now:

1. Detects 401 + Bearer challenge
2. Parses realm/service/scope from the header
3. Fetches an anonymous token (or authenticated if username/password configured)
4. Retries the original request with the Bearer token

Fixes: `docker pull artifactapi.../dockerhub/library/redis:latest` returning "unauthorized: upstream returned 401"
Reviewed-on: #60
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 00:18:06 +10:00
unkinben 9c465cbd4c fix: use map format for docker-buildx build_args (#59)
The woodpecker docker-buildx plugin expects build_args as a YAML map (KEY: VALUE), not a list (- KEY=VALUE). The list format was silently ignored, so BASE_PATH was never passed to the Docker build.

Reviewed-on: #59
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 00:12:34 +10:00
unkinben ee6e581b9d feat: configurable UI base path via BASE_PATH build arg (#58)
ci/woodpecker/tag/docker Pipeline was successful
Serves the UI under /ui instead of /. This pairs with the argocd route simplification (argocd-apps#201) where /ui → UI service and everything else → API.

- Vite: `base` set from `BASE_PATH` env var at build time
- React Router: `basename` set from injected `__BASE_PATH__`
- Nginx: location block uses `${BASE_PATH}`, substituted by sed at build
- Dockerfile: `ARG BASE_PATH=/` (default preserves existing behavior)
- Woodpecker: passes `BASE_PATH=/ui` to docker-web build

Tested: assets serve at `/ui/assets/...`, SPA routing works at `/ui/remotes`, etc.
Reviewed-on: #58
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-26 23:50:17 +10:00
unkinben 2a8e544de3 feat: add Docker Registry V2 endpoint at /v2/ (#57)
The v3 Go rewrite removed the /v2/ Docker Registry compatibility endpoint. Docker clients need:
- GET/HEAD /v2/ → 200 (registry ping)
- GET/HEAD /v2/{remoteName}/* → proxy to the docker remote

Usage: `docker pull artifactapi.example.com/{remoteName}/image:tag`
Reviewed-on: #57
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-26 23:37:52 +10:00
unkinben 847eeb839f fix: don't rewrite helm chart URLs pointing to a different host (#56)
## Problem
Helm charts like Intel device plugins have download URLs on `github.com` but the chart index is served from `intel.github.io`. The merger rewrites all URLs through the proxy, constructing:
```
https://artifactapi/api/v1/remote/intel-helm/intel/helm-charts/releases/download/...
```
Which proxies to `https://intel.github.io/helm-charts/intel/helm-charts/releases/download/...` — a 404.

## Fix
Compare the download URL host against the remote's base URL host. If they differ, leave the URL as-is so helm downloads directly from the source. Same-host URLs are still rewritten through the proxy.

Also adds `BaseURL` to `MemberIndex` so the merger has the context it needs, and uses the correct `/local/` vs `/remote/` route prefix.

Reviewed-on: #56
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-26 23:34:00 +10:00
unkinben 74d9c0fa84 chore: add pre-commit config and update CI pipeline (#55)
ci/woodpecker/tag/docker Pipeline was successful
## Summary
- New `.pre-commit-config.yaml` with standard Go hooks (gofmt, go vet, go mod tidy) plus file hygiene checks (trailing whitespace, end-of-file, yaml, large files, merge conflicts)
- go vet runs as a local hook with `./...` since the dnephin per-file hook doesn't work with Go module layouts
- Woodpecker pre-commit pipeline updated to use `almalinux9-gobuilder` image with `uvx pre-commit run --all-files`
- Pre-commit hooks installed into the repo

Reviewed-on: #55
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-23 23:21:09 +10:00
unkinben 097fbf0016 feat: UI separates locals, remotes, and virtuals (#54)
## Summary
- New "Locals" sidebar nav item with list + detail + browse pages
- Remotes page filters out local repos (repo_type=local hidden)
- LocalDetail: simplified view — just name, type, description + "Browse Files" button
- Virtuals: member links resolve to /locals/ or /remotes/ based on repo_type
- Objects page detects context for correct back-navigation

## Test plan
- [ ] Visual check: locals page shows only local repos
- [ ] Remotes page hides local repos
- [ ] Virtual member links point to correct pages
- [ ] Browse files works from local detail page

Reviewed-on: #54
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-23 23:20:18 +10:00
unkinben 6f8e70c27a feat: add local RPM repository with on-demand repodata (#53)
## Summary
- Upload RPMs to local repos, metadata parsed async via cavaliergopher/rpm
- Repodata (repomd.xml, primary/filelists/other.xml.gz) generated on-demand from DB — nothing stored in S3
- RPM provider implements LocalUploader, PostUploadHook, and LocalIndexer
- New rpm_metadata table for parsed RPM header data (name, version, deps, etc.)
- New provider interfaces: PostUploadHook, BlobReader, MetadataStore, RPMMetadataReader

## Test plan
- [x] Upload cowsay RPM from epel → async metadata parse confirmed in logs
- [x] repomd.xml generated with correct hashes → primary.xml.gz has correct metadata
- [x] `dnf install` from local repo: download + install successful
- [x] Bad file rejection (.txt → 400), overwrite rejection (409)

Reviewed-on: #53
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-23 23:20:05 +10:00
39 changed files with 1078 additions and 176 deletions
+24
View File
@@ -0,0 +1,24 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- id: check-merge-conflict
- repo: https://github.com/dnephin/pre-commit-golang
rev: v0.5.1
hooks:
- id: go-fmt
- id: go-mod-tidy
- repo: local
hooks:
- id: go-vet
name: go vet
entry: go vet ./...
language: system
types: [go]
pass_filenames: false
+4
View File
@@ -8,6 +8,8 @@ steps:
settings:
registry: git.unkin.net
repo: git.unkin.net/unkin/artifactapi
build_args:
VERSION: ${CI_COMMIT_TAG}
username: droneci
password:
from_secret: DRONECI_PASSWORD
@@ -22,6 +24,8 @@ steps:
repo: git.unkin.net/unkin/artifactapi-ui
dockerfile: ui/Dockerfile.ui
context: ui
build_args:
BASE_PATH: /ui
username: droneci
password:
from_secret: DRONECI_PASSWORD
+11 -3
View File
@@ -3,7 +3,15 @@ when:
steps:
- name: pre-commit
image: golang:1.25
image: git.unkin.net/unkin/almalinux9-gobuilder:20260606
commands:
- test -z "$(gofmt -l .)"
- go vet ./...
- uvx pre-commit run --all-files
backend_options:
kubernetes:
resources:
requests:
memory: 512Mi
cpu: 1
limits:
memory: 2Gi
cpu: 2
+2 -1
View File
@@ -9,7 +9,8 @@ RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o artifactapi ./cmd/artifactapi
ARG VERSION=dev
RUN CGO_ENABLED=0 go build -ldflags="-s -w -X main.version=${VERSION}" -o artifactapi ./cmd/artifactapi
FROM gcr.io/distroless/static-debian12:nonroot
+1 -1
View File
@@ -12,7 +12,7 @@ check-go:
fi
build: check-go tidy
go build -ldflags="-s -w" -o $(BINARY) ./cmd/artifactapi
go build -ldflags="-s -w -X main.version=$(VERSION)" -o $(BINARY) ./cmd/artifactapi
test: check-go
go test -race -count=1 ./pkg/... ./internal/...
+3 -1
View File
@@ -13,6 +13,8 @@ import (
"git.unkin.net/unkin/artifactapi/internal/tui"
)
var version = "dev"
func main() {
if len(os.Args) > 1 && os.Args[1] == "tui" {
endpoint := os.Getenv("ARTIFACTAPI_ENDPOINT")
@@ -42,7 +44,7 @@ func main() {
ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
defer cancel()
srv, err := server.New(cfg)
srv, err := server.New(cfg, version)
if err != nil {
slog.Error("failed to create server", "error", err)
os.Exit(1)
+1 -1
View File
@@ -95,7 +95,7 @@ func TestMain(m *testing.M) {
}
cfg.ListenAddr = "127.0.0.1:0"
srv, err := server.New(cfg)
srv, err := server.New(cfg, "e2e-test")
if err != nil {
log.Fatalf("server: %v", err)
}
+24
View File
@@ -24,6 +24,30 @@ func TestRoot(t *testing.T) {
}
}
func TestRemoteUpstreamTimeouts(t *testing.T) {
createRemote(t, `{
"name": "timeout-test",
"package_type": "generic",
"base_url": "https://example.com",
"stale_on_error": true,
"upstream_dial_timeout": 3,
"upstream_tls_timeout": 4,
"upstream_response_header_timeout": 5
}`)
defer deleteRemote(t, "timeout-test")
remote := getJSON(t, apiURL("/api/v2/remotes/timeout-test"))
for field, want := range map[string]float64{
"upstream_dial_timeout": 3,
"upstream_tls_timeout": 4,
"upstream_response_header_timeout": 5,
} {
if got, _ := remote[field].(float64); got != want {
t.Errorf("%s: got %v, want %v", field, remote[field], want)
}
}
}
func TestRemoteCRUD(t *testing.T) {
createRemote(t, `{
"name": "test-generic",
+33
View File
@@ -24,6 +24,39 @@ func TestProxyBlocklist(t *testing.T) {
assertStatus(t, apiURL("/api/v1/remote/blocklist-test/malware.exe"), http.StatusForbidden)
}
func TestProxyHeadBlocklist(t *testing.T) {
createRemote(t, `{
"name": "head-block-test",
"package_type": "generic",
"base_url": "https://example.com",
"blocklist": ["\\.exe$"],
"stale_on_error": true
}`)
defer deleteRemote(t, "head-block-test")
req, _ := http.NewRequest(http.MethodHead, apiURL("/v2/head-block-test/malware.exe"), nil)
resp, err := http.DefaultClient.Do(req)
if err != nil {
t.Fatalf("HEAD: %v", err)
}
resp.Body.Close()
if resp.StatusCode != http.StatusForbidden {
t.Fatalf("HEAD blocklisted path: got %d, want 403", resp.StatusCode)
}
}
func TestProxyHeadUnknownRemote(t *testing.T) {
req, _ := http.NewRequest(http.MethodHead, apiURL("/v2/nonexistent/some/path"), nil)
resp, err := http.DefaultClient.Do(req)
if err != nil {
t.Fatalf("HEAD: %v", err)
}
resp.Body.Close()
if resp.StatusCode != http.StatusNotFound {
t.Fatalf("HEAD unknown remote: got %d, want 404", resp.StatusCode)
}
}
func TestProxyPatterns(t *testing.T) {
createRemote(t, `{
"name": "patterns-test",
+51 -1
View File
@@ -37,6 +37,20 @@ func (h *ProxyHandler) Routes() chi.Router {
return r
}
func (h *ProxyHandler) DockerV2Routes() chi.Router {
r := chi.NewRouter()
r.Get("/", h.handleDockerPing)
r.Head("/", h.handleDockerPing)
r.Get("/{remoteName}/*", h.handleProxy)
r.Head("/{remoteName}/*", h.handleProxyHead)
return r
}
func (h *ProxyHandler) handleDockerPing(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Docker-Distribution-Api-Version", "registry/2.0")
w.WriteHeader(http.StatusOK)
}
func (h *ProxyHandler) handleProxy(w http.ResponseWriter, r *http.Request) {
remoteName := chi.URLParam(r, "remoteName")
path := chi.URLParam(r, "*")
@@ -53,7 +67,7 @@ func (h *ProxyHandler) handleProxy(w http.ResponseWriter, r *http.Request) {
return
}
result, err := h.engine.Fetch(r.Context(), *remote, path, prov)
result, err := h.engine.Fetch(r.Context(), *remote, path, prov, r.Header)
if err != nil {
var proxyErr *proxy.ProxyError
if errors.As(err, &proxyErr) {
@@ -75,6 +89,42 @@ func (h *ProxyHandler) handleProxy(w http.ResponseWriter, r *http.Request) {
io.Copy(w, result.Reader)
}
func (h *ProxyHandler) handleProxyHead(w http.ResponseWriter, r *http.Request) {
remoteName := chi.URLParam(r, "remoteName")
path := chi.URLParam(r, "*")
remote, err := h.db.GetRemote(r.Context(), remoteName)
if err != nil {
http.Error(w, fmt.Sprintf("remote %q not found", remoteName), http.StatusNotFound)
return
}
prov, err := provider.Get(remote.PackageType)
if err != nil {
http.Error(w, fmt.Sprintf("no provider for %q", remote.PackageType), http.StatusInternalServerError)
return
}
result, err := h.engine.Head(r.Context(), *remote, path, prov)
if err != nil {
var proxyErr *proxy.ProxyError
if errors.As(err, &proxyErr) {
http.Error(w, proxyErr.Message, proxyErr.Status)
return
}
slog.Error("proxy head failed", "remote", remoteName, "path", path, "error", err)
http.Error(w, "bad gateway", http.StatusBadGateway)
return
}
w.Header().Set("Content-Type", result.ContentType)
w.Header().Set("X-Artifact-Source", result.Source)
if result.Size > 0 {
w.Header().Set("Content-Length", fmt.Sprintf("%d", result.Size))
}
w.WriteHeader(http.StatusOK)
}
func (h *ProxyHandler) handleVirtual(w http.ResponseWriter, r *http.Request) {
virtualName := chi.URLParam(r, "virtualName")
path := chi.URLParam(r, "*")
+8
View File
@@ -69,6 +69,10 @@ func (h *RemotesHandler) create(w http.ResponseWriter, r *http.Request) {
http.Error(w, "base_url is required for remote repositories", http.StatusBadRequest)
return
}
if err := remote.ValidatePatterns(); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
if err := h.db.CreateRemote(r.Context(), &remote); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
@@ -84,6 +88,10 @@ func (h *RemotesHandler) update(w http.ResponseWriter, r *http.Request) {
return
}
remote.Name = name
if err := remote.ValidatePatterns(); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
if err := h.db.UpdateRemote(r.Context(), &remote); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
+12
View File
@@ -70,6 +70,18 @@ func (r *Redis) GetETag(ctx context.Context, remote, path string) (string, error
return val, err
}
func (r *Redis) GetToken(ctx context.Context, key string) (string, error) {
val, err := r.client.Get(ctx, "token:"+key).Result()
if err == redis.Nil {
return "", nil
}
return val, err
}
func (r *Redis) SetToken(ctx context.Context, key, token string, ttl time.Duration) error {
return r.client.Set(ctx, "token:"+key, token, ttl).Err()
}
func (r *Redis) IncrCircuitFailure(ctx context.Context, remote string, cooldown time.Duration) (int64, error) {
key := fmt.Sprintf("circuit:%s", remote)
pipe := r.client.Pipeline()
+1 -1
View File
@@ -65,7 +65,7 @@ func Load() (*Config, error) {
}
func getenv(key, fallback string) string {
if v := os.Getenv(key); v != "" {
if v, ok := os.LookupEnv(key); ok {
return v
}
return fallback
+38 -3
View File
@@ -4,6 +4,8 @@ import (
"context"
"time"
"github.com/jackc/pgx/v5"
"git.unkin.net/unkin/artifactapi/pkg/models"
)
@@ -109,16 +111,49 @@ func (db *DB) InsertAccessLog(ctx context.Context, remoteName, path string, cach
return err
}
func (db *DB) FindOrphanedBlobs(ctx context.Context) ([]models.Blob, error) {
// AccessLogEntry is one buffered access-log record.
type AccessLogEntry struct {
RemoteName string
Path string
CacheHit bool
SizeBytes int64
UpstreamMS int
ClientIP string
}
// InsertAccessLogBatch bulk-inserts access-log rows with a single COPY.
func (db *DB) InsertAccessLogBatch(ctx context.Context, entries []AccessLogEntry) error {
if len(entries) == 0 {
return nil
}
rows := make([][]any, len(entries))
for i, e := range entries {
rows[i] = []any{e.RemoteName, e.Path, e.CacheHit, e.SizeBytes, e.UpstreamMS, e.ClientIP}
}
_, err := db.Pool.CopyFrom(ctx,
pgx.Identifier{"access_log"},
[]string{"remote_name", "path", "cache_hit", "size_bytes", "upstream_ms", "client_ip"},
pgx.CopyFromRows(rows),
)
return err
}
// FindOrphanedBlobs returns blobs no longer referenced by any artifact or
// local file, restricted to those created before now()-minAge. The age cutoff
// is a grace period that avoids a TOCTOU race with in-flight dedup uploads,
// which insert the blob row before the referencing artifact/local_files row.
func (db *DB) FindOrphanedBlobs(ctx context.Context, minAge time.Duration) ([]models.Blob, error) {
cutoff := time.Now().Add(-minAge)
rows, err := db.Pool.Query(ctx, `
SELECT b.content_hash, b.s3_key, b.size_bytes, b.content_type, b.created_at
FROM blobs b
WHERE b.content_hash NOT IN (
WHERE b.created_at < $1
AND b.content_hash NOT IN (
SELECT content_hash FROM artifacts
UNION
SELECT content_hash FROM local_files
)
`)
`, cutoff)
if err != nil {
return nil, err
}
+3
View File
@@ -124,6 +124,9 @@ func (db *DB) migrate() error {
CREATE INDEX IF NOT EXISTS idx_access_log_remote_time ON access_log(remote_name, created_at);
ALTER TABLE remotes ADD COLUMN IF NOT EXISTS repo_type TEXT DEFAULT 'remote';
ALTER TABLE remotes ADD COLUMN IF NOT EXISTS upstream_dial_timeout INTEGER DEFAULT 0;
ALTER TABLE remotes ADD COLUMN IF NOT EXISTS upstream_tls_timeout INTEGER DEFAULT 0;
ALTER TABLE remotes ADD COLUMN IF NOT EXISTS upstream_response_header_timeout INTEGER DEFAULT 0;
CREATE TABLE IF NOT EXISTS rpm_metadata (
id BIGSERIAL PRIMARY KEY,
+14 -5
View File
@@ -11,7 +11,9 @@ const remoteCols = `name, package_type, repo_type, base_url, description, userna
patterns, blocklist, mutable_patterns, immutable_patterns,
ban_tags_enabled, ban_tags,
quarantine_enabled, quarantine_days, stale_on_error,
releases_remote, managed_by, created_at, updated_at`
releases_remote, managed_by,
upstream_dial_timeout, upstream_tls_timeout, upstream_response_header_timeout,
created_at, updated_at`
func scanRemote(scanner interface{ Scan(...any) error }, r *models.Remote) error {
return scanner.Scan(
@@ -20,7 +22,9 @@ func scanRemote(scanner interface{ Scan(...any) error }, r *models.Remote) error
&r.Patterns, &r.Blocklist, &r.MutablePatterns, &r.ImmutablePatterns,
&r.BanTagsEnabled, &r.BanTags,
&r.QuarantineEnabled, &r.QuarantineDays, &r.StaleOnError,
&r.ReleasesRemote, &r.ManagedBy, &r.CreatedAt, &r.UpdatedAt,
&r.ReleasesRemote, &r.ManagedBy,
&r.UpstreamDialTimeout, &r.UpstreamTLSTimeout, &r.UpstreamResponseHeaderTimeout,
&r.CreatedAt, &r.UpdatedAt,
)
}
@@ -59,8 +63,9 @@ func (db *DB) CreateRemote(ctx context.Context, r *models.Remote) error {
patterns, blocklist, mutable_patterns, immutable_patterns,
ban_tags_enabled, ban_tags,
quarantine_enabled, quarantine_days, stale_on_error,
releases_remote, managed_by
) VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21)
releases_remote, managed_by,
upstream_dial_timeout, upstream_tls_timeout, upstream_response_header_timeout
) VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24)
`,
r.Name, r.PackageType, r.RepoType, r.BaseURL, r.Description, r.Username, r.Password,
r.ImmutableTTL, r.MutableTTL, r.CheckMutable,
@@ -68,6 +73,7 @@ func (db *DB) CreateRemote(ctx context.Context, r *models.Remote) error {
r.BanTagsEnabled, r.BanTags,
r.QuarantineEnabled, r.QuarantineDays, r.StaleOnError,
r.ReleasesRemote, r.ManagedBy,
r.UpstreamDialTimeout, r.UpstreamTLSTimeout, r.UpstreamResponseHeaderTimeout,
)
return err
}
@@ -80,7 +86,9 @@ func (db *DB) UpdateRemote(ctx context.Context, r *models.Remote) error {
patterns=$11, blocklist=$12, mutable_patterns=$13, immutable_patterns=$14,
ban_tags_enabled=$15, ban_tags=$16,
quarantine_enabled=$17, quarantine_days=$18, stale_on_error=$19,
releases_remote=$20, managed_by=$21, updated_at=NOW()
releases_remote=$20, managed_by=$21,
upstream_dial_timeout=$22, upstream_tls_timeout=$23, upstream_response_header_timeout=$24,
updated_at=NOW()
WHERE name=$1
`,
r.Name, r.PackageType, r.RepoType, r.BaseURL, r.Description, r.Username, r.Password,
@@ -89,6 +97,7 @@ func (db *DB) UpdateRemote(ctx context.Context, r *models.Remote) error {
r.BanTagsEnabled, r.BanTags,
r.QuarantineEnabled, r.QuarantineDays, r.StaleOnError,
r.ReleasesRemote, r.ManagedBy,
r.UpstreamDialTimeout, r.UpstreamTLSTimeout, r.UpstreamResponseHeaderTimeout,
)
return err
}
+22 -22
View File
@@ -33,29 +33,29 @@ func (db *DB) InsertRPMMetadata(ctx context.Context, meta *provider.RPMMetadata)
}
type RPMMetadataRow struct {
RepoName string
FilePath string
ContentHash string
Name string
Epoch int
Version string
Release string
Arch string
Summary string
Description string
RPMSize int64
RepoName string
FilePath string
ContentHash string
Name string
Epoch int
Version string
Release string
Arch string
Summary string
Description string
RPMSize int64
InstalledSize int64
License string
Vendor string
Group string
BuildHost string
SourceRPM string
URL string
Packager string
Requires json.RawMessage
Provides json.RawMessage
Files json.RawMessage
Changelogs json.RawMessage
License string
Vendor string
Group string
BuildHost string
SourceRPM string
URL string
Packager string
Requires json.RawMessage
Provides json.RawMessage
Files json.RawMessage
Changelogs json.RawMessage
}
func (db *DB) ListRPMMetadataEntries(ctx context.Context, repoName string) ([]provider.RPMMetadata, error) {
+9
View File
@@ -30,6 +30,15 @@ func (db *DB) GetOverviewStats(ctx context.Context) (*models.OverviewStats, erro
return nil, err
}
err = db.Pool.QueryRow(ctx, `
SELECT COALESCE(SUM(size_bytes), 0)
FROM access_log
WHERE cache_hit = TRUE AND created_at > NOW() - INTERVAL '30 days'
`).Scan(&stats.BandwidthSaved30d)
if err != nil {
return nil, err
}
return &stats, nil
}
+6 -1
View File
@@ -9,6 +9,11 @@ import (
"git.unkin.net/unkin/artifactapi/internal/storage"
)
// blobGracePeriod is how old an orphaned blob must be before GC will delete
// it. This avoids racing in-flight dedup uploads that insert the blob row
// before the referencing artifact/local_files row exists.
const blobGracePeriod = 1 * time.Hour
type Collector struct {
db *database.DB
store *storage.S3
@@ -38,7 +43,7 @@ func (c *Collector) Run(ctx context.Context) {
func (c *Collector) sweep(ctx context.Context) {
start := time.Now()
orphaned, err := c.db.FindOrphanedBlobs(ctx)
orphaned, err := c.db.FindOrphanedBlobs(ctx, blobGracePeriod)
if err != nil {
slog.Error("gc: find orphaned blobs", "error", err)
return
+21 -1
View File
@@ -2,6 +2,7 @@ package proxy
import (
"regexp"
"sync"
"git.unkin.net/unkin/artifactapi/internal/provider"
"git.unkin.net/unkin/artifactapi/pkg/models"
@@ -60,10 +61,29 @@ func (c *Classifier) Classify(remote models.Remote, path string) Classification
return ClassImmutable
}
// patternCache memoises regex compilation. Classify runs on every proxied
// request and previously recompiled each remote's pattern lists every time;
// keying by the pattern string lets each distinct pattern compile once and
// then be reused, with no invalidation needed (the pattern text is the key).
// A pattern that fails to compile is cached as a typed nil so we don't retry.
var patternCache sync.Map // map[string]*regexp.Regexp
func compileCached(pattern string) *regexp.Regexp {
if v, ok := patternCache.Load(pattern); ok {
return v.(*regexp.Regexp)
}
re, err := regexp.Compile(pattern)
if err != nil {
re = nil
}
patternCache.Store(pattern, re)
return re
}
func compilePatterns(patterns []string) []*regexp.Regexp {
compiled := make([]*regexp.Regexp, 0, len(patterns))
for _, p := range patterns {
if re, err := regexp.Compile(p); err == nil {
if re := compileCached(p); re != nil {
compiled = append(compiled, re)
}
}
+396 -86
View File
@@ -4,10 +4,13 @@ import (
"context"
"crypto/sha256"
"encoding/hex"
"encoding/json"
"errors"
"fmt"
"io"
"log/slog"
"net/http"
"strings"
"time"
"git.unkin.net/unkin/artifactapi/internal/cache"
@@ -19,19 +22,65 @@ import (
const fetchLockTTL = 30 * time.Second
const (
accessLogBufferSize = 4096
accessLogBatchSize = 128
accessLogFlushEvery = 2 * time.Second
)
type Engine struct {
db *database.DB
cache *cache.Redis
store *storage.S3
cas *storage.CAS
db *database.DB
cache *cache.Redis
store *storage.S3
cas *storage.CAS
circuit *CircuitBreaker
accessLog chan database.AccessLogEntry
}
func NewEngine(db *database.DB, c *cache.Redis, s *storage.S3) *Engine {
return &Engine{
db: db,
cache: c,
store: s,
cas: storage.NewCAS(s),
e := &Engine{
db: db,
cache: c,
store: s,
cas: storage.NewCAS(s),
circuit: NewCircuitBreaker(c),
accessLog: make(chan database.AccessLogEntry, accessLogBufferSize),
}
go e.runAccessLogWriter()
return e
}
// runAccessLogWriter drains the access-log channel and writes rows in batches,
// replacing a goroutine-per-request insert. It runs for the process lifetime;
// access logs are best-effort telemetry, so a small tail may be lost on abrupt
// shutdown.
func (e *Engine) runAccessLogWriter() {
ticker := time.NewTicker(accessLogFlushEvery)
defer ticker.Stop()
batch := make([]database.AccessLogEntry, 0, accessLogBatchSize)
flush := func() {
if len(batch) == 0 {
return
}
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
if err := e.db.InsertAccessLogBatch(ctx, batch); err != nil {
slog.Warn("access log batch insert failed", "error", err, "count", len(batch))
}
cancel()
batch = batch[:0]
}
for {
select {
case entry := <-e.accessLog:
batch = append(batch, entry)
if len(batch) >= accessLogBatchSize {
flush()
}
case <-ticker.C:
flush()
}
}
}
@@ -42,7 +91,7 @@ type FetchResult struct {
Source string // "cache" or "remote"
}
func (e *Engine) Fetch(ctx context.Context, remote models.Remote, path string, prov provider.Provider) (*FetchResult, error) {
func (e *Engine) Fetch(ctx context.Context, remote models.Remote, path string, prov provider.Provider, clientHeaders ...http.Header) (*FetchResult, error) {
classifier := NewClassifier(prov)
class := classifier.Classify(remote, path)
@@ -61,7 +110,7 @@ func (e *Engine) Fetch(ctx context.Context, remote models.Remote, path string, p
result, err := e.serveFromStore(ctx, remote, path)
if err == nil {
result.Source = "cache"
go e.logAccess(remote.Name, path, true, result.Size, 0)
e.logAccess(remote.Name, path, true, result.Size, 0)
return result, nil
}
slog.Warn("cache hit but S3 miss, re-fetching", "remote", remote.Name, "path", path)
@@ -73,11 +122,12 @@ func (e *Engine) Fetch(ctx context.Context, remote models.Remote, path string, p
}
if !locked {
time.Sleep(500 * time.Millisecond)
result, err := e.serveFromStore(ctx, remote, path)
if err == nil {
// Another request holds the fetch lock. Poll the store until the leader
// populates it rather than immediately racing to fetch upstream too; a
// cold-cache stampede otherwise hits upstream once per waiter.
if result := e.waitForStore(ctx, remote, path); result != nil {
result.Source = "cache"
go e.logAccess(remote.Name, path, true, result.Size, 0)
e.logAccess(remote.Name, path, true, result.Size, 0)
return result, nil
}
}
@@ -96,35 +146,138 @@ func (e *Engine) Fetch(ctx context.Context, remote models.Remote, path string, p
result, err := e.serveFromStore(ctx, remote, path)
if err == nil {
result.Source = "cache"
go e.logAccess(remote.Name, path, true, result.Size, 0)
e.logAccess(remote.Name, path, true, result.Size, 0)
return result, nil
}
}
}
}
var fwdHeaders http.Header
if len(clientHeaders) > 0 && clientHeaders[0] != nil {
fwdHeaders = clientHeaders[0]
}
// Short-circuit upstream calls when the remote's breaker is open: serve
// stale from the store if we have it, otherwise fail fast rather than
// hammering a known-bad upstream.
if e.circuit.IsOpen(ctx, remote.Name) {
if stale, serr := e.serveFromStore(ctx, remote, path); serr == nil {
slog.Warn("circuit open, serving stale", "remote", remote.Name, "path", path)
stale.Source = "cache"
e.logAccess(remote.Name, path, true, stale.Size, 0)
return stale, nil
}
return nil, &ProxyError{Status: http.StatusServiceUnavailable, Message: "upstream circuit open"}
}
start := time.Now()
result, err := e.fetchFromUpstream(ctx, remote, path, prov, class, ttl)
result, err := e.fetchFromUpstream(ctx, remote, path, prov, class, ttl, fwdHeaders)
upstreamMS := int(time.Since(start).Milliseconds())
if err != nil {
if isNetworkError(err) {
e.circuit.RecordFailure(ctx, remote.Name)
}
if remote.StaleOnError && isNetworkError(err) {
_ = e.cache.SetTTL(ctx, remote.Name, path, ttl)
stale, serr := e.serveFromStore(ctx, remote, path)
if serr == nil {
slog.Warn("serving stale on upstream error", "remote", remote.Name, "path", path, "error", err)
stale.Source = "cache"
go e.logAccess(remote.Name, path, true, stale.Size, 0)
e.logAccess(remote.Name, path, true, stale.Size, 0)
return stale, nil
}
}
return nil, err
}
go e.logAccess(remote.Name, path, false, result.Size, upstreamMS)
e.circuit.RecordSuccess(ctx, remote.Name)
e.logAccess(remote.Name, path, false, result.Size, upstreamMS)
return result, nil
}
func (e *Engine) fetchFromUpstream(ctx context.Context, remote models.Remote, path string, prov provider.Provider, class Classification, ttl time.Duration) (*FetchResult, error) {
// HeadResult carries artifact metadata for a HEAD request. There is no body.
type HeadResult struct {
ContentType string
Size int64
Source string // "cache" or "remote"
}
// Head resolves artifact metadata without fetching or streaming the body.
// Cached artifacts/indexes are answered from the store metadata; on a miss it
// issues an upstream HEAD. It never downloads or caches the body.
func (e *Engine) Head(ctx context.Context, remote models.Remote, path string, prov provider.Provider) (*HeadResult, error) {
class := NewClassifier(prov).Classify(remote, path)
if class == ClassDenied {
return nil, &ProxyError{Status: http.StatusForbidden, Message: "access denied"}
}
if artifact, err := e.db.GetArtifact(ctx, remote.Name, path); err == nil && artifact != nil {
return &HeadResult{ContentType: artifact.ContentType, Size: artifact.SizeBytes, Source: "cache"}, nil
}
if info, err := e.store.Stat(ctx, storage.IndexKey(remote.Name, path)); err == nil {
return &HeadResult{ContentType: info.ContentType, Size: info.Size, Source: "cache"}, nil
}
return e.headUpstream(ctx, remote, path, prov)
}
func (e *Engine) headUpstream(ctx context.Context, remote models.Remote, path string, prov provider.Provider) (*HeadResult, error) {
url := prov.UpstreamURL(remote, path)
authHeaders, err := prov.AuthHeaders(ctx, remote)
if err != nil {
return nil, fmt.Errorf("auth headers: %w", err)
}
doHead := func(extra http.Header) (*http.Response, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodHead, url, nil)
if err != nil {
return nil, fmt.Errorf("create request: %w", err)
}
for k, vv := range authHeaders {
for _, v := range vv {
req.Header.Add(k, v)
}
}
for k, vv := range extra {
for _, v := range vv {
req.Header.Set(k, v)
}
}
return http.DefaultClient.Do(req)
}
resp, err := doHead(nil)
if err != nil {
return nil, &UpstreamError{Err: err}
}
if resp.StatusCode == http.StatusUnauthorized {
resp.Body.Close()
token, _, terr := fetchBearerToken(ctx, resp.Header.Get("Www-Authenticate"), remote)
if terr == nil && token != "" {
resp, err = doHead(http.Header{"Authorization": []string{"Bearer " + token}})
if err != nil {
return nil, &UpstreamError{Err: err}
}
} else {
return nil, &ProxyError{Status: http.StatusUnauthorized, Message: "upstream returned 401"}
}
}
resp.Body.Close()
if resp.StatusCode != http.StatusOK {
return nil, &ProxyError{Status: resp.StatusCode, Message: fmt.Sprintf("upstream returned %d", resp.StatusCode)}
}
contentType := prov.ContentType(path)
if ct := resp.Header.Get("Content-Type"); ct != "" {
contentType = ct
}
return &HeadResult{ContentType: contentType, Size: resp.ContentLength, Source: "remote"}, nil
}
func (e *Engine) fetchFromUpstream(ctx context.Context, remote models.Remote, path string, prov provider.Provider, class Classification, ttl time.Duration, clientHeaders http.Header) (*FetchResult, error) {
url := prov.UpstreamURL(remote, path)
authHeaders, err := prov.AuthHeaders(ctx, remote)
@@ -141,94 +294,144 @@ func (e *Engine) fetchFromUpstream(ctx context.Context, remote models.Remote, pa
req.Header.Add(k, v)
}
}
if clientHeaders != nil {
if accept := clientHeaders.Get("Accept"); accept != "" {
req.Header.Set("Accept", accept)
}
}
resp, err := http.DefaultClient.Do(req)
resp, err := clientForRemote(remote).Do(req)
if err != nil {
return nil, &UpstreamError{Err: err}
}
if resp.StatusCode == http.StatusUnauthorized {
resp.Body.Close()
token, err := e.cachedBearerToken(ctx, resp.Header.Get("Www-Authenticate"), remote)
if err == nil && token != "" {
req2, _ := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
req2.Header.Set("Authorization", "Bearer "+token)
if clientHeaders != nil {
if accept := clientHeaders.Get("Accept"); accept != "" {
req2.Header.Set("Accept", accept)
}
}
resp, err = clientForRemote(remote).Do(req2)
if err != nil {
return nil, &UpstreamError{Err: err}
}
} else {
return nil, &ProxyError{Status: http.StatusUnauthorized, Message: "upstream returned 401"}
}
}
if resp.StatusCode != http.StatusOK {
resp.Body.Close()
return nil, &ProxyError{Status: resp.StatusCode, Message: fmt.Sprintf("upstream returned %d", resp.StatusCode)}
}
body, err := io.ReadAll(resp.Body)
resp.Body.Close()
if err != nil {
return nil, fmt.Errorf("read upstream body: %w", err)
}
rewritten, err := prov.RewriteResponse(body, remote, "")
if err != nil {
return nil, fmt.Errorf("rewrite response: %w", err)
}
if rewritten != nil {
body = rewritten
}
contentType := prov.ContentType(path)
if ct := resp.Header.Get("Content-Type"); ct != "" && contentType == "application/octet-stream" {
if ct := resp.Header.Get("Content-Type"); ct != "" {
contentType = ct
}
// Mutable indexes are small and may be rewritten, so buffer them in memory.
if class == ClassMutable {
body, err := io.ReadAll(resp.Body)
resp.Body.Close()
if err != nil {
return nil, fmt.Errorf("read upstream body: %w", err)
}
rewritten, err := prov.RewriteResponse(body, remote, "")
if err != nil {
return nil, fmt.Errorf("rewrite response: %w", err)
}
if rewritten != nil {
body = rewritten
}
s3Key := storage.IndexKey(remote.Name, path)
if err := e.store.Upload(ctx, s3Key, bytesReader(body), int64(len(body)), contentType); err != nil {
return nil, fmt.Errorf("upload index: %w", err)
}
etag := resp.Header.Get("ETag")
_ = e.cache.SetTTL(ctx, remote.Name, path, ttl)
if etag != "" {
_ = e.cache.SetETag(ctx, remote.Name, path, etag, ttl)
}
} else {
hash := sha256Hash(body)
s3Key := storage.BlobKey(hash)
exists, _ := e.store.Exists(ctx, s3Key)
if !exists {
if err := e.store.Upload(ctx, s3Key, bytesReader(body), int64(len(body)), contentType); err != nil {
return nil, fmt.Errorf("upload blob: %w", err)
}
}
contentHash := fmt.Sprintf("sha256:%s", hash)
if err := e.db.UpsertBlob(ctx, contentHash, s3Key, int64(len(body)), contentType); err != nil {
slog.Warn("upsert blob failed", "error", err)
}
if err := e.db.UpsertArtifact(ctx, remote.Name, path, contentHash, resp.Header.Get("ETag")); err != nil {
slog.Warn("upsert artifact failed", "error", err)
}
_ = e.cache.SetTTL(ctx, remote.Name, path, ttl)
if etag := resp.Header.Get("ETag"); etag != "" {
_ = e.cache.SetETag(ctx, remote.Name, path, etag, ttl)
}
return &FetchResult{
Reader: io.NopCloser(bytesReader(body)),
ContentType: contentType,
Size: int64(len(body)),
Source: "remote",
}, nil
}
// Immutable blobs are streamed through the content-addressable store
// (tempfile -> sha256 -> S3) so arbitrarily large artifacts never sit
// fully in memory. Immutable content is never rewritten in the proxy path.
casResult, err := e.cas.Store(ctx, resp.Body, contentType)
resp.Body.Close()
if err != nil {
return nil, fmt.Errorf("store blob: %w", err)
}
if err := e.db.UpsertBlob(ctx, casResult.ContentHash, casResult.S3Key, casResult.SizeBytes, contentType); err != nil {
slog.Warn("upsert blob failed", "error", err)
}
if err := e.db.UpsertArtifact(ctx, remote.Name, path, casResult.ContentHash, resp.Header.Get("ETag")); err != nil {
slog.Warn("upsert artifact failed", "error", err)
}
_ = e.cache.SetTTL(ctx, remote.Name, path, ttl)
if etag := resp.Header.Get("ETag"); etag != "" {
_ = e.cache.SetETag(ctx, remote.Name, path, etag, ttl)
}
reader, info, err := e.store.Download(ctx, casResult.S3Key)
if err != nil {
return nil, fmt.Errorf("serve stored blob: %w", err)
}
return &FetchResult{
Reader: io.NopCloser(bytesReader(body)),
ContentType: contentType,
Size: int64(len(body)),
Reader: reader,
ContentType: info.ContentType,
Size: casResult.SizeBytes,
Source: "remote",
}, nil
}
// waitForStore polls the store for an artifact populated by the request that
// holds the fetch lock, returning it once available or nil if it does not
// appear within the wait budget (after which the caller fetches upstream
// itself). It stops early if the request context is cancelled.
func (e *Engine) waitForStore(ctx context.Context, remote models.Remote, path string) *FetchResult {
const (
pollInterval = 100 * time.Millisecond
maxWait = 5 * time.Second
)
deadline := time.Now().Add(maxWait)
for {
if result, err := e.serveFromStore(ctx, remote, path); err == nil {
return result
}
if time.Now().After(deadline) {
return nil
}
select {
case <-ctx.Done():
return nil
case <-time.After(pollInterval):
}
}
}
func (e *Engine) serveFromStore(ctx context.Context, remote models.Remote, path string) (*FetchResult, error) {
artifact, err := e.db.GetArtifact(ctx, remote.Name, path)
if err == nil && artifact != nil {
reader, info, err := e.store.Download(ctx, artifact.ContentHash[len("sha256:"):])
if err == nil {
_ = e.db.TouchArtifactAccess(ctx, remote.Name, path)
return &FetchResult{
Reader: reader,
ContentType: info.ContentType,
Size: info.Size,
}, nil
}
s3Key := storage.BlobKey(artifact.ContentHash[len("sha256:"):])
reader, info, err = e.store.Download(ctx, s3Key)
reader, info, err := e.store.Download(ctx, s3Key)
if err == nil {
_ = e.db.TouchArtifactAccess(ctx, remote.Name, path)
return &FetchResult{
@@ -270,7 +473,7 @@ func (e *Engine) checkUpstream(ctx context.Context, remote models.Remote, path,
}
}
resp, err := http.DefaultClient.Do(req)
resp, err := clientForRemote(remote).Do(req)
if err != nil {
return false, &UpstreamError{Err: err}
}
@@ -291,15 +494,20 @@ func (e *Engine) ttlFor(remote models.Remote, class Classification) time.Duratio
}
}
// logAccess enqueues an access-log entry for the batch writer. It never blocks
// the request path: if the buffer is full the entry is dropped.
func (e *Engine) logAccess(remoteName, path string, cacheHit bool, size int64, upstreamMS int) {
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
_ = e.db.InsertAccessLog(ctx, remoteName, path, cacheHit, size, upstreamMS, "")
}
func sha256Hash(data []byte) string {
h := sha256.Sum256(data)
return hex.EncodeToString(h[:])
select {
case e.accessLog <- database.AccessLogEntry{
RemoteName: remoteName,
Path: path,
CacheHit: cacheHit,
SizeBytes: size,
UpstreamMS: upstreamMS,
}:
default:
slog.Warn("access log buffer full, dropping entry", "remote", remoteName, "path", path)
}
}
func bytesReader(data []byte) io.Reader {
@@ -319,6 +527,110 @@ func (r readerAt) ReadAt(p []byte, off int64) (n int, err error) {
return
}
// bearerTokenTTLDefault/Margin bound how long a token is cached: the default
// is used when the token endpoint omits expires_in, and the margin is
// subtracted so a cached token is refreshed slightly before it actually expires.
const (
bearerTokenTTLDefault = 60 * time.Second
bearerTokenTTLMargin = 10 * time.Second
)
func sha256Hash(data []byte) string {
h := sha256.Sum256(data)
return hex.EncodeToString(h[:])
}
// cachedBearerToken returns a bearer token for the given challenge, reusing a
// Redis-cached token for the same remote+challenge while it is still valid.
func (e *Engine) cachedBearerToken(ctx context.Context, wwwAuth string, remote models.Remote) (string, error) {
key := remote.Name + ":" + sha256Hash([]byte(wwwAuth))
if tok, err := e.cache.GetToken(ctx, key); err == nil && tok != "" {
return tok, nil
}
tok, ttl, err := fetchBearerToken(ctx, wwwAuth, remote)
if err != nil {
return "", err
}
if tok != "" {
if ttl <= 0 {
ttl = bearerTokenTTLDefault
}
if ttl > bearerTokenTTLMargin {
ttl -= bearerTokenTTLMargin
}
_ = e.cache.SetToken(ctx, key, tok, ttl)
}
return tok, nil
}
func fetchBearerToken(ctx context.Context, wwwAuth string, remote models.Remote) (string, time.Duration, error) {
if !strings.HasPrefix(wwwAuth, "Bearer ") {
return "", 0, fmt.Errorf("not a Bearer challenge")
}
params := map[string]string{}
for _, part := range strings.Split(wwwAuth[7:], ",") {
part = strings.TrimSpace(part)
eq := strings.Index(part, "=")
if eq < 0 {
continue
}
key := part[:eq]
val := strings.Trim(part[eq+1:], `"`)
params[key] = val
}
realm := params["realm"]
if realm == "" {
return "", 0, fmt.Errorf("no realm in Bearer challenge")
}
tokenURL := realm
sep := "?"
if s, ok := params["service"]; ok {
tokenURL += sep + "service=" + s
sep = "&"
}
if s, ok := params["scope"]; ok {
tokenURL += sep + "scope=" + s
}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, tokenURL, nil)
if err != nil {
return "", 0, err
}
if remote.Username != "" && remote.Password != "" {
req.SetBasicAuth(remote.Username, remote.Password)
}
resp, err := clientForRemote(remote).Do(req)
if err != nil {
return "", 0, err
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
return "", 0, fmt.Errorf("token endpoint returned %d", resp.StatusCode)
}
var tokenResp struct {
Token string `json:"token"`
AccessToken string `json:"access_token"`
ExpiresIn int `json:"expires_in"`
}
if err := json.NewDecoder(resp.Body).Decode(&tokenResp); err != nil {
return "", 0, err
}
ttl := time.Duration(tokenResp.ExpiresIn) * time.Second
if tokenResp.Token != "" {
return tokenResp.Token, ttl, nil
}
return tokenResp.AccessToken, ttl, nil
}
type ProxyError struct {
Status int
Message string
@@ -334,8 +646,6 @@ func (e *UpstreamError) Error() string { return fmt.Sprintf("upstream error: %v"
func (e *UpstreamError) Unwrap() error { return e.Err }
func isNetworkError(err error) bool {
if _, ok := err.(*UpstreamError); ok {
return true
}
return false
var ue *UpstreamError
return errors.As(err, &ue)
}
+83
View File
@@ -0,0 +1,83 @@
package proxy
import (
"net"
"net/http"
"sync"
"time"
"git.unkin.net/unkin/artifactapi/pkg/models"
)
// Default upstream timeouts. A remote may override any of these; a zero
// override falls back to the default here. There is deliberately no overall
// Client.Timeout: the proxy streams arbitrarily large artifacts and total time
// is bounded by the request context instead. We only constrain the phases that
// must never hang — connect, TLS handshake, and time-to-first-response-header —
// so a slow or wedged upstream cannot pin a goroutine and connection.
const (
defaultDialTimeout = 10 * time.Second
defaultTLSTimeout = 10 * time.Second
defaultResponseHeaderTimeout = 30 * time.Second
)
type clientKey struct {
dial time.Duration
tls time.Duration
respHeader time.Duration
}
var (
clientCacheMu sync.Mutex
clientCache = map[clientKey]*http.Client{}
)
// upstreamClientFor returns an HTTP client configured with the given timeouts,
// reusing a cached client (and its connection pool) for identical timeout sets.
// Zero values fall back to the defaults.
func upstreamClientFor(dial, tls, respHeader time.Duration) *http.Client {
if dial <= 0 {
dial = defaultDialTimeout
}
if tls <= 0 {
tls = defaultTLSTimeout
}
if respHeader <= 0 {
respHeader = defaultResponseHeaderTimeout
}
key := clientKey{dial: dial, tls: tls, respHeader: respHeader}
clientCacheMu.Lock()
defer clientCacheMu.Unlock()
if c, ok := clientCache[key]; ok {
return c
}
c := &http.Client{
Transport: &http.Transport{
Proxy: http.ProxyFromEnvironment,
DialContext: (&net.Dialer{
Timeout: dial,
KeepAlive: 30 * time.Second,
}).DialContext,
MaxIdleConns: 100,
MaxIdleConnsPerHost: 10,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: tls,
ExpectContinueTimeout: 1 * time.Second,
ResponseHeaderTimeout: respHeader,
},
}
clientCache[key] = c
return c
}
// clientForRemote returns the upstream client for a remote, applying its
// per-remote timeout overrides (in seconds) on top of the defaults.
func clientForRemote(remote models.Remote) *http.Client {
return upstreamClientFor(
time.Duration(remote.UpstreamDialTimeout)*time.Second,
time.Duration(remote.UpstreamTLSTimeout)*time.Second,
time.Duration(remote.UpstreamResponseHeaderTimeout)*time.Second,
)
}
+5 -2
View File
@@ -35,6 +35,7 @@ import (
type Server struct {
cfg *config.Config
version string
router chi.Router
db *database.DB
cache *cache.Redis
@@ -45,7 +46,7 @@ type Server struct {
gc *gc.Collector
}
func New(cfg *config.Config) (*Server, error) {
func New(cfg *config.Config, version string) (*Server, error) {
db, err := database.New(cfg.DatabaseDSN())
if err != nil {
return nil, fmt.Errorf("database: %w", err)
@@ -68,6 +69,7 @@ func New(cfg *config.Config) (*Server, error) {
s := &Server{
cfg: cfg,
version: version,
db: db,
cache: redis,
store: s3,
@@ -96,6 +98,7 @@ func (s *Server) routes() chi.Router {
proxyHandler := v1.NewProxyHandler(s.engine, s.virtEngine, s.db, s.store, s.localHandler)
r.Mount("/api/v1", proxyHandler.Routes())
r.Mount("/v2", proxyHandler.DockerV2Routes())
remotesHandler := v2.NewRemotesHandler(s.db)
virtualsHandler := v2.NewVirtualsHandler(s.db)
@@ -137,7 +140,7 @@ func (s *Server) handleHealth(w http.ResponseWriter, r *http.Request) {
func (s *Server) handleRoot(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK)
fmt.Fprint(w, `{"name":"artifactapi","version":"3.0.0-dev"}`)
fmt.Fprintf(w, `{"name":"artifactapi","version":"%s"}`, s.version)
}
func (s *Server) newHTTPServer() *http.Server {
+2 -2
View File
@@ -79,7 +79,7 @@ func (e *Engine) fetchMemberIndexes(ctx context.Context, virt models.Virtual, pa
results[idx] = result{err: fmt.Errorf("local index %q: %w", name, err)}
return
}
results[idx] = result{index: MemberIndex{RemoteName: name, RepoType: remote.RepoType, Body: body}}
results[idx] = result{index: MemberIndex{RemoteName: name, RepoType: remote.RepoType, BaseURL: remote.BaseURL, Body: body}}
return
}
@@ -102,7 +102,7 @@ func (e *Engine) fetchMemberIndexes(ctx context.Context, virt models.Virtual, pa
return
}
results[idx] = result{index: MemberIndex{RemoteName: name, RepoType: remote.RepoType, Body: body}}
results[idx] = result{index: MemberIndex{RemoteName: name, RepoType: remote.RepoType, BaseURL: remote.BaseURL, Body: body}}
}(i, memberName)
}
+40 -3
View File
@@ -54,15 +54,27 @@ func (m *HelmMerger) MergeIndexes(members []MemberIndex, proxyBaseURL string) ([
seen[chart][ver.Version] = true
if proxyBaseURL != "" {
routePrefix := "remote"
if member.RepoType == "local" {
routePrefix = "local"
}
baseHost := extractHost(member.BaseURL)
for i, u := range ver.URLs {
if strings.HasPrefix(u, "http://") || strings.HasPrefix(u, "https://") {
ver.URLs[i] = fmt.Sprintf("%s/api/v1/remote/%s/%s",
if baseHost != "" && extractHost(u) != baseHost {
continue
}
relPath := extractPathRelativeToBase(u, member.BaseURL)
ver.URLs[i] = fmt.Sprintf("%s/api/v1/%s/%s/%s",
strings.TrimRight(proxyBaseURL, "/"),
routePrefix,
member.RemoteName,
extractPath(u))
relPath)
} else {
ver.URLs[i] = fmt.Sprintf("%s/api/v1/remote/%s/%s",
ver.URLs[i] = fmt.Sprintf("%s/api/v1/%s/%s/%s",
strings.TrimRight(proxyBaseURL, "/"),
routePrefix,
member.RemoteName,
u)
}
@@ -78,6 +90,31 @@ func (m *HelmMerger) MergeIndexes(members []MemberIndex, proxyBaseURL string) ([
return yaml.Marshal(merged)
}
func extractHost(rawURL string) string {
idx := strings.Index(rawURL, "://")
if idx == -1 {
return ""
}
rest := rawURL[idx+3:]
slashIdx := strings.Index(rest, "/")
if slashIdx == -1 {
return rest
}
return rest[:slashIdx]
}
func extractPathRelativeToBase(rawURL, baseURL string) string {
fullPath := extractPath(rawURL)
basePath := extractPath(baseURL)
if basePath != "" {
basePath = strings.TrimRight(basePath, "/") + "/"
if strings.HasPrefix(fullPath, basePath) {
return fullPath[len(basePath):]
}
}
return fullPath
}
func extractPath(rawURL string) string {
idx := strings.Index(rawURL, "://")
if idx == -1 {
+1
View File
@@ -9,6 +9,7 @@ import (
type MemberIndex struct {
RemoteName string
RepoType models.RepoType
BaseURL string
Body []byte
}
+30
View File
@@ -2,6 +2,7 @@ package models
import (
"fmt"
"regexp"
"time"
)
@@ -46,6 +47,11 @@ type Remote struct {
MutableTTL int `json:"mutable_ttl"`
CheckMutable bool `json:"check_mutable"`
// Upstream HTTP timeouts in seconds. 0 means use the server default.
UpstreamDialTimeout int `json:"upstream_dial_timeout,omitempty"`
UpstreamTLSTimeout int `json:"upstream_tls_timeout,omitempty"`
UpstreamResponseHeaderTimeout int `json:"upstream_response_header_timeout,omitempty"`
Patterns []string `json:"patterns,omitempty"`
Blocklist []string `json:"blocklist,omitempty"`
MutablePatterns []string `json:"mutable_patterns,omitempty"`
@@ -66,6 +72,30 @@ type Remote struct {
UpdatedAt time.Time `json:"updated_at"`
}
// ValidatePatterns ensures every configured regex compiles. Storing an
// invalid pattern would otherwise be silently dropped at match time, which
// for the blocklist is a fail-open: a mistyped deny rule becomes a no-op.
func (r *Remote) ValidatePatterns() error {
groups := []struct {
field string
patterns []string
}{
{"patterns", r.Patterns},
{"blocklist", r.Blocklist},
{"mutable_patterns", r.MutablePatterns},
{"immutable_patterns", r.ImmutablePatterns},
{"ban_tags", r.BanTags},
}
for _, g := range groups {
for _, p := range g.patterns {
if _, err := regexp.Compile(p); err != nil {
return fmt.Errorf("invalid regex in %s: %q: %w", g.field, p, err)
}
}
}
return nil
}
type RemoteWithStats struct {
Remote
Stats RemoteStats `json:"stats"`
+19
View File
@@ -0,0 +1,19 @@
package models
import "testing"
func TestRemote_ValidatePatterns(t *testing.T) {
valid := &Remote{
Patterns: []string{`.*\.tar\.gz$`},
Blocklist: []string{`^secret/`},
ImmutablePatterns: []string{`\.rpm$`},
}
if err := valid.ValidatePatterns(); err != nil {
t.Fatalf("expected valid patterns, got %v", err)
}
bad := &Remote{Blocklist: []string{`[unterminated`}}
if err := bad.ValidatePatterns(); err == nil {
t.Fatal("expected error for invalid blocklist regex, got nil")
}
}
+7
View File
@@ -6,13 +6,20 @@ COPY package.json package-lock.json* ./
RUN npm ci
COPY . .
ARG BASE_PATH=/
ENV BASE_PATH=${BASE_PATH}
RUN npm run build
FROM nginx:alpine
ARG BASE_PATH=/
COPY --from=builder /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/conf.d/default.conf
RUN sed -i "s|\${BASE_PATH}|${BASE_PATH}|g" /etc/nginx/conf.d/default.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
+6 -27
View File
@@ -5,33 +5,12 @@ server {
root /usr/share/nginx/html;
index index.html;
location /api/ {
proxy_pass http://artifactapi:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_buffering off;
}
location /v2/ {
proxy_pass http://artifactapi:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_buffering off;
}
location /health {
proxy_pass http://artifactapi:8000;
}
location /metrics {
proxy_pass http://artifactapi:8000;
}
location / {
location ${BASE_PATH}/ {
rewrite ^${BASE_PATH}(/.*)$ $1 break;
try_files $uri $uri/ /index.html;
}
location = ${BASE_PATH} {
return 301 ${BASE_PATH}/;
}
}
+6
View File
@@ -2,6 +2,8 @@ import { Routes, Route, NavLink } from 'react-router-dom';
import { Dashboard } from './pages/Dashboard';
import { Remotes } from './pages/Remotes';
import { RemoteDetail } from './pages/RemoteDetail';
import { Locals } from './pages/Locals';
import { LocalDetail } from './pages/LocalDetail';
import { Virtuals } from './pages/Virtuals';
import { Objects } from './pages/Objects';
import { Probe } from './pages/Probe';
@@ -18,6 +20,7 @@ export function App() {
<div className="sidebar-nav">
<NavLink to="/" end>Dashboard</NavLink>
<NavLink to="/remotes">Remotes</NavLink>
<NavLink to="/locals">Locals</NavLink>
<NavLink to="/virtuals">Virtuals</NavLink>
<NavLink to="/probe">Test Remote</NavLink>
</div>
@@ -31,6 +34,9 @@ export function App() {
<Route path="/remotes" element={<Remotes />} />
<Route path="/remotes/:name" element={<RemoteDetail />} />
<Route path="/remotes/:name/objects" element={<Objects />} />
<Route path="/locals" element={<Locals />} />
<Route path="/locals/:name" element={<LocalDetail />} />
<Route path="/locals/:name/objects" element={<Objects />} />
<Route path="/virtuals" element={<Virtuals />} />
<Route path="/probe" element={<Probe />} />
</Routes>
+5 -1
View File
@@ -4,9 +4,13 @@ import { BrowserRouter } from 'react-router-dom';
import { App } from './App';
import './index.css';
declare const __BASE_PATH__: string;
const basename = __BASE_PATH__.replace(/\/+$/, '') || '/';
createRoot(document.getElementById('root')!).render(
<StrictMode>
<BrowserRouter>
<BrowserRouter basename={basename}>
<App />
</BrowserRouter>
</StrictMode>,
+5
View File
@@ -50,6 +50,11 @@ export function Dashboard() {
value={formatNumber(stats.total_blobs_deduped)}
sub="shared blobs"
/>
<StatsCard
label="Bandwidth Saved"
value={formatBytes(stats.bandwidth_saved_30d)}
sub="last 30 days"
/>
</div>
{health && (
+46
View File
@@ -0,0 +1,46 @@
import { useEffect, useState } from 'react';
import { useParams, Link } from 'react-router-dom';
import { api } from '../api/client';
import type { Remote } from '../api/types';
import { Badge } from '../components/Badge';
import './RemoteDetail.css';
export function LocalDetail() {
const { name } = useParams<{ name: string }>();
const [remote, setRemote] = useState<Remote | null>(null);
const [error, setError] = useState<string | null>(null);
useEffect(() => {
if (!name) return;
api.getRemote(name)
.then(setRemote)
.catch(e => setError(e.message));
}, [name]);
if (error) return <div className="error-banner">{error}</div>;
if (!remote) return <div className="loading">Loading...</div>;
return (
<div>
<div className="detail-header">
<Link to="/locals" className="back-link">&larr; Locals</Link>
<h1 className="page-title">{remote.name}</h1>
<div className="detail-badges">
<Badge variant="blue">{remote.package_type}</Badge>
<Badge variant="default">local</Badge>
{remote.managed_by && <Badge variant="green">managed by {remote.managed_by}</Badge>}
</div>
</div>
{remote.description && (
<p className="detail-description">{remote.description}</p>
)}
<div className="detail-actions">
<Link to={`/locals/${remote.name}/objects`} className="btn btn-primary">
Browse Files
</Link>
</div>
</div>
);
}
+93
View File
@@ -0,0 +1,93 @@
import { useEffect, useState } from 'react';
import { useNavigate } from 'react-router-dom';
import { api } from '../api/client';
import type { Remote } from '../api/types';
import { Badge } from '../components/Badge';
import { DataTable } from '../components/DataTable';
import './Remotes.css';
const typeColors: Record<string, 'blue' | 'green' | 'yellow' | 'red' | 'default'> = {
docker: 'blue',
helm: 'green',
rpm: 'yellow',
pypi: 'blue',
npm: 'red',
generic: 'default',
alpine: 'green',
puppet: 'yellow',
terraform: 'blue',
goproxy: 'green',
};
export function Locals() {
const navigate = useNavigate();
const [remotes, setRemotes] = useState<Remote[]>([]);
const [filter, setFilter] = useState('');
const [loading, setLoading] = useState(true);
useEffect(() => {
api.listRemotes()
.then(r => setRemotes((r || []).filter(x => x.repo_type === 'local')))
.finally(() => setLoading(false));
}, []);
const filtered = remotes.filter(r => {
if (filter && !r.name.toLowerCase().includes(filter.toLowerCase())) return false;
return true;
});
return (
<div>
<h1 className="page-title">Local Repositories</h1>
<div className="remotes-toolbar">
<input
className="search-input"
placeholder="Filter by name..."
value={filter}
onChange={e => setFilter(e.target.value)}
/>
<span className="result-count">{filtered.length} locals</span>
</div>
{loading ? (
<div className="loading">Loading...</div>
) : (
<DataTable
columns={[
{
key: 'name',
header: 'Name',
render: (r: Remote) => <span className="mono">{r.name}</span>,
},
{
key: 'type',
header: 'Type',
render: (r: Remote) => (
<Badge variant={typeColors[r.package_type] || 'default'}>
{r.package_type}
</Badge>
),
width: '110px',
},
{
key: 'description',
header: 'Description',
render: (r: Remote) => r.description || <span className="text-muted"></span>,
},
{
key: 'managed',
header: 'Managed',
render: (r: Remote) =>
r.managed_by ? <Badge variant="blue">{r.managed_by}</Badge> : <span className="text-muted"></span>,
width: '100px',
},
]}
data={filtered}
emptyMessage="No local repositories configured"
onRowClick={(r) => navigate(`/locals/${r.name}`)}
/>
)}
</div>
);
}
+5 -2
View File
@@ -1,5 +1,5 @@
import { useEffect, useState, useCallback, useMemo } from 'react';
import { useParams, Link } from 'react-router-dom';
import { useParams, useLocation, Link } from 'react-router-dom';
import { api } from '../api/client';
import type { Artifact } from '../api/types';
import { formatBytes, timeAgo, truncateHash } from '../components/format';
@@ -171,6 +171,9 @@ function TreeRow({ node, depth, expanded, onToggle, onEvict }: TreeRowProps) {
export function Objects() {
const { name } = useParams<{ name: string }>();
const location = useLocation();
const isLocal = location.pathname.startsWith('/locals/');
const backLink = isLocal ? `/locals/${name}` : `/remotes/${name}`;
const [artifacts, setArtifacts] = useState<Artifact[]>([]);
const [loading, setLoading] = useState(true);
const [filter, setFilter] = useState('');
@@ -233,7 +236,7 @@ export function Objects() {
return (
<div>
<div className="detail-header">
<Link to={`/remotes/${name}`} className="back-link">&larr; {name}</Link>
<Link to={backLink} className="back-link">&larr; {name}</Link>
<h1 className="page-title">Cached Objects</h1>
</div>
+3 -2
View File
@@ -32,9 +32,10 @@ export function Remotes() {
.finally(() => setLoading(false));
}, []);
const types = [...new Set(remotes.map(r => r.package_type))].sort();
const remoteOnly = remotes.filter(r => r.repo_type !== 'local');
const types = [...new Set(remoteOnly.map(r => r.package_type))].sort();
const filtered = remotes.filter(r => {
const filtered = remoteOnly.filter(r => {
if (typeFilter && r.package_type !== typeFilter) return false;
if (filter && !r.name.toLowerCase().includes(filter.toLowerCase())) return false;
return true;
+32 -10
View File
@@ -1,21 +1,38 @@
import { useEffect, useState } from 'react';
import { Link } from 'react-router-dom';
import { api } from '../api/client';
import type { Virtual } from '../api/types';
import type { Remote, Virtual } from '../api/types';
import { Badge } from '../components/Badge';
import { DataTable } from '../components/DataTable';
import './Virtuals.css';
export function Virtuals() {
const [virtuals, setVirtuals] = useState<Virtual[]>([]);
const [remoteMap, setRemoteMap] = useState<Record<string, Remote>>({});
const [loading, setLoading] = useState(true);
const [expanded, setExpanded] = useState<string | null>(null);
useEffect(() => {
api.listVirtuals()
.then(v => setVirtuals(v || []))
Promise.all([api.listVirtuals(), api.listRemotes()])
.then(([v, r]) => {
setVirtuals(v || []);
const map: Record<string, Remote> = {};
for (const remote of r || []) {
map[remote.name] = remote;
}
setRemoteMap(map);
})
.finally(() => setLoading(false));
}, []);
function memberLink(name: string) {
const remote = remoteMap[name];
if (remote?.repo_type === 'local') {
return `/locals/${name}`;
}
return `/remotes/${name}`;
}
return (
<div>
<h1 className="page-title">Virtual Repositories</h1>
@@ -40,7 +57,7 @@ export function Virtuals() {
key: 'members',
header: 'Members',
render: (v: Virtual) => (
<span className="member-count">{v.members?.length || 0} remotes</span>
<span className="member-count">{v.members?.length || 0} repos</span>
),
width: '110px',
},
@@ -69,12 +86,17 @@ export function Virtuals() {
<ul className="member-list">
{virtuals
.find(v => v.name === expanded)
?.members?.map((m, i) => (
<li key={m}>
<span className="member-priority">{i + 1}</span>
<a href={`/remotes/${m}`} className="mono">{m}</a>
</li>
))}
?.members?.map((m, i) => {
const remote = remoteMap[m];
const typeLabel = remote?.repo_type === 'local' ? 'local' : 'remote';
return (
<li key={m}>
<span className="member-priority">{i + 1}</span>
<Link to={memberLink(m)} className="mono">{m}</Link>
<Badge variant={typeLabel === 'local' ? 'yellow' : 'default'}>{typeLabel}</Badge>
</li>
);
})}
</ul>
</div>
)}
+6
View File
@@ -1,7 +1,10 @@
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
const basePath = process.env.BASE_PATH || '/'
export default defineConfig({
base: basePath,
plugins: [react()],
server: {
proxy: {
@@ -11,4 +14,7 @@ export default defineConfig({
'/metrics': 'http://localhost:8000',
},
},
define: {
'__BASE_PATH__': JSON.stringify(basePath),
},
})