Classify runs on every proxied request and recompiled each remote's
pattern lists every time. Cache compiled regexes keyed by pattern text so
each distinct pattern compiles once and is reused thereafter.
Refs #73
Shows total bytes served from cache (instead of upstream) over the last 30 days. Queries `SUM(size_bytes) WHERE cache_hit = TRUE` from access_log.
Reviewed-on: #65
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
When a helm repo base URL includes a path component (e.g. \`stakater.github.io/stakater-charts\`), the merger was extracting the full URL path (\`stakater-charts/reloader-2.2.8.tgz\`) and the proxy then constructed \`base_url/stakater-charts/reloader-2.2.8.tgz\` = double path = 404.
Fix: \`extractPathRelativeToBase()\` strips the shared base path prefix so only the filename portion is used as the proxy path.
Reviewed-on: #64
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
The / endpoint was hardcoded to return 3.0.0-dev. Now uses the git tag version set via ldflags at build time.
Reviewed-on: #63
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
## Problems
1. Docker daemon sends specific Accept headers to negotiate manifest format, but the proxy dropped them — registries defaulted to OCI format, causing "mediaType should be manifest.v2+json not oci.image.index" errors
2. Upstream Content-Type was only used when the provider returned "application/octet-stream" — Docker manifests got the wrong Content-Type
## Fixes
- Forward client Accept header to upstream (both initial request and Bearer token retry)
- Always prefer upstream Content-Type when present
- Fetch signature now accepts variadic clientHeaders for backwards compat
## E2E tested
- DockerHub: redis:7-alpine, alpine:3 — skopeo inspect OK
- GHCR: OCI-only images work with docker pull (GHCR 404s Docker v2 Accept, which is expected)
- Quay: prometheus/node-exporter — skopeo inspect OK
Reviewed-on: #62
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Docker Hub (and other registries) return 401 with a `Www-Authenticate: Bearer realm=...` challenge even for public images. The proxy now:
1. Detects 401 + Bearer challenge
2. Parses realm/service/scope from the header
3. Fetches an anonymous token (or authenticated if username/password configured)
4. Retries the original request with the Bearer token
Fixes: `docker pull artifactapi.../dockerhub/library/redis:latest` returning "unauthorized: upstream returned 401"
Reviewed-on: #60
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
The v3 Go rewrite removed the /v2/ Docker Registry compatibility endpoint. Docker clients need:
- GET/HEAD /v2/ → 200 (registry ping)
- GET/HEAD /v2/{remoteName}/* → proxy to the docker remote
Usage: `docker pull artifactapi.example.com/{remoteName}/image:tag`
Reviewed-on: #57
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
## Problem
Helm charts like Intel device plugins have download URLs on `github.com` but the chart index is served from `intel.github.io`. The merger rewrites all URLs through the proxy, constructing:
```
https://artifactapi/api/v1/remote/intel-helm/intel/helm-charts/releases/download/...
```
Which proxies to `https://intel.github.io/helm-charts/intel/helm-charts/releases/download/...` — a 404.
## Fix
Compare the download URL host against the remote's base URL host. If they differ, leave the URL as-is so helm downloads directly from the source. Same-host URLs are still rewritten through the proxy.
Also adds `BaseURL` to `MemberIndex` so the merger has the context it needs, and uses the correct `/local/` vs `/remote/` route prefix.
Reviewed-on: #56
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
## Summary
- Upload RPMs to local repos, metadata parsed async via cavaliergopher/rpm
- Repodata (repomd.xml, primary/filelists/other.xml.gz) generated on-demand from DB — nothing stored in S3
- RPM provider implements LocalUploader, PostUploadHook, and LocalIndexer
- New rpm_metadata table for parsed RPM header data (name, version, deps, etc.)
- New provider interfaces: PostUploadHook, BlobReader, MetadataStore, RPMMetadataReader
## Test plan
- [x] Upload cowsay RPM from epel → async metadata parse confirmed in logs
- [x] repomd.xml generated with correct hashes → primary.xml.gz has correct metadata
- [x] `dnf install` from local repo: download + install successful
- [x] Bad file rejection (.txt → 400), overwrite rejection (409)
Reviewed-on: #53
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
## Summary
Move package-type-specific local repo logic out of centralized handlers into provider packages via optional Go interfaces.
**New interfaces in `provider` package:**
- \`LocalUploader\`: \`ValidateUpload(filePath) → (storagePath, contentType, error)\` + \`UploadResponse(...)\`
- \`LocalIndexer\`: \`ServeLocalIndex(w, r, files, repoName, path) → bool\` + \`GenerateLocalIndex(ctx, files, repoName, path) → ([]byte, error)\`
- \`FileStore\`: \`ListFilesByPrefix\` + \`ListPackages\` (implemented by database.DB)
**Providers implement these interfaces:**
- PyPI: upload validation (wheel/sdist naming), simple index serving + generation
- Terraform: upload validation (provider zip naming), mirror protocol serving
**Handlers simplified to generic dispatch:**
- \`local.go\`: type-asserts to \`LocalUploader\`, falls back to generic upload
- \`proxy.go\`: type-asserts to \`LocalIndexer\`, falls back to raw file serving
- \`engine.go\`: type-asserts to \`LocalIndexer\` for local virtual members
Adding a new local repo type (e.g. RPM) = implement the interfaces in its provider package. Zero handler changes.
## Test plan
- [x] Build + unit tests pass
- [x] E2E: PyPI local upload → simple index → uv pip install (smoke test after refactor)
Reviewed-on: #52
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
## Summary
- Virtual engine detects local members and generates indexes in-memory
- MemberIndex.RepoType drives correct URL prefix in merged output
- PyPI merger rewrites links to /api/v1/local/ or /api/v1/remote/ appropriately
- Includes local PyPI support (cherry-picked from #50)
## Test plan
- [x] Upload wheel to local PyPI → install from direct local URL
- [x] Create virtual with local + remote → install from virtual URL
- [x] Both paths produce correct absolute download URLs
Reviewed-on: #51
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
## Summary
- Upload Python wheels/sdists to local PyPI repos with filename validation
- PEP 503 simple index computed on-demand from stored files
- Package names normalized per PEP 503 (lowercase, hyphens)
- Overwrites rejected (409 Conflict)
## Test plan
- [x] Build wheel with `uv build` → upload → verify simple index HTML → `uv pip install` from local repo
- [x] Bad filename rejection (400)
- [x] Overwrite rejection (409)
- [x] Hash integrity verification on download
Reviewed-on: #50
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Introduces repo_type (remote/local) as a separate axis from package_type
so that any package type can be hosted locally. A terraform local repo
is package_type=terraform + repo_type=local.
- Remote model gains RepoType field (defaults to "remote")
- Database schema adds repo_type column with migration for existing DBs
- V1 proxy adds /api/v1/local/{name}/* route for serving local files
- V2 upload via PUT /api/v2/remotes/{name}/files/{ns}/{type}/{file}.zip
validates filename matches terraform-provider-{type}_{ver}_{os}_{arch}.zip
and returns 409 on duplicate (no overwrites)
- index.json and {version}.json are computed on-the-fly from uploaded zips
rather than stored as separate files
- V2 create validates repo_type and requires base_url only for remotes
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #49
- Objects page renders paths as a collapsible tree instead of flat list
with expand/collapse all, aggregated size/hits per directory
- Dashboard gains top-files-by-hits and top-files-by-bandwidth tables
- Backend: new /api/v2/stats/top-files-by-hits and
/api/v2/stats/top-files-by-bandwidth endpoints
- Raised per_page max to 5000 for objects listing
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #48