Compare commits

..

1 Commits

Author SHA1 Message Date
unkinben 8320987121 feat: add virtual repository support for unified index merging
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/test Pipeline was successful
ci/woodpecker/pr/build Pipeline was successful
ci/woodpecker/tag/docker Pipeline was successful
Adds a new virtual repo type that merges indexes from multiple member remotes
of the same package type. Currently supports helm (index.yaml merge with URL
rewriting). Member fetches run in parallel; merged index is Redis-cached at
min(mutable_ttl) across members.
2026-04-29 22:59:51 +10:00
22 changed files with 316 additions and 917 deletions
+23 -47
View File
@@ -11,7 +11,6 @@ FastAPI caching proxy that downloads and stores files from remote sources in S3-
- Stale-on-upstream-error: refreshes TTL when backend is unreachable rather than evicting - Stale-on-upstream-error: refreshes TTL when backend is unreachable rather than evicting
- URL rewriting for PyPI simple index, npm metadata, and Helm `index.yaml` - URL rewriting for PyPI simple index, npm metadata, and Helm `index.yaml`
- Access control via regex patterns — unmatched paths return 403 - Access control via regex patterns — unmatched paths return 403
- Docker tag banning — block named tags (e.g. `latest`) while allowing digest pulls
## Architecture ## Architecture
@@ -71,11 +70,10 @@ src/artifactapi/
| Method | Path | Description | | Method | Path | Description |
|---|---|---| |---|---|---|
| `GET` | `/api/v1/remote/{remote}/{path}` | Fetch artifact (auto-cache on miss) | | `GET` | `/api/v1/remote/{remote}/{path}` | Fetch artifact (auto-cache on miss) |
| `PUT` | `/api/v1/remote/{remote}/{path}` | Upload to local remote |
| `HEAD` | `/api/v1/remote/{remote}/{path}` | Check existence (local remotes) |
| `DELETE` | `/api/v1/remote/{remote}/{path}` | Delete from local remote |
| `GET` | `/api/v1/virtual/{virtual}/{path}` | Fetch from virtual (merged) repository | | `GET` | `/api/v1/virtual/{virtual}/{path}` | Fetch from virtual (merged) repository |
| `GET` | `/api/v1/local/{local}/{path}` | Download from local repository |
| `PUT` | `/api/v1/local/{local}/{path}` | Upload to local repository |
| `HEAD` | `/api/v1/local/{local}/{path}` | Check existence (local) |
| `DELETE` | `/api/v1/local/{local}/{path}` | Delete from local repository |
| `GET` | `/v2/{remote}/{path}` | Docker Registry v2 proxy | | `GET` | `/v2/{remote}/{path}` | Docker Registry v2 proxy |
| `PUT` | `/cache/flush` | Flush cache entries | | `PUT` | `/cache/flush` | Flush cache entries |
| `GET` | `/health` | Health check | | `GET` | `/health` | Health check |
@@ -122,14 +120,13 @@ config_dir: conf.d # or an absolute path
remotes: {} # optional base remotes remotes: {} # optional base remotes
``` ```
### Configuration structure ### remotes.yaml Structure
Repositories are declared under three top-level keys matching their type:
```yaml ```yaml
remotes: # proxy (caching) remotes remotes:
remote-name: remote-name:
base_url: "https://example.com" base_url: "https://example.com"
type: "remote" # "remote", "local", or "virtual"
package: "generic" # generic, alpine, rpm, docker, pypi, npm, helm package: "generic" # generic, alpine, rpm, docker, pypi, npm, helm
description: "..." description: "..."
immutable_patterns: # regex — cached forever immutable_patterns: # regex — cached forever
@@ -140,20 +137,6 @@ remotes: # proxy (caching) remotes
cache: cache:
immutable_ttl: 0 # 0 = indefinitely immutable_ttl: 0 # 0 = indefinitely
mutable_ttl: 3600 mutable_ttl: 3600
virtuals: # virtual (merged-index) repositories
virtual-name:
package: "helm"
members:
- remote-a
- remote-b
locals: # local upload repositories (no base_url)
local-name:
package: "generic"
cache:
immutable_ttl: 0
mutable_ttl: 0
``` ```
## Remote Types ## Remote Types
@@ -166,6 +149,7 @@ Arbitrary HTTP file servers — GitHub releases, HashiCorp, custom servers.
remotes: remotes:
github: github:
base_url: "https://github.com" base_url: "https://github.com"
type: "remote"
package: "generic" package: "generic"
immutable_patterns: immutable_patterns:
- "gruntwork-io/terragrunt/.*terragrunt_linux_amd64.*" - "gruntwork-io/terragrunt/.*terragrunt_linux_amd64.*"
@@ -174,6 +158,7 @@ remotes:
github-archive: github-archive:
base_url: "https://github.com" base_url: "https://github.com"
type: "remote"
package: "generic" package: "generic"
immutable_patterns: immutable_patterns:
- ".*/archive/refs/tags/.*\\.tar\\.gz$" # tag archives never change - ".*/archive/refs/tags/.*\\.tar\\.gz$" # tag archives never change
@@ -193,6 +178,7 @@ Access: `GET /api/v1/remote/github/owner/repo/releases/download/v1.0/binary.tar.
remotes: remotes:
alpine: alpine:
base_url: "https://dl-cdn.alpinelinux.org" base_url: "https://dl-cdn.alpinelinux.org"
type: "remote"
package: "alpine" package: "alpine"
immutable_patterns: immutable_patterns:
- ".*/x86_64/.*\\.apk$" - ".*/x86_64/.*\\.apk$"
@@ -209,6 +195,7 @@ remotes:
remotes: remotes:
almalinux: almalinux:
base_url: "https://mirror.example.com/almalinux" base_url: "https://mirror.example.com/almalinux"
type: "remote"
package: "rpm" package: "rpm"
immutable_patterns: immutable_patterns:
- ".*/x86_64/.*\\.rpm$" - ".*/x86_64/.*\\.rpm$"
@@ -226,6 +213,7 @@ remotes:
remotes: remotes:
dockerhub: dockerhub:
base_url: "https://registry-1.docker.io" base_url: "https://registry-1.docker.io"
type: "remote"
package: "docker" package: "docker"
# username / password optional for public images # username / password optional for public images
cache: cache:
@@ -234,6 +222,7 @@ remotes:
ghcr: ghcr:
base_url: "https://ghcr.io" base_url: "https://ghcr.io"
type: "remote"
package: "docker" package: "docker"
username: "your-github-username" username: "your-github-username"
password: "ghp_your_pat" # read:packages scope password: "ghp_your_pat" # read:packages scope
@@ -244,26 +233,6 @@ remotes:
Tag manifests and `/tags/list` are built-in mutable patterns. Digest-addressed blobs are immutable. Tag manifests and `/tags/list` are built-in mutable patterns. Digest-addressed blobs are immutable.
#### Banning tags
Set `ban_tags_enabled: true` and list named tags in `ban_tags` to block specific tag references. Requests for a banned tag return `403`. Digest-addressed pulls (`sha256:…`) are never blocked, so images already in use can still be referenced by digest.
```yaml
remotes:
dockerhub:
base_url: "https://registry-1.docker.io"
package: "docker"
ban_tags_enabled: true
ban_tags:
- latest # force pinned tags in CI/CD
- edge
cache:
immutable_ttl: 0
mutable_ttl: 300
```
`ban_tags_enabled` defaults to `false`. Setting it to `true` with an empty `ban_tags` list has no effect.
For RKE2/containerd, configure `/etc/rancher/rke2/registries.yaml`: For RKE2/containerd, configure `/etc/rancher/rke2/registries.yaml`:
```yaml ```yaml
@@ -286,6 +255,7 @@ mirrors:
remotes: remotes:
pypi: pypi:
base_url: "https://files.pythonhosted.org" base_url: "https://files.pythonhosted.org"
type: "remote"
package: "pypi" package: "pypi"
check_mutable_updates: true check_mutable_updates: true
immutable_patterns: immutable_patterns:
@@ -317,6 +287,7 @@ default = true
remotes: remotes:
npm: npm:
base_url: "https://registry.npmjs.org" base_url: "https://registry.npmjs.org"
type: "remote"
package: "npm" package: "npm"
check_mutable_updates: true check_mutable_updates: true
immutable_patterns: immutable_patterns:
@@ -343,6 +314,7 @@ registry=https://artifacts.example.com/api/v1/remote/npm/
remotes: remotes:
hashicorp-helm: hashicorp-helm:
base_url: "https://helm.releases.hashicorp.com" base_url: "https://helm.releases.hashicorp.com"
type: "remote"
package: "helm" package: "helm"
check_mutable_updates: true check_mutable_updates: true
immutable_patterns: immutable_patterns:
@@ -371,6 +343,7 @@ All members must share the same `package` type as the virtual repo. Currently su
remotes: remotes:
helm-hashicorp: helm-hashicorp:
base_url: "https://helm.releases.hashicorp.com" base_url: "https://helm.releases.hashicorp.com"
type: "remote"
package: "helm" package: "helm"
immutable_patterns: immutable_patterns:
- "\\.tgz$" - "\\.tgz$"
@@ -380,6 +353,7 @@ remotes:
helm-bitnami: helm-bitnami:
base_url: "https://charts.bitnami.com/bitnami" base_url: "https://charts.bitnami.com/bitnami"
type: "remote"
package: "helm" package: "helm"
immutable_patterns: immutable_patterns:
- "\\.tgz$" - "\\.tgz$"
@@ -387,8 +361,8 @@ remotes:
immutable_ttl: 0 immutable_ttl: 0
mutable_ttl: 3600 mutable_ttl: 3600
virtuals:
helm-all: helm-all:
type: "virtual"
package: "helm" package: "helm"
members: members:
- helm-hashicorp # listed first = highest priority - helm-hashicorp # listed first = highest priority
@@ -411,7 +385,7 @@ If a member is unreachable and has no cached index, it is skipped and a warning
**Caching:** **Caching:**
The merged index is cached using `min(mutable_ttl)` across all members. Each member's raw index is cached in S3 under its own remote key; the virtual handler reuses those copies when available. On rebuild, each member's parsed index is also stored as a compact msgpack file (`index.msgpack`) alongside the raw YAML, eliminating the YAML parse cost on subsequent rebuilds. The merged index is cached using `min(mutable_ttl)` across all members. Each member's raw index is cached in S3 under its own remote key by the normal proxy rules; the virtual handler reuses those copies when available.
**Helm example:** **Helm example:**
@@ -425,8 +399,9 @@ Chart tarball URLs in the merged `index.yaml` are rewritten to point at the indi
### local ### local
```yaml ```yaml
locals: remotes:
local-generic: local-generic:
type: "local"
package: "generic" package: "generic"
description: "Local file repository" description: "Local file repository"
cache: cache:
@@ -434,7 +409,7 @@ locals:
mutable_ttl: 0 mutable_ttl: 0
``` ```
No `base_url`. Files are uploaded via `PUT /api/v1/local/{name}/{path}` and downloaded via `GET /api/v1/local/{name}/{path}`. No `base_url`. Files are uploaded via `PUT` and served via `GET`.
## Caching Model ## Caching Model
@@ -476,6 +451,7 @@ Set `quarantine_new: true` and `quarantine_days: N` on a remote to block immutab
remotes: remotes:
pypi: pypi:
base_url: "https://files.pythonhosted.org" base_url: "https://files.pythonhosted.org"
type: "remote"
package: "pypi" package: "pypi"
quarantine_new: true quarantine_new: true
quarantine_days: 3 # block packages published in the last 3 days quarantine_days: 3 # block packages published in the last 3 days
+11
View File
@@ -0,0 +1,11 @@
remotes:
alpine:
base_url: "https://dl-cdn.alpinelinux.org"
type: "remote"
package: "alpine"
description: "Alpine Linux APK package repository"
immutable_patterns:
- ".*/x86_64/.*\\.apk$"
cache:
immutable_ttl: 0
mutable_ttl: 7200
+12
View File
@@ -0,0 +1,12 @@
remotes:
github:
base_url: "https://github.com"
type: "remote"
package: "generic"
description: "GitHub releases and files"
immutable_patterns:
- "gruntwork-io/terragrunt/.*terragrunt_linux_amd64.*"
- "prometheus/node_exporter/.*/node_exporter-.*\\.linux-amd64\\.tar\\.gz$"
cache:
immutable_ttl: 0
mutable_ttl: 0
+17
View File
@@ -0,0 +1,17 @@
remotes:
pypi:
base_url: "https://files.pythonhosted.org"
type: "remote"
package: "pypi"
description: "Python Package Index"
check_mutable_updates: true
quarantine_new: true
quarantine_days: 3
immutable_patterns:
- "packages/.*\\.whl$"
- "packages/.*\\.whl\\.metadata$"
- "packages/.*\\.tar\\.gz$"
- "packages/.*\\.zip$"
cache:
immutable_ttl: 0
mutable_ttl: 600
+1
View File
@@ -1,6 +1,7 @@
remotes: remotes:
alpine: alpine:
base_url: "https://dl-cdn.alpinelinux.org" base_url: "https://dl-cdn.alpinelinux.org"
type: "remote"
package: "alpine" package: "alpine"
description: "Alpine Linux APK package repository" description: "Alpine Linux APK package repository"
immutable_patterns: immutable_patterns:
+1
View File
@@ -1,6 +1,7 @@
remotes: remotes:
github: github:
base_url: "https://github.com" base_url: "https://github.com"
type: "remote"
package: "generic" package: "generic"
description: "GitHub releases and files" description: "GitHub releases and files"
immutable_patterns: immutable_patterns:
+1
View File
@@ -1,6 +1,7 @@
remotes: remotes:
pypi: pypi:
base_url: "https://files.pythonhosted.org" base_url: "https://files.pythonhosted.org"
type: "remote"
package: "pypi" package: "pypi"
description: "Python Package Index" description: "Python Package Index"
check_mutable_updates: true check_mutable_updates: true
+34 -3
View File
@@ -35,6 +35,7 @@
remotes: remotes:
github: github:
base_url: "https://github.com" base_url: "https://github.com"
type: "remote"
package: "generic" package: "generic"
description: "GitHub releases and files" description: "GitHub releases and files"
immutable_patterns: immutable_patterns:
@@ -66,6 +67,7 @@ remotes:
github-archive: github-archive:
base_url: "https://github.com" base_url: "https://github.com"
type: "remote"
package: "generic" package: "generic"
description: "GitHub repository archive tarballs" description: "GitHub repository archive tarballs"
immutable_patterns: immutable_patterns:
@@ -85,6 +87,7 @@ remotes:
gitea-dl: gitea-dl:
base_url: "https://dl.gitea.com" base_url: "https://dl.gitea.com"
type: "remote"
package: "generic" package: "generic"
description: "Gitea download site" description: "Gitea download site"
immutable_patterns: immutable_patterns:
@@ -95,6 +98,7 @@ remotes:
hashicorp-releases: hashicorp-releases:
base_url: "https://releases.hashicorp.com" base_url: "https://releases.hashicorp.com"
type: "remote"
package: "generic" package: "generic"
description: "HashiCorp product releases" description: "HashiCorp product releases"
immutable_patterns: immutable_patterns:
@@ -115,6 +119,7 @@ remotes:
alpine: alpine:
base_url: "https://dl-cdn.alpinelinux.org" base_url: "https://dl-cdn.alpinelinux.org"
type: "remote"
package: "alpine" package: "alpine"
description: "Alpine Linux APK package repository" description: "Alpine Linux APK package repository"
immutable_patterns: immutable_patterns:
@@ -128,6 +133,7 @@ remotes:
almalinux: almalinux:
base_url: "https://gsl-syd.mm.fcix.net/almalinux" base_url: "https://gsl-syd.mm.fcix.net/almalinux"
type: "remote"
package: "rpm" package: "rpm"
description: "AlmaLinux RPM package repository" description: "AlmaLinux RPM package repository"
immutable_patterns: immutable_patterns:
@@ -144,6 +150,7 @@ remotes:
epel: epel:
base_url: "http://mirror.aarnet.edu.au/pub/epel" base_url: "http://mirror.aarnet.edu.au/pub/epel"
type: "remote"
package: "rpm" package: "rpm"
description: "EPEL (Extra Packages for Enterprise Linux)" description: "EPEL (Extra Packages for Enterprise Linux)"
immutable_patterns: immutable_patterns:
@@ -158,6 +165,7 @@ remotes:
fedora: fedora:
base_url: "https://gsl-syd.mm.fcix.net/fedora/linux" base_url: "https://gsl-syd.mm.fcix.net/fedora/linux"
type: "remote"
package: "rpm" package: "rpm"
description: "Fedora Linux RPM package repository" description: "Fedora Linux RPM package repository"
immutable_patterns: immutable_patterns:
@@ -172,6 +180,7 @@ remotes:
ghcr: ghcr:
base_url: "https://ghcr.io" base_url: "https://ghcr.io"
type: "remote"
package: "docker" package: "docker"
description: "GitHub Container Registry" description: "GitHub Container Registry"
# username: "your-github-username" # username: "your-github-username"
@@ -185,6 +194,7 @@ remotes:
dockerhub: dockerhub:
base_url: "https://registry-1.docker.io" base_url: "https://registry-1.docker.io"
type: "remote"
package: "docker" package: "docker"
description: "Docker Hub registry" description: "Docker Hub registry"
cache: cache:
@@ -193,6 +203,7 @@ remotes:
pypi: pypi:
base_url: "https://files.pythonhosted.org" base_url: "https://files.pythonhosted.org"
type: "remote"
package: "pypi" package: "pypi"
description: "Python Package Index — simple index and package files via a single remote" description: "Python Package Index — simple index and package files via a single remote"
# simple/ requests are transparently fetched from pypi.org; package files come from # simple/ requests are transparently fetched from pypi.org; package files come from
@@ -215,6 +226,7 @@ remotes:
pypi-gitea: pypi-gitea:
base_url: "https://gitea.example.com/api/packages/myorg/pypi" base_url: "https://gitea.example.com/api/packages/myorg/pypi"
type: "remote"
package: "pypi" package: "pypi"
description: "Private Gitea PyPI registry — simple index and files at the same host" description: "Private Gitea PyPI registry — simple index and files at the same host"
# username: "your-gitea-username" # username: "your-gitea-username"
@@ -232,6 +244,7 @@ remotes:
npm: npm:
base_url: "https://registry.npmjs.org" base_url: "https://registry.npmjs.org"
type: "remote"
package: "npm" package: "npm"
description: "npm registry — package metadata with tarball URL rewriting" description: "npm registry — package metadata with tarball URL rewriting"
check_mutable_updates: true check_mutable_updates: true
@@ -245,6 +258,7 @@ remotes:
hashicorp-helm: hashicorp-helm:
base_url: "https://helm.releases.hashicorp.com" base_url: "https://helm.releases.hashicorp.com"
type: "remote"
package: "helm" package: "helm"
description: "HashiCorp Helm chart repository (Vault, Consul, Nomad, etc.)" description: "HashiCorp Helm chart repository (Vault, Consul, Nomad, etc.)"
check_mutable_updates: true check_mutable_updates: true
@@ -256,6 +270,7 @@ remotes:
metallb: metallb:
base_url: "https://metallb.github.io/metallb" base_url: "https://metallb.github.io/metallb"
type: "remote"
package: "helm" package: "helm"
description: "MetalLB load balancer Helm charts" description: "MetalLB load balancer Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -267,6 +282,7 @@ remotes:
jetstack: jetstack:
base_url: "https://charts.jetstack.io" base_url: "https://charts.jetstack.io"
type: "remote"
package: "helm" package: "helm"
description: "Jetstack Helm charts (cert-manager)" description: "Jetstack Helm charts (cert-manager)"
check_mutable_updates: true check_mutable_updates: true
@@ -278,6 +294,7 @@ remotes:
rancher-stable: rancher-stable:
base_url: "https://releases.rancher.com/server-charts/stable" base_url: "https://releases.rancher.com/server-charts/stable"
type: "remote"
package: "helm" package: "helm"
description: "Rancher stable Helm charts" description: "Rancher stable Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -289,6 +306,7 @@ remotes:
purelb: purelb:
base_url: "https://gitlab.com/api/v4/projects/20400619/packages/helm/stable" base_url: "https://gitlab.com/api/v4/projects/20400619/packages/helm/stable"
type: "remote"
package: "helm" package: "helm"
description: "PureLB load balancer Helm charts" description: "PureLB load balancer Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -300,6 +318,7 @@ remotes:
istio: istio:
base_url: "https://istio-release.storage.googleapis.com/charts" base_url: "https://istio-release.storage.googleapis.com/charts"
type: "remote"
package: "helm" package: "helm"
description: "Istio service mesh Helm charts" description: "Istio service mesh Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -311,6 +330,7 @@ remotes:
cnpg: cnpg:
base_url: "https://cloudnative-pg.github.io/charts" base_url: "https://cloudnative-pg.github.io/charts"
type: "remote"
package: "helm" package: "helm"
description: "CloudNativePG operator Helm charts" description: "CloudNativePG operator Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -322,6 +342,7 @@ remotes:
ceph-csi: ceph-csi:
base_url: "https://ceph.github.io/csi-charts" base_url: "https://ceph.github.io/csi-charts"
type: "remote"
package: "helm" package: "helm"
description: "Ceph CSI driver Helm charts" description: "Ceph CSI driver Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -333,6 +354,7 @@ remotes:
external-dns: external-dns:
base_url: "https://kubernetes-sigs.github.io/external-dns/" base_url: "https://kubernetes-sigs.github.io/external-dns/"
type: "remote"
package: "helm" package: "helm"
description: "ExternalDNS Helm charts" description: "ExternalDNS Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -344,6 +366,7 @@ remotes:
intel-helm: intel-helm:
base_url: "https://intel.github.io/helm-charts/" base_url: "https://intel.github.io/helm-charts/"
type: "remote"
package: "helm" package: "helm"
description: "Intel Helm charts" description: "Intel Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -355,6 +378,7 @@ remotes:
elastic: elastic:
base_url: "https://helm.elastic.co" base_url: "https://helm.elastic.co"
type: "remote"
package: "helm" package: "helm"
description: "Elastic stack Helm charts" description: "Elastic stack Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -366,6 +390,7 @@ remotes:
k8up-io: k8up-io:
base_url: "https://k8up-io.github.io/k8up" base_url: "https://k8up-io.github.io/k8up"
type: "remote"
package: "helm" package: "helm"
description: "K8up backup operator Helm charts" description: "K8up backup operator Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -377,6 +402,7 @@ remotes:
victoriametrics: victoriametrics:
base_url: "https://victoriametrics.github.io/helm-charts/" base_url: "https://victoriametrics.github.io/helm-charts/"
type: "remote"
package: "helm" package: "helm"
description: "VictoriaMetrics observability Helm charts" description: "VictoriaMetrics observability Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -388,6 +414,7 @@ remotes:
grafana: grafana:
base_url: "https://grafana.github.io/helm-charts" base_url: "https://grafana.github.io/helm-charts"
type: "remote"
package: "helm" package: "helm"
description: "Grafana observability Helm charts" description: "Grafana observability Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -399,6 +426,7 @@ remotes:
helm-openldap: helm-openldap:
base_url: "https://jp-gouin.github.io/helm-openldap/" base_url: "https://jp-gouin.github.io/helm-openldap/"
type: "remote"
package: "helm" package: "helm"
description: "OpenLDAP Helm charts" description: "OpenLDAP Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -410,6 +438,7 @@ remotes:
woodpecker: woodpecker:
base_url: "https://woodpecker-ci.org/" base_url: "https://woodpecker-ci.org/"
type: "remote"
package: "helm" package: "helm"
description: "Woodpecker CI Helm charts" description: "Woodpecker CI Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -421,6 +450,7 @@ remotes:
stakater: stakater:
base_url: "https://stakater.github.io/stakater-charts" base_url: "https://stakater.github.io/stakater-charts"
type: "remote"
package: "helm" package: "helm"
description: "Stakater Helm charts" description: "Stakater Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -432,6 +462,7 @@ remotes:
jfrog: jfrog:
base_url: "https://charts.jfrog.io/" base_url: "https://charts.jfrog.io/"
type: "remote"
package: "helm" package: "helm"
description: "JFrog Helm charts" description: "JFrog Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -443,6 +474,7 @@ remotes:
openvox: openvox:
base_url: "https://openvoxproject.github.io/openvox-helm-chart" base_url: "https://openvoxproject.github.io/openvox-helm-chart"
type: "remote"
package: "helm" package: "helm"
description: "OpenVox Helm charts" description: "OpenVox Helm charts"
check_mutable_updates: true check_mutable_updates: true
@@ -452,9 +484,8 @@ remotes:
immutable_ttl: 0 immutable_ttl: 0
mutable_ttl: 3600 mutable_ttl: 3600
virtuals:
helm-all: helm-all:
type: "virtual"
package: "helm" package: "helm"
description: "Virtual repository merging all helm remotes — member order is priority order for duplicate chart+version" description: "Virtual repository merging all helm remotes — member order is priority order for duplicate chart+version"
members: members:
@@ -478,8 +509,8 @@ virtuals:
- jfrog - jfrog
- openvox - openvox
locals:
local-generic: local-generic:
type: "local"
package: "generic" package: "generic"
description: "Local generic file repository" description: "Local generic file repository"
cache: cache:
-1
View File
@@ -14,7 +14,6 @@ dependencies = [
"lxml>=4.9.0", "lxml>=4.9.0",
"prometheus-client>=0.19.0", "prometheus-client>=0.19.0",
"python-multipart>=0.0.6", "python-multipart>=0.0.6",
"msgpack>=1.0.0",
] ]
requires-python = ">=3.11" requires-python = ">=3.11"
readme = "README.md" readme = "README.md"
-35
View File
@@ -1,4 +1,3 @@
import asyncio
import hashlib import hashlib
import json import json
import logging import logging
@@ -34,16 +33,6 @@ async def proxy(request: Request, remote_name: str, path: str, storage, cache, c
logger.info(f"PATTERN BLOCKED: {remote_name}/{path}") logger.info(f"PATTERN BLOCKED: {remote_name}/{path}")
raise HTTPException(status_code=403, detail="Image not allowed by configuration patterns") raise HTTPException(status_code=403, detail="Image not allowed by configuration patterns")
if remote_config.get("ban_tags_enabled", False):
ban_tags = remote_config.get("ban_tags", [])
if ban_tags:
tag_match = re.search(r"/manifests/([^/]+)$", path)
if tag_match:
tag = tag_match.group(1)
if not tag.startswith("sha256:") and tag in ban_tags:
logger.info(f"TAG BANNED: {remote_name}/{path} (tag: {tag})")
raise HTTPException(status_code=403, detail=f"Tag '{tag}' is not permitted on this remote")
base_url = remote_config.get("base_url", "").rstrip("/") base_url = remote_config.get("base_url", "").rstrip("/")
remote_url = f"{base_url}/v2/{path}" remote_url = f"{base_url}/v2/{path}"
@@ -58,21 +47,8 @@ async def proxy(request: Request, remote_name: str, path: str, storage, cache, c
if not await _proxy.handle_expired_mutable(remote_name, path, remote_url, config, cache, storage): if not await _proxy.handle_expired_mutable(remote_name, path, remote_url, config, cache, storage):
cached_key = None cached_key = None
lock_acquired = False
if not cached_key:
lock_acquired = cache.acquire_fetch_lock(remote_name, path)
if not lock_acquired:
# Another pod is already fetching — poll storage briefly before issuing a duplicate upstream request
for _ in range(10):
await asyncio.sleep(0.5)
probe_key = storage.get_object_key(remote_name, path)
if storage.exists(probe_key):
cached_key = probe_key
break
if not cached_key: if not cached_key:
logger.info(f"Cache MISS: {remote_name}/{path} - fetching from remote: {remote_url}") logger.info(f"Cache MISS: {remote_name}/{path} - fetching from remote: {remote_url}")
try:
result = await _proxy.cache_single_artifact(remote_url, remote_name, path, storage, remote_config) result = await _proxy.cache_single_artifact(remote_url, remote_name, path, storage, remote_config)
if result["status"] == "error": if result["status"] == "error":
raise HTTPException(status_code=502, detail=f"Failed to fetch: {result['error']}") raise HTTPException(status_code=502, detail=f"Failed to fetch: {result['error']}")
@@ -88,9 +64,6 @@ async def proxy(request: Request, remote_name: str, path: str, storage, cache, c
if published: if published:
cache.store_artifact_published(remote_name, path, published) cache.store_artifact_published(remote_name, path, published)
_proxy._check_quarantine(remote_name, published, config) _proxy._check_quarantine(remote_name, published, config)
finally:
if lock_acquired:
cache.release_fetch_lock(remote_name, path)
elif not is_mutable: elif not is_mutable:
published = cache.get_artifact_published(remote_name, path) published = cache.get_artifact_published(remote_name, path)
if not published: if not published:
@@ -117,14 +90,6 @@ async def proxy(request: Request, remote_name: str, path: str, storage, cache, c
content_type = "application/vnd.oci.image.manifest.v1+json" content_type = "application/vnd.oci.image.manifest.v1+json"
digest = f"sha256:{hashlib.sha256(artifact_data).hexdigest()}" digest = f"sha256:{hashlib.sha256(artifact_data).hexdigest()}"
# Cross-link tag manifests to their sha256 digest key so digest-addressed pulls hit cache
if is_mutable and "/manifests/" in path:
digest_path = re.sub(r"/manifests/[^/]+$", f"/manifests/{digest}", path)
digest_key = storage.get_object_key(remote_name, digest_path)
if not storage.exists(digest_key):
storage.upload(digest_key, artifact_data)
headers = { headers = {
"Docker-Distribution-Api-Version": "registry/2.0", "Docker-Distribution-Api-Version": "registry/2.0",
"Docker-Content-Digest": digest, "Docker-Content-Digest": digest,
+16 -21
View File
@@ -1,6 +1,5 @@
import hashlib import hashlib
import logging import logging
import os
from fastapi import HTTPException, Response, UploadFile from fastapi import HTTPException, Response, UploadFile
from fastapi.responses import JSONResponse from fastapi.responses import JSONResponse
@@ -8,23 +7,12 @@ from fastapi.responses import JSONResponse
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
def download(remote_name: str, path: str, storage, database, config) -> Response:
if not config.get_local_config(remote_name):
raise HTTPException(status_code=404, detail=f"Local repository '{remote_name}' not configured")
metadata = database.get_local_file_metadata(remote_name, path)
if not metadata:
raise HTTPException(status_code=404, detail="File not found")
content = storage.download_object(metadata["s3_key"])
return Response(
content=content,
media_type=metadata.get("content_type", "application/octet-stream"),
headers={"Content-Disposition": f"attachment; filename={os.path.basename(path)}"},
)
async def upload(remote_name: str, path: str, file: UploadFile, storage, database, config) -> JSONResponse: async def upload(remote_name: str, path: str, file: UploadFile, storage, database, config) -> JSONResponse:
if not config.get_local_config(remote_name): remote_config = config.get_remote_config(remote_name)
raise HTTPException(status_code=404, detail=f"Local repository '{remote_name}' not configured") if not remote_config:
raise HTTPException(status_code=404, detail=f"Remote '{remote_name}' not configured")
if remote_config.get("type") != "local":
raise HTTPException(status_code=400, detail="Upload only supported for local repositories")
try: try:
content = await file.read() content = await file.read()
@@ -71,8 +59,12 @@ async def upload(remote_name: str, path: str, file: UploadFile, storage, databas
def check_exists(remote_name: str, path: str, database, config) -> Response: def check_exists(remote_name: str, path: str, database, config) -> Response:
if not config.get_local_config(remote_name): remote_config = config.get_remote_config(remote_name)
raise HTTPException(status_code=404, detail=f"Local repository '{remote_name}' not configured") if not remote_config:
raise HTTPException(status_code=404, detail=f"Remote '{remote_name}' not configured")
if remote_config.get("type") != "local":
raise HTTPException(status_code=405, detail="HEAD method only supported for local repositories")
try: try:
metadata = database.get_local_file_metadata(remote_name, path) metadata = database.get_local_file_metadata(remote_name, path)
@@ -95,8 +87,11 @@ def check_exists(remote_name: str, path: str, database, config) -> Response:
def delete(remote_name: str, path: str, storage, database, config) -> JSONResponse: def delete(remote_name: str, path: str, storage, database, config) -> JSONResponse:
if not config.get_local_config(remote_name): remote_config = config.get_remote_config(remote_name)
raise HTTPException(status_code=404, detail=f"Local repository '{remote_name}' not configured") if not remote_config:
raise HTTPException(status_code=404, detail=f"Remote '{remote_name}' not configured")
if remote_config.get("type") != "local":
raise HTTPException(status_code=400, detail="Delete only supported for local repositories")
try: try:
s3_key = database.delete_local_file(remote_name, path) s3_key = database.delete_local_file(remote_name, path)
+13
View File
@@ -218,6 +218,19 @@ async def handle(request: Request, remote_name: str, path: str, storage, cache,
if not remote_config: if not remote_config:
raise HTTPException(status_code=404, detail=f"Remote '{remote_name}' not configured") raise HTTPException(status_code=404, detail=f"Remote '{remote_name}' not configured")
if remote_config.get("type") == "local":
metadata = database.get_local_file_metadata(remote_name, path)
if not metadata:
raise HTTPException(status_code=404, detail="File not found")
content = storage.download_object(metadata["s3_key"])
if content is None:
raise HTTPException(status_code=500, detail="File not accessible")
return Response(
content=content,
media_type=metadata.get("content_type", "application/octet-stream"),
headers={"Content-Disposition": f"attachment; filename={os.path.basename(path)}"},
)
path_parts = path.split("/") path_parts = path.split("/")
if len(path_parts) >= 2: if len(path_parts) >= 2:
repo_path = f"{path_parts[0]}/{path_parts[1]}" repo_path = f"{path_parts[0]}/{path_parts[1]}"
+21 -111
View File
@@ -6,21 +6,15 @@ from datetime import UTC, date, datetime
from typing import Protocol, runtime_checkable from typing import Protocol, runtime_checkable
import httpx import httpx
import msgpack as _msgpack
import yaml import yaml
from fastapi import HTTPException, Request, Response from fastapi import HTTPException, Request, Response
from ..remote import helm as _helm
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
try:
_YamlLoader = yaml.CSafeLoader
_YamlDumperBase = yaml.CDumper
except AttributeError:
_YamlLoader = yaml.SafeLoader
_YamlDumperBase = yaml.Dumper
class _HelmDumper(yaml.Dumper):
class _HelmDumper(_YamlDumperBase):
"""YAML dumper that serializes datetime/date objects back to ISO 8601 strings. """YAML dumper that serializes datetime/date objects back to ISO 8601 strings.
yaml.safe_load converts timestamp-shaped YAML scalars (e.g. chart `created` yaml.safe_load converts timestamp-shaped YAML scalars (e.g. chart `created`
@@ -43,43 +37,21 @@ _HelmDumper.add_representer(datetime, _repr_datetime)
_HelmDumper.add_representer(date, _repr_date) _HelmDumper.add_representer(date, _repr_date)
def _entries_to_msgpack_safe(entries: dict) -> dict:
"""Convert datetime/date values to ISO strings for msgpack serialization."""
result = {}
for chart, versions in entries.items():
safe_versions = []
for v in versions:
safe_v = {}
for k, val in v.items():
if isinstance(val, datetime):
safe_v[k] = val.isoformat()
elif isinstance(val, date):
safe_v[k] = val.isoformat()
else:
safe_v[k] = val
safe_versions.append(safe_v)
result[chart] = safe_versions
return result
async def _get_member_index( async def _get_member_index(
member_name: str, member_name: str,
member_cfg: dict, member_cfg: dict,
path: str, path: str,
storage, storage,
cache, cache,
) -> tuple[str, dict, int, bytes | None, dict | None]: ) -> tuple[str, dict, int, bytes | None]:
"""Fetch or retrieve cached index.yaml for one member remote. """Fetch or retrieve cached index.yaml for one member remote.
Returns (member_name, member_cfg, ttl, raw_bytes, parsed_entries). Returns (member_name, member_cfg, ttl, raw_bytes).
raw_bytes is None if the member is unreachable and not in S3. raw_bytes is None if the member is unreachable and not in S3.
parsed_entries is the pre-parsed entries dict (from msgpack cache), or None.
""" """
member_ttl = member_cfg.get("cache", {}).get("mutable_ttl", 3600) member_ttl = member_cfg.get("cache", {}).get("mutable_ttl", 3600)
s3_key = storage.get_object_key(member_name, path) s3_key = storage.get_object_key(member_name, path)
msgpack_key = storage.get_object_key(member_name, "index.msgpack")
raw_data: bytes | None = None raw_data: bytes | None = None
parsed_entries: dict | None = None
if storage.exists(s3_key) and cache.is_index_valid(member_name, path): if storage.exists(s3_key) and cache.is_index_valid(member_name, path):
try: try:
@@ -87,13 +59,6 @@ async def _get_member_index(
logger.info(f"Virtual: cache hit for member '{member_name}'") logger.info(f"Virtual: cache hit for member '{member_name}'")
except Exception: except Exception:
raw_data = None raw_data = None
if raw_data is not None and storage.exists(msgpack_key):
try:
packed = storage.download_object(msgpack_key)
parsed_entries = _msgpack.unpackb(packed, raw=False)
logger.debug(f"Virtual: msgpack hit for member '{member_name}'")
except Exception:
parsed_entries = None
if raw_data is None: if raw_data is None:
base_url = member_cfg.get("base_url", "").rstrip("/") base_url = member_cfg.get("base_url", "").rstrip("/")
@@ -111,74 +76,35 @@ async def _get_member_index(
raw_data = response.content raw_data = response.content
except Exception as e: except Exception as e:
logger.warning(f"Virtual: failed to fetch index.yaml from member '{member_name}': {e}") logger.warning(f"Virtual: failed to fetch index.yaml from member '{member_name}': {e}")
return member_name, member_cfg, member_ttl, None, None return member_name, member_cfg, member_ttl, None
try: try:
storage.upload(s3_key, raw_data) storage.upload(s3_key, raw_data)
cache.mark_index_cached(member_name, path, member_ttl) cache.mark_index_cached(member_name, path, member_ttl)
except Exception as e: except Exception as e:
logger.warning(f"Virtual: failed to cache index.yaml for member '{member_name}': {e}") logger.warning(f"Virtual: failed to cache index.yaml for member '{member_name}': {e}")
if parsed_entries is None and raw_data is not None: return member_name, member_cfg, member_ttl, raw_data
try:
index = yaml.load(raw_data, Loader=_YamlLoader)
safe_entries = _entries_to_msgpack_safe(index.get("entries") or {})
storage.upload(msgpack_key, _msgpack.packb(safe_entries, use_bin_type=True))
parsed_entries = safe_entries
except Exception as e:
logger.warning(f"Virtual: failed to build msgpack cache for '{member_name}': {e}")
return member_name, member_cfg, member_ttl, raw_data, parsed_entries
def _rewrite_urls(urls: list, base_url: str, proxy_base: str, member_name: str) -> list: def _merge_helm_indexes(raw_indexes: list[bytes], member_names: list[str], member_configs: list[dict], proxy_base: str) -> bytes:
proxy_remote = f"{proxy_base}/api/v1/remote/{member_name}"
rewritten = []
for url in urls:
if url.startswith(("http://", "https://")):
if base_url and url.startswith(base_url):
url = proxy_remote + url[len(base_url) :]
else:
url = f"{proxy_remote}/{url.lstrip('/')}"
rewritten.append(url)
return rewritten
def _merge_helm_indexes(
raw_indexes: list[bytes],
parsed_entries_list: list[dict | None],
member_names: list[str],
member_configs: list[dict],
proxy_base: str,
) -> bytes:
"""Merge helm index.yaml files with per-member URL rewriting. """Merge helm index.yaml files with per-member URL rewriting.
Priority is determined by position in member_names: earlier members win Priority is determined by position in member_names: earlier members win
when the same chart name + version appears in multiple remotes. when the same chart name + version appears in multiple remotes.
Uses pre-parsed msgpack entries when available to skip YAML parsing.
""" """
merged_entries: dict[str, list] = {} merged_entries: dict[str, list] = {}
for raw_data, pre_parsed, member_name, member_cfg in zip(raw_indexes, parsed_entries_list, member_names, member_configs): for raw_data, member_name, member_cfg in zip(raw_indexes, member_names, member_configs):
base_url = member_cfg.get("base_url", "").rstrip("/") base_url = member_cfg.get("base_url", "").rstrip("/")
rewritten, _ = _helm.resolve_content(raw_data, "index.yaml", "index.yaml", base_url, proxy_base, member_name)
if pre_parsed is not None:
entries = pre_parsed
else:
try: try:
index = yaml.load(raw_data, Loader=_YamlLoader) index = yaml.safe_load(rewritten)
except Exception as e: except Exception as e:
logger.warning(f"Virtual: failed to parse index.yaml from member '{member_name}': {e}") logger.warning(f"Virtual: failed to parse index.yaml from member '{member_name}': {e}")
continue continue
entries = index.get("entries") or {}
for chart_name, versions in entries.items(): for chart_name, versions in (index.get("entries") or {}).items():
for version_entry in versions:
version_entry["urls"] = _rewrite_urls(
version_entry.get("urls") or [],
base_url,
proxy_base,
member_name,
)
if chart_name not in merged_entries: if chart_name not in merged_entries:
merged_entries[chart_name] = list(versions) merged_entries[chart_name] = list(versions)
else: else:
@@ -200,14 +126,7 @@ def _merge_helm_indexes(
@runtime_checkable @runtime_checkable
class _VirtualHandler(Protocol): class _VirtualHandler(Protocol):
def accepts_path(self, path: str) -> bool: ... def accepts_path(self, path: str) -> bool: ...
def merge( def merge(self, raw_indexes: list[bytes], member_names: list[str], member_configs: list[dict], proxy_base: str) -> bytes: ...
self,
raw_indexes: list[bytes],
parsed_entries: list[dict | None],
member_names: list[str],
member_configs: list[dict],
proxy_base: str,
) -> bytes: ...
def path_error(self) -> str: ... def path_error(self) -> str: ...
@@ -215,15 +134,8 @@ class _HelmHandler:
def accepts_path(self, path: str) -> bool: def accepts_path(self, path: str) -> bool:
return path == "index.yaml" return path == "index.yaml"
def merge( def merge(self, raw_indexes: list[bytes], member_names: list[str], member_configs: list[dict], proxy_base: str) -> bytes:
self, return _merge_helm_indexes(raw_indexes, member_names, member_configs, proxy_base)
raw_indexes: list[bytes],
parsed_entries: list[dict | None],
member_names: list[str],
member_configs: list[dict],
proxy_base: str,
) -> bytes:
return _merge_helm_indexes(raw_indexes, parsed_entries, member_names, member_configs, proxy_base)
def path_error(self) -> str: def path_error(self) -> str:
return "Virtual helm repositories only serve index.yaml; chart tarballs are served directly by member remotes" return "Virtual helm repositories only serve index.yaml; chart tarballs are served directly by member remotes"
@@ -235,9 +147,11 @@ _HANDLERS: dict[str, _VirtualHandler] = {
async def handle(request: Request, virtual_name: str, path: str, storage, cache, config) -> Response: async def handle(request: Request, virtual_name: str, path: str, storage, cache, config) -> Response:
virtual_cfg = config.get_virtual_config(virtual_name) virtual_cfg = config.get_remote_config(virtual_name)
if not virtual_cfg: if not virtual_cfg:
raise HTTPException(status_code=404, detail=f"Virtual repository '{virtual_name}' not configured") raise HTTPException(status_code=404, detail=f"Virtual repository '{virtual_name}' not configured")
if virtual_cfg.get("type") != "virtual":
raise HTTPException(status_code=400, detail=f"'{virtual_name}' is not a virtual repository")
package = virtual_cfg.get("package") package = virtual_cfg.get("package")
handler = _HANDLERS.get(package) handler = _HANDLERS.get(package)
@@ -274,19 +188,17 @@ async def handle(request: Request, virtual_name: str, path: str, storage, cache,
fetch_ms = int((time.perf_counter() - t_fetch) * 1000) fetch_ms = int((time.perf_counter() - t_fetch) * 1000)
raw_indexes: list[bytes] = [] raw_indexes: list[bytes] = []
used_parsed: list[dict | None] = []
used_members: list[str] = [] used_members: list[str] = []
used_configs: list[dict] = [] used_configs: list[dict] = []
min_ttl: int | None = None min_ttl: int | None = None
for member_name, member_cfg, member_ttl, raw_data, parsed_entries in results: for member_name, member_cfg, member_ttl, raw_data in results:
if min_ttl is None or member_ttl < min_ttl: if min_ttl is None or member_ttl < min_ttl:
min_ttl = member_ttl min_ttl = member_ttl
if raw_data is None: if raw_data is None:
logger.warning(f"Virtual '{virtual_name}': skipping unreachable member '{member_name}'") logger.warning(f"Virtual '{virtual_name}': skipping unreachable member '{member_name}'")
continue continue
raw_indexes.append(raw_data) raw_indexes.append(raw_data)
used_parsed.append(parsed_entries)
used_members.append(member_name) used_members.append(member_name)
used_configs.append(member_cfg) used_configs.append(member_cfg)
@@ -297,7 +209,7 @@ async def handle(request: Request, virtual_name: str, path: str, storage, cache,
min_ttl = 3600 min_ttl = 3600
t_merge = time.perf_counter() t_merge = time.perf_counter()
merged = await asyncio.to_thread(handler.merge, raw_indexes, used_parsed, used_members, used_configs, proxy_base) merged = handler.merge(raw_indexes, used_members, used_configs, proxy_base)
merge_ms = int((time.perf_counter() - t_merge) * 1000) merge_ms = int((time.perf_counter() - t_merge) * 1000)
try: try:
@@ -305,11 +217,9 @@ async def handle(request: Request, virtual_name: str, path: str, storage, cache,
storage.upload(virtual_key, merged) storage.upload(virtual_key, merged)
cache.mark_index_cached(virtual_name, path, min_ttl) cache.mark_index_cached(virtual_name, path, min_ttl)
store_ms = int((time.perf_counter() - t_store) * 1000) store_ms = int((time.perf_counter() - t_store) * 1000)
msgpack_hits = sum(1 for p in used_parsed if p is not None)
logger.info( logger.info(
f"Virtual MISS: {virtual_name}/{path} rebuilt from {used_members} " f"Virtual MISS: {virtual_name}/{path} rebuilt from {used_members} "
f"(fetch={fetch_ms}ms merge={merge_ms}ms store={store_ms}ms ttl={min_ttl}s " f"(fetch={fetch_ms}ms merge={merge_ms}ms store={store_ms}ms ttl={min_ttl}s)"
f"msgpack={msgpack_hits}/{len(used_members)})"
) )
except Exception as e: except Exception as e:
logger.warning(f"Virtual: failed to store merged index for '{virtual_name}': {e}") logger.warning(f"Virtual: failed to store merged index for '{virtual_name}': {e}")
-19
View File
@@ -99,25 +99,6 @@ class RedisCache:
except Exception: except Exception:
return None return None
def acquire_fetch_lock(self, remote_name: str, path: str, ttl: int = 30) -> bool:
"""Try to acquire a short-lived fetch lock. Returns True if acquired, False if held by another caller."""
if not self.available:
return True # fail open: no Redis → behave as if we always hold the lock
key = f"fetchlock:{remote_name}:{hashlib.sha256(path.encode()).hexdigest()[:16]}"
try:
return bool(self.client.set(key, 1, nx=True, ex=ttl))
except Exception:
return True
def release_fetch_lock(self, remote_name: str, path: str) -> None:
if not self.available:
return
key = f"fetchlock:{remote_name}:{hashlib.sha256(path.encode()).hexdigest()[:16]}"
try:
self.client.delete(key)
except Exception:
pass
def cleanup_expired_index(self, storage, remote_name: str, path: str) -> None: def cleanup_expired_index(self, storage, remote_name: str, path: str) -> None:
if not self.available: if not self.available:
return return
+4 -12
View File
@@ -50,8 +50,8 @@ class ConfigManager:
def _merge(base: dict, overlay: dict) -> dict: def _merge(base: dict, overlay: dict) -> dict:
result = {**base} result = {**base}
for key, value in overlay.items(): for key, value in overlay.items():
if key in ("remotes", "virtuals", "locals") and isinstance(base.get(key), dict) and isinstance(value, dict): if key == "remotes" and isinstance(base.get("remotes"), dict) and isinstance(value, dict):
result[key] = {**base.get(key, {}), **value} result["remotes"] = {**base.get("remotes", {}), **value}
else: else:
result[key] = value result[key] = value
return result return result
@@ -67,11 +67,11 @@ class ConfigManager:
self._config_dir = None self._config_dir = None
if os.path.isdir(self.config_path): if os.path.isdir(self.config_path):
return self._load_from_dir(self.config_path) or {"remotes": {}, "virtuals": {}, "locals": {}} return self._load_from_dir(self.config_path) or {"remotes": {}}
config = self._load_single_file(self.config_path) config = self._load_single_file(self.config_path)
if not config: if not config:
return {"remotes": {}, "virtuals": {}, "locals": {}} return {"remotes": {}}
config_dir = config.pop("config_dir", None) config_dir = config.pop("config_dir", None)
if config_dir: if config_dir:
@@ -119,14 +119,6 @@ class ConfigManager:
self._check_reload() self._check_reload()
return self.config.get("remotes", {}).get(remote_name) return self.config.get("remotes", {}).get(remote_name)
def get_virtual_config(self, virtual_name: str) -> dict | None:
self._check_reload()
return self.config.get("virtuals", {}).get(virtual_name)
def get_local_config(self, local_name: str) -> dict | None:
self._check_reload()
return self.config.get("locals", {}).get(local_name)
def get_immutable_patterns(self, remote_name: str, repo_path: str = "") -> list[str]: def get_immutable_patterns(self, remote_name: str, repo_path: str = "") -> list[str]:
remote_config = self.get_remote_config(remote_name) remote_config = self.get_remote_config(remote_name)
if not remote_config: if not remote_config:
+10 -21
View File
@@ -49,13 +49,7 @@ class ArtifactRequest(BaseModel):
@app.get("/") @app.get("/")
def read_root(): def read_root():
config._check_reload() config._check_reload()
return { return {"message": "Artifact Storage API", "version": app.version, "remotes": list(config.config.get("remotes", {}).keys())}
"message": "Artifact Storage API",
"version": app.version,
"remotes": list(config.config.get("remotes", {}).keys()),
"virtuals": list(config.config.get("virtuals", {}).keys()),
"locals": list(config.config.get("locals", {}).keys()),
}
@app.get("/health") @app.get("/health")
@@ -105,24 +99,19 @@ async def get_artifact(request: Request, remote_name: str, path: str):
return await proxy.handle(request, remote_name, path, storage, cache, config, database, metrics) return await proxy.handle(request, remote_name, path, storage, cache, config, database, metrics)
@app.get("/api/v1/local/{local_name}/{path:path}") @app.put("/api/v1/remote/{remote_name}/{path:path}")
def get_local_artifact(local_name: str, path: str): async def upload_file(remote_name: str, path: str, file: UploadFile = File(...)):
return local.download(local_name, path, storage, database, config) return await local.upload(remote_name, path, file, storage, database, config)
@app.put("/api/v1/local/{local_name}/{path:path}") @app.head("/api/v1/remote/{remote_name}/{path:path}")
async def upload_local_file(local_name: str, path: str, file: UploadFile = File(...)): def check_file_exists(remote_name: str, path: str):
return await local.upload(local_name, path, file, storage, database, config) return local.check_exists(remote_name, path, database, config)
@app.head("/api/v1/local/{local_name}/{path:path}") @app.delete("/api/v1/remote/{remote_name}/{path:path}")
def check_local_file_exists(local_name: str, path: str): def delete_file(remote_name: str, path: str):
return local.check_exists(local_name, path, database, config) return local.delete(remote_name, path, storage, database, config)
@app.delete("/api/v1/local/{local_name}/{path:path}")
def delete_local_file(local_name: str, path: str):
return local.delete(local_name, path, storage, database, config)
@app.post("/api/v1/artifacts/cache") @app.post("/api/v1/artifacts/cache")
+7 -13
View File
@@ -87,10 +87,9 @@ class MetricsManager:
# Get from database if available # Get from database if available
db_sizes = self.database_manager.get_storage_by_remote() db_sizes = self.database_manager.get_storage_by_remote()
if db_sizes: if db_sizes:
# Initialize all configured remotes and locals to 0 # Initialize all configured remotes to 0
remote_sizes = {} remote_sizes = {}
all_names = list(config_manager.config.get("remotes", {}).keys()) + list(config_manager.config.get("locals", {}).keys()) for remote in config_manager.config.get("remotes", {}).keys():
for remote in all_names:
remote_sizes[remote] = db_sizes.get(remote, 0) remote_sizes[remote] = db_sizes.get(remote, 0)
# Update Prometheus gauges # Update Prometheus gauges
@@ -102,10 +101,10 @@ class MetricsManager:
# Fallback to S3 scanning if database not available # Fallback to S3 scanning if database not available
try: try:
remote_sizes = {} remote_sizes = {}
all_names = list(config_manager.config.get("remotes", {}).keys()) + list(config_manager.config.get("locals", {}).keys()) remotes = config_manager.config.get("remotes", {}).keys()
# Initialize all remotes and locals to 0 # Initialize all remotes to 0
for remote in all_names: for remote in remotes:
remote_sizes[remote] = 0 remote_sizes[remote] = 0
paginator = storage.client.get_paginator("list_objects_v2") paginator = storage.client.get_paginator("list_objects_v2")
@@ -175,13 +174,8 @@ class MetricsManager:
metrics["requests"]["cache_hit_ratio"] = cache_hits / total_requests if total_requests > 0 else 0.0 metrics["requests"]["cache_hit_ratio"] = cache_hits / total_requests if total_requests > 0 else 0.0
metrics["bandwidth"]["saved_bytes"] = bandwidth_saved metrics["bandwidth"]["saved_bytes"] = bandwidth_saved
# Get per-repo metrics # Get per-remote metrics
all_repos = { for remote in config_manager.config.get("remotes", {}).keys():
**config_manager.config.get("remotes", {}),
**config_manager.config.get("virtuals", {}),
**config_manager.config.get("locals", {}),
}
for remote in all_repos.keys():
remote_cache_hits = int(self.redis_client.client.get(f"metrics:cache_hits:{remote}") or 0) remote_cache_hits = int(self.redis_client.client.get(f"metrics:cache_hits:{remote}") or 0)
remote_cache_misses = int(self.redis_client.client.get(f"metrics:cache_misses:{remote}") or 0) remote_cache_misses = int(self.redis_client.client.get(f"metrics:cache_misses:{remote}") or 0)
remote_total = remote_cache_hits + remote_cache_misses remote_total = remote_cache_hits + remote_cache_misses
+22 -16
View File
@@ -20,55 +20,61 @@ TEST_REMOTES = {
"remotes": { "remotes": {
"alpine-test": { "alpine-test": {
"base_url": "https://dl-cdn.alpinelinux.org", "base_url": "https://dl-cdn.alpinelinux.org",
"type": "remote",
"package": "alpine", "package": "alpine",
"immutable_patterns": [".*/x86_64/.*\\.apk$"], "immutable_patterns": [".*/x86_64/.*\\.apk$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 3600}, "cache": {"immutable_ttl": 0, "mutable_ttl": 3600},
}, },
"rpm-test": { "rpm-test": {
"base_url": "https://example.com/rpm", "base_url": "https://example.com/rpm",
"type": "remote",
"package": "rpm", "package": "rpm",
"immutable_patterns": [".*/x86_64/.*\\.rpm$", ".*/repodata/.*$"], "immutable_patterns": [".*/x86_64/.*\\.rpm$", ".*/repodata/.*$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 3600}, "cache": {"immutable_ttl": 0, "mutable_ttl": 3600},
}, },
"docker-test": { "docker-test": {
"base_url": "https://registry.example.com", "base_url": "https://registry.example.com",
"type": "remote",
"package": "docker", "package": "docker",
"cache": {"immutable_ttl": 0, "mutable_ttl": 300}, "cache": {"immutable_ttl": 0, "mutable_ttl": 300},
}, },
"docker-restricted": { "docker-restricted": {
"base_url": "https://registry.example.com", "base_url": "https://registry.example.com",
"type": "remote",
"package": "docker", "package": "docker",
"immutable_patterns": ["^library/nginx"], "immutable_patterns": ["^library/nginx"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 300}, "cache": {"immutable_ttl": 0, "mutable_ttl": 300},
}, },
"docker-bantags-test": {
"base_url": "https://registry.example.com",
"package": "docker",
"ban_tags_enabled": True,
"ban_tags": ["latest", "edge"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 300},
},
"generic-test": { "generic-test": {
"base_url": "https://releases.example.com", "base_url": "https://releases.example.com",
"type": "remote",
"package": "generic", "package": "generic",
"immutable_patterns": [".*\\.tar\\.gz$"], "immutable_patterns": [".*\\.tar\\.gz$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 0}, "cache": {"immutable_ttl": 0, "mutable_ttl": 0},
}, },
"custom-index-test": { "custom-index-test": {
"base_url": "https://example.com", "base_url": "https://example.com",
"type": "remote",
"package": "generic", "package": "generic",
"mutable_patterns": ["metadata\\.json$"], "mutable_patterns": ["metadata\\.json$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 600}, "cache": {"immutable_ttl": 0, "mutable_ttl": 600},
}, },
"check-mutable-test": { "check-mutable-test": {
"base_url": "https://example.com", "base_url": "https://example.com",
"type": "remote",
"package": "generic", "package": "generic",
"mutable_patterns": ["metadata\\.json$"], "mutable_patterns": ["metadata\\.json$"],
"check_mutable_updates": True, "check_mutable_updates": True,
"cache": {"immutable_ttl": 0, "mutable_ttl": 600}, "cache": {"immutable_ttl": 0, "mutable_ttl": 600},
}, },
"local-test": {
"type": "local",
"package": "generic",
"cache": {"immutable_ttl": 0, "mutable_ttl": 0},
},
"pypi-test": { "pypi-test": {
"base_url": "https://files.pythonhosted.org", "base_url": "https://files.pythonhosted.org",
"type": "remote",
"package": "pypi", "package": "pypi",
"immutable_patterns": [ "immutable_patterns": [
r"packages/.*\.whl$", r"packages/.*\.whl$",
@@ -79,6 +85,7 @@ TEST_REMOTES = {
}, },
"npm-test": { "npm-test": {
"base_url": "https://registry.npmjs.org", "base_url": "https://registry.npmjs.org",
"type": "remote",
"package": "npm", "package": "npm",
"immutable_patterns": [r"\.tgz$"], "immutable_patterns": [r"\.tgz$"],
"mutable_patterns": [r"^(?!.*\.tgz$).*"], "mutable_patterns": [r"^(?!.*\.tgz$).*"],
@@ -86,12 +93,14 @@ TEST_REMOTES = {
}, },
"helm-test": { "helm-test": {
"base_url": "https://helm.releases.hashicorp.com", "base_url": "https://helm.releases.hashicorp.com",
"type": "remote",
"package": "helm", "package": "helm",
"immutable_patterns": [r"\.tgz$"], "immutable_patterns": [r"\.tgz$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 3600}, "cache": {"immutable_ttl": 0, "mutable_ttl": 3600},
}, },
"quarantine-test": { "quarantine-test": {
"base_url": "https://releases.example.com", "base_url": "https://releases.example.com",
"type": "remote",
"package": "generic", "package": "generic",
"immutable_patterns": [r".*\.tar\.gz$"], "immutable_patterns": [r".*\.tar\.gz$"],
"quarantine_new": True, "quarantine_new": True,
@@ -100,6 +109,7 @@ TEST_REMOTES = {
}, },
"quarantine-disabled": { "quarantine-disabled": {
"base_url": "https://releases.example.com", "base_url": "https://releases.example.com",
"type": "remote",
"package": "generic", "package": "generic",
"immutable_patterns": [r".*\.tar\.gz$"], "immutable_patterns": [r".*\.tar\.gz$"],
"quarantine_new": False, "quarantine_new": False,
@@ -108,31 +118,27 @@ TEST_REMOTES = {
}, },
"helm-member-2": { "helm-member-2": {
"base_url": "https://charts.example.com", "base_url": "https://charts.example.com",
"type": "remote",
"package": "helm", "package": "helm",
"immutable_patterns": [r"\.tgz$"], "immutable_patterns": [r"\.tgz$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 1800}, "cache": {"immutable_ttl": 0, "mutable_ttl": 1800},
}, },
},
"locals": {
"local-test": {
"package": "generic",
"cache": {"immutable_ttl": 0, "mutable_ttl": 0},
},
},
"virtuals": {
"helm-virtual-test": { "helm-virtual-test": {
"type": "virtual",
"package": "helm", "package": "helm",
"members": ["helm-test", "helm-member-2"], "members": ["helm-test", "helm-member-2"],
}, },
"unsupported-virtual-test": { "unsupported-virtual-test": {
"type": "virtual",
"package": "rpm", "package": "rpm",
"members": ["rpm-test"], "members": ["rpm-test"],
}, },
"empty-virtual-test": { "empty-virtual-test": {
"type": "virtual",
"package": "helm", "package": "helm",
"members": [], "members": [],
}, },
}, }
} }
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
-71
View File
@@ -327,74 +327,3 @@ class TestArtifactPublished:
def test_get_returns_none_when_unavailable(self, unavailable_cache): def test_get_returns_none_when_unavailable(self, unavailable_cache):
assert unavailable_cache.get_artifact_published("remote", "path") is None assert unavailable_cache.get_artifact_published("remote", "path") is None
# ---------------------------------------------------------------------------
# fetch lock (thundering-herd deduplication)
# ---------------------------------------------------------------------------
class TestFetchLock:
def test_acquire_returns_true_when_lock_obtained(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = True
result = cache_with_redis.acquire_fetch_lock("myremote", "library/nginx/manifests/latest")
assert result is True
def test_acquire_calls_set_nx_with_ttl(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = True
cache_with_redis.acquire_fetch_lock("myremote", "library/nginx/manifests/latest", ttl=15)
_, kwargs = mock_redis_client.set.call_args
assert kwargs.get("nx") is True
assert kwargs.get("ex") == 15
def test_acquire_returns_false_when_lock_already_held(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = None # Redis SET NX → None when key exists
result = cache_with_redis.acquire_fetch_lock("myremote", "library/nginx/manifests/latest")
assert result is False
def test_acquire_fails_open_when_unavailable(self, unavailable_cache):
# caller must be allowed to proceed when Redis is down
assert unavailable_cache.acquire_fetch_lock("myremote", "some/path") is True
def test_acquire_fails_open_on_redis_exception(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.side_effect = Exception("connection reset")
assert cache_with_redis.acquire_fetch_lock("myremote", "some/path") is True
def test_lock_key_embeds_path_hash(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = True
path = "library/nginx/manifests/latest"
cache_with_redis.acquire_fetch_lock("myremote", path)
args, _ = mock_redis_client.set.call_args
expected_hash = hashlib.sha256(path.encode()).hexdigest()[:16]
assert args[0] == f"fetchlock:myremote:{expected_hash}"
def test_lock_key_hash_is_16_chars(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = True
cache_with_redis.acquire_fetch_lock("myremote", "some/long/path/file.tar.gz")
args, _ = mock_redis_client.set.call_args
# key format: fetchlock:<remote>:<16-char hash>
parts = args[0].split(":")
assert len(parts) == 3
assert len(parts[2]) == 16
def test_different_paths_produce_different_lock_keys(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = True
cache_with_redis.acquire_fetch_lock("myremote", "path/a/manifests/latest")
key_a = mock_redis_client.set.call_args[0][0]
mock_redis_client.set.reset_mock()
cache_with_redis.acquire_fetch_lock("myremote", "path/b/manifests/latest")
key_b = mock_redis_client.set.call_args[0][0]
assert key_a != key_b
def test_release_deletes_correct_key(self, cache_with_redis, mock_redis_client):
path = "library/nginx/manifests/latest"
cache_with_redis.release_fetch_lock("myremote", path)
expected_hash = hashlib.sha256(path.encode()).hexdigest()[:16]
mock_redis_client.delete.assert_called_once_with(f"fetchlock:myremote:{expected_hash}")
def test_release_no_op_when_unavailable(self, unavailable_cache):
unavailable_cache.release_fetch_lock("myremote", "some/path") # must not raise
def test_release_no_op_on_redis_exception(self, cache_with_redis, mock_redis_client):
mock_redis_client.delete.side_effect = Exception("timeout")
cache_with_redis.release_fetch_lock("myremote", "some/path") # must not raise
+19 -19
View File
@@ -27,24 +27,24 @@ def make_config(tmp_path):
class TestGetMutablePatterns: class TestGetMutablePatterns:
def test_alpine_returns_package_defaults(self, make_config): def test_alpine_returns_package_defaults(self, make_config):
cfg = make_config({"r": {"package": "alpine", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "alpine", "base_url": "https://x.com"}})
patterns = cfg.get_mutable_patterns("r") patterns = cfg.get_mutable_patterns("r")
assert r"APKINDEX\.tar\.gz$" in patterns assert r"APKINDEX\.tar\.gz$" in patterns
def test_rpm_returns_package_defaults(self, make_config): def test_rpm_returns_package_defaults(self, make_config):
cfg = make_config({"r": {"package": "rpm", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "rpm", "base_url": "https://x.com"}})
patterns = cfg.get_mutable_patterns("r") patterns = cfg.get_mutable_patterns("r")
assert r"repomd\.xml$" in patterns assert r"repomd\.xml$" in patterns
assert any("repodata" in p for p in patterns) assert any("repodata" in p for p in patterns)
def test_docker_returns_package_defaults(self, make_config): def test_docker_returns_package_defaults(self, make_config):
cfg = make_config({"r": {"package": "docker", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "docker", "base_url": "https://x.com"}})
patterns = cfg.get_mutable_patterns("r") patterns = cfg.get_mutable_patterns("r")
assert any("manifests" in p for p in patterns) assert any("manifests" in p for p in patterns)
assert any("tags/list" in p for p in patterns) assert any("tags/list" in p for p in patterns)
def test_generic_returns_empty_list(self, make_config): def test_generic_returns_empty_list(self, make_config):
cfg = make_config({"r": {"package": "generic", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "generic", "base_url": "https://x.com"}})
assert cfg.get_mutable_patterns("r") == [] assert cfg.get_mutable_patterns("r") == []
def test_unknown_remote_returns_empty_list(self, make_config): def test_unknown_remote_returns_empty_list(self, make_config):
@@ -52,12 +52,12 @@ class TestGetMutablePatterns:
assert cfg.get_mutable_patterns("nonexistent") == [] assert cfg.get_mutable_patterns("nonexistent") == []
def test_missing_package_field_defaults_to_generic(self, make_config): def test_missing_package_field_defaults_to_generic(self, make_config):
cfg = make_config({"r": {"base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "base_url": "https://x.com"}})
assert cfg.get_mutable_patterns("r") == [] assert cfg.get_mutable_patterns("r") == []
def test_unknown_package_type_returns_empty_list(self, make_config): def test_unknown_package_type_returns_empty_list(self, make_config):
# A mis-spelled package type silently returns [] — this is a known footgun # A mis-spelled package type silently returns [] — this is a known footgun
cfg = make_config({"r": {"package": "deb", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "deb", "base_url": "https://x.com"}})
assert cfg.get_mutable_patterns("r") == [] assert cfg.get_mutable_patterns("r") == []
def test_extra_patterns_appended_after_defaults(self, make_config): def test_extra_patterns_appended_after_defaults(self, make_config):
@@ -134,7 +134,7 @@ class TestGetMutablePatterns:
assert r"custom-meta\.xml$" in patterns assert r"custom-meta\.xml$" in patterns
def test_npm_has_no_package_defaults(self, make_config): def test_npm_has_no_package_defaults(self, make_config):
cfg = make_config({"r": {"package": "npm", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "npm", "base_url": "https://x.com"}})
assert cfg.get_mutable_patterns("r") == [] assert cfg.get_mutable_patterns("r") == []
def test_npm_explicit_mutable_pattern_matches_metadata(self, make_config): def test_npm_explicit_mutable_pattern_matches_metadata(self, make_config):
@@ -155,14 +155,14 @@ class TestGetMutablePatterns:
assert any(re.search(p, "@babel/core") for p in patterns) assert any(re.search(p, "@babel/core") for p in patterns)
def test_helm_returns_index_yaml_as_mutable(self, make_config): def test_helm_returns_index_yaml_as_mutable(self, make_config):
cfg = make_config({"r": {"package": "helm", "base_url": "https://helm.example.com"}}) cfg = make_config({"r": {"type": "remote", "package": "helm", "base_url": "https://helm.example.com"}})
patterns = cfg.get_mutable_patterns("r") patterns = cfg.get_mutable_patterns("r")
assert r"index\.yaml$" in patterns assert r"index\.yaml$" in patterns
def test_helm_chart_tarballs_not_mutable_by_default(self, make_config): def test_helm_chart_tarballs_not_mutable_by_default(self, make_config):
import re import re
cfg = make_config({"r": {"package": "helm", "base_url": "https://helm.example.com"}}) cfg = make_config({"r": {"type": "remote", "package": "helm", "base_url": "https://helm.example.com"}})
patterns = cfg.get_mutable_patterns("r") patterns = cfg.get_mutable_patterns("r")
# Only index.yaml is mutable; .tgz chart tarballs are not # Only index.yaml is mutable; .tgz chart tarballs are not
assert not any(re.search(p, "vault-0.29.1.tgz") for p in patterns) assert not any(re.search(p, "vault-0.29.1.tgz") for p in patterns)
@@ -210,7 +210,7 @@ class TestGetImmutablePatterns:
assert cfg.get_immutable_patterns("nonexistent") == [] assert cfg.get_immutable_patterns("nonexistent") == []
def test_returns_empty_when_no_patterns_configured(self, make_config): def test_returns_empty_when_no_patterns_configured(self, make_config):
cfg = make_config({"r": {"package": "generic", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "generic", "base_url": "https://x.com"}})
assert cfg.get_immutable_patterns("r") == [] assert cfg.get_immutable_patterns("r") == []
def test_multiple_patterns_returned(self, make_config): def test_multiple_patterns_returned(self, make_config):
@@ -281,7 +281,7 @@ class TestGetUserMutablePatterns:
def test_excludes_package_defaults(self, make_config): def test_excludes_package_defaults(self, make_config):
# Package defaults (APKINDEX etc.) must NOT appear here # Package defaults (APKINDEX etc.) must NOT appear here
cfg = make_config({"r": {"package": "alpine", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "alpine", "base_url": "https://x.com"}})
assert cfg.get_user_mutable_patterns("r") == [] assert cfg.get_user_mutable_patterns("r") == []
def test_returns_empty_for_missing_remote(self, make_config): def test_returns_empty_for_missing_remote(self, make_config):
@@ -289,7 +289,7 @@ class TestGetUserMutablePatterns:
assert cfg.get_user_mutable_patterns("nonexistent") == [] assert cfg.get_user_mutable_patterns("nonexistent") == []
def test_returns_empty_when_key_absent(self, make_config): def test_returns_empty_when_key_absent(self, make_config):
cfg = make_config({"r": {"package": "generic", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "generic", "base_url": "https://x.com"}})
assert cfg.get_user_mutable_patterns("r") == [] assert cfg.get_user_mutable_patterns("r") == []
@@ -317,7 +317,7 @@ class TestGetCacheConfig:
assert cfg.get_cache_config("nonexistent") == {} assert cfg.get_cache_config("nonexistent") == {}
def test_returns_empty_dict_when_no_cache_key(self, make_config): def test_returns_empty_dict_when_no_cache_key(self, make_config):
cfg = make_config({"r": {"package": "generic", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "generic", "base_url": "https://x.com"}})
assert cfg.get_cache_config("r") == {} assert cfg.get_cache_config("r") == {}
@@ -329,11 +329,11 @@ class TestGetCacheConfig:
class TestConfigReload: class TestConfigReload:
def test_reloads_when_file_mtime_advances(self, tmp_path): def test_reloads_when_file_mtime_advances(self, tmp_path):
cfg_file = tmp_path / "remotes.yaml" cfg_file = tmp_path / "remotes.yaml"
cfg_file.write_text(yaml.dump({"remotes": {"repo-a": {"package": "generic", "base_url": "https://x.com"}}})) cfg_file.write_text(yaml.dump({"remotes": {"repo-a": {"type": "remote", "package": "generic", "base_url": "https://x.com"}}}))
cfg = ConfigManager(str(cfg_file)) cfg = ConfigManager(str(cfg_file))
assert "repo-a" in cfg.config["remotes"] assert "repo-a" in cfg.config["remotes"]
cfg_file.write_text(yaml.dump({"remotes": {"repo-b": {"package": "generic", "base_url": "https://y.com"}}})) cfg_file.write_text(yaml.dump({"remotes": {"repo-b": {"type": "remote", "package": "generic", "base_url": "https://y.com"}}}))
future_mtime = cfg._last_modified + 1 future_mtime = cfg._last_modified + 1
os.utime(str(cfg_file), (future_mtime, future_mtime)) os.utime(str(cfg_file), (future_mtime, future_mtime))
@@ -344,7 +344,7 @@ class TestConfigReload:
def test_no_reload_when_file_unchanged(self, tmp_path): def test_no_reload_when_file_unchanged(self, tmp_path):
cfg_file = tmp_path / "remotes.yaml" cfg_file = tmp_path / "remotes.yaml"
cfg_file.write_text(yaml.dump({"remotes": {"repo-a": {"package": "generic", "base_url": "https://x.com"}}})) cfg_file.write_text(yaml.dump({"remotes": {"repo-a": {"type": "remote", "package": "generic", "base_url": "https://x.com"}}}))
cfg = ConfigManager(str(cfg_file)) cfg = ConfigManager(str(cfg_file))
# Call check_reload without touching the file — should not reload # Call check_reload without touching the file — should not reload
@@ -360,7 +360,7 @@ class TestConfigReload:
class TestGetQuarantineConfig: class TestGetQuarantineConfig:
def test_returns_false_zero_when_not_configured(self, make_config): def test_returns_false_zero_when_not_configured(self, make_config):
cfg = make_config({"r": {"package": "generic", "base_url": "https://x.com"}}) cfg = make_config({"r": {"type": "remote", "package": "generic", "base_url": "https://x.com"}})
enabled, days = cfg.get_quarantine_config("r") enabled, days = cfg.get_quarantine_config("r")
assert enabled is False assert enabled is False
assert days == 0 assert days == 0
@@ -426,7 +426,7 @@ class TestGetQuarantineConfig:
def _remote(base_url: str = "https://x.com") -> dict: def _remote(base_url: str = "https://x.com") -> dict:
return {"package": "generic", "base_url": base_url} return {"type": "remote", "package": "generic", "base_url": base_url}
class TestConfigDirMode: class TestConfigDirMode:
@@ -445,7 +445,7 @@ class TestConfigDirMode:
def test_empty_directory_returns_empty_remotes(self, tmp_path): def test_empty_directory_returns_empty_remotes(self, tmp_path):
cfg = ConfigManager(str(tmp_path)) cfg = ConfigManager(str(tmp_path))
assert cfg.config == {"remotes": {}, "virtuals": {}, "locals": {}} assert cfg.config == {"remotes": {}}
def test_ignores_non_yaml_files(self, tmp_path): def test_ignores_non_yaml_files(self, tmp_path):
(tmp_path / "notes.txt").write_text("not yaml") (tmp_path / "notes.txt").write_text("not yaml")
+26 -216
View File
@@ -260,211 +260,6 @@ class TestDockerProxy:
mock_fetch.assert_called_once() mock_fetch.assert_called_once()
assert response.status_code == 200 assert response.status_code == 200
# --- Issue 1: sha256 digest cross-linking ---
def test_tag_manifest_is_stored_under_digest_key_on_cache_hit(self, client, patched_deps):
# When serving a cached tag manifest the handler must also write the content
# under the sha256 digest key so subsequent sha256-addressed pulls hit cache.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
# First exists call (tag manifest): hit. Second (digest key): miss → triggers upload.
deps["storage"].exists.side_effect = [True, False]
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/v2/docker-test/library/nginx/manifests/v1.25.3")
assert response.status_code == 200
deps["storage"].upload.assert_called_once_with(deps["storage"].get_object_key.return_value, manifest)
def test_tag_manifest_digest_key_not_written_when_already_exists(self, client, patched_deps):
# When the digest key already exists in storage upload must not be called.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
# Both the tag key and the digest key already present.
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
client.get("/v2/docker-test/library/nginx/manifests/v1.25.3")
deps["storage"].upload.assert_not_called()
def test_sha256_manifest_request_is_not_cross_linked(self, client, patched_deps):
# sha256-addressed manifests are immutable — the cross-link logic must not apply.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = False # sha256 manifest is immutable
with patch("artifactapi.artifact.proxy._fetch_last_modified", new_callable=AsyncMock, return_value=None):
client.get("/v2/docker-test/library/nginx/manifests/sha256:" + "a" * 64)
deps["storage"].upload.assert_not_called()
# --- Issue 2: thundering herd distributed lock ---
def test_lock_acquired_and_released_on_upstream_fetch(self, client, patched_deps):
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.side_effect = [False, False] # initial miss; digest key also absent
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].acquire_fetch_lock.return_value = True
with patch(
"artifactapi.artifact.proxy.cache_single_artifact",
new_callable=AsyncMock,
return_value={"status": "cached"},
):
response = client.get("/v2/docker-test/library/nginx/manifests/latest")
deps["cache"].acquire_fetch_lock.assert_called_once()
deps["cache"].release_fetch_lock.assert_called_once()
assert response.status_code == 200
def test_lock_released_even_when_fetch_returns_error(self, client, patched_deps):
deps = patched_deps
deps["storage"].exists.return_value = False
deps["cache"].is_mutable_file.return_value = True
deps["cache"].acquire_fetch_lock.return_value = True
with patch(
"artifactapi.artifact.proxy.cache_single_artifact",
new_callable=AsyncMock,
return_value={"status": "error", "error": "upstream down"},
):
response = client.get("/v2/docker-test/library/nginx/manifests/latest")
deps["cache"].release_fetch_lock.assert_called_once()
assert response.status_code == 502
def test_thundering_herd_polls_storage_when_lock_not_acquired(self, client, patched_deps):
# When the lock is held by another pod the handler must poll storage and serve
# from cache once the competing fetch completes, without issuing its own upstream request.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
# Initial cache check: miss. First poll iteration: another pod has written it.
# Third call is for the digest cross-link check (is_mutable=True path); digest key exists.
deps["storage"].exists.side_effect = [False, True, True]
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
deps["cache"].acquire_fetch_lock.return_value = False # lock held by peer
with patch("artifactapi.artifact.docker.asyncio.sleep", new_callable=AsyncMock):
with patch(
"artifactapi.artifact.proxy.cache_single_artifact",
new_callable=AsyncMock,
) as mock_fetch:
response = client.get("/v2/docker-test/library/nginx/manifests/latest")
mock_fetch.assert_not_called()
assert response.status_code == 200
def test_thundering_herd_falls_through_to_fetch_if_poll_times_out(self, client, patched_deps):
# If the item never appears in storage during the poll window the handler must
# still issue its own upstream fetch as a fallback.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
# All exists calls return False — item never appears during polling.
deps["storage"].exists.return_value = False
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].acquire_fetch_lock.return_value = False # lock held by peer
with patch("artifactapi.artifact.docker.asyncio.sleep", new_callable=AsyncMock):
with patch(
"artifactapi.artifact.proxy.cache_single_artifact",
new_callable=AsyncMock,
return_value={"status": "cached"},
) as mock_fetch:
response = client.get("/v2/docker-test/library/nginx/manifests/latest")
mock_fetch.assert_called_once()
assert response.status_code == 200
# ---------------------------------------------------------------------------
# Docker ban_tags feature
# ---------------------------------------------------------------------------
class TestDockerBanTags:
def test_banned_tag_returns_403(self, client, patched_deps):
response = client.get("/v2/docker-bantags-test/library/nginx/manifests/latest")
assert response.status_code == 403
assert "latest" in response.json()["detail"]
def test_second_banned_tag_returns_403(self, client, patched_deps):
response = client.get("/v2/docker-bantags-test/library/nginx/manifests/edge")
assert response.status_code == 403
assert "edge" in response.json()["detail"]
def test_allowed_tag_proceeds(self, client, patched_deps):
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/v2/docker-bantags-test/library/nginx/manifests/1.25.3")
assert response.status_code == 200
def test_digest_pull_bypasses_ban(self, client, patched_deps):
# sha256-addressed pulls must never be blocked by the tag ban list
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = False
digest = "sha256:" + "a" * 64
with patch("artifactapi.artifact.proxy._fetch_last_modified", new_callable=AsyncMock, return_value=None):
response = client.get(f"/v2/docker-bantags-test/library/nginx/manifests/{digest}")
assert response.status_code == 200
def test_ban_tags_disabled_by_default(self, client, patched_deps):
# docker-test has no ban_tags_enabled — "latest" must pass through
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/v2/docker-test/library/nginx/manifests/latest")
assert response.status_code == 200
def test_ban_tags_enabled_but_empty_list_allows_all(self, client, patched_deps):
# If ban_tags_enabled is true but ban_tags is empty nothing should be blocked.
# docker-test doesn't have ban_tags_enabled, but we can verify via the
# docker-bantags-test remote with an unlisted tag.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/v2/docker-bantags-test/library/nginx/manifests/stable")
assert response.status_code == 200
def test_ban_check_does_not_apply_to_blobs(self, client, patched_deps):
# Blob paths don't contain /manifests/ — the ban check must not interfere
deps = patched_deps
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = b"\x00" * 100
deps["cache"].is_mutable_file.return_value = False
with patch("artifactapi.artifact.proxy._fetch_last_modified", new_callable=AsyncMock, return_value=None):
response = client.get("/v2/docker-bantags-test/library/nginx/blobs/sha256:" + "b" * 64)
assert response.status_code == 200
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Generic artifact route /api/v1/remote/{remote}/{path} # Generic artifact route /api/v1/remote/{remote}/{path}
@@ -728,53 +523,68 @@ class TestGenericArtifactRoute:
deps["database"].get_local_file_metadata.return_value = None deps["database"].get_local_file_metadata.return_value = None
deps["database"].available = True deps["database"].available = True
response = client.get("/api/v1/local/local-test/path/to/nonexistent.bin") response = client.get("/api/v1/remote/local-test/path/to/nonexistent.bin")
assert response.status_code == 404 assert response.status_code == 404
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Upload route PUT /api/v1/local/{local}/{path} # Upload route PUT /api/v1/remote/{remote}/{path}
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
class TestUploadRoute: class TestUploadRoute:
def test_unknown_local_returns_404(self, client, patched_deps): def test_unknown_remote_returns_404(self, client, patched_deps):
response = client.put( response = client.put(
"/api/v1/local/nonexistent/path/to/file.tar.gz", "/api/v1/remote/nonexistent/path/to/file.tar.gz",
files={"file": ("file.tar.gz", b"content", "application/octet-stream")}, files={"file": ("file.tar.gz", b"content", "application/octet-stream")},
) )
assert response.status_code == 404 assert response.status_code == 404
def test_non_local_remote_returns_400(self, client, patched_deps):
response = client.put(
"/api/v1/remote/generic-test/path/to/file.tar.gz",
files={"file": ("file.tar.gz", b"content", "application/octet-stream")},
)
assert response.status_code == 400
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# HEAD route HEAD /api/v1/local/{local}/{path} # HEAD route HEAD /api/v1/remote/{remote}/{path}
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
class TestHeadRoute: class TestHeadRoute:
def test_non_local_remote_returns_405(self, client, patched_deps):
response = client.head("/api/v1/remote/generic-test/path/to/file.tar.gz")
assert response.status_code == 405
def test_local_repo_file_not_found_returns_404(self, client, patched_deps): def test_local_repo_file_not_found_returns_404(self, client, patched_deps):
deps = patched_deps deps = patched_deps
deps["database"].get_local_file_metadata.return_value = None deps["database"].get_local_file_metadata.return_value = None
deps["database"].available = True deps["database"].available = True
response = client.head("/api/v1/local/local-test/path/to/nonexistent.bin") response = client.head("/api/v1/remote/local-test/path/to/nonexistent.bin")
assert response.status_code == 404 assert response.status_code == 404
def test_unknown_local_returns_404(self, client, patched_deps): def test_unknown_remote_returns_404(self, client, patched_deps):
response = client.head("/api/v1/local/nonexistent/path/to/file.bin") response = client.head("/api/v1/remote/nonexistent/path/to/file.bin")
assert response.status_code == 404 assert response.status_code == 404
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# DELETE route DELETE /api/v1/local/{local}/{path} # DELETE route DELETE /api/v1/remote/{remote}/{path}
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
class TestDeleteRoute: class TestDeleteRoute:
def test_unknown_local_returns_404(self, client, patched_deps): def test_unknown_remote_returns_404(self, client, patched_deps):
response = client.delete("/api/v1/local/nonexistent/path/to/file.tar.gz") response = client.delete("/api/v1/remote/nonexistent/path/to/file.tar.gz")
assert response.status_code == 404 assert response.status_code == 404
def test_non_local_remote_returns_400(self, client, patched_deps):
response = client.delete("/api/v1/remote/generic-test/path/to/file.tar.gz")
assert response.status_code == 400
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Cache flush PUT /cache/flush # Cache flush PUT /cache/flush
+59 -293
View File
@@ -8,15 +8,11 @@ import yaml
from artifactapi.artifact.virtual import ( from artifactapi.artifact.virtual import (
_HANDLERS, _HANDLERS,
_entries_to_msgpack_safe,
_get_member_index, _get_member_index,
_HelmDumper, _HelmDumper,
_HelmHandler, _HelmHandler,
_merge_helm_indexes, _merge_helm_indexes,
_rewrite_urls,
_VirtualHandler, _VirtualHandler,
_YamlDumperBase,
_YamlLoader,
) )
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -70,47 +66,12 @@ entries:
generated: "2023-01-01T00:00:00.000Z" generated: "2023-01-01T00:00:00.000Z"
""" """
_INDEX_RELATIVE = b"""\
apiVersion: v1
entries:
rancher:
- name: rancher
version: "2.13.1"
urls:
- rancher-2.13.1.tgz
generated: "2023-01-01T00:00:00.000Z"
"""
_CFG_A = {"base_url": "https://helm.releases.hashicorp.com", "cache": {"mutable_ttl": 3600}} _CFG_A = {"base_url": "https://helm.releases.hashicorp.com", "cache": {"mutable_ttl": 3600}}
_CFG_B = {"base_url": "https://charts.example.com", "cache": {"mutable_ttl": 1800}} _CFG_B = {"base_url": "https://charts.example.com", "cache": {"mutable_ttl": 1800}}
# --------------------------------------------------------------------------- def _identity_resolve(data, *args, **kwargs):
# _YamlLoader / _YamlDumperBase — C extension selection return data, None
# ---------------------------------------------------------------------------
class TestYamlExtensionSelection:
def test_loader_is_a_class(self):
assert isinstance(_YamlLoader, type)
def test_dumper_base_is_a_class(self):
assert isinstance(_YamlDumperBase, type)
def test_helm_dumper_uses_selected_base(self):
assert issubclass(_HelmDumper, _YamlDumperBase)
def test_c_extensions_used_when_available(self):
try:
assert _YamlLoader is yaml.CSafeLoader
assert _YamlDumperBase is yaml.CDumper
except AttributeError:
assert _YamlLoader is yaml.SafeLoader
assert _YamlDumperBase is yaml.Dumper
def test_loader_can_parse_yaml(self):
result = yaml.load(b"key: value", Loader=_YamlLoader)
assert result == {"key": "value"}
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -174,13 +135,14 @@ class TestHelmHandler:
assert isinstance(msg, str) and len(msg) > 0 assert isinstance(msg, str) and len(msg) > 0
def test_merge_returns_bytes(self): def test_merge_returns_bytes(self):
result = self.handler.merge([_INDEX_A], [None], ["member-a"], [_CFG_A], "http://proxy.example.com") with patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve):
result = self.handler.merge([_INDEX_A], ["member-a"], [_CFG_A], "http://proxy.example.com")
assert isinstance(result, bytes) assert isinstance(result, bytes)
def test_merge_delegates_to_merge_helm_indexes(self): def test_merge_delegates_to_merge_helm_indexes(self):
with patch("artifactapi.artifact.virtual._merge_helm_indexes", return_value=b"merged") as mock_fn: with patch("artifactapi.artifact.virtual._merge_helm_indexes", return_value=b"merged") as mock_fn:
result = self.handler.merge([b"data"], [None], ["m"], [{}], "http://proxy") result = self.handler.merge([b"data"], ["m"], [{}], "http://proxy")
mock_fn.assert_called_once_with([b"data"], [None], ["m"], [{}], "http://proxy") mock_fn.assert_called_once_with([b"data"], ["m"], [{}], "http://proxy")
assert result == b"merged" assert result == b"merged"
@@ -198,41 +160,6 @@ class TestHandlersRegistry:
assert isinstance(_HANDLERS["helm"], _VirtualHandler) assert isinstance(_HANDLERS["helm"], _VirtualHandler)
# ---------------------------------------------------------------------------
# _rewrite_urls
# ---------------------------------------------------------------------------
class TestRewriteUrls:
def _rewrite(self, urls, base_url="https://upstream.example.com", proxy_base="http://proxy.example.com", member_name="my-remote"):
return _rewrite_urls(urls, base_url, proxy_base, member_name)
def test_absolute_url_matching_base_is_rewritten(self):
result = self._rewrite(["https://upstream.example.com/chart-1.0.0.tgz"])
assert result == ["http://proxy.example.com/api/v1/remote/my-remote/chart-1.0.0.tgz"]
def test_relative_url_is_prepended_with_proxy_remote(self):
result = self._rewrite(["chart-1.0.0.tgz"])
assert result == ["http://proxy.example.com/api/v1/remote/my-remote/chart-1.0.0.tgz"]
def test_relative_url_with_leading_slash(self):
result = self._rewrite(["/chart-1.0.0.tgz"])
assert result == ["http://proxy.example.com/api/v1/remote/my-remote/chart-1.0.0.tgz"]
def test_absolute_url_not_matching_base_is_unchanged(self):
result = self._rewrite(["https://other.example.com/chart-1.0.0.tgz"])
assert result == ["https://other.example.com/chart-1.0.0.tgz"]
def test_empty_url_list_returns_empty(self):
assert self._rewrite([]) == []
def test_multiple_urls_all_rewritten(self):
urls = ["https://upstream.example.com/a-1.0.0.tgz", "b-2.0.0.tgz"]
result = self._rewrite(urls)
assert result[0] == "http://proxy.example.com/api/v1/remote/my-remote/a-1.0.0.tgz"
assert result[1] == "http://proxy.example.com/api/v1/remote/my-remote/b-2.0.0.tgz"
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# _merge_helm_indexes # _merge_helm_indexes
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -240,7 +167,8 @@ class TestRewriteUrls:
class TestMergeHelmIndexes: class TestMergeHelmIndexes:
def _merge(self, raw_indexes, member_names, member_configs, proxy_base="http://proxy.example.com"): def _merge(self, raw_indexes, member_names, member_configs, proxy_base="http://proxy.example.com"):
return _merge_helm_indexes(raw_indexes, [None] * len(raw_indexes), member_names, member_configs, proxy_base) with patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve):
return _merge_helm_indexes(raw_indexes, member_names, member_configs, proxy_base)
def _parse(self, raw): def _parse(self, raw):
return yaml.safe_load(raw) return yaml.safe_load(raw)
@@ -259,18 +187,7 @@ class TestMergeHelmIndexes:
def test_first_member_wins_on_duplicate_name_and_version(self): def test_first_member_wins_on_duplicate_name_and_version(self):
index = self._parse(self._merge([_INDEX_A, _INDEX_B], ["member-a", "member-b"], [_CFG_A, _CFG_B])) index = self._parse(self._merge([_INDEX_A, _INDEX_B], ["member-a", "member-b"], [_CFG_A, _CFG_B]))
v027 = next(e for e in index["entries"]["vault"] if e["version"] == "0.27.0") v027 = next(e for e in index["entries"]["vault"] if e["version"] == "0.27.0")
assert "member-a" in v027["urls"][0] assert "helm.releases.hashicorp.com" in v027["urls"][0]
def test_absolute_urls_rewritten_to_proxy(self):
index = self._parse(self._merge([_INDEX_A], ["member-a"], [_CFG_A]))
url = index["entries"]["vault"][0]["urls"][0]
assert url == "http://proxy.example.com/api/v1/remote/member-a/vault-0.27.0.tgz"
def test_relative_urls_rewritten_to_proxy(self):
cfg = {"base_url": "https://releases.rancher.com/server-charts/stable", "cache": {"mutable_ttl": 3600}}
index = self._parse(self._merge([_INDEX_RELATIVE], ["rancher-stable"], [cfg]))
url = index["entries"]["rancher"][0]["urls"][0]
assert url == "http://proxy.example.com/api/v1/remote/rancher-stable/rancher-2.13.1.tgz"
def test_different_versions_of_same_chart_both_included(self): def test_different_versions_of_same_chart_both_included(self):
index = self._parse(self._merge([_INDEX_A, _INDEX_B], ["member-a", "member-b"], [_CFG_A, _CFG_B])) index = self._parse(self._merge([_INDEX_A, _INDEX_B], ["member-a", "member-b"], [_CFG_A, _CFG_B]))
@@ -343,7 +260,7 @@ class TestGetMemberIndex:
storage.exists.return_value = True storage.exists.return_value = True
cache.is_index_valid.return_value = True cache.is_index_valid.return_value = True
_, _, _, raw_data, _ = await _get_member_index("m", member_cfg, "index.yaml", storage, cache) _, _, _, raw_data = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == b"cached bytes" assert raw_data == b"cached bytes"
@@ -366,7 +283,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response(b"fresh bytes") mock_client.get.return_value = self._fake_response(b"fresh bytes")
_, _, _, raw_data, _ = await _get_member_index("m", member_cfg, "index.yaml", storage, cache) _, _, _, raw_data = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == b"fresh bytes" assert raw_data == b"fresh bytes"
@@ -376,7 +293,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response() mock_client.get.return_value = self._fake_response()
_, _, _, raw_data, _ = await _get_member_index("m", member_cfg, "index.yaml", storage, cache) _, _, _, raw_data = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == b"upstream bytes" assert raw_data == b"upstream bytes"
@@ -435,7 +352,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.side_effect = Exception("connection refused") mock_client.get.side_effect = Exception("connection refused")
_, _, _, raw_data, _ = await _get_member_index("m", member_cfg, "index.yaml", storage, cache) _, _, _, raw_data = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data is None assert raw_data is None
@@ -447,7 +364,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response() mock_client.get.return_value = self._fake_response()
_, _, _, raw_data, _ = await _get_member_index("m", member_cfg, "index.yaml", storage, cache) _, _, _, raw_data = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == b"upstream bytes" assert raw_data == b"upstream bytes"
@@ -458,7 +375,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response() mock_client.get.return_value = self._fake_response()
_, _, ttl, _, _ = await _get_member_index("m", cfg, "index.yaml", storage, cache) _, _, ttl, _ = await _get_member_index("m", cfg, "index.yaml", storage, cache)
assert ttl == 900 assert ttl == 900
@@ -469,7 +386,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response() mock_client.get.return_value = self._fake_response()
_, _, ttl, _, _ = await _get_member_index("m", cfg, "index.yaml", storage, cache) _, _, ttl, _ = await _get_member_index("m", cfg, "index.yaml", storage, cache)
assert ttl == 3600 assert ttl == 3600
@@ -513,10 +430,10 @@ class TestVirtualRoute:
response = client.get("/api/v1/virtual/no-such-virtual/index.yaml") response = client.get("/api/v1/virtual/no-such-virtual/index.yaml")
assert response.status_code == 404 assert response.status_code == 404
def test_non_virtual_name_returns_404(self, client, patched_virtual_deps): def test_non_virtual_type_returns_400(self, client, patched_virtual_deps):
# helm-test is in remotes, not virtuals # helm-test is type "remote", not "virtual"
response = client.get("/api/v1/virtual/helm-test/index.yaml") response = client.get("/api/v1/virtual/helm-test/index.yaml")
assert response.status_code == 404 assert response.status_code == 400
def test_unsupported_package_returns_400(self, client, patched_virtual_deps): def test_unsupported_package_returns_400(self, client, patched_virtual_deps):
# unsupported-virtual-test has package "rpm" # unsupported-virtual-test has package "rpm"
@@ -567,16 +484,22 @@ class TestVirtualRoute:
mock_get.assert_not_called() mock_get.assert_not_called()
def test_cache_miss_returns_200_with_yaml_content_type(self, client, patched_virtual_deps): def test_cache_miss_returns_200_with_yaml_content_type(self, client, patched_virtual_deps):
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get: with (
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None) patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml") response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
assert response.status_code == 200 assert response.status_code == 200
assert "text/yaml" in response.headers["content-type"] assert "text/yaml" in response.headers["content-type"]
def test_cache_miss_response_contains_merged_entries(self, client, patched_virtual_deps): def test_cache_miss_response_contains_merged_entries(self, client, patched_virtual_deps):
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get: with (
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None) patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml") response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
index = yaml.safe_load(response.content) index = yaml.safe_load(response.content)
@@ -584,26 +507,35 @@ class TestVirtualRoute:
def test_cache_miss_stores_result_in_s3(self, client, patched_virtual_deps): def test_cache_miss_stores_result_in_s3(self, client, patched_virtual_deps):
deps = patched_virtual_deps deps = patched_virtual_deps
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get: with (
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None) patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
client.get("/api/v1/virtual/helm-virtual-test/index.yaml") client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
deps["storage"].upload.assert_called_once() deps["storage"].upload.assert_called_once()
def test_cache_miss_marks_index_cached(self, client, patched_virtual_deps): def test_cache_miss_marks_index_cached(self, client, patched_virtual_deps):
deps = patched_virtual_deps deps = patched_virtual_deps
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get: with (
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None) patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
client.get("/api/v1/virtual/helm-virtual-test/index.yaml") client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
deps["cache"].mark_index_cached.assert_called_once() deps["cache"].mark_index_cached.assert_called_once()
def test_cache_miss_uses_min_ttl_across_members(self, client, patched_virtual_deps): def test_cache_miss_uses_min_ttl_across_members(self, client, patched_virtual_deps):
deps = patched_virtual_deps deps = patched_virtual_deps
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get: with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.side_effect = [ mock_get.side_effect = [
("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None), ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE),
("helm-member-2", _CFG_B, 1800, _INDEX_SIMPLE, None), ("helm-member-2", _CFG_B, 1800, _INDEX_SIMPLE),
] ]
client.get("/api/v1/virtual/helm-virtual-test/index.yaml") client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
@@ -612,16 +544,19 @@ class TestVirtualRoute:
def test_all_members_unreachable_returns_502(self, client, patched_virtual_deps): def test_all_members_unreachable_returns_502(self, client, patched_virtual_deps):
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get: with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get:
mock_get.return_value = ("helm-test", _CFG_A, 3600, None, None) mock_get.return_value = ("helm-test", _CFG_A, 3600, None)
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml") response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
assert response.status_code == 502 assert response.status_code == 502
def test_one_member_unreachable_still_returns_200(self, client, patched_virtual_deps): def test_one_member_unreachable_still_returns_200(self, client, patched_virtual_deps):
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get: with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.side_effect = [ mock_get.side_effect = [
("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None), ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE),
("helm-member-2", _CFG_B, 1800, None, None), ("helm-member-2", _CFG_B, 1800, None),
] ]
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml") response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
@@ -637,9 +572,10 @@ class TestVirtualRoute:
with ( with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get, patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
patch.object(main_mod.config, "get_remote_config", side_effect=patched_get), patch.object(main_mod.config, "get_remote_config", side_effect=patched_get),
): ):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None) mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml") response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
# only helm-test was available — should succeed # only helm-test was available — should succeed
@@ -650,181 +586,11 @@ class TestVirtualRoute:
deps = patched_virtual_deps deps = patched_virtual_deps
deps["storage"].upload.side_effect = Exception("S3 write error") deps["storage"].upload.side_effect = Exception("S3 write error")
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get: with (
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None) patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml") response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
assert response.status_code == 200 assert response.status_code == 200
# ---------------------------------------------------------------------------
# _entries_to_msgpack_safe
# ---------------------------------------------------------------------------
class TestEntriesToMsgpackSafe:
def test_plain_string_values_pass_through(self):
entries = {"chart": [{"name": "chart", "version": "1.0.0", "urls": ["http://x/c.tgz"]}]}
result = _entries_to_msgpack_safe(entries)
assert result["chart"][0]["version"] == "1.0.0"
def test_datetime_converted_to_iso_string(self):
dt = datetime(2023, 6, 15, 12, 0, 0, tzinfo=UTC)
entries = {"chart": [{"name": "chart", "version": "1.0.0", "created": dt}]}
result = _entries_to_msgpack_safe(entries)
assert isinstance(result["chart"][0]["created"], str)
assert "2023-06-15" in result["chart"][0]["created"]
def test_date_converted_to_iso_string(self):
entries = {"chart": [{"name": "chart", "version": "1.0.0", "created": date(2023, 6, 15)}]}
result = _entries_to_msgpack_safe(entries)
assert result["chart"][0]["created"] == "2023-06-15"
def test_empty_entries_returns_empty_dict(self):
assert _entries_to_msgpack_safe({}) == {}
def test_multiple_versions_all_converted(self):
dt = datetime(2023, 1, 1, tzinfo=UTC)
entries = {
"chart": [
{"name": "chart", "version": "1.0.0", "created": dt},
{"name": "chart", "version": "2.0.0", "created": dt},
]
}
result = _entries_to_msgpack_safe(entries)
for v in result["chart"]:
assert isinstance(v["created"], str)
def test_result_is_msgpack_serializable(self):
import msgpack
dt = datetime(2023, 6, 15, 12, 0, 0, tzinfo=UTC)
entries = {"chart": [{"name": "chart", "version": "1.0.0", "created": dt, "urls": ["http://x/c.tgz"]}]}
safe = _entries_to_msgpack_safe(entries)
packed = msgpack.packb(safe, use_bin_type=True)
unpacked = msgpack.unpackb(packed, raw=False)
assert unpacked["chart"][0]["created"] == safe["chart"][0]["created"]
# ---------------------------------------------------------------------------
# _merge_helm_indexes — pre-parsed entries path
# ---------------------------------------------------------------------------
class TestMergeHelmIndexesWithParsed:
"""Verify that pre-parsed entries (from msgpack) produce the same output as raw YAML."""
def _parse_entries(self, raw: bytes) -> dict:
index = yaml.safe_load(raw)
return index.get("entries") or {}
def test_parsed_entries_produce_same_charts_as_raw(self):
parsed = self._parse_entries(_INDEX_A)
raw_result = yaml.safe_load(_merge_helm_indexes([_INDEX_A], [None], ["member-a"], [_CFG_A], "http://proxy.example.com"))
parsed_result = yaml.safe_load(_merge_helm_indexes([_INDEX_A], [parsed], ["member-a"], [_CFG_A], "http://proxy.example.com"))
assert set(raw_result["entries"].keys()) == set(parsed_result["entries"].keys())
def test_parsed_entries_urls_are_rewritten(self):
parsed = self._parse_entries(_INDEX_A)
result = yaml.safe_load(_merge_helm_indexes([_INDEX_A], [parsed], ["member-a"], [_CFG_A], "http://proxy.example.com"))
url = result["entries"]["vault"][0]["urls"][0]
assert "member-a" in url
assert "proxy.example.com" in url
def test_none_parsed_falls_back_to_raw_bytes(self):
result = yaml.safe_load(_merge_helm_indexes([_INDEX_A], [None], ["member-a"], [_CFG_A], "http://proxy.example.com"))
assert "vault" in result["entries"]
def test_mixed_parsed_and_raw_merge_correctly(self):
parsed_a = self._parse_entries(_INDEX_A)
result = yaml.safe_load(
_merge_helm_indexes(
[_INDEX_A, _INDEX_B],
[parsed_a, None],
["member-a", "member-b"],
[_CFG_A, _CFG_B],
"http://proxy.example.com",
)
)
assert "vault" in result["entries"]
assert "nginx" in result["entries"]
# ---------------------------------------------------------------------------
# _get_member_index — msgpack cache behaviour
# ---------------------------------------------------------------------------
class TestGetMemberIndexMsgpack:
@pytest.fixture
def storage(self):
m = MagicMock()
m.get_object_key.side_effect = lambda name, path: f"{name}/{path}"
m.exists.return_value = False
m.download_object.return_value = _INDEX_SIMPLE
return m
@pytest.fixture
def cache(self):
m = MagicMock()
m.is_index_valid.return_value = False
return m
@pytest.fixture
def member_cfg(self):
return {"base_url": "https://helm.releases.hashicorp.com", "cache": {"mutable_ttl": 3600}}
def _fake_response(self, content=_INDEX_SIMPLE):
r = MagicMock()
r.content = content
r.raise_for_status = MagicMock()
return r
async def test_cache_hit_with_msgpack_returns_parsed_entries(self, storage, cache, member_cfg):
import msgpack
entries = {"mychart": [{"name": "mychart", "version": "1.0.0", "urls": ["http://x/c.tgz"]}]}
packed = msgpack.packb(entries, use_bin_type=True)
storage.exists.side_effect = lambda key: True
cache.is_index_valid.return_value = True
storage.download_object.side_effect = lambda key: packed if key.endswith("index.msgpack") else _INDEX_SIMPLE
_, _, _, raw_data, parsed = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert parsed == entries
async def test_cache_miss_builds_msgpack_and_returns_parsed(self, storage, cache, member_cfg):
with patch("artifactapi.artifact.virtual.httpx.AsyncClient") as mock_cls:
mock_client = AsyncMock()
mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response()
_, _, _, raw_data, parsed = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == _INDEX_SIMPLE
assert isinstance(parsed, dict)
assert "mychart" in parsed
async def test_broken_msgpack_rebuilds_from_raw_yaml(self, storage, cache, member_cfg):
storage.exists.side_effect = lambda key: True
cache.is_index_valid.return_value = True
storage.download_object.side_effect = lambda key: b"not-valid-msgpack" if key.endswith("index.msgpack") else _INDEX_SIMPLE
_, _, _, raw_data, parsed = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == _INDEX_SIMPLE
# Falls back to YAML parse and rebuilds msgpack — entries are returned
assert isinstance(parsed, dict)
assert "mychart" in parsed
async def test_upstream_failure_returns_none_for_both(self, storage, cache, member_cfg):
with patch("artifactapi.artifact.virtual.httpx.AsyncClient") as mock_cls:
mock_client = AsyncMock()
mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.side_effect = Exception("timeout")
_, _, _, raw_data, parsed = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data is None
assert parsed is None