Compare commits

..

1 Commits

Author SHA1 Message Date
unkinben 34160032fc feat: use top-level key for repo type instead of type field
Replace the flat `remotes:` map (with `type: "remote"/"virtual"/"local"`) with
separate top-level sections — `remote:`, `virtual:`, `local:` — so the repo
type is declared structurally and the `type:` field is no longer needed.

Config loader normalises the new format to the existing internal representation
(injecting `type` into each remote dict), so all handler code is unchanged.
Adds a TestYamlTypeKeys suite covering all three type keys, mixed files, and
field preservation. Includes README migration guide for splitting a single
remotes file into per-type-and-package conf.d files.
2026-04-29 23:24:54 +10:00
22 changed files with 495 additions and 970 deletions
+115 -48
View File
@@ -11,7 +11,6 @@ FastAPI caching proxy that downloads and stores files from remote sources in S3-
- Stale-on-upstream-error: refreshes TTL when backend is unreachable rather than evicting
- URL rewriting for PyPI simple index, npm metadata, and Helm `index.yaml`
- Access control via regex patterns — unmatched paths return 403
- Docker tag banning — block named tags (e.g. `latest`) while allowing digest pulls
## Architecture
@@ -71,11 +70,10 @@ src/artifactapi/
| Method | Path | Description |
|---|---|---|
| `GET` | `/api/v1/remote/{remote}/{path}` | Fetch artifact (auto-cache on miss) |
| `PUT` | `/api/v1/remote/{remote}/{path}` | Upload to local remote |
| `HEAD` | `/api/v1/remote/{remote}/{path}` | Check existence (local remotes) |
| `DELETE` | `/api/v1/remote/{remote}/{path}` | Delete from local remote |
| `GET` | `/api/v1/virtual/{virtual}/{path}` | Fetch from virtual (merged) repository |
| `GET` | `/api/v1/local/{local}/{path}` | Download from local repository |
| `PUT` | `/api/v1/local/{local}/{path}` | Upload to local repository |
| `HEAD` | `/api/v1/local/{local}/{path}` | Check existence (local) |
| `DELETE` | `/api/v1/local/{local}/{path}` | Delete from local repository |
| `GET` | `/v2/{remote}/{path}` | Docker Registry v2 proxy |
| `PUT` | `/cache/flush` | Flush cache entries |
| `GET` | `/health` | Health check |
@@ -122,12 +120,12 @@ config_dir: conf.d # or an absolute path
remotes: {} # optional base remotes
```
### Configuration structure
### remotes.yaml Structure
Repositories are declared under three top-level keys matching their type:
The top-level key declares the repository type — no `type:` field needed:
```yaml
remotes: # proxy (caching) remotes
remote:
remote-name:
base_url: "https://example.com"
package: "generic" # generic, alpine, rpm, docker, pypi, npm, helm
@@ -141,19 +139,16 @@ remotes: # proxy (caching) remotes
immutable_ttl: 0 # 0 = indefinitely
mutable_ttl: 3600
virtuals: # virtual (merged-index) repositories
virtual:
virtual-name:
package: "helm"
members:
- remote-a
- remote-b
- remote-name-1
- remote-name-2
locals: # local upload repositories (no base_url)
local:
local-name:
package: "generic"
cache:
immutable_ttl: 0
mutable_ttl: 0
```
## Remote Types
@@ -163,7 +158,7 @@ locals: # local upload repositories (no base_url)
Arbitrary HTTP file servers — GitHub releases, HashiCorp, custom servers.
```yaml
remotes:
remote:
github:
base_url: "https://github.com"
package: "generic"
@@ -190,7 +185,7 @@ Access: `GET /api/v1/remote/github/owner/repo/releases/download/v1.0/binary.tar.
### alpine
```yaml
remotes:
remote:
alpine:
base_url: "https://dl-cdn.alpinelinux.org"
package: "alpine"
@@ -206,7 +201,7 @@ remotes:
### rpm
```yaml
remotes:
remote:
almalinux:
base_url: "https://mirror.example.com/almalinux"
package: "rpm"
@@ -223,7 +218,7 @@ remotes:
### docker
```yaml
remotes:
remote:
dockerhub:
base_url: "https://registry-1.docker.io"
package: "docker"
@@ -244,26 +239,6 @@ remotes:
Tag manifests and `/tags/list` are built-in mutable patterns. Digest-addressed blobs are immutable.
#### Banning tags
Set `ban_tags_enabled: true` and list named tags in `ban_tags` to block specific tag references. Requests for a banned tag return `403`. Digest-addressed pulls (`sha256:…`) are never blocked, so images already in use can still be referenced by digest.
```yaml
remotes:
dockerhub:
base_url: "https://registry-1.docker.io"
package: "docker"
ban_tags_enabled: true
ban_tags:
- latest # force pinned tags in CI/CD
- edge
cache:
immutable_ttl: 0
mutable_ttl: 300
```
`ban_tags_enabled` defaults to `false`. Setting it to `true` with an empty `ban_tags` list has no effect.
For RKE2/containerd, configure `/etc/rancher/rke2/registries.yaml`:
```yaml
@@ -283,7 +258,7 @@ mirrors:
### pypi
```yaml
remotes:
remote:
pypi:
base_url: "https://files.pythonhosted.org"
package: "pypi"
@@ -314,7 +289,7 @@ default = true
### npm
```yaml
remotes:
remote:
npm:
base_url: "https://registry.npmjs.org"
package: "npm"
@@ -340,7 +315,7 @@ registry=https://artifacts.example.com/api/v1/remote/npm/
### helm
```yaml
remotes:
remote:
hashicorp-helm:
base_url: "https://helm.releases.hashicorp.com"
package: "helm"
@@ -368,7 +343,7 @@ A virtual repository presents a single unified index built from multiple member
All members must share the same `package` type as the virtual repo. Currently supported package types: `helm`.
```yaml
remotes:
remote:
helm-hashicorp:
base_url: "https://helm.releases.hashicorp.com"
package: "helm"
@@ -387,7 +362,7 @@ remotes:
immutable_ttl: 0
mutable_ttl: 3600
virtuals:
virtual:
helm-all:
package: "helm"
members:
@@ -411,7 +386,7 @@ If a member is unreachable and has no cached index, it is skipped and a warning
**Caching:**
The merged index is cached using `min(mutable_ttl)` across all members. Each member's raw index is cached in S3 under its own remote key; the virtual handler reuses those copies when available. On rebuild, each member's parsed index is also stored as a compact msgpack file (`index.msgpack`) alongside the raw YAML, eliminating the YAML parse cost on subsequent rebuilds.
The merged index is cached using `min(mutable_ttl)` across all members. Each member's raw index is cached in S3 under its own remote key by the normal proxy rules; the virtual handler reuses those copies when available.
**Helm example:**
@@ -425,7 +400,7 @@ Chart tarball URLs in the merged `index.yaml` are rewritten to point at the indi
### local
```yaml
locals:
local:
local-generic:
package: "generic"
description: "Local file repository"
@@ -434,7 +409,99 @@ locals:
mutable_ttl: 0
```
No `base_url`. Files are uploaded via `PUT /api/v1/local/{name}/{path}` and downloaded via `GET /api/v1/local/{name}/{path}`.
No `base_url`. Files are uploaded via `PUT` and served via `GET`.
## Migration
### Splitting a single remotes file into per-type files
The old format used a single `remotes:` map with an explicit `type:` field on each entry. The new format uses top-level type keys (`remote:`, `virtual:`, `local:`) and supports splitting across multiple files via `config_dir`.
**Before** (`remotes.yaml`):
```yaml
remotes:
dockerhub:
base_url: "https://registry-1.docker.io"
type: "remote"
package: "docker"
cache:
immutable_ttl: 0
mutable_ttl: 300
hashicorp-helm:
base_url: "https://helm.releases.hashicorp.com"
type: "remote"
package: "helm"
immutable_patterns:
- "\\.tgz$"
cache:
immutable_ttl: 0
mutable_ttl: 3600
helm-all:
type: "virtual"
package: "helm"
members:
- hashicorp-helm
local-files:
type: "local"
package: "generic"
```
**After** — one file per type + package type, with a main config pointing at the directory:
`config.yaml`:
```yaml
config_dir: conf.d
```
`conf.d/remote-docker.yaml`:
```yaml
remote:
dockerhub:
base_url: "https://registry-1.docker.io"
package: "docker"
cache:
immutable_ttl: 0
mutable_ttl: 300
```
`conf.d/remote-helm.yaml`:
```yaml
remote:
hashicorp-helm:
base_url: "https://helm.releases.hashicorp.com"
package: "helm"
immutable_patterns:
- "\\.tgz$"
cache:
immutable_ttl: 0
mutable_ttl: 3600
```
`conf.d/virtual-helm.yaml`:
```yaml
virtual:
helm-all:
package: "helm"
members:
- hashicorp-helm
```
`conf.d/local-generic.yaml`:
```yaml
local:
local-files:
package: "generic"
```
Set `CONFIG_PATH` to the main file:
```
CONFIG_PATH=/etc/artifactapi/config.yaml
```
Files in `conf.d/` are merged alphabetically; later files win on conflicts within the same remote name.
## Caching Model
@@ -473,7 +540,7 @@ When a mutable file expires and the upstream is unreachable (connection refused,
Set `quarantine_new: true` and `quarantine_days: N` on a remote to block immutable artifacts published within the last N days. Requests return `404` until the quarantine period expires, giving time to detect malicious packages before they are consumed.
```yaml
remotes:
remote:
pypi:
base_url: "https://files.pythonhosted.org"
package: "pypi"
+11
View File
@@ -0,0 +1,11 @@
remotes:
alpine:
base_url: "https://dl-cdn.alpinelinux.org"
type: "remote"
package: "alpine"
description: "Alpine Linux APK package repository"
immutable_patterns:
- ".*/x86_64/.*\\.apk$"
cache:
immutable_ttl: 0
mutable_ttl: 7200
+12
View File
@@ -0,0 +1,12 @@
remotes:
github:
base_url: "https://github.com"
type: "remote"
package: "generic"
description: "GitHub releases and files"
immutable_patterns:
- "gruntwork-io/terragrunt/.*terragrunt_linux_amd64.*"
- "prometheus/node_exporter/.*/node_exporter-.*\\.linux-amd64\\.tar\\.gz$"
cache:
immutable_ttl: 0
mutable_ttl: 0
+17
View File
@@ -0,0 +1,17 @@
remotes:
pypi:
base_url: "https://files.pythonhosted.org"
type: "remote"
package: "pypi"
description: "Python Package Index"
check_mutable_updates: true
quarantine_new: true
quarantine_days: 3
immutable_patterns:
- "packages/.*\\.whl$"
- "packages/.*\\.whl\\.metadata$"
- "packages/.*\\.tar\\.gz$"
- "packages/.*\\.zip$"
cache:
immutable_ttl: 0
mutable_ttl: 600
+1 -1
View File
@@ -1,4 +1,4 @@
remotes:
remote:
alpine:
base_url: "https://dl-cdn.alpinelinux.org"
package: "alpine"
+1 -1
View File
@@ -1,4 +1,4 @@
remotes:
remote:
github:
base_url: "https://github.com"
package: "generic"
+1 -1
View File
@@ -1,4 +1,4 @@
remotes:
remote:
pypi:
base_url: "https://files.pythonhosted.org"
package: "pypi"
+23 -29
View File
@@ -1,5 +1,10 @@
# Example remotes configuration — copy and adapt for your environment.
#
# Top-level keys determine the repository type:
# remote: — proxy to an upstream URL, cache responses in S3
# virtual: — merge indexes from multiple member remotes into one
# local: — store files uploaded via PUT, serve via GET
#
# immutable_patterns: artifacts cached forever (e.g. release binaries, versioned tags).
# mutable_patterns: artifacts that expire after cache.mutable_ttl seconds and are
# re-fetched from upstream on next request (e.g. index files,
@@ -32,7 +37,8 @@
#database:
# url: "postgresql://artifacts:artifacts123@localhost:5432/artifacts"
#
remotes:
remote:
github:
base_url: "https://github.com"
package: "generic"
@@ -61,7 +67,7 @@ remotes:
- "stalwartlabs/stalwart/.*/stalwart-foundationdb-x86_64-unknown-linux-gnu\\.tar\\.gz$"
- "stalwartlabs/stalwart/.*/stalwart-x86_64-unknown-linux-gnu\\.tar\\.gz$"
cache:
immutable_ttl: 0 # Files cached indefinitely
immutable_ttl: 0
mutable_ttl: 0
github-archive:
@@ -80,8 +86,8 @@ remotes:
# Only applies to user-defined mutable_patterns, not package-type defaults.
check_mutable_updates: true
cache:
immutable_ttl: 0 # Tag archives cached indefinitely
mutable_ttl: 86400 # Branch archives refreshed after 1 day
immutable_ttl: 0 # Tag archives cached indefinitely
mutable_ttl: 86400 # Branch archives refreshed after 1 day
gitea-dl:
base_url: "https://dl.gitea.com"
@@ -90,7 +96,7 @@ remotes:
immutable_patterns:
- "act_runner/.*/act_runner-.*-linux-amd64$"
cache:
immutable_ttl: 0 # Files cached indefinitely
immutable_ttl: 0
mutable_ttl: 0
hashicorp-releases:
@@ -110,7 +116,7 @@ remotes:
- "nomad/.*/nomad_.*_linux_amd64\\.zip$"
- "packer/.*/packer_.*_linux_amd64\\.zip$"
cache:
immutable_ttl: 0 # Files cached indefinitely
immutable_ttl: 0
mutable_ttl: 0
alpine:
@@ -123,7 +129,7 @@ remotes:
# and is always re-fetched on expiry — conditional checks are skipped for
# built-in mutable patterns regardless of this flag.
cache:
immutable_ttl: 0 # Files cached indefinitely
immutable_ttl: 0
mutable_ttl: 7200 # Index files (APKINDEX.tar.gz) cached for 2 hours
almalinux:
@@ -134,12 +140,9 @@ remotes:
- ".*/x86_64/.*\\.rpm$"
- ".*/noarch/.*\\.rpm$"
- ".*/repodata/.*$"
- ".*\\.rpm$" # Allow all RPM files
# repomd.xml / repodata are package-type defaults — always re-fetched on
# expiry. check_mutable_updates would only apply to any custom
# mutable_patterns added here.
- ".*\\.rpm$"
cache:
immutable_ttl: 0 # Files cached indefinitely
immutable_ttl: 0
mutable_ttl: 7200 # Metadata files cached for 2 hours
epel:
@@ -153,8 +156,8 @@ remotes:
- ".*/noarch/.*\\.rpm$"
- ".*/repodata/.*$"
cache:
immutable_ttl: 0 # Files cached indefinitely
mutable_ttl: 7200 # Metadata files cached for 2 hours
immutable_ttl: 0
mutable_ttl: 7200
fedora:
base_url: "https://gsl-syd.mm.fcix.net/fedora/linux"
@@ -167,7 +170,7 @@ remotes:
- ".*/noarch/.*\\.rpm$"
- "updates/.*/Everything/x86_64/repodata/.*$"
cache:
immutable_ttl: 0 # Files cached indefinitely
immutable_ttl: 0
mutable_ttl: 300 # Metadata files cached for 5 minutes
ghcr:
@@ -176,9 +179,6 @@ remotes:
description: "GitHub Container Registry"
# username: "your-github-username"
# password: "your-github-pat" # needs read:packages scope
# Docker manifest/tag-list patterns are package-type defaults — always
# re-fetched on expiry. check_mutable_updates only applies to any custom
# mutable_patterns you add (e.g. a metadata endpoint).
cache:
immutable_ttl: 0
mutable_ttl: 300
@@ -195,12 +195,7 @@ remotes:
base_url: "https://files.pythonhosted.org"
package: "pypi"
description: "Python Package Index — simple index and package files via a single remote"
# simple/ requests are transparently fetched from pypi.org; package files come from
# files.pythonhosted.org (base_url). URLs in the simple index are rewritten to this remote.
check_mutable_updates: true
# Block packages published within the last 3 days (supply-chain attack mitigation).
# Immutable artifacts (wheel/sdist) newer than quarantine_days return 404 until
# the window passes. Disable by setting quarantine_new: false or removing both keys.
quarantine_new: true
quarantine_days: 3
immutable_patterns:
@@ -251,8 +246,8 @@ remotes:
immutable_patterns:
- "\\.tgz$"
cache:
immutable_ttl: 0 # Chart tarballs are versioned — cache forever
mutable_ttl: 3600 # index.yaml refreshed after 1 hour
immutable_ttl: 0
mutable_ttl: 3600
metallb:
base_url: "https://metallb.github.io/metallb"
@@ -452,8 +447,7 @@ remotes:
immutable_ttl: 0
mutable_ttl: 3600
virtuals:
virtual:
helm-all:
package: "helm"
description: "Virtual repository merging all helm remotes — member order is priority order for duplicate chart+version"
@@ -478,10 +472,10 @@ virtuals:
- jfrog
- openvox
locals:
local:
local-generic:
package: "generic"
description: "Local generic file repository"
cache:
immutable_ttl: 0 # Files cached indefinitely
immutable_ttl: 0
mutable_ttl: 0
-1
View File
@@ -14,7 +14,6 @@ dependencies = [
"lxml>=4.9.0",
"prometheus-client>=0.19.0",
"python-multipart>=0.0.6",
"msgpack>=1.0.0",
]
requires-python = ">=3.11"
readme = "README.md"
+15 -50
View File
@@ -1,4 +1,3 @@
import asyncio
import hashlib
import json
import logging
@@ -34,16 +33,6 @@ async def proxy(request: Request, remote_name: str, path: str, storage, cache, c
logger.info(f"PATTERN BLOCKED: {remote_name}/{path}")
raise HTTPException(status_code=403, detail="Image not allowed by configuration patterns")
if remote_config.get("ban_tags_enabled", False):
ban_tags = remote_config.get("ban_tags", [])
if ban_tags:
tag_match = re.search(r"/manifests/([^/]+)$", path)
if tag_match:
tag = tag_match.group(1)
if not tag.startswith("sha256:") and tag in ban_tags:
logger.info(f"TAG BANNED: {remote_name}/{path} (tag: {tag})")
raise HTTPException(status_code=403, detail=f"Tag '{tag}' is not permitted on this remote")
base_url = remote_config.get("base_url", "").rstrip("/")
remote_url = f"{base_url}/v2/{path}"
@@ -58,39 +47,23 @@ async def proxy(request: Request, remote_name: str, path: str, storage, cache, c
if not await _proxy.handle_expired_mutable(remote_name, path, remote_url, config, cache, storage):
cached_key = None
lock_acquired = False
if not cached_key:
lock_acquired = cache.acquire_fetch_lock(remote_name, path)
if not lock_acquired:
# Another pod is already fetching — poll storage briefly before issuing a duplicate upstream request
for _ in range(10):
await asyncio.sleep(0.5)
probe_key = storage.get_object_key(remote_name, path)
if storage.exists(probe_key):
cached_key = probe_key
break
if not cached_key:
logger.info(f"Cache MISS: {remote_name}/{path} - fetching from remote: {remote_url}")
try:
result = await _proxy.cache_single_artifact(remote_url, remote_name, path, storage, remote_config)
if result["status"] == "error":
raise HTTPException(status_code=502, detail=f"Failed to fetch: {result['error']}")
if result["status"] == "cached" and is_mutable:
cache_config = config.get_cache_config(remote_name)
mutable_ttl = cache_config.get("mutable_ttl", 3600)
cache.mark_index_cached(remote_name, path, mutable_ttl)
logger.info(f"Mutable file cached with TTL: {remote_name}/{path} (ttl: {mutable_ttl}s)")
if result.get("etag") or result.get("last_modified"):
cache.store_mutable_meta(remote_name, path, result.get("etag"), result.get("last_modified"))
if not is_mutable:
published = result.get("last_modified")
if published:
cache.store_artifact_published(remote_name, path, published)
_proxy._check_quarantine(remote_name, published, config)
finally:
if lock_acquired:
cache.release_fetch_lock(remote_name, path)
result = await _proxy.cache_single_artifact(remote_url, remote_name, path, storage, remote_config)
if result["status"] == "error":
raise HTTPException(status_code=502, detail=f"Failed to fetch: {result['error']}")
if result["status"] == "cached" and is_mutable:
cache_config = config.get_cache_config(remote_name)
mutable_ttl = cache_config.get("mutable_ttl", 3600)
cache.mark_index_cached(remote_name, path, mutable_ttl)
logger.info(f"Mutable file cached with TTL: {remote_name}/{path} (ttl: {mutable_ttl}s)")
if result.get("etag") or result.get("last_modified"):
cache.store_mutable_meta(remote_name, path, result.get("etag"), result.get("last_modified"))
if not is_mutable:
published = result.get("last_modified")
if published:
cache.store_artifact_published(remote_name, path, published)
_proxy._check_quarantine(remote_name, published, config)
elif not is_mutable:
published = cache.get_artifact_published(remote_name, path)
if not published:
@@ -117,14 +90,6 @@ async def proxy(request: Request, remote_name: str, path: str, storage, cache, c
content_type = "application/vnd.oci.image.manifest.v1+json"
digest = f"sha256:{hashlib.sha256(artifact_data).hexdigest()}"
# Cross-link tag manifests to their sha256 digest key so digest-addressed pulls hit cache
if is_mutable and "/manifests/" in path:
digest_path = re.sub(r"/manifests/[^/]+$", f"/manifests/{digest}", path)
digest_key = storage.get_object_key(remote_name, digest_path)
if not storage.exists(digest_key):
storage.upload(digest_key, artifact_data)
headers = {
"Docker-Distribution-Api-Version": "registry/2.0",
"Docker-Content-Digest": digest,
+16 -21
View File
@@ -1,6 +1,5 @@
import hashlib
import logging
import os
from fastapi import HTTPException, Response, UploadFile
from fastapi.responses import JSONResponse
@@ -8,23 +7,12 @@ from fastapi.responses import JSONResponse
logger = logging.getLogger(__name__)
def download(remote_name: str, path: str, storage, database, config) -> Response:
if not config.get_local_config(remote_name):
raise HTTPException(status_code=404, detail=f"Local repository '{remote_name}' not configured")
metadata = database.get_local_file_metadata(remote_name, path)
if not metadata:
raise HTTPException(status_code=404, detail="File not found")
content = storage.download_object(metadata["s3_key"])
return Response(
content=content,
media_type=metadata.get("content_type", "application/octet-stream"),
headers={"Content-Disposition": f"attachment; filename={os.path.basename(path)}"},
)
async def upload(remote_name: str, path: str, file: UploadFile, storage, database, config) -> JSONResponse:
if not config.get_local_config(remote_name):
raise HTTPException(status_code=404, detail=f"Local repository '{remote_name}' not configured")
remote_config = config.get_remote_config(remote_name)
if not remote_config:
raise HTTPException(status_code=404, detail=f"Remote '{remote_name}' not configured")
if remote_config.get("type") != "local":
raise HTTPException(status_code=400, detail="Upload only supported for local repositories")
try:
content = await file.read()
@@ -71,8 +59,12 @@ async def upload(remote_name: str, path: str, file: UploadFile, storage, databas
def check_exists(remote_name: str, path: str, database, config) -> Response:
if not config.get_local_config(remote_name):
raise HTTPException(status_code=404, detail=f"Local repository '{remote_name}' not configured")
remote_config = config.get_remote_config(remote_name)
if not remote_config:
raise HTTPException(status_code=404, detail=f"Remote '{remote_name}' not configured")
if remote_config.get("type") != "local":
raise HTTPException(status_code=405, detail="HEAD method only supported for local repositories")
try:
metadata = database.get_local_file_metadata(remote_name, path)
@@ -95,8 +87,11 @@ def check_exists(remote_name: str, path: str, database, config) -> Response:
def delete(remote_name: str, path: str, storage, database, config) -> JSONResponse:
if not config.get_local_config(remote_name):
raise HTTPException(status_code=404, detail=f"Local repository '{remote_name}' not configured")
remote_config = config.get_remote_config(remote_name)
if not remote_config:
raise HTTPException(status_code=404, detail=f"Remote '{remote_name}' not configured")
if remote_config.get("type") != "local":
raise HTTPException(status_code=400, detail="Delete only supported for local repositories")
try:
s3_key = database.delete_local_file(remote_name, path)
+13
View File
@@ -218,6 +218,19 @@ async def handle(request: Request, remote_name: str, path: str, storage, cache,
if not remote_config:
raise HTTPException(status_code=404, detail=f"Remote '{remote_name}' not configured")
if remote_config.get("type") == "local":
metadata = database.get_local_file_metadata(remote_name, path)
if not metadata:
raise HTTPException(status_code=404, detail="File not found")
content = storage.download_object(metadata["s3_key"])
if content is None:
raise HTTPException(status_code=500, detail="File not accessible")
return Response(
content=content,
media_type=metadata.get("content_type", "application/octet-stream"),
headers={"Content-Disposition": f"attachment; filename={os.path.basename(path)}"},
)
path_parts = path.split("/")
if len(path_parts) >= 2:
repo_path = f"{path_parts[0]}/{path_parts[1]}"
+25 -115
View File
@@ -6,21 +6,15 @@ from datetime import UTC, date, datetime
from typing import Protocol, runtime_checkable
import httpx
import msgpack as _msgpack
import yaml
from fastapi import HTTPException, Request, Response
from ..remote import helm as _helm
logger = logging.getLogger(__name__)
try:
_YamlLoader = yaml.CSafeLoader
_YamlDumperBase = yaml.CDumper
except AttributeError:
_YamlLoader = yaml.SafeLoader
_YamlDumperBase = yaml.Dumper
class _HelmDumper(_YamlDumperBase):
class _HelmDumper(yaml.Dumper):
"""YAML dumper that serializes datetime/date objects back to ISO 8601 strings.
yaml.safe_load converts timestamp-shaped YAML scalars (e.g. chart `created`
@@ -43,43 +37,21 @@ _HelmDumper.add_representer(datetime, _repr_datetime)
_HelmDumper.add_representer(date, _repr_date)
def _entries_to_msgpack_safe(entries: dict) -> dict:
"""Convert datetime/date values to ISO strings for msgpack serialization."""
result = {}
for chart, versions in entries.items():
safe_versions = []
for v in versions:
safe_v = {}
for k, val in v.items():
if isinstance(val, datetime):
safe_v[k] = val.isoformat()
elif isinstance(val, date):
safe_v[k] = val.isoformat()
else:
safe_v[k] = val
safe_versions.append(safe_v)
result[chart] = safe_versions
return result
async def _get_member_index(
member_name: str,
member_cfg: dict,
path: str,
storage,
cache,
) -> tuple[str, dict, int, bytes | None, dict | None]:
) -> tuple[str, dict, int, bytes | None]:
"""Fetch or retrieve cached index.yaml for one member remote.
Returns (member_name, member_cfg, ttl, raw_bytes, parsed_entries).
Returns (member_name, member_cfg, ttl, raw_bytes).
raw_bytes is None if the member is unreachable and not in S3.
parsed_entries is the pre-parsed entries dict (from msgpack cache), or None.
"""
member_ttl = member_cfg.get("cache", {}).get("mutable_ttl", 3600)
s3_key = storage.get_object_key(member_name, path)
msgpack_key = storage.get_object_key(member_name, "index.msgpack")
raw_data: bytes | None = None
parsed_entries: dict | None = None
if storage.exists(s3_key) and cache.is_index_valid(member_name, path):
try:
@@ -87,13 +59,6 @@ async def _get_member_index(
logger.info(f"Virtual: cache hit for member '{member_name}'")
except Exception:
raw_data = None
if raw_data is not None and storage.exists(msgpack_key):
try:
packed = storage.download_object(msgpack_key)
parsed_entries = _msgpack.unpackb(packed, raw=False)
logger.debug(f"Virtual: msgpack hit for member '{member_name}'")
except Exception:
parsed_entries = None
if raw_data is None:
base_url = member_cfg.get("base_url", "").rstrip("/")
@@ -111,74 +76,35 @@ async def _get_member_index(
raw_data = response.content
except Exception as e:
logger.warning(f"Virtual: failed to fetch index.yaml from member '{member_name}': {e}")
return member_name, member_cfg, member_ttl, None, None
return member_name, member_cfg, member_ttl, None
try:
storage.upload(s3_key, raw_data)
cache.mark_index_cached(member_name, path, member_ttl)
except Exception as e:
logger.warning(f"Virtual: failed to cache index.yaml for member '{member_name}': {e}")
if parsed_entries is None and raw_data is not None:
try:
index = yaml.load(raw_data, Loader=_YamlLoader)
safe_entries = _entries_to_msgpack_safe(index.get("entries") or {})
storage.upload(msgpack_key, _msgpack.packb(safe_entries, use_bin_type=True))
parsed_entries = safe_entries
except Exception as e:
logger.warning(f"Virtual: failed to build msgpack cache for '{member_name}': {e}")
return member_name, member_cfg, member_ttl, raw_data, parsed_entries
return member_name, member_cfg, member_ttl, raw_data
def _rewrite_urls(urls: list, base_url: str, proxy_base: str, member_name: str) -> list:
proxy_remote = f"{proxy_base}/api/v1/remote/{member_name}"
rewritten = []
for url in urls:
if url.startswith(("http://", "https://")):
if base_url and url.startswith(base_url):
url = proxy_remote + url[len(base_url) :]
else:
url = f"{proxy_remote}/{url.lstrip('/')}"
rewritten.append(url)
return rewritten
def _merge_helm_indexes(
raw_indexes: list[bytes],
parsed_entries_list: list[dict | None],
member_names: list[str],
member_configs: list[dict],
proxy_base: str,
) -> bytes:
def _merge_helm_indexes(raw_indexes: list[bytes], member_names: list[str], member_configs: list[dict], proxy_base: str) -> bytes:
"""Merge helm index.yaml files with per-member URL rewriting.
Priority is determined by position in member_names: earlier members win
when the same chart name + version appears in multiple remotes.
Uses pre-parsed msgpack entries when available to skip YAML parsing.
"""
merged_entries: dict[str, list] = {}
for raw_data, pre_parsed, member_name, member_cfg in zip(raw_indexes, parsed_entries_list, member_names, member_configs):
for raw_data, member_name, member_cfg in zip(raw_indexes, member_names, member_configs):
base_url = member_cfg.get("base_url", "").rstrip("/")
rewritten, _ = _helm.resolve_content(raw_data, "index.yaml", "index.yaml", base_url, proxy_base, member_name)
if pre_parsed is not None:
entries = pre_parsed
else:
try:
index = yaml.load(raw_data, Loader=_YamlLoader)
except Exception as e:
logger.warning(f"Virtual: failed to parse index.yaml from member '{member_name}': {e}")
continue
entries = index.get("entries") or {}
try:
index = yaml.safe_load(rewritten)
except Exception as e:
logger.warning(f"Virtual: failed to parse index.yaml from member '{member_name}': {e}")
continue
for chart_name, versions in entries.items():
for version_entry in versions:
version_entry["urls"] = _rewrite_urls(
version_entry.get("urls") or [],
base_url,
proxy_base,
member_name,
)
for chart_name, versions in (index.get("entries") or {}).items():
if chart_name not in merged_entries:
merged_entries[chart_name] = list(versions)
else:
@@ -200,14 +126,7 @@ def _merge_helm_indexes(
@runtime_checkable
class _VirtualHandler(Protocol):
def accepts_path(self, path: str) -> bool: ...
def merge(
self,
raw_indexes: list[bytes],
parsed_entries: list[dict | None],
member_names: list[str],
member_configs: list[dict],
proxy_base: str,
) -> bytes: ...
def merge(self, raw_indexes: list[bytes], member_names: list[str], member_configs: list[dict], proxy_base: str) -> bytes: ...
def path_error(self) -> str: ...
@@ -215,15 +134,8 @@ class _HelmHandler:
def accepts_path(self, path: str) -> bool:
return path == "index.yaml"
def merge(
self,
raw_indexes: list[bytes],
parsed_entries: list[dict | None],
member_names: list[str],
member_configs: list[dict],
proxy_base: str,
) -> bytes:
return _merge_helm_indexes(raw_indexes, parsed_entries, member_names, member_configs, proxy_base)
def merge(self, raw_indexes: list[bytes], member_names: list[str], member_configs: list[dict], proxy_base: str) -> bytes:
return _merge_helm_indexes(raw_indexes, member_names, member_configs, proxy_base)
def path_error(self) -> str:
return "Virtual helm repositories only serve index.yaml; chart tarballs are served directly by member remotes"
@@ -235,9 +147,11 @@ _HANDLERS: dict[str, _VirtualHandler] = {
async def handle(request: Request, virtual_name: str, path: str, storage, cache, config) -> Response:
virtual_cfg = config.get_virtual_config(virtual_name)
virtual_cfg = config.get_remote_config(virtual_name)
if not virtual_cfg:
raise HTTPException(status_code=404, detail=f"Virtual repository '{virtual_name}' not configured")
if virtual_cfg.get("type") != "virtual":
raise HTTPException(status_code=400, detail=f"'{virtual_name}' is not a virtual repository")
package = virtual_cfg.get("package")
handler = _HANDLERS.get(package)
@@ -274,19 +188,17 @@ async def handle(request: Request, virtual_name: str, path: str, storage, cache,
fetch_ms = int((time.perf_counter() - t_fetch) * 1000)
raw_indexes: list[bytes] = []
used_parsed: list[dict | None] = []
used_members: list[str] = []
used_configs: list[dict] = []
min_ttl: int | None = None
for member_name, member_cfg, member_ttl, raw_data, parsed_entries in results:
for member_name, member_cfg, member_ttl, raw_data in results:
if min_ttl is None or member_ttl < min_ttl:
min_ttl = member_ttl
if raw_data is None:
logger.warning(f"Virtual '{virtual_name}': skipping unreachable member '{member_name}'")
continue
raw_indexes.append(raw_data)
used_parsed.append(parsed_entries)
used_members.append(member_name)
used_configs.append(member_cfg)
@@ -297,7 +209,7 @@ async def handle(request: Request, virtual_name: str, path: str, storage, cache,
min_ttl = 3600
t_merge = time.perf_counter()
merged = await asyncio.to_thread(handler.merge, raw_indexes, used_parsed, used_members, used_configs, proxy_base)
merged = handler.merge(raw_indexes, used_members, used_configs, proxy_base)
merge_ms = int((time.perf_counter() - t_merge) * 1000)
try:
@@ -305,11 +217,9 @@ async def handle(request: Request, virtual_name: str, path: str, storage, cache,
storage.upload(virtual_key, merged)
cache.mark_index_cached(virtual_name, path, min_ttl)
store_ms = int((time.perf_counter() - t_store) * 1000)
msgpack_hits = sum(1 for p in used_parsed if p is not None)
logger.info(
f"Virtual MISS: {virtual_name}/{path} rebuilt from {used_members} "
f"(fetch={fetch_ms}ms merge={merge_ms}ms store={store_ms}ms ttl={min_ttl}s "
f"msgpack={msgpack_hits}/{len(used_members)})"
f"(fetch={fetch_ms}ms merge={merge_ms}ms store={store_ms}ms ttl={min_ttl}s)"
)
except Exception as e:
logger.warning(f"Virtual: failed to store merged index for '{virtual_name}': {e}")
-19
View File
@@ -99,25 +99,6 @@ class RedisCache:
except Exception:
return None
def acquire_fetch_lock(self, remote_name: str, path: str, ttl: int = 30) -> bool:
"""Try to acquire a short-lived fetch lock. Returns True if acquired, False if held by another caller."""
if not self.available:
return True # fail open: no Redis → behave as if we always hold the lock
key = f"fetchlock:{remote_name}:{hashlib.sha256(path.encode()).hexdigest()[:16]}"
try:
return bool(self.client.set(key, 1, nx=True, ex=ttl))
except Exception:
return True
def release_fetch_lock(self, remote_name: str, path: str) -> None:
if not self.available:
return
key = f"fetchlock:{remote_name}:{hashlib.sha256(path.encode()).hexdigest()[:16]}"
try:
self.client.delete(key)
except Exception:
pass
def cleanup_expired_index(self, storage, remote_name: str, path: str) -> None:
if not self.available:
return
+23 -14
View File
@@ -4,6 +4,21 @@ import os
import yaml
_TYPE_KEYS = ("remote", "virtual", "local")
def _normalize_loaded(raw: dict) -> dict:
"""Convert {remote: {...}, virtual: {...}, local: {...}} into {remotes: {name: {type: ..., ...}}}."""
remotes = {}
for type_key in _TYPE_KEYS:
for name, cfg in (raw.get(type_key) or {}).items():
remotes[name] = {"type": type_key, **cfg}
result = {k: v for k, v in raw.items() if k not in _TYPE_KEYS}
if remotes:
result["remotes"] = remotes
return result
_PACKAGE_MUTABLE_PATTERNS: dict[str, list[str]] = {
"alpine": [
r"APKINDEX\.tar\.gz$",
@@ -41,8 +56,10 @@ class ConfigManager:
try:
with open(path) as f:
if path.endswith((".yaml", ".yml")):
return yaml.safe_load(f) or {}
return json.load(f)
raw = yaml.safe_load(f) or {}
else:
raw = json.load(f)
return _normalize_loaded(raw)
except FileNotFoundError:
return {}
@@ -50,8 +67,8 @@ class ConfigManager:
def _merge(base: dict, overlay: dict) -> dict:
result = {**base}
for key, value in overlay.items():
if key in ("remotes", "virtuals", "locals") and isinstance(base.get(key), dict) and isinstance(value, dict):
result[key] = {**base.get(key, {}), **value}
if key == "remotes" and isinstance(base.get("remotes"), dict) and isinstance(value, dict):
result["remotes"] = {**base.get("remotes", {}), **value}
else:
result[key] = value
return result
@@ -67,11 +84,11 @@ class ConfigManager:
self._config_dir = None
if os.path.isdir(self.config_path):
return self._load_from_dir(self.config_path) or {"remotes": {}, "virtuals": {}, "locals": {}}
return self._load_from_dir(self.config_path) or {"remotes": {}}
config = self._load_single_file(self.config_path)
if not config:
return {"remotes": {}, "virtuals": {}, "locals": {}}
return {"remotes": {}}
config_dir = config.pop("config_dir", None)
if config_dir:
@@ -119,14 +136,6 @@ class ConfigManager:
self._check_reload()
return self.config.get("remotes", {}).get(remote_name)
def get_virtual_config(self, virtual_name: str) -> dict | None:
self._check_reload()
return self.config.get("virtuals", {}).get(virtual_name)
def get_local_config(self, local_name: str) -> dict | None:
self._check_reload()
return self.config.get("locals", {}).get(local_name)
def get_immutable_patterns(self, remote_name: str, repo_path: str = "") -> list[str]:
remote_config = self.get_remote_config(remote_name)
if not remote_config:
+10 -21
View File
@@ -49,13 +49,7 @@ class ArtifactRequest(BaseModel):
@app.get("/")
def read_root():
config._check_reload()
return {
"message": "Artifact Storage API",
"version": app.version,
"remotes": list(config.config.get("remotes", {}).keys()),
"virtuals": list(config.config.get("virtuals", {}).keys()),
"locals": list(config.config.get("locals", {}).keys()),
}
return {"message": "Artifact Storage API", "version": app.version, "remotes": list(config.config.get("remotes", {}).keys())}
@app.get("/health")
@@ -105,24 +99,19 @@ async def get_artifact(request: Request, remote_name: str, path: str):
return await proxy.handle(request, remote_name, path, storage, cache, config, database, metrics)
@app.get("/api/v1/local/{local_name}/{path:path}")
def get_local_artifact(local_name: str, path: str):
return local.download(local_name, path, storage, database, config)
@app.put("/api/v1/remote/{remote_name}/{path:path}")
async def upload_file(remote_name: str, path: str, file: UploadFile = File(...)):
return await local.upload(remote_name, path, file, storage, database, config)
@app.put("/api/v1/local/{local_name}/{path:path}")
async def upload_local_file(local_name: str, path: str, file: UploadFile = File(...)):
return await local.upload(local_name, path, file, storage, database, config)
@app.head("/api/v1/remote/{remote_name}/{path:path}")
def check_file_exists(remote_name: str, path: str):
return local.check_exists(remote_name, path, database, config)
@app.head("/api/v1/local/{local_name}/{path:path}")
def check_local_file_exists(local_name: str, path: str):
return local.check_exists(local_name, path, database, config)
@app.delete("/api/v1/local/{local_name}/{path:path}")
def delete_local_file(local_name: str, path: str):
return local.delete(local_name, path, storage, database, config)
@app.delete("/api/v1/remote/{remote_name}/{path:path}")
def delete_file(remote_name: str, path: str):
return local.delete(remote_name, path, storage, database, config)
@app.post("/api/v1/artifacts/cache")
+7 -13
View File
@@ -87,10 +87,9 @@ class MetricsManager:
# Get from database if available
db_sizes = self.database_manager.get_storage_by_remote()
if db_sizes:
# Initialize all configured remotes and locals to 0
# Initialize all configured remotes to 0
remote_sizes = {}
all_names = list(config_manager.config.get("remotes", {}).keys()) + list(config_manager.config.get("locals", {}).keys())
for remote in all_names:
for remote in config_manager.config.get("remotes", {}).keys():
remote_sizes[remote] = db_sizes.get(remote, 0)
# Update Prometheus gauges
@@ -102,10 +101,10 @@ class MetricsManager:
# Fallback to S3 scanning if database not available
try:
remote_sizes = {}
all_names = list(config_manager.config.get("remotes", {}).keys()) + list(config_manager.config.get("locals", {}).keys())
remotes = config_manager.config.get("remotes", {}).keys()
# Initialize all remotes and locals to 0
for remote in all_names:
# Initialize all remotes to 0
for remote in remotes:
remote_sizes[remote] = 0
paginator = storage.client.get_paginator("list_objects_v2")
@@ -175,13 +174,8 @@ class MetricsManager:
metrics["requests"]["cache_hit_ratio"] = cache_hits / total_requests if total_requests > 0 else 0.0
metrics["bandwidth"]["saved_bytes"] = bandwidth_saved
# Get per-repo metrics
all_repos = {
**config_manager.config.get("remotes", {}),
**config_manager.config.get("virtuals", {}),
**config_manager.config.get("locals", {}),
}
for remote in all_repos.keys():
# Get per-remote metrics
for remote in config_manager.config.get("remotes", {}).keys():
remote_cache_hits = int(self.redis_client.client.get(f"metrics:cache_hits:{remote}") or 0)
remote_cache_misses = int(self.redis_client.client.get(f"metrics:cache_misses:{remote}") or 0)
remote_total = remote_cache_hits + remote_cache_misses
+22 -16
View File
@@ -20,55 +20,61 @@ TEST_REMOTES = {
"remotes": {
"alpine-test": {
"base_url": "https://dl-cdn.alpinelinux.org",
"type": "remote",
"package": "alpine",
"immutable_patterns": [".*/x86_64/.*\\.apk$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 3600},
},
"rpm-test": {
"base_url": "https://example.com/rpm",
"type": "remote",
"package": "rpm",
"immutable_patterns": [".*/x86_64/.*\\.rpm$", ".*/repodata/.*$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 3600},
},
"docker-test": {
"base_url": "https://registry.example.com",
"type": "remote",
"package": "docker",
"cache": {"immutable_ttl": 0, "mutable_ttl": 300},
},
"docker-restricted": {
"base_url": "https://registry.example.com",
"type": "remote",
"package": "docker",
"immutable_patterns": ["^library/nginx"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 300},
},
"docker-bantags-test": {
"base_url": "https://registry.example.com",
"package": "docker",
"ban_tags_enabled": True,
"ban_tags": ["latest", "edge"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 300},
},
"generic-test": {
"base_url": "https://releases.example.com",
"type": "remote",
"package": "generic",
"immutable_patterns": [".*\\.tar\\.gz$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 0},
},
"custom-index-test": {
"base_url": "https://example.com",
"type": "remote",
"package": "generic",
"mutable_patterns": ["metadata\\.json$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 600},
},
"check-mutable-test": {
"base_url": "https://example.com",
"type": "remote",
"package": "generic",
"mutable_patterns": ["metadata\\.json$"],
"check_mutable_updates": True,
"cache": {"immutable_ttl": 0, "mutable_ttl": 600},
},
"local-test": {
"type": "local",
"package": "generic",
"cache": {"immutable_ttl": 0, "mutable_ttl": 0},
},
"pypi-test": {
"base_url": "https://files.pythonhosted.org",
"type": "remote",
"package": "pypi",
"immutable_patterns": [
r"packages/.*\.whl$",
@@ -79,6 +85,7 @@ TEST_REMOTES = {
},
"npm-test": {
"base_url": "https://registry.npmjs.org",
"type": "remote",
"package": "npm",
"immutable_patterns": [r"\.tgz$"],
"mutable_patterns": [r"^(?!.*\.tgz$).*"],
@@ -86,12 +93,14 @@ TEST_REMOTES = {
},
"helm-test": {
"base_url": "https://helm.releases.hashicorp.com",
"type": "remote",
"package": "helm",
"immutable_patterns": [r"\.tgz$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 3600},
},
"quarantine-test": {
"base_url": "https://releases.example.com",
"type": "remote",
"package": "generic",
"immutable_patterns": [r".*\.tar\.gz$"],
"quarantine_new": True,
@@ -100,6 +109,7 @@ TEST_REMOTES = {
},
"quarantine-disabled": {
"base_url": "https://releases.example.com",
"type": "remote",
"package": "generic",
"immutable_patterns": [r".*\.tar\.gz$"],
"quarantine_new": False,
@@ -108,31 +118,27 @@ TEST_REMOTES = {
},
"helm-member-2": {
"base_url": "https://charts.example.com",
"type": "remote",
"package": "helm",
"immutable_patterns": [r"\.tgz$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 1800},
},
},
"locals": {
"local-test": {
"package": "generic",
"cache": {"immutable_ttl": 0, "mutable_ttl": 0},
},
},
"virtuals": {
"helm-virtual-test": {
"type": "virtual",
"package": "helm",
"members": ["helm-test", "helm-member-2"],
},
"unsupported-virtual-test": {
"type": "virtual",
"package": "rpm",
"members": ["rpm-test"],
},
"empty-virtual-test": {
"type": "virtual",
"package": "helm",
"members": [],
},
},
}
}
# ---------------------------------------------------------------------------
-71
View File
@@ -327,74 +327,3 @@ class TestArtifactPublished:
def test_get_returns_none_when_unavailable(self, unavailable_cache):
assert unavailable_cache.get_artifact_published("remote", "path") is None
# ---------------------------------------------------------------------------
# fetch lock (thundering-herd deduplication)
# ---------------------------------------------------------------------------
class TestFetchLock:
def test_acquire_returns_true_when_lock_obtained(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = True
result = cache_with_redis.acquire_fetch_lock("myremote", "library/nginx/manifests/latest")
assert result is True
def test_acquire_calls_set_nx_with_ttl(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = True
cache_with_redis.acquire_fetch_lock("myremote", "library/nginx/manifests/latest", ttl=15)
_, kwargs = mock_redis_client.set.call_args
assert kwargs.get("nx") is True
assert kwargs.get("ex") == 15
def test_acquire_returns_false_when_lock_already_held(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = None # Redis SET NX → None when key exists
result = cache_with_redis.acquire_fetch_lock("myremote", "library/nginx/manifests/latest")
assert result is False
def test_acquire_fails_open_when_unavailable(self, unavailable_cache):
# caller must be allowed to proceed when Redis is down
assert unavailable_cache.acquire_fetch_lock("myremote", "some/path") is True
def test_acquire_fails_open_on_redis_exception(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.side_effect = Exception("connection reset")
assert cache_with_redis.acquire_fetch_lock("myremote", "some/path") is True
def test_lock_key_embeds_path_hash(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = True
path = "library/nginx/manifests/latest"
cache_with_redis.acquire_fetch_lock("myremote", path)
args, _ = mock_redis_client.set.call_args
expected_hash = hashlib.sha256(path.encode()).hexdigest()[:16]
assert args[0] == f"fetchlock:myremote:{expected_hash}"
def test_lock_key_hash_is_16_chars(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = True
cache_with_redis.acquire_fetch_lock("myremote", "some/long/path/file.tar.gz")
args, _ = mock_redis_client.set.call_args
# key format: fetchlock:<remote>:<16-char hash>
parts = args[0].split(":")
assert len(parts) == 3
assert len(parts[2]) == 16
def test_different_paths_produce_different_lock_keys(self, cache_with_redis, mock_redis_client):
mock_redis_client.set.return_value = True
cache_with_redis.acquire_fetch_lock("myremote", "path/a/manifests/latest")
key_a = mock_redis_client.set.call_args[0][0]
mock_redis_client.set.reset_mock()
cache_with_redis.acquire_fetch_lock("myremote", "path/b/manifests/latest")
key_b = mock_redis_client.set.call_args[0][0]
assert key_a != key_b
def test_release_deletes_correct_key(self, cache_with_redis, mock_redis_client):
path = "library/nginx/manifests/latest"
cache_with_redis.release_fetch_lock("myremote", path)
expected_hash = hashlib.sha256(path.encode()).hexdigest()[:16]
mock_redis_client.delete.assert_called_once_with(f"fetchlock:myremote:{expected_hash}")
def test_release_no_op_when_unavailable(self, unavailable_cache):
unavailable_cache.release_fetch_lock("myremote", "some/path") # must not raise
def test_release_no_op_on_redis_exception(self, cache_with_redis, mock_redis_client):
mock_redis_client.delete.side_effect = Exception("timeout")
cache_with_redis.release_fetch_lock("myremote", "some/path") # must not raise
+98 -40
View File
@@ -10,11 +10,11 @@ from artifactapi.config import ConfigManager
@pytest.fixture
def make_config(tmp_path):
"""Factory: write a remotes dict to a temp YAML and return a ConfigManager."""
"""Factory: write a remote dict to a temp YAML and return a ConfigManager."""
def _make(remotes_dict):
cfg_file = tmp_path / "remotes.yaml"
cfg_file.write_text(yaml.dump({"remotes": remotes_dict}))
cfg_file.write_text(yaml.dump({"remote": remotes_dict}))
return ConfigManager(str(cfg_file))
return _make
@@ -64,7 +64,6 @@ class TestGetMutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "alpine",
"base_url": "https://x.com",
"mutable_patterns": [r"custom\.json$"],
@@ -81,7 +80,6 @@ class TestGetMutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "alpine",
"base_url": "https://x.com",
"mutable_patterns": [],
@@ -95,7 +93,6 @@ class TestGetMutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "alpine",
"base_url": "https://x.com",
"mutable_patterns": [existing],
@@ -109,7 +106,6 @@ class TestGetMutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "generic",
"base_url": "https://x.com",
"mutable_patterns": [r"meta\.json$", r"index\.yaml$"],
@@ -122,7 +118,6 @@ class TestGetMutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "rpm",
"base_url": "https://x.com",
"mutable_patterns": [r"custom-meta\.xml$"],
@@ -143,7 +138,6 @@ class TestGetMutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "npm",
"base_url": "https://x.com",
"mutable_patterns": [r"^(?!.*\.tgz$).*"],
@@ -174,7 +168,6 @@ class TestGetMutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "npm",
"base_url": "https://x.com",
"mutable_patterns": [r"^(?!.*\.tgz$).*"],
@@ -196,7 +189,6 @@ class TestGetImmutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "generic",
"base_url": "https://x.com",
"immutable_patterns": [r".*\.tar\.gz$"],
@@ -218,7 +210,6 @@ class TestGetImmutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "rpm",
"base_url": "https://x.com",
"immutable_patterns": patterns,
@@ -231,7 +222,6 @@ class TestGetImmutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "generic",
"base_url": "https://x.com",
"immutable_patterns": [r".*\.tar\.gz$"],
@@ -247,7 +237,6 @@ class TestGetImmutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "generic",
"base_url": "https://x.com",
"immutable_patterns": [r".*\.tar\.gz$"],
@@ -270,7 +259,6 @@ class TestGetUserMutablePatterns:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "alpine",
"base_url": "https://x.com",
"mutable_patterns": [r"custom\.json$"],
@@ -303,7 +291,6 @@ class TestGetCacheConfig:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "generic",
"base_url": "https://x.com",
"cache": {"immutable_ttl": 0, "mutable_ttl": 7200},
@@ -329,11 +316,11 @@ class TestGetCacheConfig:
class TestConfigReload:
def test_reloads_when_file_mtime_advances(self, tmp_path):
cfg_file = tmp_path / "remotes.yaml"
cfg_file.write_text(yaml.dump({"remotes": {"repo-a": {"package": "generic", "base_url": "https://x.com"}}}))
cfg_file.write_text(yaml.dump({"remote": {"repo-a": {"package": "generic", "base_url": "https://x.com"}}}))
cfg = ConfigManager(str(cfg_file))
assert "repo-a" in cfg.config["remotes"]
cfg_file.write_text(yaml.dump({"remotes": {"repo-b": {"package": "generic", "base_url": "https://y.com"}}}))
cfg_file.write_text(yaml.dump({"remote": {"repo-b": {"package": "generic", "base_url": "https://y.com"}}}))
future_mtime = cfg._last_modified + 1
os.utime(str(cfg_file), (future_mtime, future_mtime))
@@ -344,7 +331,7 @@ class TestConfigReload:
def test_no_reload_when_file_unchanged(self, tmp_path):
cfg_file = tmp_path / "remotes.yaml"
cfg_file.write_text(yaml.dump({"remotes": {"repo-a": {"package": "generic", "base_url": "https://x.com"}}}))
cfg_file.write_text(yaml.dump({"remote": {"repo-a": {"package": "generic", "base_url": "https://x.com"}}}))
cfg = ConfigManager(str(cfg_file))
# Call check_reload without touching the file — should not reload
@@ -375,7 +362,6 @@ class TestGetQuarantineConfig:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "generic",
"base_url": "https://x.com",
"quarantine_new": True,
@@ -391,7 +377,6 @@ class TestGetQuarantineConfig:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "generic",
"base_url": "https://x.com",
"quarantine_new": False,
@@ -407,7 +392,6 @@ class TestGetQuarantineConfig:
cfg = make_config(
{
"r": {
"type": "remote",
"package": "generic",
"base_url": "https://x.com",
"quarantine_new": True,
@@ -431,36 +415,36 @@ def _remote(base_url: str = "https://x.com") -> dict:
class TestConfigDirMode:
def test_loads_all_yaml_files(self, tmp_path):
(tmp_path / "a.yaml").write_text(yaml.dump({"remotes": {"repo-a": _remote()}}))
(tmp_path / "b.yaml").write_text(yaml.dump({"remotes": {"repo-b": _remote("https://y.com")}}))
(tmp_path / "a.yaml").write_text(yaml.dump({"remote": {"repo-a": _remote()}}))
(tmp_path / "b.yaml").write_text(yaml.dump({"remote": {"repo-b": _remote("https://y.com")}}))
cfg = ConfigManager(str(tmp_path))
assert "repo-a" in cfg.config["remotes"]
assert "repo-b" in cfg.config["remotes"]
def test_later_file_overrides_earlier_on_same_key(self, tmp_path):
(tmp_path / "a.yaml").write_text(yaml.dump({"remotes": {"r": _remote("https://first.com")}}))
(tmp_path / "b.yaml").write_text(yaml.dump({"remotes": {"r": _remote("https://second.com")}}))
(tmp_path / "a.yaml").write_text(yaml.dump({"remote": {"r": _remote("https://first.com")}}))
(tmp_path / "b.yaml").write_text(yaml.dump({"remote": {"r": _remote("https://second.com")}}))
cfg = ConfigManager(str(tmp_path))
assert cfg.config["remotes"]["r"]["base_url"] == "https://second.com"
def test_empty_directory_returns_empty_remotes(self, tmp_path):
cfg = ConfigManager(str(tmp_path))
assert cfg.config == {"remotes": {}, "virtuals": {}, "locals": {}}
assert cfg.config == {"remotes": {}}
def test_ignores_non_yaml_files(self, tmp_path):
(tmp_path / "notes.txt").write_text("not yaml")
(tmp_path / "a.yaml").write_text(yaml.dump({"remotes": {"repo-a": _remote()}}))
(tmp_path / "a.yaml").write_text(yaml.dump({"remote": {"repo-a": _remote()}}))
cfg = ConfigManager(str(tmp_path))
assert list(cfg.config["remotes"].keys()) == ["repo-a"]
def test_reload_picks_up_new_file(self, tmp_path):
(tmp_path / "a.yaml").write_text(yaml.dump({"remotes": {"repo-a": _remote()}}))
(tmp_path / "a.yaml").write_text(yaml.dump({"remote": {"repo-a": _remote()}}))
cfg = ConfigManager(str(tmp_path))
assert "repo-a" in cfg.config["remotes"]
assert "repo-b" not in cfg.config["remotes"]
new_file = tmp_path / "b.yaml"
new_file.write_text(yaml.dump({"remotes": {"repo-b": _remote("https://y.com")}}))
new_file.write_text(yaml.dump({"remote": {"repo-b": _remote("https://y.com")}}))
future_mtime = cfg._last_modified + 1
os.utime(str(new_file), (future_mtime, future_mtime))
@@ -479,9 +463,9 @@ class TestConfigDirKey:
def test_merges_remotes_from_config_dir(self, tmp_path):
conf_d = tmp_path / "conf.d"
conf_d.mkdir()
(conf_d / "remotes.yaml").write_text(yaml.dump({"remotes": {"repo-extra": _remote("https://extra.com")}}))
(conf_d / "remotes.yaml").write_text(yaml.dump({"remote": {"repo-extra": _remote("https://extra.com")}}))
main = tmp_path / "config.yaml"
main.write_text(yaml.dump({"config_dir": str(conf_d), "remotes": {"repo-main": _remote()}}))
main.write_text(yaml.dump({"config_dir": str(conf_d), "remote": {"repo-main": _remote()}}))
cfg = ConfigManager(str(main))
assert "repo-main" in cfg.config["remotes"]
assert "repo-extra" in cfg.config["remotes"]
@@ -489,9 +473,9 @@ class TestConfigDirKey:
def test_relative_config_dir_resolved_from_main_file(self, tmp_path):
conf_d = tmp_path / "conf.d"
conf_d.mkdir()
(conf_d / "r.yaml").write_text(yaml.dump({"remotes": {"repo-a": _remote()}}))
(conf_d / "r.yaml").write_text(yaml.dump({"remote": {"repo-a": _remote()}}))
main = tmp_path / "config.yaml"
main.write_text(yaml.dump({"config_dir": "conf.d", "remotes": {}}))
main.write_text(yaml.dump({"config_dir": "conf.d"}))
cfg = ConfigManager(str(main))
assert "repo-a" in cfg.config["remotes"]
@@ -499,16 +483,16 @@ class TestConfigDirKey:
conf_d = tmp_path / "conf.d"
conf_d.mkdir()
main = tmp_path / "config.yaml"
main.write_text(yaml.dump({"config_dir": str(conf_d), "remotes": {}}))
main.write_text(yaml.dump({"config_dir": str(conf_d), "remote": {}}))
cfg = ConfigManager(str(main))
assert "config_dir" not in cfg.config
def test_dir_remote_overrides_main_file_remote(self, tmp_path):
conf_d = tmp_path / "conf.d"
conf_d.mkdir()
(conf_d / "override.yaml").write_text(yaml.dump({"remotes": {"r": _remote("https://new.com")}}))
(conf_d / "override.yaml").write_text(yaml.dump({"remote": {"r": _remote("https://new.com")}}))
main = tmp_path / "config.yaml"
main.write_text(yaml.dump({"config_dir": str(conf_d), "remotes": {"r": _remote("https://old.com")}}))
main.write_text(yaml.dump({"config_dir": str(conf_d), "remote": {"r": _remote("https://old.com")}}))
cfg = ConfigManager(str(main))
assert cfg.config["remotes"]["r"]["base_url"] == "https://new.com"
@@ -516,7 +500,7 @@ class TestConfigDirKey:
conf_d = tmp_path / "conf.d"
conf_d.mkdir()
main = tmp_path / "config.yaml"
main.write_text(yaml.dump({"config_dir": str(conf_d), "remotes": {"repo-main": _remote()}}))
main.write_text(yaml.dump({"config_dir": str(conf_d), "remote": {"repo-main": _remote()}}))
cfg = ConfigManager(str(main))
assert list(cfg.config["remotes"].keys()) == ["repo-main"]
@@ -524,13 +508,13 @@ class TestConfigDirKey:
conf_d = tmp_path / "conf.d"
conf_d.mkdir()
dir_file = conf_d / "r.yaml"
dir_file.write_text(yaml.dump({"remotes": {"repo-v1": _remote()}}))
dir_file.write_text(yaml.dump({"remote": {"repo-v1": _remote()}}))
main = tmp_path / "config.yaml"
main.write_text(yaml.dump({"config_dir": str(conf_d), "remotes": {}}))
main.write_text(yaml.dump({"config_dir": str(conf_d), "remote": {}}))
cfg = ConfigManager(str(main))
assert "repo-v1" in cfg.config["remotes"]
dir_file.write_text(yaml.dump({"remotes": {"repo-v2": _remote("https://v2.com")}}))
dir_file.write_text(yaml.dump({"remote": {"repo-v2": _remote("https://v2.com")}}))
future_mtime = cfg._last_modified + 1
os.utime(str(dir_file), (future_mtime, future_mtime))
@@ -538,3 +522,77 @@ class TestConfigDirKey:
assert "repo-v2" in cfg.config["remotes"]
assert "repo-v1" not in cfg.config["remotes"]
# ---------------------------------------------------------------------------
# YAML format normalisation — top-level type keys
# ---------------------------------------------------------------------------
class TestYamlTypeKeys:
def test_remote_key_injects_type_remote(self, tmp_path):
f = tmp_path / "r.yaml"
f.write_text(yaml.dump({"remote": {"my-remote": {"package": "generic", "base_url": "https://x.com"}}}))
cfg = ConfigManager(str(f))
assert cfg.config["remotes"]["my-remote"]["type"] == "remote"
def test_virtual_key_injects_type_virtual(self, tmp_path):
f = tmp_path / "r.yaml"
f.write_text(yaml.dump({"virtual": {"my-virtual": {"package": "helm", "members": ["a", "b"]}}}))
cfg = ConfigManager(str(f))
assert cfg.config["remotes"]["my-virtual"]["type"] == "virtual"
assert cfg.config["remotes"]["my-virtual"]["members"] == ["a", "b"]
def test_local_key_injects_type_local(self, tmp_path):
f = tmp_path / "r.yaml"
f.write_text(yaml.dump({"local": {"my-local": {"package": "generic"}}}))
cfg = ConfigManager(str(f))
assert cfg.config["remotes"]["my-local"]["type"] == "local"
def test_mixed_file_all_three_types(self, tmp_path):
f = tmp_path / "r.yaml"
f.write_text(
yaml.dump(
{
"remote": {"r": {"package": "helm", "base_url": "https://helm.example.com"}},
"virtual": {"v": {"package": "helm", "members": ["r"]}},
"local": {"l": {"package": "generic"}},
}
)
)
cfg = ConfigManager(str(f))
assert cfg.config["remotes"]["r"]["type"] == "remote"
assert cfg.config["remotes"]["v"]["type"] == "virtual"
assert cfg.config["remotes"]["l"]["type"] == "local"
def test_type_field_not_required_in_yaml(self, tmp_path):
f = tmp_path / "r.yaml"
f.write_text(yaml.dump({"remote": {"r": {"package": "alpine", "base_url": "https://x.com"}}}))
cfg = ConfigManager(str(f))
raw = cfg.config["remotes"]["r"]
# type is injected by the loader; the original dict had no type key
assert "type" in raw
assert raw["type"] == "remote"
def test_other_fields_preserved_after_normalisation(self, tmp_path):
f = tmp_path / "r.yaml"
f.write_text(
yaml.dump(
{
"remote": {
"r": {
"package": "helm",
"base_url": "https://helm.example.com",
"immutable_patterns": [r"\.tgz$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 1800},
}
}
}
)
)
cfg = ConfigManager(str(f))
remote = cfg.config["remotes"]["r"]
assert remote["package"] == "helm"
assert remote["base_url"] == "https://helm.example.com"
assert remote["cache"] == {"immutable_ttl": 0, "mutable_ttl": 1800}
assert r"\.tgz$" in remote["immutable_patterns"]
+26 -216
View File
@@ -260,211 +260,6 @@ class TestDockerProxy:
mock_fetch.assert_called_once()
assert response.status_code == 200
# --- Issue 1: sha256 digest cross-linking ---
def test_tag_manifest_is_stored_under_digest_key_on_cache_hit(self, client, patched_deps):
# When serving a cached tag manifest the handler must also write the content
# under the sha256 digest key so subsequent sha256-addressed pulls hit cache.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
# First exists call (tag manifest): hit. Second (digest key): miss → triggers upload.
deps["storage"].exists.side_effect = [True, False]
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/v2/docker-test/library/nginx/manifests/v1.25.3")
assert response.status_code == 200
deps["storage"].upload.assert_called_once_with(deps["storage"].get_object_key.return_value, manifest)
def test_tag_manifest_digest_key_not_written_when_already_exists(self, client, patched_deps):
# When the digest key already exists in storage upload must not be called.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
# Both the tag key and the digest key already present.
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
client.get("/v2/docker-test/library/nginx/manifests/v1.25.3")
deps["storage"].upload.assert_not_called()
def test_sha256_manifest_request_is_not_cross_linked(self, client, patched_deps):
# sha256-addressed manifests are immutable — the cross-link logic must not apply.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = False # sha256 manifest is immutable
with patch("artifactapi.artifact.proxy._fetch_last_modified", new_callable=AsyncMock, return_value=None):
client.get("/v2/docker-test/library/nginx/manifests/sha256:" + "a" * 64)
deps["storage"].upload.assert_not_called()
# --- Issue 2: thundering herd distributed lock ---
def test_lock_acquired_and_released_on_upstream_fetch(self, client, patched_deps):
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.side_effect = [False, False] # initial miss; digest key also absent
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].acquire_fetch_lock.return_value = True
with patch(
"artifactapi.artifact.proxy.cache_single_artifact",
new_callable=AsyncMock,
return_value={"status": "cached"},
):
response = client.get("/v2/docker-test/library/nginx/manifests/latest")
deps["cache"].acquire_fetch_lock.assert_called_once()
deps["cache"].release_fetch_lock.assert_called_once()
assert response.status_code == 200
def test_lock_released_even_when_fetch_returns_error(self, client, patched_deps):
deps = patched_deps
deps["storage"].exists.return_value = False
deps["cache"].is_mutable_file.return_value = True
deps["cache"].acquire_fetch_lock.return_value = True
with patch(
"artifactapi.artifact.proxy.cache_single_artifact",
new_callable=AsyncMock,
return_value={"status": "error", "error": "upstream down"},
):
response = client.get("/v2/docker-test/library/nginx/manifests/latest")
deps["cache"].release_fetch_lock.assert_called_once()
assert response.status_code == 502
def test_thundering_herd_polls_storage_when_lock_not_acquired(self, client, patched_deps):
# When the lock is held by another pod the handler must poll storage and serve
# from cache once the competing fetch completes, without issuing its own upstream request.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
# Initial cache check: miss. First poll iteration: another pod has written it.
# Third call is for the digest cross-link check (is_mutable=True path); digest key exists.
deps["storage"].exists.side_effect = [False, True, True]
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
deps["cache"].acquire_fetch_lock.return_value = False # lock held by peer
with patch("artifactapi.artifact.docker.asyncio.sleep", new_callable=AsyncMock):
with patch(
"artifactapi.artifact.proxy.cache_single_artifact",
new_callable=AsyncMock,
) as mock_fetch:
response = client.get("/v2/docker-test/library/nginx/manifests/latest")
mock_fetch.assert_not_called()
assert response.status_code == 200
def test_thundering_herd_falls_through_to_fetch_if_poll_times_out(self, client, patched_deps):
# If the item never appears in storage during the poll window the handler must
# still issue its own upstream fetch as a fallback.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
# All exists calls return False — item never appears during polling.
deps["storage"].exists.return_value = False
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].acquire_fetch_lock.return_value = False # lock held by peer
with patch("artifactapi.artifact.docker.asyncio.sleep", new_callable=AsyncMock):
with patch(
"artifactapi.artifact.proxy.cache_single_artifact",
new_callable=AsyncMock,
return_value={"status": "cached"},
) as mock_fetch:
response = client.get("/v2/docker-test/library/nginx/manifests/latest")
mock_fetch.assert_called_once()
assert response.status_code == 200
# ---------------------------------------------------------------------------
# Docker ban_tags feature
# ---------------------------------------------------------------------------
class TestDockerBanTags:
def test_banned_tag_returns_403(self, client, patched_deps):
response = client.get("/v2/docker-bantags-test/library/nginx/manifests/latest")
assert response.status_code == 403
assert "latest" in response.json()["detail"]
def test_second_banned_tag_returns_403(self, client, patched_deps):
response = client.get("/v2/docker-bantags-test/library/nginx/manifests/edge")
assert response.status_code == 403
assert "edge" in response.json()["detail"]
def test_allowed_tag_proceeds(self, client, patched_deps):
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/v2/docker-bantags-test/library/nginx/manifests/1.25.3")
assert response.status_code == 200
def test_digest_pull_bypasses_ban(self, client, patched_deps):
# sha256-addressed pulls must never be blocked by the tag ban list
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = False
digest = "sha256:" + "a" * 64
with patch("artifactapi.artifact.proxy._fetch_last_modified", new_callable=AsyncMock, return_value=None):
response = client.get(f"/v2/docker-bantags-test/library/nginx/manifests/{digest}")
assert response.status_code == 200
def test_ban_tags_disabled_by_default(self, client, patched_deps):
# docker-test has no ban_tags_enabled — "latest" must pass through
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/v2/docker-test/library/nginx/manifests/latest")
assert response.status_code == 200
def test_ban_tags_enabled_but_empty_list_allows_all(self, client, patched_deps):
# If ban_tags_enabled is true but ban_tags is empty nothing should be blocked.
# docker-test doesn't have ban_tags_enabled, but we can verify via the
# docker-bantags-test remote with an unlisted tag.
deps = patched_deps
manifest = json.dumps({"mediaType": "application/vnd.oci.image.manifest.v1+json", "layers": []}).encode()
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = manifest
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/v2/docker-bantags-test/library/nginx/manifests/stable")
assert response.status_code == 200
def test_ban_check_does_not_apply_to_blobs(self, client, patched_deps):
# Blob paths don't contain /manifests/ — the ban check must not interfere
deps = patched_deps
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = b"\x00" * 100
deps["cache"].is_mutable_file.return_value = False
with patch("artifactapi.artifact.proxy._fetch_last_modified", new_callable=AsyncMock, return_value=None):
response = client.get("/v2/docker-bantags-test/library/nginx/blobs/sha256:" + "b" * 64)
assert response.status_code == 200
# ---------------------------------------------------------------------------
# Generic artifact route /api/v1/remote/{remote}/{path}
@@ -728,53 +523,68 @@ class TestGenericArtifactRoute:
deps["database"].get_local_file_metadata.return_value = None
deps["database"].available = True
response = client.get("/api/v1/local/local-test/path/to/nonexistent.bin")
response = client.get("/api/v1/remote/local-test/path/to/nonexistent.bin")
assert response.status_code == 404
# ---------------------------------------------------------------------------
# Upload route PUT /api/v1/local/{local}/{path}
# Upload route PUT /api/v1/remote/{remote}/{path}
# ---------------------------------------------------------------------------
class TestUploadRoute:
def test_unknown_local_returns_404(self, client, patched_deps):
def test_unknown_remote_returns_404(self, client, patched_deps):
response = client.put(
"/api/v1/local/nonexistent/path/to/file.tar.gz",
"/api/v1/remote/nonexistent/path/to/file.tar.gz",
files={"file": ("file.tar.gz", b"content", "application/octet-stream")},
)
assert response.status_code == 404
def test_non_local_remote_returns_400(self, client, patched_deps):
response = client.put(
"/api/v1/remote/generic-test/path/to/file.tar.gz",
files={"file": ("file.tar.gz", b"content", "application/octet-stream")},
)
assert response.status_code == 400
# ---------------------------------------------------------------------------
# HEAD route HEAD /api/v1/local/{local}/{path}
# HEAD route HEAD /api/v1/remote/{remote}/{path}
# ---------------------------------------------------------------------------
class TestHeadRoute:
def test_non_local_remote_returns_405(self, client, patched_deps):
response = client.head("/api/v1/remote/generic-test/path/to/file.tar.gz")
assert response.status_code == 405
def test_local_repo_file_not_found_returns_404(self, client, patched_deps):
deps = patched_deps
deps["database"].get_local_file_metadata.return_value = None
deps["database"].available = True
response = client.head("/api/v1/local/local-test/path/to/nonexistent.bin")
response = client.head("/api/v1/remote/local-test/path/to/nonexistent.bin")
assert response.status_code == 404
def test_unknown_local_returns_404(self, client, patched_deps):
response = client.head("/api/v1/local/nonexistent/path/to/file.bin")
def test_unknown_remote_returns_404(self, client, patched_deps):
response = client.head("/api/v1/remote/nonexistent/path/to/file.bin")
assert response.status_code == 404
# ---------------------------------------------------------------------------
# DELETE route DELETE /api/v1/local/{local}/{path}
# DELETE route DELETE /api/v1/remote/{remote}/{path}
# ---------------------------------------------------------------------------
class TestDeleteRoute:
def test_unknown_local_returns_404(self, client, patched_deps):
response = client.delete("/api/v1/local/nonexistent/path/to/file.tar.gz")
def test_unknown_remote_returns_404(self, client, patched_deps):
response = client.delete("/api/v1/remote/nonexistent/path/to/file.tar.gz")
assert response.status_code == 404
def test_non_local_remote_returns_400(self, client, patched_deps):
response = client.delete("/api/v1/remote/generic-test/path/to/file.tar.gz")
assert response.status_code == 400
# ---------------------------------------------------------------------------
# Cache flush PUT /cache/flush
+59 -293
View File
@@ -8,15 +8,11 @@ import yaml
from artifactapi.artifact.virtual import (
_HANDLERS,
_entries_to_msgpack_safe,
_get_member_index,
_HelmDumper,
_HelmHandler,
_merge_helm_indexes,
_rewrite_urls,
_VirtualHandler,
_YamlDumperBase,
_YamlLoader,
)
# ---------------------------------------------------------------------------
@@ -70,47 +66,12 @@ entries:
generated: "2023-01-01T00:00:00.000Z"
"""
_INDEX_RELATIVE = b"""\
apiVersion: v1
entries:
rancher:
- name: rancher
version: "2.13.1"
urls:
- rancher-2.13.1.tgz
generated: "2023-01-01T00:00:00.000Z"
"""
_CFG_A = {"base_url": "https://helm.releases.hashicorp.com", "cache": {"mutable_ttl": 3600}}
_CFG_B = {"base_url": "https://charts.example.com", "cache": {"mutable_ttl": 1800}}
# ---------------------------------------------------------------------------
# _YamlLoader / _YamlDumperBase — C extension selection
# ---------------------------------------------------------------------------
class TestYamlExtensionSelection:
def test_loader_is_a_class(self):
assert isinstance(_YamlLoader, type)
def test_dumper_base_is_a_class(self):
assert isinstance(_YamlDumperBase, type)
def test_helm_dumper_uses_selected_base(self):
assert issubclass(_HelmDumper, _YamlDumperBase)
def test_c_extensions_used_when_available(self):
try:
assert _YamlLoader is yaml.CSafeLoader
assert _YamlDumperBase is yaml.CDumper
except AttributeError:
assert _YamlLoader is yaml.SafeLoader
assert _YamlDumperBase is yaml.Dumper
def test_loader_can_parse_yaml(self):
result = yaml.load(b"key: value", Loader=_YamlLoader)
assert result == {"key": "value"}
def _identity_resolve(data, *args, **kwargs):
return data, None
# ---------------------------------------------------------------------------
@@ -174,13 +135,14 @@ class TestHelmHandler:
assert isinstance(msg, str) and len(msg) > 0
def test_merge_returns_bytes(self):
result = self.handler.merge([_INDEX_A], [None], ["member-a"], [_CFG_A], "http://proxy.example.com")
with patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve):
result = self.handler.merge([_INDEX_A], ["member-a"], [_CFG_A], "http://proxy.example.com")
assert isinstance(result, bytes)
def test_merge_delegates_to_merge_helm_indexes(self):
with patch("artifactapi.artifact.virtual._merge_helm_indexes", return_value=b"merged") as mock_fn:
result = self.handler.merge([b"data"], [None], ["m"], [{}], "http://proxy")
mock_fn.assert_called_once_with([b"data"], [None], ["m"], [{}], "http://proxy")
result = self.handler.merge([b"data"], ["m"], [{}], "http://proxy")
mock_fn.assert_called_once_with([b"data"], ["m"], [{}], "http://proxy")
assert result == b"merged"
@@ -198,41 +160,6 @@ class TestHandlersRegistry:
assert isinstance(_HANDLERS["helm"], _VirtualHandler)
# ---------------------------------------------------------------------------
# _rewrite_urls
# ---------------------------------------------------------------------------
class TestRewriteUrls:
def _rewrite(self, urls, base_url="https://upstream.example.com", proxy_base="http://proxy.example.com", member_name="my-remote"):
return _rewrite_urls(urls, base_url, proxy_base, member_name)
def test_absolute_url_matching_base_is_rewritten(self):
result = self._rewrite(["https://upstream.example.com/chart-1.0.0.tgz"])
assert result == ["http://proxy.example.com/api/v1/remote/my-remote/chart-1.0.0.tgz"]
def test_relative_url_is_prepended_with_proxy_remote(self):
result = self._rewrite(["chart-1.0.0.tgz"])
assert result == ["http://proxy.example.com/api/v1/remote/my-remote/chart-1.0.0.tgz"]
def test_relative_url_with_leading_slash(self):
result = self._rewrite(["/chart-1.0.0.tgz"])
assert result == ["http://proxy.example.com/api/v1/remote/my-remote/chart-1.0.0.tgz"]
def test_absolute_url_not_matching_base_is_unchanged(self):
result = self._rewrite(["https://other.example.com/chart-1.0.0.tgz"])
assert result == ["https://other.example.com/chart-1.0.0.tgz"]
def test_empty_url_list_returns_empty(self):
assert self._rewrite([]) == []
def test_multiple_urls_all_rewritten(self):
urls = ["https://upstream.example.com/a-1.0.0.tgz", "b-2.0.0.tgz"]
result = self._rewrite(urls)
assert result[0] == "http://proxy.example.com/api/v1/remote/my-remote/a-1.0.0.tgz"
assert result[1] == "http://proxy.example.com/api/v1/remote/my-remote/b-2.0.0.tgz"
# ---------------------------------------------------------------------------
# _merge_helm_indexes
# ---------------------------------------------------------------------------
@@ -240,7 +167,8 @@ class TestRewriteUrls:
class TestMergeHelmIndexes:
def _merge(self, raw_indexes, member_names, member_configs, proxy_base="http://proxy.example.com"):
return _merge_helm_indexes(raw_indexes, [None] * len(raw_indexes), member_names, member_configs, proxy_base)
with patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve):
return _merge_helm_indexes(raw_indexes, member_names, member_configs, proxy_base)
def _parse(self, raw):
return yaml.safe_load(raw)
@@ -259,18 +187,7 @@ class TestMergeHelmIndexes:
def test_first_member_wins_on_duplicate_name_and_version(self):
index = self._parse(self._merge([_INDEX_A, _INDEX_B], ["member-a", "member-b"], [_CFG_A, _CFG_B]))
v027 = next(e for e in index["entries"]["vault"] if e["version"] == "0.27.0")
assert "member-a" in v027["urls"][0]
def test_absolute_urls_rewritten_to_proxy(self):
index = self._parse(self._merge([_INDEX_A], ["member-a"], [_CFG_A]))
url = index["entries"]["vault"][0]["urls"][0]
assert url == "http://proxy.example.com/api/v1/remote/member-a/vault-0.27.0.tgz"
def test_relative_urls_rewritten_to_proxy(self):
cfg = {"base_url": "https://releases.rancher.com/server-charts/stable", "cache": {"mutable_ttl": 3600}}
index = self._parse(self._merge([_INDEX_RELATIVE], ["rancher-stable"], [cfg]))
url = index["entries"]["rancher"][0]["urls"][0]
assert url == "http://proxy.example.com/api/v1/remote/rancher-stable/rancher-2.13.1.tgz"
assert "helm.releases.hashicorp.com" in v027["urls"][0]
def test_different_versions_of_same_chart_both_included(self):
index = self._parse(self._merge([_INDEX_A, _INDEX_B], ["member-a", "member-b"], [_CFG_A, _CFG_B]))
@@ -343,7 +260,7 @@ class TestGetMemberIndex:
storage.exists.return_value = True
cache.is_index_valid.return_value = True
_, _, _, raw_data, _ = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
_, _, _, raw_data = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == b"cached bytes"
@@ -366,7 +283,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response(b"fresh bytes")
_, _, _, raw_data, _ = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
_, _, _, raw_data = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == b"fresh bytes"
@@ -376,7 +293,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response()
_, _, _, raw_data, _ = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
_, _, _, raw_data = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == b"upstream bytes"
@@ -435,7 +352,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.side_effect = Exception("connection refused")
_, _, _, raw_data, _ = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
_, _, _, raw_data = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data is None
@@ -447,7 +364,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response()
_, _, _, raw_data, _ = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
_, _, _, raw_data = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == b"upstream bytes"
@@ -458,7 +375,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response()
_, _, ttl, _, _ = await _get_member_index("m", cfg, "index.yaml", storage, cache)
_, _, ttl, _ = await _get_member_index("m", cfg, "index.yaml", storage, cache)
assert ttl == 900
@@ -469,7 +386,7 @@ class TestGetMemberIndex:
mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response()
_, _, ttl, _, _ = await _get_member_index("m", cfg, "index.yaml", storage, cache)
_, _, ttl, _ = await _get_member_index("m", cfg, "index.yaml", storage, cache)
assert ttl == 3600
@@ -513,10 +430,10 @@ class TestVirtualRoute:
response = client.get("/api/v1/virtual/no-such-virtual/index.yaml")
assert response.status_code == 404
def test_non_virtual_name_returns_404(self, client, patched_virtual_deps):
# helm-test is in remotes, not virtuals
def test_non_virtual_type_returns_400(self, client, patched_virtual_deps):
# helm-test is type "remote", not "virtual"
response = client.get("/api/v1/virtual/helm-test/index.yaml")
assert response.status_code == 404
assert response.status_code == 400
def test_unsupported_package_returns_400(self, client, patched_virtual_deps):
# unsupported-virtual-test has package "rpm"
@@ -567,16 +484,22 @@ class TestVirtualRoute:
mock_get.assert_not_called()
def test_cache_miss_returns_200_with_yaml_content_type(self, client, patched_virtual_deps):
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get:
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None)
with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
assert response.status_code == 200
assert "text/yaml" in response.headers["content-type"]
def test_cache_miss_response_contains_merged_entries(self, client, patched_virtual_deps):
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get:
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None)
with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
index = yaml.safe_load(response.content)
@@ -584,26 +507,35 @@ class TestVirtualRoute:
def test_cache_miss_stores_result_in_s3(self, client, patched_virtual_deps):
deps = patched_virtual_deps
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get:
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None)
with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
deps["storage"].upload.assert_called_once()
def test_cache_miss_marks_index_cached(self, client, patched_virtual_deps):
deps = patched_virtual_deps
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get:
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None)
with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
deps["cache"].mark_index_cached.assert_called_once()
def test_cache_miss_uses_min_ttl_across_members(self, client, patched_virtual_deps):
deps = patched_virtual_deps
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get:
with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.side_effect = [
("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None),
("helm-member-2", _CFG_B, 1800, _INDEX_SIMPLE, None),
("helm-test", _CFG_A, 3600, _INDEX_SIMPLE),
("helm-member-2", _CFG_B, 1800, _INDEX_SIMPLE),
]
client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
@@ -612,16 +544,19 @@ class TestVirtualRoute:
def test_all_members_unreachable_returns_502(self, client, patched_virtual_deps):
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get:
mock_get.return_value = ("helm-test", _CFG_A, 3600, None, None)
mock_get.return_value = ("helm-test", _CFG_A, 3600, None)
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
assert response.status_code == 502
def test_one_member_unreachable_still_returns_200(self, client, patched_virtual_deps):
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get:
with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.side_effect = [
("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None),
("helm-member-2", _CFG_B, 1800, None, None),
("helm-test", _CFG_A, 3600, _INDEX_SIMPLE),
("helm-member-2", _CFG_B, 1800, None),
]
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
@@ -637,9 +572,10 @@ class TestVirtualRoute:
with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
patch.object(main_mod.config, "get_remote_config", side_effect=patched_get),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None)
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
# only helm-test was available — should succeed
@@ -650,181 +586,11 @@ class TestVirtualRoute:
deps = patched_virtual_deps
deps["storage"].upload.side_effect = Exception("S3 write error")
with patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get:
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE, None)
with (
patch("artifactapi.artifact.virtual._get_member_index", new_callable=AsyncMock) as mock_get,
patch("artifactapi.artifact.virtual._helm.resolve_content", side_effect=_identity_resolve),
):
mock_get.return_value = ("helm-test", _CFG_A, 3600, _INDEX_SIMPLE)
response = client.get("/api/v1/virtual/helm-virtual-test/index.yaml")
assert response.status_code == 200
# ---------------------------------------------------------------------------
# _entries_to_msgpack_safe
# ---------------------------------------------------------------------------
class TestEntriesToMsgpackSafe:
def test_plain_string_values_pass_through(self):
entries = {"chart": [{"name": "chart", "version": "1.0.0", "urls": ["http://x/c.tgz"]}]}
result = _entries_to_msgpack_safe(entries)
assert result["chart"][0]["version"] == "1.0.0"
def test_datetime_converted_to_iso_string(self):
dt = datetime(2023, 6, 15, 12, 0, 0, tzinfo=UTC)
entries = {"chart": [{"name": "chart", "version": "1.0.0", "created": dt}]}
result = _entries_to_msgpack_safe(entries)
assert isinstance(result["chart"][0]["created"], str)
assert "2023-06-15" in result["chart"][0]["created"]
def test_date_converted_to_iso_string(self):
entries = {"chart": [{"name": "chart", "version": "1.0.0", "created": date(2023, 6, 15)}]}
result = _entries_to_msgpack_safe(entries)
assert result["chart"][0]["created"] == "2023-06-15"
def test_empty_entries_returns_empty_dict(self):
assert _entries_to_msgpack_safe({}) == {}
def test_multiple_versions_all_converted(self):
dt = datetime(2023, 1, 1, tzinfo=UTC)
entries = {
"chart": [
{"name": "chart", "version": "1.0.0", "created": dt},
{"name": "chart", "version": "2.0.0", "created": dt},
]
}
result = _entries_to_msgpack_safe(entries)
for v in result["chart"]:
assert isinstance(v["created"], str)
def test_result_is_msgpack_serializable(self):
import msgpack
dt = datetime(2023, 6, 15, 12, 0, 0, tzinfo=UTC)
entries = {"chart": [{"name": "chart", "version": "1.0.0", "created": dt, "urls": ["http://x/c.tgz"]}]}
safe = _entries_to_msgpack_safe(entries)
packed = msgpack.packb(safe, use_bin_type=True)
unpacked = msgpack.unpackb(packed, raw=False)
assert unpacked["chart"][0]["created"] == safe["chart"][0]["created"]
# ---------------------------------------------------------------------------
# _merge_helm_indexes — pre-parsed entries path
# ---------------------------------------------------------------------------
class TestMergeHelmIndexesWithParsed:
"""Verify that pre-parsed entries (from msgpack) produce the same output as raw YAML."""
def _parse_entries(self, raw: bytes) -> dict:
index = yaml.safe_load(raw)
return index.get("entries") or {}
def test_parsed_entries_produce_same_charts_as_raw(self):
parsed = self._parse_entries(_INDEX_A)
raw_result = yaml.safe_load(_merge_helm_indexes([_INDEX_A], [None], ["member-a"], [_CFG_A], "http://proxy.example.com"))
parsed_result = yaml.safe_load(_merge_helm_indexes([_INDEX_A], [parsed], ["member-a"], [_CFG_A], "http://proxy.example.com"))
assert set(raw_result["entries"].keys()) == set(parsed_result["entries"].keys())
def test_parsed_entries_urls_are_rewritten(self):
parsed = self._parse_entries(_INDEX_A)
result = yaml.safe_load(_merge_helm_indexes([_INDEX_A], [parsed], ["member-a"], [_CFG_A], "http://proxy.example.com"))
url = result["entries"]["vault"][0]["urls"][0]
assert "member-a" in url
assert "proxy.example.com" in url
def test_none_parsed_falls_back_to_raw_bytes(self):
result = yaml.safe_load(_merge_helm_indexes([_INDEX_A], [None], ["member-a"], [_CFG_A], "http://proxy.example.com"))
assert "vault" in result["entries"]
def test_mixed_parsed_and_raw_merge_correctly(self):
parsed_a = self._parse_entries(_INDEX_A)
result = yaml.safe_load(
_merge_helm_indexes(
[_INDEX_A, _INDEX_B],
[parsed_a, None],
["member-a", "member-b"],
[_CFG_A, _CFG_B],
"http://proxy.example.com",
)
)
assert "vault" in result["entries"]
assert "nginx" in result["entries"]
# ---------------------------------------------------------------------------
# _get_member_index — msgpack cache behaviour
# ---------------------------------------------------------------------------
class TestGetMemberIndexMsgpack:
@pytest.fixture
def storage(self):
m = MagicMock()
m.get_object_key.side_effect = lambda name, path: f"{name}/{path}"
m.exists.return_value = False
m.download_object.return_value = _INDEX_SIMPLE
return m
@pytest.fixture
def cache(self):
m = MagicMock()
m.is_index_valid.return_value = False
return m
@pytest.fixture
def member_cfg(self):
return {"base_url": "https://helm.releases.hashicorp.com", "cache": {"mutable_ttl": 3600}}
def _fake_response(self, content=_INDEX_SIMPLE):
r = MagicMock()
r.content = content
r.raise_for_status = MagicMock()
return r
async def test_cache_hit_with_msgpack_returns_parsed_entries(self, storage, cache, member_cfg):
import msgpack
entries = {"mychart": [{"name": "mychart", "version": "1.0.0", "urls": ["http://x/c.tgz"]}]}
packed = msgpack.packb(entries, use_bin_type=True)
storage.exists.side_effect = lambda key: True
cache.is_index_valid.return_value = True
storage.download_object.side_effect = lambda key: packed if key.endswith("index.msgpack") else _INDEX_SIMPLE
_, _, _, raw_data, parsed = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert parsed == entries
async def test_cache_miss_builds_msgpack_and_returns_parsed(self, storage, cache, member_cfg):
with patch("artifactapi.artifact.virtual.httpx.AsyncClient") as mock_cls:
mock_client = AsyncMock()
mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.return_value = self._fake_response()
_, _, _, raw_data, parsed = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == _INDEX_SIMPLE
assert isinstance(parsed, dict)
assert "mychart" in parsed
async def test_broken_msgpack_rebuilds_from_raw_yaml(self, storage, cache, member_cfg):
storage.exists.side_effect = lambda key: True
cache.is_index_valid.return_value = True
storage.download_object.side_effect = lambda key: b"not-valid-msgpack" if key.endswith("index.msgpack") else _INDEX_SIMPLE
_, _, _, raw_data, parsed = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data == _INDEX_SIMPLE
# Falls back to YAML parse and rebuilds msgpack — entries are returned
assert isinstance(parsed, dict)
assert "mychart" in parsed
async def test_upstream_failure_returns_none_for_both(self, storage, cache, member_cfg):
with patch("artifactapi.artifact.virtual.httpx.AsyncClient") as mock_cls:
mock_client = AsyncMock()
mock_cls.return_value.__aenter__.return_value = mock_client
mock_client.get.side_effect = Exception("timeout")
_, _, _, raw_data, parsed = await _get_member_index("m", member_cfg, "index.yaml", storage, cache)
assert raw_data is None
assert parsed is None