- npm: remove npm_files_url/npm_files_remote; rewrite uses base_url and
remote name directly (same approach as helm)
- npm: replace hardcoded .tgz extension check with immutable_patterns match
- pypi: collapse pypi + pypi-files into a single remote (base_url points
to files.pythonhosted.org); simple/ requests are transparently fetched
from pypi.org with no extra config required
- pypi: remove pypi_files_url/pypi_files_remote from pypi and pypi-gitea
- pypi: rewrite check now uses immutable_patterns (consistent with npm)
- Update README for both pypi and npm sections
- Update tests and fixtures to reflect single-remote pypi config
- Add helm package type with index.yaml as mutable (TTL-based) and
.tgz chart tarballs as immutable
- Rewrite chart URLs in index.yaml to serve tarballs via proxy cache
- Add text/yaml content-type detection for .yaml/.yml files
- Add hashicorp-helm example remote in remotes.yaml
- Update README with Helm chart repository proxy section
- Add tests for helm mutable patterns and route behaviour
- Add `npm` package type to config with no built-in mutable defaults;
users set explicit mutable_patterns (e.g. ^(?!.*\.tgz$).*) and
immutable_patterns (e.g. \.tgz$) in remotes.yaml
- Rewrite dist.tarball URLs in metadata JSON on the fly so tarball
downloads pass through the same proxy remote instead of hitting
npmjs.org directly
- Single-remote design: npm_files_remote points back to itself since
both metadata and tarballs are served from registry.npmjs.org
- Add .tgz to _get_content_type (application/gzip)
- Add example npm remote to remotes.yaml
- Add npm proxy section to README covering remotes.yaml config,
client setup (npm/yarn/pnpm), rewriting behaviour, and
mutable vs immutable path table
- Add tests for mutable pattern matching, URL rewriting, content-type,
scoped packages, cache miss, and tarball immutability
- Add 'pypi' package type to config.py; simple/ paths are mutable by default
- Refactor content-type detection into _get_content_type() helper; add .whl
- Add _resolve_content() which rewrites files host URLs in simple index HTML
to go through the proxy (pypi_files_url / pypi_files_remote config keys),
and returns text/html content-type for simple index responses
- Add basic auth support for non-Docker remotes (username + password/token
in remote config); thread auth through _upstream_reachable and
check_upstream_changed so mutable TTL checks also authenticate
- Add 'pypi' remote (pypi.org simple index) and 'pypi-files' remote
(files.pythonhosted.org) to remotes.yaml; add 'pypi-gitea' example for
Gitea package registries where index and files share the same base URL
- Add unit tests: simple index URL rewriting, HTML content-type, .whl/.tar.gz
content-types, mutable index detection, and immutable pattern enforcement
When a mutable file's TTL expires and the upstream backend cannot be
contacted (network error or timeout), the cached copy is kept and its
TTL refreshed instead of being evicted. This keeps RPM repodata, Alpine
indexes, branch archives, and other mutable data available during
upstream outages.
Adds UpstreamUnreachable exception and _upstream_reachable() helper.
check_upstream_changed() now raises UpstreamUnreachable on network
errors (was silently returning True). handle_expired_mutable() catches
the exception on the check_mutable_updates path and calls
_upstream_reachable() on the plain-expiry path.
README updated to current immutable/mutable terminology and documents
all new caching features.
Deduplicates the expired-mutable TTL/redownload branching logic that
was copied verbatim between get_artifact and docker_v2_proxy. Adds the
missing happy-path test for a changed mutable file that is successfully
re-fetched from upstream.
When check_mutable_updates: true is set on a remote, expired user-defined
mutable files are revalidated before re-downloading:
- On expiry a conditional HEAD is sent with If-None-Match / If-Modified-Since
- 304 Not Modified: TTL is refreshed in Redis, S3 cache is untouched
- 200 / no conditional support: cache is invalidated and file re-downloaded
- Network error: safe fallback — assume changed, re-download
ETag and Last-Modified from upstream responses are stored in Redis under
mutable:meta:<remote>:<hash> (no expiry, cleaned up on re-download or
cache flush). The flag only applies to user-configured mutable_patterns;
built-in package-type defaults (APKINDEX, repomd.xml, Docker manifests)
are always re-fetched unconditionally.
cache/flush also clears mutable:meta:* keys alongside index:* keys.
Remove remotes.yaml from .gitignore and add header comments explaining
the immutable_patterns/mutable_patterns/cache keys. Marks the file
clearly as an example to copy and adapt; warns against committing
real credentials.
Replace the include_patterns/index_patterns split with a clearer
immutable_patterns/mutable_patterns model:
- immutable_patterns: artifacts cached indefinitely (no TTL)
- mutable_patterns: artifacts that expire and are re-fetched after
cache.mutable_ttl seconds (replaces cache.index_ttl)
_PACKAGE_INDEX_PATTERNS renamed to _PACKAGE_MUTABLE_PATTERNS; all
built-in package-type index patterns (APKINDEX, repomd, manifests, etc.)
default to the remote's mutable_ttl (default 1 hour).
cache.file_ttl renamed to cache.immutable_ttl for consistency.
Adds github-archive remote to remotes.yaml as a worked example showing
tag archives as immutable and branch archives as mutable (1-day TTL).
docker-compose.yml: fix VERSION=dev → 2.2.2.dev0 (valid PEP 440),
add :z SELinux label to volume mounts.
- Rebase Dockerfile onto almalinux9-base, install via uv tool install
- Remove dev artifacts (remotes.yaml, ca-bundle.pem) from image
- Mount gitignored dev files via docker-compose volumes instead
- Add .dockerignore to keep secrets out of build context
- Track docker-compose.yml in git (no secrets; dev files mounted as volumes)
Add a new "Docker Image Rewriting with RKE2" section covering:
- How the /v2/ proxy integrates with registries.yaml mirror rewrites
- Per-registry examples (docker.io, ghcr.io, registry.k8s.io, quay.io)
- include_patterns for restricting which images are cached
- TLS CA configuration for private certificate authorities
- Apply and verification commands
Expand the Configuration section with:
- Richer include_patterns examples (anchored, extension, architecture,
Docker image name patterns, repodata directories)
- New index_patterns section explaining built-in defaults per package
type and how to add custom patterns (Helm index.yaml, APT InRelease/
Packages.gz, extra RPM comps.xml)
Replace hardcoded is_index_file logic with regex patterns driven by
remotes.yaml. Package-level defaults (alpine/rpm/docker) are merged with
any extra patterns listed under index_patterns in the remote config.
Align with intended type=local|remote|virtual / package=docker|rpm|alpine|generic
model. All docker-specific logic now keyed on package field; type field
correctly reflects the repository kind (remote vs local).
Adds pattern checking to docker_v2_proxy before any upstream fetch.
Patterns match against the full path and the image name (first two
path segments), bypassing the index-file exemption that check_artifact_patterns
applies — so restrictions apply equally to manifests, blobs, and tag lists.
Returns 403 when no pattern matches, consistent with the non-docker route.
Replace hardcoded version in pyproject.toml with hatch-vcs so the
package version is read from git tags at build time. Dockerfile
accepts a VERSION build arg and passes it as HATCH_VCS_PRETEND_VERSION
for builds without a git checkout. Makefile _tag target now rebuilds
the container with the correct version automatically.
- Add /v2/ endpoint implementing OCI Distribution API for native docker pull support
- Add docker_auth.py with Bearer token challenge handling and in-memory token cache
- Classify tag-based manifests (/manifests/<tag>) as index (short TTL, mutable)
- Classify digest-pinned manifests (/manifests/sha256:...) and blobs as file cache (indefinite, immutable)
- Deduplicate blob storage by keying on sha256 digest rather than image path
- Support username/password auth per docker remote in remotes.yaml
Add support for remote-specific S3 object deletion using hierarchical key prefixes.
When a remote parameter is provided, only objects under that remote's hierarchy
(e.g., 'fedora/*') are deleted, preserving cached files for other remotes.
Key improvements:
- Use S3 list_objects_v2 Prefix parameter for targeted deletion
- Enhanced logging to indicate scope of cache clearing operations
- Backward compatible - existing behavior preserved when no remote specified
Resolves artifactapi-u46
This commit introduces two major improvements:
1. **Hierarchical S3 Key Structure**:
- Replace URL-based hashing with remote-name/hash(directory_path)/filename format
- Enables remote-specific cache operations and intuitive S3 organization
- Cache keys now independent of mirror URL changes
- Example: fedora/886d215f6d1a0108/eccodes-2.44.0-1.fc42.x86_64.rpm
2. **Automated Version Management**:
- Add bumpver for semantic version bumping
- Single source of truth in pyproject.toml
- FastAPI dynamically reads version from package metadata
- Eliminates manual version synchronization between files
Changes:
- storage.py: New get_object_key(remote_name, path) method with directory hashing
- main.py: Dynamic version import and updated cache key generation calls
- cache.py: Updated to use new hierarchical key structure
- pyproject.toml: Added bumpver config and dev dependency
Breaking change: S3 key format changed, existing cache will need regeneration
- Add PUT /cache/flush endpoint for selective cache management
- Fix critical cache key mismatch in cleanup_expired_index()
- Update index file detection to use full path instead of filename
- Bump version to 2.0.2
The key changes include adding a comprehensive cache flush API that supports selective flushing by cache type (index, files, metrics) and fixing a critical bug where cache keys were
inconsistent between storage and cleanup operations, preventing proper cache invalidation.