94 Commits

Author SHA1 Message Date
unkinben 8bc9285117 chore: track remotes.yaml as a documented example config
Remove remotes.yaml from .gitignore and add header comments explaining
the immutable_patterns/mutable_patterns/cache keys. Marks the file
clearly as an example to copy and adapt; warns against committing
real credentials.
2026-04-27 10:58:59 +10:00
unkinben ce01a94141 feat: rename include/index patterns to immutable/mutable with per-remote TTL
Replace the include_patterns/index_patterns split with a clearer
immutable_patterns/mutable_patterns model:

- immutable_patterns: artifacts cached indefinitely (no TTL)
- mutable_patterns: artifacts that expire and are re-fetched after
  cache.mutable_ttl seconds (replaces cache.index_ttl)

_PACKAGE_INDEX_PATTERNS renamed to _PACKAGE_MUTABLE_PATTERNS; all
built-in package-type index patterns (APKINDEX, repomd, manifests, etc.)
default to the remote's mutable_ttl (default 1 hour).

cache.file_ttl renamed to cache.immutable_ttl for consistency.
Adds github-archive remote to remotes.yaml as a worked example showing
tag archives as immutable and branch archives as mutable (1-day TTL).

docker-compose.yml: fix VERSION=dev → 2.2.2.dev0 (valid PEP 440),
add :z SELinux label to volume mounts.
2026-04-27 00:40:13 +10:00
unkinben 4619ae18d8 Merge pull request 'chore: remove build from tag' (#13) from benvin/docker-compose-build into master
Reviewed-on: #13
2026-04-25 22:29:48 +10:00
unkinben ac51d3a51d chore: remove build from tag
ci/woodpecker/pr/test Pipeline was successful
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/build Pipeline was successful
- stop building the image on tag events
2026-04-25 22:27:59 +10:00
unkinben 2887ce4476 Merge pull request 'build: align Dockerfile with packer build and add docker-compose dev mounts' (#12) from benvin/packer-aligned-dockerfile into master
ci/woodpecker/tag/docker Pipeline was successful
Reviewed-on: #12
v2.2.1
2026-04-25 22:23:59 +10:00
unkinben 9e52929d73 build: align Dockerfile with packer build and add docker-compose dev mounts
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/test Pipeline was successful
ci/woodpecker/pr/build Pipeline was successful
- Rebase Dockerfile onto almalinux9-base, install via uv tool install
- Remove dev artifacts (remotes.yaml, ca-bundle.pem) from image
- Mount gitignored dev files via docker-compose volumes instead
- Add .dockerignore to keep secrets out of build context
- Track docker-compose.yml in git (no secrets; dev files mounted as volumes)
2026-04-25 22:17:36 +10:00
unkinben 788d469063 Merge pull request 'benvin/configurable-index-patterns' (#11) from benvin/configurable-index-patterns into master
ci/woodpecker/tag/docker Pipeline failed
Reviewed-on: #11
v2.2.0
2026-04-25 21:04:25 +10:00
unkinben 1cbe836f1b ci: add Woodpecker pipelines for pre-commit, tests, and Docker build
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/test Pipeline was successful
2026-04-25 21:02:39 +10:00
unkinben f3394b9ca6 docs: add RKE2 image rewriting guide and expand pattern examples
Add a new "Docker Image Rewriting with RKE2" section covering:
- How the /v2/ proxy integrates with registries.yaml mirror rewrites
- Per-registry examples (docker.io, ghcr.io, registry.k8s.io, quay.io)
- include_patterns for restricting which images are cached
- TLS CA configuration for private certificate authorities
- Apply and verification commands

Expand the Configuration section with:
- Richer include_patterns examples (anchored, extension, architecture,
  Docker image name patterns, repodata directories)
- New index_patterns section explaining built-in defaults per package
  type and how to add custom patterns (Helm index.yaml, APT InRelease/
  Packages.gz, extra RPM comps.xml)
2026-04-25 20:20:42 +10:00
unkinben 8da43e610e tests: resolve all peer-review issues across test suite
Address every substantive critique from the peer review:

test_cache: replace tautological same-inputs key test with hardcoded
hash assertion; assert setex call + TTL in mark_index_cached test;
assert client is None for unavailable no-op; rename Packages.gz test
to document intentional behaviour; add alpine sig/tmp negatives; add
hyphenated and date-tag docker positive cases; add key hash-length
assertion.

test_config: replace live-constant comparisons with literal string
assertions for alpine/rpm/docker; add unknown package type test;
add dict-keyed repositories branch coverage (per-repo override and
fallback); fix cache config to full equality check; add explicit empty
index_patterns test.

test_docker_auth: fix case-insensitive test to verify realm value;
add field-order (scope before service) limitation test; add pipe-char
collision documentation test; add missing fetch_token edge cases
(no token field, HTTPStatusError, missing expires_in default 300);
replace rubber-stamp delegate test with end-to-end parse→fetch test.

test_storage: replace split prefix/suffix assertions with structural
3-part check + pinned sha256 assertion; fix Docker blob digests to
64-char hex; add secure=True URL test; add upload return value test;
add download_object 404-on-ClientError test; remove redundant subset
test.

test_routes: add metrics.record_cache_hit/miss assertions; add
mark_index_cached assertion after cache miss on index (docker + generic);
add Content-Disposition, X-Artifact-Size header checks; add rpm/xml
content-type tests; add flush test that verifies Redis keys are deleted
when cache is available; add smoke coverage for upload (PUT), HEAD, DELETE,
/metrics, and /config routes.
2026-04-25 19:58:33 +10:00
unkinben 3a13d76f7e chore: add .tox, .pytest_cache, .pre-commit-cache, .ruff_cache to .gitignore 2026-04-25 19:21:43 +10:00
unkinben 2d0e2c64e6 feat: add test suite, tox, pre-commit, and ruff formatting
- tests/: 107 unit tests across config, cache, docker_auth, storage,
  and FastAPI routes; all passing under pytest-asyncio auto mode
- tox.ini: runs pytest via uvx --with tox-uv tox (py311)
- .pre-commit-config.yaml: ruff lint + ruff-format at v0.15.12
- pyproject.toml: pytest config (asyncio_mode=auto), ruff config
  (line-length=140), tox/pre-commit added to dev extras
- Makefile: test/tox/pre-commit targets via uvx --python 3.11
- Source files reformatted by ruff-format (no logic changes)
2026-04-25 19:21:05 +10:00
unkinben 2414ddfdd3 feat: make index file patterns configurable per remote
Replace hardcoded is_index_file logic with regex patterns driven by
remotes.yaml. Package-level defaults (alpine/rpm/docker) are merged with
any extra patterns listed under index_patterns in the remote config.
2026-04-25 18:40:45 +10:00
unkinben b3d12f4962 docs: add SPEC.md with repository model and caching requirements v2.1.3 2026-04-25 18:31:27 +10:00
unkinben 92b9f9a03e refactor: use package: docker instead of type: docker
Align with intended type=local|remote|virtual / package=docker|rpm|alpine|generic
model. All docker-specific logic now keyed on package field; type field
correctly reflects the repository kind (remote vs local).
2026-04-25 18:27:31 +10:00
unkinben 7930023de8 Merge pull request 'feat: enforce include_patterns on docker /v2/ proxy route' (#10) from benvin/docker-include-patterns into master
Reviewed-on: #10
v2.1.2
2026-04-25 18:14:50 +10:00
unkinben 869a1f8c02 feat: enforce include_patterns on docker /v2/ proxy route
Adds pattern checking to docker_v2_proxy before any upstream fetch.
Patterns match against the full path and the image name (first two
path segments), bypassing the index-file exemption that check_artifact_patterns
applies — so restrictions apply equally to manifests, blobs, and tag lists.
Returns 403 when no pattern matches, consistent with the non-docker route.
2026-04-25 18:09:12 +10:00
unkinben 1b2ee0d37f Merge pull request 'benvin/docker-caching' (#9) from benvin/docker-caching into master
Reviewed-on: #9
v2.1.1
2026-04-25 17:33:18 +10:00
unkinben 33e7365a88 fix: set SETUPTOOLS_SCM_PRETEND_VERSION in Dockerfile for hatch-vcs 2026-04-25 17:31:36 +10:00
unkinben cf854a2ace chore: derive version from git tags via hatch-vcs
Replace hardcoded version in pyproject.toml with hatch-vcs so the
package version is read from git tags at build time. Dockerfile
accepts a VERSION build arg and passes it as HATCH_VCS_PRETEND_VERSION
for builds without a git checkout. Makefile _tag target now rebuilds
the container with the correct version automatically.
v2.1.0
2026-04-25 16:53:37 +10:00
unkinben 4c1f77e679 Merge pull request 'feat: add Docker registry proxy support with proper cache classification' (#8) from benvin/docker-caching into master
Reviewed-on: #8
2026-04-25 16:37:38 +10:00
unkinben 4651183ed1 feat: add Docker registry proxy support with proper cache classification
- Add /v2/ endpoint implementing OCI Distribution API for native docker pull support
- Add docker_auth.py with Bearer token challenge handling and in-memory token cache
- Classify tag-based manifests (/manifests/<tag>) as index (short TTL, mutable)
- Classify digest-pinned manifests (/manifests/sha256:...) and blobs as file cache (indefinite, immutable)
- Deduplicate blob storage by keying on sha256 digest rather than image path
- Support username/password auth per docker remote in remotes.yaml
2026-04-25 16:35:27 +10:00
unkinben 5733d52e51 Merge pull request 'feature/cache-flush-api-enhancement' (#7) from feature/cache-flush-api-enhancement into master
Reviewed-on: #7
2026-01-25 11:34:43 +11:00
unkinben bf8a176dda Bump version: 2.0.4 → 2.0.5 2026-01-25 11:09:29 +11:00
unkinben 3e8e819ecf feat: enhance cache flush API for remote-specific S3 clearing
Add support for remote-specific S3 object deletion using hierarchical key prefixes.
When a remote parameter is provided, only objects under that remote's hierarchy
(e.g., 'fedora/*') are deleted, preserving cached files for other remotes.

Key improvements:
- Use S3 list_objects_v2 Prefix parameter for targeted deletion
- Enhanced logging to indicate scope of cache clearing operations
- Backward compatible - existing behavior preserved when no remote specified

Resolves artifactapi-u46
2026-01-25 11:08:46 +11:00
unkinben de04e4d2b2 Merge pull request 'benvin/path-based-storage' (#6) from benvin/path-based-storage into master
Reviewed-on: #6
2026-01-25 00:00:53 +11:00
unkinben 16c8bd60eb Bump version: 2.0.3 → 2.0.4 v2.0.4 2026-01-24 23:59:48 +11:00
unkinben d39550c4e8 chore: migrate from bumpver to bump-my-version for better reliability 2026-01-24 23:59:05 +11:00
unkinben 424de5cc13 fix: sync package version with bumpver version (2.0.3) 2026-01-24 23:54:42 +11:00
unkinben 3f54874421 Bump version 2.0.2 → 2.0.3 2026-01-24 23:53:19 +11:00
unkinben 6e8912eed1 chore: ignore local config overrides (docker-compose.yml, ca-bundle.pem) 2026-01-24 23:53:08 +11:00
unkinben 5a0e8b4e0b feat: implement hierarchical S3 keys and automated version management
This commit introduces two major improvements:

1. **Hierarchical S3 Key Structure**:
   - Replace URL-based hashing with remote-name/hash(directory_path)/filename format
   - Enables remote-specific cache operations and intuitive S3 organization
   - Cache keys now independent of mirror URL changes
   - Example: fedora/886d215f6d1a0108/eccodes-2.44.0-1.fc42.x86_64.rpm

2. **Automated Version Management**:
   - Add bumpver for semantic version bumping
   - Single source of truth in pyproject.toml
   - FastAPI dynamically reads version from package metadata
   - Eliminates manual version synchronization between files

Changes:
- storage.py: New get_object_key(remote_name, path) method with directory hashing
- main.py: Dynamic version import and updated cache key generation calls
- cache.py: Updated to use new hierarchical key structure
- pyproject.toml: Added bumpver config and dev dependency

Breaking change: S3 key format changed, existing cache will need regeneration
2026-01-24 23:51:03 +11:00
unkinben 1a71a2c9fa Merge pull request 'feat: add cache flush API and fix cache key consistency' (#5) from benvin/cache_flush into master
Reviewed-on: #5
2026-01-13 19:02:52 +11:00
unkinben d2ecc6b1a0 feat: add cache flush API and fix cache key consistency
- Add PUT /cache/flush endpoint for selective cache management
- Fix critical cache key mismatch in cleanup_expired_index()
- Update index file detection to use full path instead of filename
- Bump version to 2.0.2

The key changes include adding a comprehensive cache flush API that supports selective flushing by cache type (index, files, metrics) and fixing a critical bug where cache keys were
inconsistent between storage and cleanup operations, preventing proper cache invalidation.
2026-01-13 19:02:31 +11:00
unkinben e4013e6a2a Merge pull request 'feat: index caching' (#4) from benvin/index_caching into master
Reviewed-on: #4
2026-01-13 18:14:39 +11:00
unkinben 9defc78e21 feat: index caching
- improve index detection for rpms
- improve logging
2026-01-13 18:13:47 +11:00
unkinben f40675f3d2 Merge pull request 'feat: add fedora index files' (#3) from benvin/fedora_indexes into master
Reviewed-on: #3
2026-01-10 17:02:58 +11:00
unkinben b54e6c3e0c feat: add fedora index files
- ensure files matching xml.zck and xml.zst are marked as index files
2026-01-10 17:01:39 +11:00
unkinben 79a8553e9c Merge pull request 'Fix S3 SSL certificate validation and boto3 checksum compatibility' (#2) from benvin/boto3_fixes into master
Reviewed-on: #2
2026-01-08 23:55:42 +11:00
unkinben b7205e09a3 Fix S3 SSL certificate validation and boto3 checksum compatibility
- Add support for custom CA bundle via REQUESTS_CA_BUNDLE/SSL_CERT_FILE environment variables
- Configure boto3 client with custom SSL verification to support Ceph RadosGW through nginx proxy
- Maintain boto3 checksum validation configuration for compatibility with third-party S3 providers
- Resolves XAmzContentSHA256Mismatch errors when connecting to RadosGW endpoints

Fixes #4400 compatibility issue with boto3 v1.36+ stricter checksum validation
2026-01-08 23:54:39 +11:00
unkinben 1fb6b89a5f Merge pull request 'Fix boto3 XAmzContentSHA256Mismatch errors with Ceph RadosGW' (#1) from fix/boto3-checksum-validation into master
Reviewed-on: #1
2026-01-08 23:07:51 +11:00
unkinben f0f3cfda50 Fix boto3 XAmzContentSHA256Mismatch errors with Ceph RadosGW
- Configure boto3 client with when_required checksum calculation/validation
- Resolves compatibility issues between boto3 v1.36+ and S3-compatible providers
- Fixes 502 errors when uploading artifacts to storage backend

Refs: https://github.com/boto/boto3/issues/4400
2026-01-08 23:04:19 +11:00
unkinben 666ad08518 Add CONFIG_PATH environment variable for configurable config file location
- Replace hardcoded remotes.yaml path with CONFIG_PATH environment variable
- Update docker-compose.yml to set CONFIG_PATH=/app/remotes.yaml
- Application now requires CONFIG_PATH to be set, enabling flexible deployment
2026-01-06 21:35:59 +11:00
unkinben 46711eec6a Initial implementation of generic artifact storage system
- FastAPI-based caching proxy for remote file servers
- YAML configuration for multiple remotes (GitHub, Gitea, HashiCorp, etc.)
- Direct URL API: /api/v1/remote/{remote}/{path} with auto-download and caching
- Pattern-based access control with regex filtering
- S3/MinIO backend storage with predictable paths
- Docker Compose setup with MinIO for local development
2026-01-06 21:13:13 +11:00