feat: add puppet forge remote type
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/test Pipeline was successful
ci/woodpecker/pr/build Pipeline was successful

Adds package: puppet for proxying the Puppet Forge API (forgeapi.puppet.com).

- remote/puppet.py: rewrites absolute forge URLs and relative /v3/files/
  paths in JSON responses to absolute proxy URLs; g10k uses
  url.ResolveReference so absolute file_uri values override the base
  entirely, meaning tarball downloads go straight to the proxy
- config.py: registers built-in mutable patterns for v3/modules/ and
  v3/releases (module metadata pages)
- artifact/proxy.py: dispatches to puppet.resolve_content for package:
  puppet remotes
- 9 new tests covering mutable detection, URL rewriting (relative and
  absolute), content-type, tarball pass-through, and pattern blocking

Client configuration (g10k):
  - config file: forge_base_url: https://artifacts.example.com/api/v1/remote/puppet-forge
  - Puppetfile: forge.baseUrl https://artifacts.example.com/api/v1/remote/puppet-forge
This commit is contained in:
2026-05-17 10:50:14 +10:00
parent ff2aefeef4
commit 5912e3ae3c
8 changed files with 216 additions and 4 deletions
+46 -2
View File
@@ -4,7 +4,7 @@ FastAPI caching proxy that downloads and stores files from remote sources in S3-
## Features
- Remote definitions via `remotes.yaml` — generic HTTP, Alpine APK, RPM, Docker, PyPI, npm, Helm
- Remote definitions via `remotes.yaml` — generic HTTP, Alpine APK, RPM, Docker, PyPI, npm, Helm, Puppet Forge
- Virtual repositories — merge multiple remotes of the same package type into a single unified index
- Immutable/mutable caching model with per-remote TTLs
- Conditional revalidation (`If-None-Match` / `If-Modified-Since`) on TTL expiry
@@ -62,6 +62,7 @@ src/artifactapi/
├── generic.py — generic HTTP remotes
├── helm.py — Helm index.yaml URL rewriting
├── npm.py — npm metadata URL rewriting
├── puppet.py — Puppet Forge JSON URL rewriting
├── python.py — PyPI URL construction + HTML rewriting
└── rpm.py — RPM remotes
```
@@ -130,7 +131,7 @@ Repositories are declared under three top-level keys matching their type:
remotes: # proxy (caching) remotes
remote-name:
base_url: "https://example.com"
package: "generic" # generic, alpine, rpm, docker, pypi, npm, helm
package: "generic" # generic, alpine, rpm, docker, pypi, npm, helm, puppet
description: "..."
immutable_patterns: # regex — cached forever
- ".*\\.tar\\.gz$"
@@ -361,6 +362,48 @@ helm repo add hashicorp https://artifacts.example.com/api/v1/remote/hashicorp-he
helm repo update
```
### puppet
Proxy for [Puppet Forge](https://forge.puppet.com) (forgeapi.puppet.com). Module metadata is cached as mutable; versioned module tarballs are cached as immutable.
```yaml
remotes:
puppet-forge:
base_url: "https://forgeapi.puppet.com"
package: "puppet"
check_mutable_updates: true
immutable_patterns:
- "^v3/files/.*\\.tar\\.gz$"
cache:
immutable_ttl: 0 # module tarballs cached indefinitely
mutable_ttl: 600 # module metadata refreshed after 10 minutes
```
`v3/modules/` and `v3/releases` are built-in mutable patterns — module metadata pages expire after `mutable_ttl` and are re-fetched on the next request.
**URL rewriting**: the proxy rewrites `file_uri` fields in Forge JSON responses from relative paths (`/v3/files/…`) to absolute proxy URLs. g10k resolves download URLs with Go's `url.ResolveReference`, so an absolute `file_uri` overrides the forge base entirely — tarballs download straight from the proxy without a second hop.
**Client configuration — g10k**: set `forge_base_url` in the g10k config file:
```yaml
# g10k.yaml
cachedir: /tmp/g10k
forge_base_url: https://artifacts.example.com/api/v1/remote/puppet-forge
sources:
control:
remote: git@git.example.com:puppet/control.git
basedir: /etc/puppetlabs/code/environments
```
Alternatively, set the URL per-Puppetfile with the `forge.baseUrl` directive (works with `-puppetfile` mode and does not require a config file):
```ruby
forge.baseUrl https://artifacts.example.com/api/v1/remote/puppet-forge
mod 'puppetlabs-stdlib', '9.7.0'
mod 'puppetlabs-inifile', '6.2.0'
```
### virtual
A virtual repository presents a single unified index built from multiple member remotes of the same package type. Clients configure one endpoint and get access to all member remotes transparently.
@@ -457,6 +500,7 @@ Each package type has built-in defaults that are merged with any user-defined `m
| `docker` | Tag manifests (non-digest refs), `/tags/list` |
| `pypi` | `simple/` (per-package and top-level index pages) |
| `helm` | `index\.yaml$` |
| `puppet` | `^v3/modules/`, `^v3/releases` |
| `npm` | *(none built-in — define via `mutable_patterns`)* |
| `generic` | *(none)* |
+14
View File
@@ -452,6 +452,20 @@ remotes:
immutable_ttl: 0
mutable_ttl: 3600
puppet-forge:
base_url: "https://forgeapi.puppet.com"
package: "puppet"
description: "Puppet Forge module registry"
# Module metadata (v3/modules/, v3/releases) is mutable by default.
# Configure r10k / librarian-puppet with this remote as the Forge URL:
# http://your-proxy/api/v1/remote/puppet-forge
check_mutable_updates: true
immutable_patterns:
- "^v3/files/.*\\.tar\\.gz$"
cache:
immutable_ttl: 0 # Module tarballs cached indefinitely
mutable_ttl: 600 # Module metadata refreshed after 10 minutes
virtuals:
helm-all:
+3
View File
@@ -11,6 +11,7 @@ from fastapi import HTTPException, Request, Response
from ..auth import get_docker_token_for_response
from ..remote import helm as _helm
from ..remote import npm as _npm
from ..remote import puppet as _puppet
from ..remote import python as _pypi
from ..remote.base import get_content_type
@@ -84,6 +85,8 @@ def _resolve_content(
return _npm.resolve_content(data, path, filename, remote_config.get("immutable_patterns", []), base_url, proxy_base, remote_name)
if package == "helm":
return _helm.resolve_content(data, path, filename, base_url, proxy_base, remote_name)
if package == "puppet":
return _puppet.resolve_content(data, path, filename, base_url, proxy_base, remote_name)
return data, get_content_type(filename)
+4
View File
@@ -26,6 +26,10 @@ _PACKAGE_MUTABLE_PATTERNS: dict[str, list[str]] = {
"helm": [
r"index\.yaml$",
],
"puppet": [
r"^v3/modules/",
r"^v3/releases",
],
"generic": [],
}
+2 -2
View File
@@ -1,4 +1,4 @@
from . import generic, helm, npm, python, rpm
from . import generic, helm, npm, puppet, python, rpm
from .base import get_content_type
__all__ = ["generic", "helm", "npm", "python", "rpm", "get_content_type"]
__all__ = ["generic", "helm", "npm", "puppet", "python", "rpm", "get_content_type"]
+24
View File
@@ -0,0 +1,24 @@
from .base import get_content_type
def resolve_content(
data: bytes,
path: str,
filename: str,
base_url: str,
proxy_url: str,
remote_name: str,
) -> tuple[bytes, str]:
if not path.startswith("v3/files/"):
proxy_remote_url = f"{proxy_url}/api/v1/remote/{remote_name}"
# Rewrite any absolute forge API URLs
data = data.replace(base_url.encode(), proxy_remote_url.encode())
# Rewrite relative file_uri paths ("/v3/files/...") to absolute proxy URLs.
# g10k resolves file_uri against only the forge host, so a relative path
# would drop our /api/v1/remote/<name> prefix.
data = data.replace(
b'"/v3/files/',
f'"{proxy_remote_url}/v3/files/'.encode(),
)
return data, "application/json"
return data, get_content_type(filename)
+6
View File
@@ -112,6 +112,12 @@ TEST_REMOTES = {
"immutable_patterns": [r"\.tgz$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 1800},
},
"puppet-test": {
"base_url": "https://forgeapi.puppet.com",
"package": "puppet",
"immutable_patterns": [r"^v3/files/.*\.tar\.gz$"],
"cache": {"immutable_ttl": 0, "mutable_ttl": 600},
},
},
"locals": {
"local-test": {
+117
View File
@@ -1117,6 +1117,123 @@ class TestHelmRemote:
assert response.status_code == 403
# ---------------------------------------------------------------------------
# Puppet Forge remote /api/v1/remote/puppet-test/...
# ---------------------------------------------------------------------------
class TestPuppetRemote:
def test_module_metadata_is_mutable(self, client, patched_deps):
"""v3/modules/ paths are detected as mutable (package-type default)."""
deps = patched_deps
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = b'{"current_release":{"file_uri":"/v3/files/puppetlabs-stdlib-9.7.0.tar.gz"}}'
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/api/v1/remote/puppet-test/v3/modules/puppetlabs-stdlib")
assert response.status_code == 200
deps["cache"].mark_index_cached.assert_not_called()
def test_releases_path_is_mutable(self, client, patched_deps):
"""v3/releases paths are detected as mutable (package-type default)."""
deps = patched_deps
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = b'{"file_uri":"/v3/files/puppetlabs-stdlib-9.7.0.tar.gz"}'
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/api/v1/remote/puppet-test/v3/releases/puppetlabs-stdlib-9.7.0")
assert response.status_code == 200
def test_relative_file_uri_rewritten_to_absolute_proxy_url(self, client, patched_deps):
"""Relative /v3/files/ paths in JSON responses are rewritten to absolute proxy URLs."""
deps = patched_deps
meta = b'{"current_release":{"file_uri":"/v3/files/puppetlabs-stdlib-9.7.0.tar.gz","version":"9.7.0"}}'
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = meta
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/api/v1/remote/puppet-test/v3/modules/puppetlabs-stdlib")
assert response.status_code == 200
assert b'"/v3/files/' not in response.content
assert b"/api/v1/remote/puppet-test/v3/files/puppetlabs-stdlib-9.7.0.tar.gz" in response.content
def test_absolute_forge_url_rewritten_to_proxy(self, client, patched_deps):
"""Absolute forgeapi.puppet.com URLs in JSON are rewritten to the proxy URL."""
deps = patched_deps
meta = b'{"uri":"https://forgeapi.puppet.com/v3/modules/puppetlabs-stdlib"}'
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = meta
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/api/v1/remote/puppet-test/v3/modules/puppetlabs-stdlib")
assert response.status_code == 200
assert b"forgeapi.puppet.com" not in response.content
assert b"/api/v1/remote/puppet-test" in response.content
def test_metadata_content_type_is_json(self, client, patched_deps):
deps = patched_deps
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = b'{"current_release":{}}'
deps["cache"].is_mutable_file.return_value = True
deps["cache"].is_index_valid.return_value = True
response = client.get("/api/v1/remote/puppet-test/v3/modules/puppetlabs-concat")
assert response.status_code == 200
assert "application/json" in response.headers["content-type"]
def test_tarball_served_without_rewriting(self, client, patched_deps):
"""Module tarballs (v3/files/*.tar.gz) are served as binary without URL rewriting."""
deps = patched_deps
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = b"\x1f\x8b tarball bytes"
deps["cache"].is_mutable_file.return_value = False
response = client.get("/api/v1/remote/puppet-test/v3/files/puppetlabs-stdlib-9.7.0.tar.gz")
assert response.status_code == 200
assert "application/gzip" in response.headers["content-type"]
assert response.headers["X-Artifact-Source"] == "cache"
def test_tarball_not_blocked_by_immutable_pattern(self, client, patched_deps):
"""v3/files/*.tar.gz matches the configured immutable_patterns and is allowed."""
deps = patched_deps
deps["storage"].exists.return_value = True
deps["storage"].download_object.return_value = b"\x1f\x8b tarball bytes"
deps["cache"].is_mutable_file.return_value = False
response = client.get("/api/v1/remote/puppet-test/v3/files/puppetlabs-inifile-6.2.0.tar.gz")
assert response.status_code == 200
def test_unknown_path_blocked(self, client, patched_deps):
"""Paths outside v3/modules, v3/releases, and v3/files are blocked."""
deps = patched_deps
deps["cache"].is_mutable_file.return_value = False
response = client.get("/api/v1/remote/puppet-test/v3/users/puppetlabs")
assert response.status_code == 403
def test_metadata_cache_miss_fetches_upstream(self, client, patched_deps):
deps = patched_deps
meta = b'{"current_release":{"file_uri":"/v3/files/puppetlabs-stdlib-9.7.0.tar.gz"}}'
deps["storage"].exists.return_value = False
deps["storage"].download_object.return_value = meta
deps["cache"].is_mutable_file.return_value = True
with patch(
"artifactapi.artifact.proxy.cache_single_artifact",
new_callable=AsyncMock,
return_value={"status": "cached"},
) as mock_fetch:
response = client.get("/api/v1/remote/puppet-test/v3/modules/puppetlabs-stdlib")
mock_fetch.assert_called_once()
assert response.status_code == 200
assert b'"/v3/files/' not in response.content
# ---------------------------------------------------------------------------
# Quarantine (quarantine-test remote: quarantine_new=True, quarantine_days=3)
# ---------------------------------------------------------------------------