The UI now serves under /ui (artifactapi#58). Health probes need /ui instead of /.
Reviewed-on: #202
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Route /ui → UI service, everything else → API service.
Replaces the growing list of per-prefix rules (/api, /v2, /health) with a single catch-all to the API. No more needing to add a route rule every time the API adds a new top-level path.
Reviewed-on: #201
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
The v3 route migration (#198) split routes into /api → API and / → UI, but /v2/ (Docker Registry V2 API) and /health now hit the UI catch-all instead of the API backend.
This breaks `docker pull artifactapi.k8s.syd1.au.unkin.net/...` with context deadline exceeded.
Adds /v2 and /health prefix rules before the UI catch-all.
Reviewed-on: #200
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
update the environment secret reference to match what has been
deployed. this prevents a containerconfigerror
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #199
What changed:
- Adds new v3 API and UI deployments (separate api-deployment.yaml, ui-deployment.yaml) alongside the existing monolithic artifactapi-deployment.yaml
- Adds CNPG PostgreSQL cluster + pooler to replace the standalone postgres deployment
- Adds new api-env configmap, new Vault secrets (postgres-credentials, environment), and a second VaultAuth (default1)
- Adds new services targeting the split api and ui selectors
- Adds HPAs for both new deployments
- Updates kustomization to include all new resources
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #197
attempted to let claude deploy a new version of artifactory with
terrible results. this change is to remove that mess so I can start
again.
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #196
just-enough to test terraform deployment and begin migration. have
change to cnpg for the database and a new bucket for storage
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #192
woodpecker jobs for terraform-artifactapi use the service account of the
same name to run jobs, so that it can access specific secrets
- add terraform-artifactapi serviceaccount
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #190
## Summary
- Add ServiceAccount terraform-git in woodpecker namespace for terraform-git CI pipelines
- Add to kustomization.yaml
## Test plan
- [ ] Verify ArgoCD syncs the new service account
- [ ] Verify woodpecker CI can use the service account
Reviewed-on: #189
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Drop from 3 replicas to 1. Remove init container, repl-certs secret,
replication port, podAntiAffinity, server-1/2 configs, and replication
stanza from server-0.toml. Mount configmap directly via subPath.
Reviewed-on: #185
## Summary
- The `\n` escape in a shell variable wasn't interpreted as a newline when passed as a `printf %s` argument
- This caused `automatic_refresh = true` to be appended to the `partner_cert` string value on the same line, breaking TOML parsing on kanidm-2
- Fixed by using separate `printf` calls per peer type, with `\n` in the format string (not a variable) where it is correctly interpreted
## Test plan
- [ ] kanidm-2 init container generates valid TOML with `automatic_refresh = true` on its own line under the kanidm-0 peer section
- [ ] kanidm-1 and kanidm-2 start successfully and auto-refresh domain UUID from kanidm-0
Reviewed-on: #182
kanidm-0 is the authoritative supplier; kanidm-1 and kanidm-2 pull
from kanidm-0 only. automatic_refresh = true on the kanidm-0 peer
entry for kanidm-1/2 so fresh nodes auto-sync domain UUID on restart.
Reviewed-on: #181
## Summary
Sets `WOODPECKER_BACKEND_K8S_PRIORITY_CLASS: power` on the Woodpecker agent so all CI pipeline pods are scheduled with the `power` PriorityClass (value 100, preemptionPolicy: Never).
This means pipeline pods can be evicted when the cluster is under pressure but won't preempt other workloads.
## Dependency
Requires the `power` PriorityClass to exist on the cluster — deploy PR #174 (priority-classes app) first.
## Test plan
- Trigger a pipeline run and confirm pods are created with `priorityClassName: power`
- `kubectl get pod -n woodpecker -o jsonpath='{.items[*].spec.priorityClassName}'`
Reviewed-on: #175
## Summary
- New `apps/base/priority-classes/` app with four `PriorityClass` objects managed via the `platform` ArgoCD project
- Adds `apps/overlays/*/priority-classes` to the platform ApplicationSet generator
- Adds `priority-classes` namespace to platform AppProject destinations (required even for cluster-scoped resources)
| Class | Value | PreemptionPolicy | Intent |
|---|---|---|---|
| `low` | 100 | Never | Background work; evictable, won't preempt others |
| `power` | 100 | Never | Compute-heavy but expendable (e.g. AI/ML workloads) |
| `medium` | 10000 | PreemptLowerPriority | Standard services |
| `high` | 100000 | PreemptLowerPriority | Critical services; preempts lower-priority pods |
`PriorityClass` is already in the platform project's `clusterResourceWhitelist` so no project policy changes were needed.
## Test plan
- ArgoCD syncs `platform-priority-classes` successfully
- `kubectl get priorityclasses low power medium high` shows all four classes
Reviewed-on: #174
Part of #155 (prerequisite for open-webui deployment PR #172).
## Summary
- Adds `^open-webui/open-webui` to the `ghcr` remote's `immutable_patterns` in `remote-docker.yaml` so version-pinned open-webui image pulls are cached indefinitely through artifactapi
## Test plan
- artifactapi serves `ghcr.io/open-webui/open-webui:<version>` with `X-Artifact-Source: cache` on second fetch
Reviewed-on: #173
Replaces Consul service registration with the native Kubernetes provider so Vault labels its own pods with active/standby/perf-standby status without requiring a Consul dependency.
## Changes
- `values.yaml`: swap `service_registration "consul"` for `service_registration "kubernetes" {}`, add `VAULT_K8S_NAMESPACE` and `VAULT_K8S_POD_NAME` env vars via downward API
- `role_k8s-service-registration.yaml`: Role + RoleBinding granting the `vault` service account `get`/`update`/`patch` on pods
- `kustomization.yaml`: include new RBAC file
Reviewed-on: #171
## Summary
- Adds `open-policy-agent/conftest/.*/conftest_.*_Linux_x86_64.tar.gz$` to the `github` remote immutable patterns in artifactapi
## Why
conftest v0.68.2 (https://github.com/open-policy-agent/conftest/releases/tag/v0.68.2) is now used for OPA policy checks in CI (see #167). Caching the release tarball in artifactapi reduces external dependency on GitHub during builds.
Reviewed-on: #168
## Summary
- Removes `clusterIP: null` from the `puppetdb` Service spec
## Why
Setting `clusterIP: null` makes ArgoCD's desired state explicit about the field being null. Kubernetes assigns a real IP on creation and the field is immutable afterward. The null vs assigned-IP mismatch causes permanent OutOfSync on the puppetdb Service. Removing the field means ArgoCD no longer claims ownership of `clusterIP`, so the API server's value is authoritative.
Reviewed-on: #166
## Summary
- Adds `group: gateway.networking.k8s.io` and `kind: Gateway` to `parentRefs`
- Adds `group: ""`, `kind: Service`, and `weight: 1` to `backendRefs`
## Why
The Gateway API controller defaults these fields when creating/updating TLSRoute objects, so the live state always has them. ArgoCD diffs desired vs live by string comparison, causing the `kanidm` TLSRoute to show permanent OutOfSync. Same root cause as #162 (HTTPRoutes).
Reviewed-on: #165
## Summary
- Changes `server.resources.limits.cpu` from `1000m` to `"1"` in consul Helm values
## Why
`1000m` (1000 milliCPU) is equivalent to `1` CPU, but Kubernetes normalizes the value to `"1"` when storing. ArgoCD diffs desired vs live by string comparison, so the mismatch causes a permanent OutOfSync on the `consul-server` StatefulSet. Same root cause as #163.
Reviewed-on: #164
## Summary
- Changes `limits.memory` from `1024Mi` to `1Gi` (same value, canonical form)
- Changes `limits.cpu` from `1` (integer) to `"1"` (string, canonical form)
## Why
Kubernetes normalizes resource quantities on write — `1024Mi` becomes `1Gi` and integer `1` becomes string `"1"`. ArgoCD diffs by string comparison, so these equivalent values cause a permanent OutOfSync on the `litellm-postgres` Cluster.
Reviewed-on: #163
## Summary
- Adds `group: gateway.networking.k8s.io` and `kind: Gateway` to all `parentRefs` entries
- Adds `group: ""`, `kind: Service`, and `weight: 1` to all `backendRefs` entries
- Affects 9 HTTPRoute files across artifactapi, cattle-system, consul, kanidm, litellm, paperclip, puppet, and vault
## Why
ArgoCD diffs the desired manifest against the live Kubernetes object. The Gateway API controller defaults these fields when creating/updating objects, so the live state always has them — causing persistent OutOfSync for every HTTPRoute. Same root cause as #153 (certificateRefs).
## Test plan
- [ ] All affected ArgoCD applications show Synced after merge
Reviewed-on: #162
## Summary
- Changes both `config-init` init container and `kanidm` container images from `ghcr.io/kanidm/server:1.10.3` to `kanidm/server:1.10.3`
## Why
`kanidm/server` is published on Docker Hub, not ghcr.io. RKE2 rewrites dockerhub pulls through the artifactapi mirror automatically.
## Test plan
- [ ] Pods roll successfully after ArgoCD sync
- [ ] Verify kanidm cluster replication still healthy
Reviewed-on: #161
## Summary
- Removes `^kanidm/` from the `ghcr` remote immutable_patterns
- Adds `^kanidm/` to the `dockerhub` remote immutable_patterns
## Why
`kanidm/server` is published on Docker Hub, not ghcr.io. Pulling via the `ghcr` cache was failing with 403 on anonymous token fetch → 502 Bad Gateway.
## Test plan
- [ ] `docker pull artifactapi.k8s.syd1.au.unkin.net/dockerhub/kanidm/server:1.10.3` succeeds after artifactapi redeploys
Reviewed-on: #160
## Summary
- Adds \`stalwartlabs/webadmin/releases/latest/download/webadmin.zip\` to \`mutable_patterns\` in the \`github\` generic remote so the stalwart webadmin UI can be fetched through artifactapi rather than directly from GitHub.
## Notes
- Uses \`mutable_patterns\` (not \`immutable\`) because \`releases/latest\` resolves to whichever release is current and changes over time.
- Access URL: \`https://artifactapi.k8s.syd1.au.unkin.net/generic/github/stalwartlabs/webadmin/releases/latest/download/webadmin.zip\`
Reviewed-on: #157
The Gateway API admission server defaults certificateRefs[].group to ""
when it is omitted. ArgoCD diffed the desired state (no group field) against
the live state (group: "") and flagged every gateway as out of sync.
Fix: explicitly set group: "" in all certificateRefs entries so the
rendered manifest matches the API server's canonical form exactly.
Affected: artifactapi, cattle-system, consul, litellm, paperclip,
puppet (puppetboard + puppetdb), vault.
Reviewed-on: #153
Vault and consul namespaces were missing from the platform AppProject
allowed destinations, causing ArgoCD sync failures with:
destination server 'https://kubernetes.default.svc' and namespace
'vault' do not match any of the allowed destinations in project 'platform'
Reviewed-on: #152
## Summary
- Deploys HashiCorp Consul 1.22.7 using Helm chart 1.9.7 with 5 server replicas
- Configuration modelled on production consul: \`datacenter=au-syd1\`, \`connect=true\`, \`raft_multiplier=10\`, HTTP on 8500, GRPC on 8502, HTTPS disabled
- 5-replica server cluster with \`bootstrapExpect=5\`
- 10Gi cephrbd-fast-delete PVC per server pod
- Gateway API: HTTPS gateway + HTTPRoute (443→consul-consul-ui:80→8500) at \`consul.k8s.syd1.au.unkin.net\`
- PodDisruptionBudget patched from \`policy/v1beta1\` to \`policy/v1\` (k8s 1.25+ compatibility)
- ArgoCD platform ApplicationSet updated to include consul overlay path
- Clients disabled (server-only deployment)
- ConnectInject disabled (can be enabled later for service mesh)
## Requires
- PR #147 (artifactapi: add hashicorp/consul to docker immutable patterns) to be merged first
## Test plan
- [ ] Sandbox tested in \`sandbox-consul\`: all 5 server pods 1/1 Running, cluster formed
- [ ] After merge: ArgoCD syncs consul namespace
- [ ] Verify \`consul.k8s.syd1.au.unkin.net\` is accessible via Gateway
Reviewed-on: #149
## Summary
- Deploys HashiCorp Vault 2.0.1 using Helm chart 0.32.0 in HA raft mode (5 replicas)
- Configuration modelled on production vault: \`disable_mlock=true\`, headless-DNS retry_join for all 5 pods
- IPC_LOCK capability added via \`server.statefulSet.securityContext.container\`
- 10Gi cephrbd-fast-delete PVC per pod via \`dataStorage\`
- Gateway API: HTTPS gateway + HTTPRoute (443→vault service port 8200) at \`vault.k8s.syd1.au.unkin.net\`
- ArgoCD platform ApplicationSet updated to include vault overlay path
- Injector disabled (no agent sidecar injection needed)
## Requires
- PR #147 (artifactapi: add hashicorp/vault to docker immutable patterns) to be merged first
## Test plan
- [ ] Sandbox tested in \`sandbox-vault\`: all 5 pods Running, raft cluster forming
- [ ] After merge: ArgoCD syncs vault namespace
- [ ] Operator runs \`vault operator init\` to initialize, then unseals all 5 nodes
- [ ] Verify \`vault.k8s.syd1.au.unkin.net\` is accessible via Gateway
Reviewed-on: #148
## Summary
- Upgrades cert-manager from v1.19.2 to v1.20.2
- Enables `enableGatewayAPI: true` via the `ControllerConfiguration` config block
## Why
cert-manager's Gateway API integration was not enabled. Without it, `cert-manager.io/*` annotations on Gateway resources are ignored and no certificates are issued. This is required for the consul and vault PRs (#148, #149) to have their TLS certs automatically provisioned from their Gateway annotations.
In v1.20.2, `ExperimentalGatewayAPISupport` is BETA and defaults to true — enabling `enableGatewayAPI` in the controller config activates the gateway-shim controller.
## Test plan
- [ ] cert-manager rolls out cleanly (v1.20.2 pods become Ready)
- [ ] After rollout, existing Gateway-annotated services (artifactapi, puppet, litellm) retain valid certs
- [ ] New Gateway resources with `cert-manager.io/cluster-issuer` annotations trigger Certificate creation
Reviewed-on: #150
## Summary
- Adds \`^hashicorp/consul\` and \`^hashicorp/vault\` to the dockerhub immutable_patterns in artifactapi's remote-docker.yaml
- Replaces the more specific \`^hashicorp/vault-secrets-operator\` pattern since \`^hashicorp/vault\` subsumes it
- Required for the benvin/vault and benvin/consul branches (vault:2.0.1 and consul:1.22.7)
## Test plan
- [ ] Verify artifactapi accepts requests for hashicorp/vault and hashicorp/consul images after merge
Reviewed-on: #147