163 Commits

Author SHA1 Message Date
unkinben bbb9acba36 feat: add woodpecker service accounts for media terraform repos (#214)
Add Kubernetes ServiceAccounts in the woodpecker namespace for terraform-sonarr, terraform-radarr, and terraform-prowlarr CI pipelines.

Reviewed-on: #214
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-28 22:04:33 +10:00
benvin 48f32a044d fix: update TLSRoute to v1 (#213)
TLSRoutes are now in standard, no longer experimental

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #213
2026-06-28 17:50:27 +10:00
unkinben 7f1444fb38 Add Authentik identity provider deployment (#211)
## Summary
- Deploy Authentik (identity.unkin.net) via Helm chart 2026.5.3
- CNPG PostgreSQL cluster (3 instances) with separate rw/ro poolers (2 instances each)
- Redis with 5Gi persistent storage
- Gateway API for HTTPS (identity.unkin.net) and LDAPS (ldap.k8s.syd1.au.unkin.net, ldap.main.unkin.net)
- TLSRoute for LDAPS passthrough, HTTPRoute for external-dns record creation
- Vault secrets for postgres credentials, authentik secret key, and S3 storage credentials
- S3 storage via RadosGW (bucket: authentik)
- 3 server replicas, 2 worker replicas
- Woodpecker ServiceAccount for terraform-authentik CI
- Platform applicationset and project updated

## Dependencies
- terraform-git #15 (merged) — repo definition
- terraform-vault #78 (merged) — auth roles and Consul ACL

## Vault secrets needed before deploy
Write to `kv/kubernetes/namespace/authentik/default/`:
- `postgres-credentials`: username + password
- `authentik-credentials`: AUTHENTIK_SECRET_KEY
- `s3-credentials`: S3 access key + secret key

Reviewed-on: #211
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-28 17:42:49 +10:00
unkinben cfca1e5278 Add age-api deployment (#210)
## Summary
- Deploy age-api to the au-syd1 cluster
- Uses configMapGenerator for people config with jaidi, ben, and sudaporn
- Includes gateway, httproute, service, and deployment
- Image: git.unkin.net/unkin/age-api:v0.1.0

Reviewed-on: #210
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-28 12:19:38 +10:00
unkinben feaec2c8a9 chore: bump artifactapi + ui to v3.6.5 (#208)
Adds bandwidth saved stat to dashboard.

Reviewed-on: #208
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 22:27:55 +10:00
unkinben d1cc467455 chore: bump artifactapi + ui to v3.6.4 (#207)
Fixes helm chart URL path duplication for same-host repos (stakater).

Reviewed-on: #207
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 08:06:26 +10:00
unkinben 0e9ac4d390 chore: bump artifactapi + ui to v3.6.3 (#206)
Includes Docker Accept header forwarding, Content-Type fix, nginx base path fix, and version endpoint fix.

Reviewed-on: #206
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 07:51:13 +10:00
unkinben 722ced3256 chore: bump artifactapi + ui to v3.6.2 (#205)
Includes Docker Bearer token auth (#60) and UI BASE_PATH build_args fix (#59).

Reviewed-on: #205
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 00:20:27 +10:00
unkinben 92e6f0f13b chore: bump artifactapi + ui to v3.6.1 (#204)
Rebuilds UI with BASE_PATH=/ui so assets serve under /ui/.

Reviewed-on: #204
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-27 00:03:58 +10:00
unkinben 825c46c91b chore: bump artifactapi + ui to v3.6.0 (#203)
Bumps API and UI images from v3.5.0 to v3.6.0.

Reviewed-on: #203
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-26 23:57:52 +10:00
unkinben 2c9c79d8f1 fix: update UI health check paths to /ui (#202)
The UI now serves under /ui (artifactapi#58). Health probes need /ui instead of /.

Reviewed-on: #202
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-26 23:57:27 +10:00
unkinben f695657d9d refactor: simplify artifactapi routes (#201)
Route /ui → UI service, everything else → API service.

Replaces the growing list of per-prefix rules (/api, /v2, /health) with a single catch-all to the API. No more needing to add a route rule every time the API adds a new top-level path.

Reviewed-on: #201
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-26 23:39:24 +10:00
unkinben 5dee768170 fix: route /v2 and /health to artifactapi API service (#200)
The v3 route migration (#198) split routes into /api → API and / → UI, but /v2/ (Docker Registry V2 API) and /health now hit the UI catch-all instead of the API backend.

This breaks `docker pull artifactapi.k8s.syd1.au.unkin.net/...` with context deadline exceeded.

Adds /v2 and /health prefix rules before the UI catch-all.

Reviewed-on: #200
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-26 23:31:47 +10:00
benvin f120f3b426 fix: rename environment2 to environment (#199)
update the environment secret reference to match what has been
 deployed. this prevents a containerconfigerror

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #199
2026-06-26 22:55:24 +10:00
benvin f6d60bd02d feat: artifactapi route change (#198)
complete cutover to artifactapi 3

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #198
2026-06-26 22:50:27 +10:00
benvin aac1b654bb feat: migrate to artifactapi 3+ (#197)
What changed:
- Adds new v3 API and UI deployments (separate api-deployment.yaml, ui-deployment.yaml) alongside the existing monolithic artifactapi-deployment.yaml
- Adds CNPG PostgreSQL cluster + pooler to replace the standalone postgres deployment
- Adds new api-env configmap, new Vault secrets (postgres-credentials, environment), and a second VaultAuth (default1)
- Adds new services targeting the split api and ui selectors
- Adds HPAs for both new deployments
- Updates kustomization to include all new resources

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #197
2026-06-26 22:18:07 +10:00
benvin 1c6e087116 chore: cleanup artifactory3 mess (#196)
attempted to let claude deploy a new version of artifactory with
terrible results. this change is to remove that mess so I can start
again.

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #196
2026-06-21 17:40:17 +10:00
benvin 9e6efb7c78 🤦 (#195)
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #195
2026-06-21 17:30:47 +10:00
benvin cae42b4896 feat: manage postgres-credentials for artifactapi3 (#194)
pull credentials for postgres/cnpg from vault

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #194
2026-06-21 17:26:26 +10:00
benvin 349dc5fd01 chore: remove middleware resource (#193)
there is no crd for this, preventing the deployment of artifactapi 3

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #193
2026-06-21 09:10:49 +10:00
benvin 8cbd645332 feat: deploy artifactapi3 (#192)
just-enough to test terraform deployment and begin migration. have
change to cnpg for the database and a new bucket for storage

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #192
2026-06-20 12:22:22 +10:00
benvin ad2cdd3b63 fix: update woodpecker kustomization (#191)
Reviewed-on: #191
2026-06-17 21:34:02 +10:00
benvin 17782d716c feat: enable terraform-artifactapi jobs (#190)
woodpecker jobs for terraform-artifactapi use the service account of the
same name to run jobs, so that it can access specific secrets

- add terraform-artifactapi serviceaccount

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #190
2026-06-17 21:23:49 +10:00
unkinben 188c39f85d feat: add terraform-git service account for woodpecker CI (#189)
## Summary
- Add ServiceAccount terraform-git in woodpecker namespace for terraform-git CI pipelines
- Add to kustomization.yaml

## Test plan
- [ ] Verify ArgoCD syncs the new service account
- [ ] Verify woodpecker CI can use the service account

Reviewed-on: #189
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-07 20:36:55 +10:00
unkinben 0b7819bda3 chore: bump almalinux9 image tags (#188)
Bump almalinux9 image tags to 20260606

Reviewed-on: #188
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-07 00:35:12 +10:00
benvin 3c6330ebfd benvin/gitea (#187)
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #187
2026-06-06 19:47:16 +10:00
unkinben a3a56d0c2b chore: add almalinux-vault repos (#186)
- 9.7 is end of life, ensure that we can still query packages

Reviewed-on: #186
2026-06-02 23:13:45 +10:00
unkinben 4b1fbe1fe1 feat(kanidm): scale down to single replica, remove replication (#185)
Drop from 3 replicas to 1. Remove init container, repl-certs secret,
replication port, podAntiAffinity, server-1/2 configs, and replication
stanza from server-0.toml. Mount configmap directly via subPath.

Reviewed-on: #185
2026-06-02 22:41:28 +10:00
unkinben 666f3d055c feat: add sessionaffinity to kanidm service (#184)
- required as traefik is in passthrough mode

Reviewed-on: #184
2026-06-02 21:17:51 +10:00
unkinben 3dc8801070 fix(kanidm): fix automatic_refresh TOML generation in init container (#182)
## Summary

- The `\n` escape in a shell variable wasn't interpreted as a newline when passed as a `printf %s` argument
- This caused `automatic_refresh = true` to be appended to the `partner_cert` string value on the same line, breaking TOML parsing on kanidm-2
- Fixed by using separate `printf` calls per peer type, with `\n` in the format string (not a variable) where it is correctly interpreted

## Test plan

- [ ] kanidm-2 init container generates valid TOML with `automatic_refresh = true` on its own line under the kanidm-0 peer section
- [ ] kanidm-1 and kanidm-2 start successfully and auto-refresh domain UUID from kanidm-0

Reviewed-on: #182
2026-05-31 00:25:21 +10:00
unkinben 60f1f3130b fix(kanidm): replicate 1/2 from 0 only with automatic_refresh (#181)
kanidm-0 is the authoritative supplier; kanidm-1 and kanidm-2 pull
from kanidm-0 only. automatic_refresh = true on the kanidm-0 peer
entry for kanidm-1/2 so fresh nodes auto-sync domain UUID on restart.

Reviewed-on: #181
2026-05-31 00:20:30 +10:00
unkinben b6f8cb0633 feat: autorestart statefulset (#180)
- ensure kanidm is restarted with vault secrets

Reviewed-on: #180
2026-05-30 23:40:07 +10:00
unkinben f11ec1056d fix(kanidm): remove invalid automatic_refresh from replication config (#179)
Reviewed-on: #179
2026-05-30 23:20:48 +10:00
unkinben ed7feaf19a Update apps/base/kanidm/vaultauth.yaml (#177)
Fix the VaultAuth object

Reviewed-on: #177
2026-05-30 23:11:38 +10:00
unkinben 4d594fbde7 feat(kanidm): vault-managed replication certs with auto-restart (#176)
- Store per-pod replication certs in Vault (kv/kubernetes/namespace/kanidm/default/repl-certs)
- VaultAuth + VaultStaticSecret sync certs to kanidm-repl-certs Secret
- busybox config-init init container injects peer certs from Secret into server.toml at startup
- Remove hardcoded partner_cert entries from per-pod server.toml templates
- Add automatic_refresh = true to all replication configs
- Add reloader.stakater.com/auto annotation to trigger rolling restart on ConfigMap/Secret changes
- Document domain UUID mismatch resolution and cert rotation in README

Reviewed-on: #176
2026-05-30 23:00:46 +10:00
unkinben ede25a3858 feat(platform): add priority-classes app with low/power/medium/high classes (#174)
## Summary
- New `apps/base/priority-classes/` app with four `PriorityClass` objects managed via the `platform` ArgoCD project
- Adds `apps/overlays/*/priority-classes` to the platform ApplicationSet generator
- Adds `priority-classes` namespace to platform AppProject destinations (required even for cluster-scoped resources)

| Class | Value | PreemptionPolicy | Intent |
|---|---|---|---|
| `low` | 100 | Never | Background work; evictable, won't preempt others |
| `power` | 100 | Never | Compute-heavy but expendable (e.g. AI/ML workloads) |
| `medium` | 10000 | PreemptLowerPriority | Standard services |
| `high` | 100000 | PreemptLowerPriority | Critical services; preempts lower-priority pods |

`PriorityClass` is already in the platform project's `clusterResourceWhitelist` so no project policy changes were needed.

## Test plan
- ArgoCD syncs `platform-priority-classes` successfully
- `kubectl get priorityclasses low power medium high` shows all four classes

Reviewed-on: #174
2026-05-26 23:41:54 +10:00
unkinben f5f713fe86 feat(artifactapi): add open-webui/open-webui to ghcr immutable patterns (#173)
Part of #155 (prerequisite for open-webui deployment PR #172).

## Summary
- Adds `^open-webui/open-webui` to the `ghcr` remote's `immutable_patterns` in `remote-docker.yaml` so version-pinned open-webui image pulls are cached indefinitely through artifactapi

## Test plan
- artifactapi serves `ghcr.io/open-webui/open-webui:<version>` with `X-Artifact-Source: cache` on second fetch

Reviewed-on: #173
2026-05-26 23:28:27 +10:00
unkinben 3990fbfe06 feat(vault): switch to Kubernetes service registration (#171)
Replaces Consul service registration with the native Kubernetes provider so Vault labels its own pods with active/standby/perf-standby status without requiring a Consul dependency.

## Changes
- `values.yaml`: swap `service_registration "consul"` for `service_registration "kubernetes" {}`, add `VAULT_K8S_NAMESPACE` and `VAULT_K8S_POD_NAME` env vars via downward API
- `role_k8s-service-registration.yaml`: Role + RoleBinding granting the `vault` service account `get`/`update`/`patch` on pods
- `kustomization.yaml`: include new RBAC file

Reviewed-on: #171
2026-05-26 00:06:56 +10:00
unkinben d358098fff chore: update replication certs (#170)
- add replication certs for kanidm-0, kanidm-1 and kanidm-2

Reviewed-on: #170
2026-05-25 23:52:06 +10:00
unkinben 201e601737 feat: update kanidm replicaiton (#169)
- split to per-server configs
- remove init containers that attempted to automate the replication config
- add README.md

Reviewed-on: #169
2026-05-25 23:25:48 +10:00
unkinben d230d87ec9 feat(artifactapi): add conftest to GitHub generic remote cache (#168)
## Summary

- Adds `open-policy-agent/conftest/.*/conftest_.*_Linux_x86_64.tar.gz$` to the `github` remote immutable patterns in artifactapi

## Why

conftest v0.68.2 (https://github.com/open-policy-agent/conftest/releases/tag/v0.68.2) is now used for OPA policy checks in CI (see #167). Caching the release tarball in artifactapi reduces external dependency on GitHub during builds.

Reviewed-on: #168
2026-05-25 22:44:57 +10:00
unkinben 6497dab25e fix(puppet): remove explicit clusterIP: null from puppetdb Service (#166)
## Summary

- Removes `clusterIP: null` from the `puppetdb` Service spec

## Why

Setting `clusterIP: null` makes ArgoCD's desired state explicit about the field being null. Kubernetes assigns a real IP on creation and the field is immutable afterward. The null vs assigned-IP mismatch causes permanent OutOfSync on the puppetdb Service. Removing the field means ArgoCD no longer claims ownership of `clusterIP`, so the API server's value is authoritative.

Reviewed-on: #166
2026-05-25 22:44:24 +10:00
unkinben f403c6b05d fix(kanidm): add explicit group/kind/weight to TLSRoute refs (#165)
## Summary

- Adds `group: gateway.networking.k8s.io` and `kind: Gateway` to `parentRefs`
- Adds `group: ""`, `kind: Service`, and `weight: 1` to `backendRefs`

## Why

The Gateway API controller defaults these fields when creating/updating TLSRoute objects, so the live state always has them. ArgoCD diffs desired vs live by string comparison, causing the `kanidm` TLSRoute to show permanent OutOfSync. Same root cause as #162 (HTTPRoutes).

Reviewed-on: #165
2026-05-25 22:43:52 +10:00
unkinben dd282f59fb fix(litellm): normalize postgres cluster resource values (#163)
## Summary

- Changes `limits.memory` from `1024Mi` to `1Gi` (same value, canonical form)
- Changes `limits.cpu` from `1` (integer) to `"1"` (string, canonical form)

## Why

Kubernetes normalizes resource quantities on write — `1024Mi` becomes `1Gi` and integer `1` becomes string `"1"`. ArgoCD diffs by string comparison, so these equivalent values cause a permanent OutOfSync on the `litellm-postgres` Cluster.

Reviewed-on: #163
2026-05-24 23:30:10 +10:00
unkinben 1890dd4bda fix(gateways): add explicit group/kind/weight to all HTTPRoute refs (#162)
## Summary

- Adds `group: gateway.networking.k8s.io` and `kind: Gateway` to all `parentRefs` entries
- Adds `group: ""`, `kind: Service`, and `weight: 1` to all `backendRefs` entries
- Affects 9 HTTPRoute files across artifactapi, cattle-system, consul, kanidm, litellm, paperclip, puppet, and vault

## Why

ArgoCD diffs the desired manifest against the live Kubernetes object. The Gateway API controller defaults these fields when creating/updating objects, so the live state always has them — causing persistent OutOfSync for every HTTPRoute. Same root cause as #153 (certificateRefs).

## Test plan

- [ ] All affected ArgoCD applications show Synced after merge

Reviewed-on: #162
2026-05-24 20:32:37 +10:00
unkinben 6815b66010 fix(kanidm): use dockerhub image instead of ghcr.io (#161)
## Summary

- Changes both `config-init` init container and `kanidm` container images from `ghcr.io/kanidm/server:1.10.3` to `kanidm/server:1.10.3`

## Why

`kanidm/server` is published on Docker Hub, not ghcr.io. RKE2 rewrites dockerhub pulls through the artifactapi mirror automatically.

## Test plan

- [ ] Pods roll successfully after ArgoCD sync
- [ ] Verify kanidm cluster replication still healthy

Reviewed-on: #161
2026-05-24 20:27:21 +10:00
unkinben 7cbec33588 fix(artifactapi): move kanidm to dockerhub remote (#160)
## Summary

- Removes `^kanidm/` from the `ghcr` remote immutable_patterns
- Adds `^kanidm/` to the `dockerhub` remote immutable_patterns

## Why

`kanidm/server` is published on Docker Hub, not ghcr.io. Pulling via the `ghcr` cache was failing with 403 on anonymous token fetch → 502 Bad Gateway.

## Test plan

- [ ] `docker pull artifactapi.k8s.syd1.au.unkin.net/dockerhub/kanidm/server:1.10.3` succeeds after artifactapi redeploys

Reviewed-on: #160
2026-05-24 20:24:33 +10:00
unkinben 3756208ccd benvin/kanidm (#159)
Reviewed-on: #159
2026-05-24 19:55:22 +10:00
unkinben 6ce92e8ead benvin/artifactapi-mail-images (#158)
Reviewed-on: #158
2026-05-24 14:44:38 +10:00
unkinben af79d86db6 feat(artifactapi): cache stalwart webadmin zip (#157)
## Summary

- Adds \`stalwartlabs/webadmin/releases/latest/download/webadmin.zip\` to \`mutable_patterns\` in the \`github\` generic remote so the stalwart webadmin UI can be fetched through artifactapi rather than directly from GitHub.

## Notes

- Uses \`mutable_patterns\` (not \`immutable\`) because \`releases/latest\` resolves to whichever release is current and changes over time.
- Access URL: \`https://artifactapi.k8s.syd1.au.unkin.net/generic/github/stalwartlabs/webadmin/releases/latest/download/webadmin.zip\`

Reviewed-on: #157
2026-05-24 12:55:16 +10:00