References the CRD bundle from the bind-operator repo by a stable raw URL
so the CRDs never drift from the operator, matching how other apps import
upstream manifests.
- replace the nine vendored crds/*.yaml with a single remote resource:
git.unkin.net/unkin/bind-operator raw config/crd/install.yaml at v0.1.1
- bump the operator image to v0.1.1 so the running operator and its CRDs
come from the same tag
The dynamic cluster mode was removed from the operator; RFC2136 update
capability is a per-zone property, not a cluster role. The external-dns
tier is an authoritative cluster whose zones set dynamicUpdate.
- switch binddns-externaldns BindCluster to mode authoritative
- regenerate bindcluster schema (enum: authoritative, resolver)
Adds the bind-operator and the three BindClusters that replace the
Puppet-managed BIND estate (authoritative / resolver / external-dns).
- add apps/base/bind-system: 9 CRDs, operator Deployment, RBAC (ns bind-system)
- add apps/base/binddns-auth: authoritative BindCluster + catalog zone + TSIG key
- add apps/base/binddns-resolver: recursive-resolver BindCluster with forwarders
- add apps/base/binddns-externaldns: dynamic (RFC2136) BindCluster + TSIG key
- add au-syd1 overlays for all four apps
- register the four apps in the platform ApplicationSet
- add binddns-* namespaces to the platform AppProject destinations
- add schemas/bind.unkin.net/*.json so kubeconform validates the new CRs
DNS Services are LoadBalancer via PureLB. TSIG key material is generated by
the operator into Secrets at runtime (no plain Secrets in git).
## Why
artifactapi images \`v3.7.3\` are built and pushed to the registry, but au-syd1 is still running \`v3.6.5\`. This rolls the deployment forward to pick up the recent fixes.
## Changes
- \`api-deployment\`: \`artifactapi\` \`v3.6.5\` → \`v3.7.3\`
- \`ui-deployment\`: \`artifactapi-ui\` \`v3.6.5\` → \`v3.7.3\`
Included in v3.7.x since v3.6.5:
- Local-repo files now appear in the cached-objects UI (#99).
- Evicting a local RPM prunes its repodata metadata (#100).
- The bare domain redirects to the web UI at /ui (#101).
Reviewed-on: #215
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Add Kubernetes ServiceAccounts in the woodpecker namespace for terraform-sonarr, terraform-radarr, and terraform-prowlarr CI pipelines.
Reviewed-on: #214
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
## Summary
- New `ci/generate-schemas.sh` script that generates JSON schemas from three sources:
1. Live cluster CRDs via `kubectl get crds`
2. Offline CRD manifests (ArgoCD v3.3.2, Gateway API v1.5.1)
3. Kubernetes v1.33.7 swagger spec for native types
- Schemas follow Datree catalog convention (`<group>/<Kind>_<version>.json`)
- `validate-apps.sh` and `validate-clusters.sh` check local schemas first, falling back to remote
- Fixes TLSRoute (and other CRD) schema validation failures in kubeconform
## Sources
- ArgoCD: `artifactapi.../argoproj/argo-cd/refs/tags/v3.3.2/manifests/ha/install.yaml`
- Gateway API: `artifactapi.../kubernetes-sigs/gateway-api/releases/download/v1.5.1/standard-install.yaml`
- Kubernetes: `artifactapi.../kubernetes/kubernetes/refs/tags/v1.33.7/api/openapi-spec/swagger.json`
Reviewed-on: #212
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
## Summary
- Deploy age-api to the au-syd1 cluster
- Uses configMapGenerator for people config with jaidi, ben, and sudaporn
- Includes gateway, httproute, service, and deployment
- Image: git.unkin.net/unkin/age-api:v0.1.0
Reviewed-on: #210
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
cache frequent lookups to prevent 400 errors from github. the schemas
are available via artifactapi.
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #209
Fixes helm chart URL path duplication for same-host repos (stakater).
Reviewed-on: #207
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Includes Docker Accept header forwarding, Content-Type fix, nginx base path fix, and version endpoint fix.
Reviewed-on: #206
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Includes Docker Bearer token auth (#60) and UI BASE_PATH build_args fix (#59).
Reviewed-on: #205
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Rebuilds UI with BASE_PATH=/ui so assets serve under /ui/.
Reviewed-on: #204
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Bumps API and UI images from v3.5.0 to v3.6.0.
Reviewed-on: #203
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
The UI now serves under /ui (artifactapi#58). Health probes need /ui instead of /.
Reviewed-on: #202
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Route /ui → UI service, everything else → API service.
Replaces the growing list of per-prefix rules (/api, /v2, /health) with a single catch-all to the API. No more needing to add a route rule every time the API adds a new top-level path.
Reviewed-on: #201
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
The v3 route migration (#198) split routes into /api → API and / → UI, but /v2/ (Docker Registry V2 API) and /health now hit the UI catch-all instead of the API backend.
This breaks `docker pull artifactapi.k8s.syd1.au.unkin.net/...` with context deadline exceeded.
Adds /v2 and /health prefix rules before the UI catch-all.
Reviewed-on: #200
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
update the environment secret reference to match what has been
deployed. this prevents a containerconfigerror
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #199
What changed:
- Adds new v3 API and UI deployments (separate api-deployment.yaml, ui-deployment.yaml) alongside the existing monolithic artifactapi-deployment.yaml
- Adds CNPG PostgreSQL cluster + pooler to replace the standalone postgres deployment
- Adds new api-env configmap, new Vault secrets (postgres-credentials, environment), and a second VaultAuth (default1)
- Adds new services targeting the split api and ui selectors
- Adds HPAs for both new deployments
- Updates kustomization to include all new resources
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #197
attempted to let claude deploy a new version of artifactory with
terrible results. this change is to remove that mess so I can start
again.
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #196
just-enough to test terraform deployment and begin migration. have
change to cnpg for the database and a new bucket for storage
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #192
woodpecker jobs for terraform-artifactapi use the service account of the
same name to run jobs, so that it can access specific secrets
- add terraform-artifactapi serviceaccount
---------
Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #190
## Summary
- Add ServiceAccount terraform-git in woodpecker namespace for terraform-git CI pipelines
- Add to kustomization.yaml
## Test plan
- [ ] Verify ArgoCD syncs the new service account
- [ ] Verify woodpecker CI can use the service account
Reviewed-on: #189
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
Drop from 3 replicas to 1. Remove init container, repl-certs secret,
replication port, podAntiAffinity, server-1/2 configs, and replication
stanza from server-0.toml. Mount configmap directly via subPath.
Reviewed-on: #185
## Summary
- The `\n` escape in a shell variable wasn't interpreted as a newline when passed as a `printf %s` argument
- This caused `automatic_refresh = true` to be appended to the `partner_cert` string value on the same line, breaking TOML parsing on kanidm-2
- Fixed by using separate `printf` calls per peer type, with `\n` in the format string (not a variable) where it is correctly interpreted
## Test plan
- [ ] kanidm-2 init container generates valid TOML with `automatic_refresh = true` on its own line under the kanidm-0 peer section
- [ ] kanidm-1 and kanidm-2 start successfully and auto-refresh domain UUID from kanidm-0
Reviewed-on: #182
kanidm-0 is the authoritative supplier; kanidm-1 and kanidm-2 pull
from kanidm-0 only. automatic_refresh = true on the kanidm-0 peer
entry for kanidm-1/2 so fresh nodes auto-sync domain UUID on restart.
Reviewed-on: #181
## Summary
Sets `WOODPECKER_BACKEND_K8S_PRIORITY_CLASS: power` on the Woodpecker agent so all CI pipeline pods are scheduled with the `power` PriorityClass (value 100, preemptionPolicy: Never).
This means pipeline pods can be evicted when the cluster is under pressure but won't preempt other workloads.
## Dependency
Requires the `power` PriorityClass to exist on the cluster — deploy PR #174 (priority-classes app) first.
## Test plan
- Trigger a pipeline run and confirm pods are created with `priorityClassName: power`
- `kubectl get pod -n woodpecker -o jsonpath='{.items[*].spec.priorityClassName}'`
Reviewed-on: #175
## Summary
- New `apps/base/priority-classes/` app with four `PriorityClass` objects managed via the `platform` ArgoCD project
- Adds `apps/overlays/*/priority-classes` to the platform ApplicationSet generator
- Adds `priority-classes` namespace to platform AppProject destinations (required even for cluster-scoped resources)
| Class | Value | PreemptionPolicy | Intent |
|---|---|---|---|
| `low` | 100 | Never | Background work; evictable, won't preempt others |
| `power` | 100 | Never | Compute-heavy but expendable (e.g. AI/ML workloads) |
| `medium` | 10000 | PreemptLowerPriority | Standard services |
| `high` | 100000 | PreemptLowerPriority | Critical services; preempts lower-priority pods |
`PriorityClass` is already in the platform project's `clusterResourceWhitelist` so no project policy changes were needed.
## Test plan
- ArgoCD syncs `platform-priority-classes` successfully
- `kubectl get priorityclasses low power medium high` shows all four classes
Reviewed-on: #174
Part of #155 (prerequisite for open-webui deployment PR #172).
## Summary
- Adds `^open-webui/open-webui` to the `ghcr` remote's `immutable_patterns` in `remote-docker.yaml` so version-pinned open-webui image pulls are cached indefinitely through artifactapi
## Test plan
- artifactapi serves `ghcr.io/open-webui/open-webui:<version>` with `X-Artifact-Source: cache` on second fetch
Reviewed-on: #173
Replaces Consul service registration with the native Kubernetes provider so Vault labels its own pods with active/standby/perf-standby status without requiring a Consul dependency.
## Changes
- `values.yaml`: swap `service_registration "consul"` for `service_registration "kubernetes" {}`, add `VAULT_K8S_NAMESPACE` and `VAULT_K8S_POD_NAME` env vars via downward API
- `role_k8s-service-registration.yaml`: Role + RoleBinding granting the `vault` service account `get`/`update`/`patch` on pods
- `kustomization.yaml`: include new RBAC file
Reviewed-on: #171
## Summary
- Adds `open-policy-agent/conftest/.*/conftest_.*_Linux_x86_64.tar.gz$` to the `github` remote immutable patterns in artifactapi
## Why
conftest v0.68.2 (https://github.com/open-policy-agent/conftest/releases/tag/v0.68.2) is now used for OPA policy checks in CI (see #167). Caching the release tarball in artifactapi reduces external dependency on GitHub during builds.
Reviewed-on: #168
## Summary
- Removes `clusterIP: null` from the `puppetdb` Service spec
## Why
Setting `clusterIP: null` makes ArgoCD's desired state explicit about the field being null. Kubernetes assigns a real IP on creation and the field is immutable afterward. The null vs assigned-IP mismatch causes permanent OutOfSync on the puppetdb Service. Removing the field means ArgoCD no longer claims ownership of `clusterIP`, so the API server's value is authoritative.
Reviewed-on: #166
## Summary
- Adds `group: gateway.networking.k8s.io` and `kind: Gateway` to `parentRefs`
- Adds `group: ""`, `kind: Service`, and `weight: 1` to `backendRefs`
## Why
The Gateway API controller defaults these fields when creating/updating TLSRoute objects, so the live state always has them. ArgoCD diffs desired vs live by string comparison, causing the `kanidm` TLSRoute to show permanent OutOfSync. Same root cause as #162 (HTTPRoutes).
Reviewed-on: #165
## Summary
- Changes `server.resources.limits.cpu` from `1000m` to `"1"` in consul Helm values
## Why
`1000m` (1000 milliCPU) is equivalent to `1` CPU, but Kubernetes normalizes the value to `"1"` when storing. ArgoCD diffs desired vs live by string comparison, so the mismatch causes a permanent OutOfSync on the `consul-server` StatefulSet. Same root cause as #163.
Reviewed-on: #164
## Summary
- Changes `limits.memory` from `1024Mi` to `1Gi` (same value, canonical form)
- Changes `limits.cpu` from `1` (integer) to `"1"` (string, canonical form)
## Why
Kubernetes normalizes resource quantities on write — `1024Mi` becomes `1Gi` and integer `1` becomes string `"1"`. ArgoCD diffs by string comparison, so these equivalent values cause a permanent OutOfSync on the `litellm-postgres` Cluster.
Reviewed-on: #163
## Summary
- Adds `group: gateway.networking.k8s.io` and `kind: Gateway` to all `parentRefs` entries
- Adds `group: ""`, `kind: Service`, and `weight: 1` to all `backendRefs` entries
- Affects 9 HTTPRoute files across artifactapi, cattle-system, consul, kanidm, litellm, paperclip, puppet, and vault
## Why
ArgoCD diffs the desired manifest against the live Kubernetes object. The Gateway API controller defaults these fields when creating/updating objects, so the live state always has them — causing persistent OutOfSync for every HTTPRoute. Same root cause as #153 (certificateRefs).
## Test plan
- [ ] All affected ArgoCD applications show Synced after merge
Reviewed-on: #162
## Summary
- Changes both `config-init` init container and `kanidm` container images from `ghcr.io/kanidm/server:1.10.3` to `kanidm/server:1.10.3`
## Why
`kanidm/server` is published on Docker Hub, not ghcr.io. RKE2 rewrites dockerhub pulls through the artifactapi mirror automatically.
## Test plan
- [ ] Pods roll successfully after ArgoCD sync
- [ ] Verify kanidm cluster replication still healthy
Reviewed-on: #161
## Summary
- Removes `^kanidm/` from the `ghcr` remote immutable_patterns
- Adds `^kanidm/` to the `dockerhub` remote immutable_patterns
## Why
`kanidm/server` is published on Docker Hub, not ghcr.io. Pulling via the `ghcr` cache was failing with 403 on anonymous token fetch → 502 Bad Gateway.
## Test plan
- [ ] `docker pull artifactapi.k8s.syd1.au.unkin.net/dockerhub/kanidm/server:1.10.3` succeeds after artifactapi redeploys
Reviewed-on: #160
## Summary
- Adds \`stalwartlabs/webadmin/releases/latest/download/webadmin.zip\` to \`mutable_patterns\` in the \`github\` generic remote so the stalwart webadmin UI can be fetched through artifactapi rather than directly from GitHub.
## Notes
- Uses \`mutable_patterns\` (not \`immutable\`) because \`releases/latest\` resolves to whichever release is current and changes over time.
- Access URL: \`https://artifactapi.k8s.syd1.au.unkin.net/generic/github/stalwartlabs/webadmin/releases/latest/download/webadmin.zip\`
Reviewed-on: #157
The Gateway API admission server defaults certificateRefs[].group to ""
when it is omitted. ArgoCD diffed the desired state (no group field) against
the live state (group: "") and flagged every gateway as out of sync.
Fix: explicitly set group: "" in all certificateRefs entries so the
rendered manifest matches the API server's canonical form exactly.
Affected: artifactapi, cattle-system, consul, litellm, paperclip,
puppet (puppetboard + puppetdb), vault.
Reviewed-on: #153
Vault and consul namespaces were missing from the platform AppProject
allowed destinations, causing ArgoCD sync failures with:
destination server 'https://kubernetes.default.svc' and namespace
'vault' do not match any of the allowed destinations in project 'platform'
Reviewed-on: #152
## Summary
- Deploys HashiCorp Consul 1.22.7 using Helm chart 1.9.7 with 5 server replicas
- Configuration modelled on production consul: \`datacenter=au-syd1\`, \`connect=true\`, \`raft_multiplier=10\`, HTTP on 8500, GRPC on 8502, HTTPS disabled
- 5-replica server cluster with \`bootstrapExpect=5\`
- 10Gi cephrbd-fast-delete PVC per server pod
- Gateway API: HTTPS gateway + HTTPRoute (443→consul-consul-ui:80→8500) at \`consul.k8s.syd1.au.unkin.net\`
- PodDisruptionBudget patched from \`policy/v1beta1\` to \`policy/v1\` (k8s 1.25+ compatibility)
- ArgoCD platform ApplicationSet updated to include consul overlay path
- Clients disabled (server-only deployment)
- ConnectInject disabled (can be enabled later for service mesh)
## Requires
- PR #147 (artifactapi: add hashicorp/consul to docker immutable patterns) to be merged first
## Test plan
- [ ] Sandbox tested in \`sandbox-consul\`: all 5 server pods 1/1 Running, cluster formed
- [ ] After merge: ArgoCD syncs consul namespace
- [ ] Verify \`consul.k8s.syd1.au.unkin.net\` is accessible via Gateway
Reviewed-on: #149
## Summary
- Deploys HashiCorp Vault 2.0.1 using Helm chart 0.32.0 in HA raft mode (5 replicas)
- Configuration modelled on production vault: \`disable_mlock=true\`, headless-DNS retry_join for all 5 pods
- IPC_LOCK capability added via \`server.statefulSet.securityContext.container\`
- 10Gi cephrbd-fast-delete PVC per pod via \`dataStorage\`
- Gateway API: HTTPS gateway + HTTPRoute (443→vault service port 8200) at \`vault.k8s.syd1.au.unkin.net\`
- ArgoCD platform ApplicationSet updated to include vault overlay path
- Injector disabled (no agent sidecar injection needed)
## Requires
- PR #147 (artifactapi: add hashicorp/vault to docker immutable patterns) to be merged first
## Test plan
- [ ] Sandbox tested in \`sandbox-vault\`: all 5 pods Running, raft cluster forming
- [ ] After merge: ArgoCD syncs vault namespace
- [ ] Operator runs \`vault operator init\` to initialize, then unseals all 5 nodes
- [ ] Verify \`vault.k8s.syd1.au.unkin.net\` is accessible via Gateway
Reviewed-on: #148
## Summary
- Upgrades cert-manager from v1.19.2 to v1.20.2
- Enables `enableGatewayAPI: true` via the `ControllerConfiguration` config block
## Why
cert-manager's Gateway API integration was not enabled. Without it, `cert-manager.io/*` annotations on Gateway resources are ignored and no certificates are issued. This is required for the consul and vault PRs (#148, #149) to have their TLS certs automatically provisioned from their Gateway annotations.
In v1.20.2, `ExperimentalGatewayAPISupport` is BETA and defaults to true — enabling `enableGatewayAPI` in the controller config activates the gateway-shim controller.
## Test plan
- [ ] cert-manager rolls out cleanly (v1.20.2 pods become Ready)
- [ ] After rollout, existing Gateway-annotated services (artifactapi, puppet, litellm) retain valid certs
- [ ] New Gateway resources with `cert-manager.io/cluster-issuer` annotations trigger Certificate creation
Reviewed-on: #150
## Summary
- Adds \`^hashicorp/consul\` and \`^hashicorp/vault\` to the dockerhub immutable_patterns in artifactapi's remote-docker.yaml
- Replaces the more specific \`^hashicorp/vault-secrets-operator\` pattern since \`^hashicorp/vault\` subsumes it
- Required for the benvin/vault and benvin/consul branches (vault:2.0.1 and consul:1.22.7)
## Test plan
- [ ] Verify artifactapi accepts requests for hashicorp/vault and hashicorp/consul images after merge
Reviewed-on: #147
finding litellm performance has dropped, crashed in multiple cases, and
then it had scaled to the maximum level using the majority of memory in
cluster.
- reduce the rate at which litellm autoscales
- increase the requests/limits to match usage
Reviewed-on: #144
Add port 80 HTTP listener and redirect HTTPRoute to artifactapi,
cattle-system (rancher), litellm, paperclip, and puppetboard — restoring
the redirect behaviour that existed on the previous nginx/traefik Ingress
resources.
Reviewed-on: #145
## Summary
- Replace `Ingress` (nginx) with `Gateway` + `HTTPRoute` using `traefik-internal` GatewayClass
- TLS terminated at the Gateway listener; cert-manager provisions the certificate via `vault-issuer`
- external-dns annotations moved to the Gateway
## Notes
The original Ingress had nginx-specific annotations (`proxy-body-size: 10g`, `proxy-read-timeout: 600`) which are not portable to Gateway API. These can be re-introduced via a Traefik `Middleware` CRD if needed.
## Test plan
- [ ] ArgoCD syncs the app cleanly
- [ ] cert-manager issues the `artifactapi-tls` certificate
- [ ] external-dns creates the DNS record
- [ ] `https://artifactapi.k8s.syd1.au.unkin.net` is reachable
Reviewed-on: #129
## Summary
- Replace `Ingress` (nginx) with `Gateway` + `HTTPRoute` using `traefik-internal` GatewayClass
- TLS terminated at the Gateway listener; cert-manager provisions the certificate via `vault-issuer`
- external-dns annotations moved to the Gateway
- `ingress_puppetboard.yaml` is unchanged in this PR (separate PR)
## Test plan
- [ ] ArgoCD syncs the puppet app cleanly
- [ ] cert-manager issues the `puppetdb-tls` certificate
- [ ] external-dns creates the DNS record
- [ ] `https://puppetdb.k8s.syd1.au.unkin.net` is reachable
Reviewed-on: #131
## Summary
- Replace `Ingress` (nginx) with `Gateway` + `HTTPRoute` using `traefik-internal` GatewayClass
- TLS terminated at the Gateway listener; cert-manager provisions the certificate via `vault-issuer`
- external-dns annotations moved to the Gateway
- `ingress_puppetdb.yaml` is unchanged in this PR (separate PR)
## Test plan
- [ ] ArgoCD syncs the puppet app cleanly
- [ ] cert-manager issues the `puppetboard-tls` certificate
- [ ] external-dns creates the DNS record
- [ ] `https://puppetboard.k8s.syd1.au.unkin.net` is reachable
Reviewed-on: #130
## Summary
- Replace `Ingress` (nginx) with `Gateway` + `HTTPRoute` using `traefik-internal` GatewayClass
- TLS terminated at the Gateway listener; cert-manager provisions the certificate via `vault-issuer`
- external-dns annotations moved to the Gateway
## Test plan
- [ ] ArgoCD syncs the paperclip app cleanly
- [ ] cert-manager issues the `paperclip-tls` certificate
- [ ] external-dns creates the DNS record
- [ ] `https://paperclip.k8s.syd1.au.unkin.net` is reachable
Reviewed-on: #133
## Summary
- Replace `Ingress` (nginx) with `Gateway` + `HTTPRoute` using `traefik-internal` GatewayClass
- TLS terminated at the Gateway listener; cert-manager provisions the certificate via `vault-issuer`
- external-dns annotations moved to the Gateway
## Test plan
- [ ] ArgoCD syncs the litellm app cleanly
- [ ] cert-manager issues the `litellm-tls` certificate
- [ ] external-dns creates the DNS record
- [ ] `https://litellm.k8s.syd1.au.unkin.net` is reachable
Reviewed-on: #134
## Changes
- Upgrade external-dns chart from 1.19.0 → 1.21.1 (app v0.19.0 → v0.21.0)
- Remove `gateway-tcproute` source — `TCPRoute` CRD is not installed, causing crash-loop
- Add `gateway-tlsroute` source — `TLSRoute` CRD is installed (Gateway API 1.5.1)
## Why
The pod was crash-looping every minute with `failed to list *v1alpha2.TCPRoute: the server could not find the requested resource` because the TCPRoute CRD doesn't exist in this cluster. TLSRoute was previously removed but its CRD does exist.
Reviewed-on: #143
Temporary: enable --log-level=debug to understand why the TLSRoute informer never reports HasSynced within the 1m interval. To be closed/reverted after root cause is found.
Reviewed-on: #140
## Problem
Gateway listeners with `port: 443` were rejected with `PortUnavailable: Cannot find entryPoint for Gateway: no matching entryPoint for port 443 and protocol "HTTPS"`.
Traefik matches Gateway listener ports against its internal entryPoint ports (pod-level), not the Service's `exposedPort`. The `websecure` entryPoint was configured on port `8443`, so port `443` listeners had no match.
## Fix
- `ports.websecure.port: 443` — Traefik now binds directly on 443
- `securityContext.capabilities.add: [NET_BIND_SERVICE]` — allows a non-root process to bind to privileged ports (<1024)
The Service `exposedPort` stays at `443`, so external connectivity is unchanged. All existing Gateway listeners (`port: 443`) are correct as-is.
Applies to both internal and external Traefik instances.
## Test plan
- [ ] Traefik pods restart cleanly
- [ ] `kubectl get gateway -A` shows listeners as `Programmed: True`
- [ ] `https://rancher.k8s.syd1.au.unkin.net` (already merged) is reachable
Reviewed-on: #138
## Summary
- Replace `Ingress` (nginx) with `Gateway` + `HTTPRoute` using `traefik-internal` GatewayClass
- TLS terminated at the Gateway listener; cert-manager provisions the certificate via `vault-issuer`
- external-dns annotations moved to the Gateway
## Test plan
- [ ] ArgoCD syncs the cattle-system app cleanly
- [ ] cert-manager issues the `rancher-tls` certificate
- [ ] external-dns creates the DNS record
- [ ] `https://rancher.k8s.syd1.au.unkin.net` is reachable
Reviewed-on: #132
## Problem
GatewayClasses were `Unknown` even after controllerName was fixed. The `kubernetesGateway` `labelSelector` applies to all watched resources, including GatewayClasses themselves. Since neither GatewayClass had a `traefik.io/instance` label, both Traefik instances filtered them out and never accepted them.
## Fix
- `gatewayclass-internal.yaml`: add `traefik.io/instance: internal`
- `gatewayclass-external.yaml`: add `traefik.io/instance: external`
## Test plan
- [ ] `kubectl get gatewayclass` shows both as `Accepted: True`
Reviewed-on: #137
## URGENT — Traefik pods are CrashLoopBackOff
The merged PR #135 added `--providers.kubernetesgateway.controllerName` as an `additionalArguments` entry. Traefik v3.7.0 does not support this flag and fails immediately on startup.
Old replica sets are still running (one pod each) but new pods cannot come up.
## Fix
- Remove `additionalArguments` from both `values-internal.yaml` and `values-external.yaml`
- Revert GatewayClass `controllerName` back to `traefik.io/gateway-controller` (the hardcoded Traefik default — no override mechanism exists in v3.7.0)
## After merge
GatewayClasses will remain `Unknown` until a separate solution for internal/external separation is implemented (the `labelSelector` approach needs further investigation).
Reviewed-on: #136
## Problem
Both GatewayClasses (`traefik-internal`, `traefik-external`) were stuck as `Unknown`. Neither Traefik deployment had `controllerName` set in `kubernetesGateway`, so both defaulted to `traefik.io/gateway-controller` — which matched neither GatewayClass.
## Fix
- `gatewayclass-internal.yaml`: `controllerName: traefik.io/gateway-controller-internal`
- `gatewayclass-external.yaml`: `controllerName: traefik.io/gateway-controller-external`
- `values-internal.yaml`: added `controllerName: traefik.io/gateway-controller-internal`
- `values-external.yaml`: added `controllerName: traefik.io/gateway-controller-external`
## Test plan
- [ ] ArgoCD syncs traefik-system cleanly
- [ ] `kubectl get gatewayclass` shows both as `Accepted: True`
Reviewed-on: #135
this adds a service account that can be used to run the terraform_vault
workflows with, so that we can access the jwt to generate a token
Reviewed-on: #127
Remove --providers.kubernetesgateway.controllername which does not exist in
Traefik v3, update GatewayClass controllerName to the standard v3 value, and
use labelSelector on each instance's kubernetesGateway provider to differentiate
internal vs external traffic.
Reviewed-on: #125
deploy traefik for internal and external applications. port forwarding
from the external routers will only occur to the IP of the
traefik-external service.
- traefik-internal and traefik-external added
- each is a different deployment
Reviewed-on: #119
Adds immutable patterns for yannh/kubeconform and kubernetes-sigs/kustomize
to fix 403 Forbidden errors when downloading their Linux amd64 releases.
Reviewed-on: #123
- Patch argocd-repo-server to mount vault-ca-cert and set SSL_CERT_DIR
so helm subprocesses trust the internal CA when pulling charts
- Add argocd Application pointing at clusters/au-syd1/bootstrap so
ArgoCD manages its own install going forward
Reviewed-on: #112
Patches argocd-tls-certs-cm with the Vault CA chain so ArgoCD can
verify TLS when pulling Helm charts from artifactapi.k8s.syd1.au.unkin.net.
Reviewed-on: #111
have seen some contention on woodpecker jobs, because they are not being
scheduled correctly. we need to set correct limits/requests so that they
can be accurately scheduled.
- set limits/requests for all workflows
Reviewed-on: #110
Mount the vault-ca-cert secret and set NODE_EXTRA_CA_CERTS so Node.js
trusts the internal CA chain when making outbound TLS connections.
Reviewed-on: #108
The privateHostnameGuard middleware blocks requests where the Host header
is not in the allowlist. Kubelet httpGet probes use the pod IP as the
Host header, which is never in the allowlist. Setting Host: localhost
ensures probes are always permitted.
Reviewed-on: #107
Adds base manifests and au-syd1 overlay for Paperclip (AI agent
orchestration platform), following the litellm deployment pattern.
Updates aitooling ApplicationSet to include the paperclip path.
Closes#99
Reviewed-on: #100
Deploys LiteLLM proxy with CNPG PostgreSQL (3-instance HA), PgBouncer
pooler, and Redis cache. Introduces a dedicated aitooling AppProject and
ApplicationSet to keep AI tooling services separate from platform infra.
Reviewed-on: #94
Split monolithic remotes.yaml into per-type-package files under
resources/conf.d/ to align with artifactapi v2.7.1 directory loading.
Updated schema: virtuals/locals use dedicated top-level keys, type field
removed. Added helm remotes for all kustomize helmCharts repos and
OCI patterns to docker remotes. CONFIG_PATH now points to the directory.
Reviewed-on: #92
Migrate PureLB load balancer from Terragrunt to ArgoCD/Kustomize.
Deploys purelb v0.13.0 with two LBNodeAgent and two ServiceGroup CRs
(common: 198.18.200.0/24, dmz: 198.18.199.0/24).
Adds LBNodeAgent and ServiceGroup to kubeconform skip list (no CRD catalog schema).
💘 Generated with Crush
Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>
Reviewed-on: #84
Migrate Vault Secrets Operator from Terragrunt to ArgoCD/Kustomize.
Deploys vault-secrets-operator v1.2.0 with 3 replicas, plus ClusterRole,
ClusterRoleBindings, and vault-admin ServiceAccount.
Note: static service account tokens (kubernetes.io/service-account-token)
cannot be stored in git; create manually or via Vault after deployment.
💘 Generated with Crush
Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>
Reviewed-on: #81
Migrate Victoria Metrics cluster and agent from Terragrunt to ArgoCD/Kustomize.
Creates new observability AppProject and ApplicationSet.
Deploys victoria-metrics-cluster v0.33.0 (vmselect/vminsert/vmstorage with
HPA, PDB, ingress) and victoria-metrics-agent v0.30.0 (3 replicas, k8s scrape
configs) in the observability namespace.
💘 Generated with Crush
Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>
Reviewed-on: #82
its not used and never really installed correctly. going to change to
artifact-keeper which promises to have the same capabilities and is open
source.
Reviewed-on: #83
Migrate Victoria Metrics operator from Terragrunt to ArgoCD/Kustomize.
Deploys victoria-metrics-operator v0.57.1 with 2 replicas in vm-system.
💘 Generated with Crush
Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>
Reviewed-on: #80
Migrate ECK operator from Terragrunt to ArgoCD/Kustomize.
Deploys eck-operator v3.2.0 with 2 replicas and PodDisruptionBudget
in the elastic-system namespace.
💘 Generated with Crush
Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>
Reviewed-on: #79
Migrate repository sync cronjobs from Terragrunt to ArgoCD/Kustomize.
Adds four daily CronJobs (almalinux9-baseos, almalinux9-appstream, epel9,
openvox7) with associated PVCs and ConfigMaps in the reposync namespace.
💘 Generated with Crush
Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>
Reviewed-on: #78
The g10k-code cronjob was failing with "Permission denied" because the
container (running as uid 999, non-root) attempted to create /shared in
the container root filesystem, which is not writable. Clone to /tmp
which is always writable by unprivileged users.
Reviewed-on: #76
The RWO puppetserver-shared-config PVC caused multi-attach errors when
the cronjob pod was scheduled on a different node than the previous run,
stalling the init container indefinitely. Since the config only needs to
exist for the duration of the job, remove the init container and PVC
entirely and clone the r10k config directly into /shared within the main
container before running g10k.
Reviewed-on: #75
The container was OOMKilled on every run because the 256Mi limit was far
too low for `puppet generate types`. Remove PUPPETSERVER_JAVA_ARGS (only
relevant to the puppetserver JVM, not the puppet CLI) and raise the
memory limit to 1Gi / request 512Mi.
Reviewed-on: #74
filemapper is not available on RubyGems under that name and was causing
puppetserver-compiler to crash loop. The interfaces provider that
requires puppetx/filemapper is Debian-specific and should not be loaded
on RedHat-based puppetservers.
Reviewed-on: #72
The network module's interfaces provider requires puppetx/filemapper
which was not installed, causing catalog compilation failures with
"no such file to load -- puppetx/filemapper".
Adds filemapper to additional-ruby-gems.sh for puppetserver/compiler
pods, installs it directly in the generate-types cronjob (which has no
access to that script), and adds cronjob_generate-types.yaml to the
kustomization so the CronJob is actually deployed.
💘 Generated with Crush
Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>
Reviewed-on: #71
Puppetboard was connecting to PuppetDB on port 8080 (plain HTTP), causing
403 Forbidden errors on the /metrics/v2 Jolokia endpoint which requires
HTTPS with a Puppet certificate. Also replaced the invalid
PUPPETDB_SSL_SKIP_VERIFY var with the correct PUPPETDB_SSL_VERIFY,
PUPPETDB_CERT, and PUPPETDB_KEY pointing to the certs already generated
by the cert-generator init container.
💘 Generated with Crush
Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>
Reviewed-on: #70
PuppetDB requires a separate read-only database user for its read pool.
Without it, it refuses to use the write user for read queries and all
/pdb/query/v4 calls fail with a 500.
- Add puppetdb_read role via CNPG managed.roles with password sourced
from a new postgres-read-credentials Vault secret
- Grant CONNECT, USAGE, SELECT and default privileges to puppetdb_read
via postInitApplicationSQL (must also be run manually on existing cluster)
- Add puppet-postgres-pooler-ro Pooler (type: ro) routing to replicas
- Add puppetdb-read-database-conf ConfigMap with read-database.conf
mounted into /etc/puppetlabs/puppetdb/conf.d/ in the PuppetDB deployment
- Wire OPENVOXDB_READ_POSTGRES_* env vars from the new secret
💘 Generated with Crush
Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>
Reviewed-on: #69
- master config section is not used
- server containes all setting specifically for a server (puppet, puppet ca)
- user is for all puppet <command> tooling, like 'puppet generate'
Reviewed-on: #66
Add support for installing additional Ruby gems via custom entrypoint script.
The script is mounted as a ConfigMap into /container-custom-entrypoint.d/
and will be executed during Puppetserver container startup.
Reviewed-on: #63
- Mount vault-ca-cert secret at /opt/vault-ca-cert.crt in both deployments
- Update cobbler-enc script to use correct CA certificate path
- Resolves OSError about missing TLS CA certificate bundle
Reviewed-on: #62
- Add puppet-shared-bins PVC (10GB) for shared binaries
- Mount /opt/bin in both compiler and master deployments
- Add init container to install uv binary and cobbler script to shared volume
- Update cobbler-enc to use absolute path and uv cache directory
- Configure puppet.conf to reference cobbler-enc from /opt/bin
Reviewed-on: #61
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): puppetdb:8081
ERROR:pypuppetdb.api.base:Could not reach PuppetDB on puppetdb:8081 over HTTP.
- puppetdb_host assumes HTTP when not verifying ssl
Reviewed-on: #58
- Update PuppetDB connections from HTTP (8080) to HTTPS (8081)
- Add automatic certificate generation for Puppetboard using Puppet CA
- Implement initContainers for proper certificate provisioning before app start
- Add dedicated PVC for Puppetboard certificates with RWX access
- Configure SSL verification and client authentication for secure PuppetDB access
Reviewed-on: #50
- Add node-feature-discovery and inteldeviceplugins-system to platform project
- Convert intel-nfd-rules from local Helm chart to static NodeFeatureRule manifests
- Add required Helm repositories (NFD OCI registry and Intel charts)
- Create base configurations with Helm charts and overlay structures
- Update platform ApplicationSet and project permissions
Reviewed-on: #48
This resolves SSL certificate verification failures preventing puppetdb access
- Update OPENVOXDB_SERVER_URLS from https://puppetdb:8081 to http://puppetdb:8080
- External access to puppetdb will still use HTTPS via ingress
- Internal cluster communication does not require encryption
Reviewed-on: #47
- Migrate csi-cephfs from Terraform to ArgoCD
- Migrate csi-cephrbd from Terraform to ArgoCD
- Create dedicated storage project and ApplicationSet for CSI drivers
- Add csi-* pattern matching in storage ApplicationSet
- Remove CSI apps from platform project to separate concerns
Reviewed-on: #45
- Add cnpg-system base ArgoCD application with namespace
- Create cnpg-system overlay for au-syd1 with CloudNativePG Helm chart
- Update platform ApplicationSet to include cnpg-system deployment
- Configure cloudnative-pg operator v0.27.0 with HA and resource limits
- Maintain one-to-one migration from Terraform configuration
Reviewed-on: #44
- Add externaldns base ArgoCD application with namespace and Vault integration
- Create externaldns overlay for au-syd1 with Helm chart configuration
- Update platform ApplicationSet to include externaldns deployment
- Configure external-dns v1.19.0 with RFC2136 provider for DNS updates
- Maintain one-to-one migration from Terraform configuration including TSIG secrets
Reviewed-on: #43
- Add cattle-system base ArgoCD application with namespace, Vault integration, and ingress
- Create cattle-system overlay for au-syd1 with Rancher Helm chart configuration
- Update platform ApplicationSet to include cattle-system deployment
- Update platform project to include Rancher Helm repository as source
- Configure Rancher v2.13.1 with HA, TLS, audit logging, and bootstrap secret from Vault
- Maintain one-to-one migration from Terraform configuration
Reviewed-on: #39
- Add certificates base ArgoCD application with namespace and Vault CA certificate secret
- Create certificates overlay for au-syd1 with static certificate configuration
- Update platform ApplicationSet to include certificates deployment
- Configure Vault CA certificate with reflector annotations for cross-namespace replication
- Maintain one-to-one migration from Terraform configuration
Note: Skip no_plain_secrets hook as this is a public CA certificate that needs
to be replicated via reflector, not a sensitive secret
Reviewed-on: #37
- change puppet/puppetca -> LoadBalancer
- dedicate ip's for puppet and puppetca loadbalancers
- name the puppetserver port
- remove puppet/puppetca ingress
Reviewed-on: #35
puppetdb_port has tcp:// in it, even though we pass the correct variable
in from a configmap.
```
ben@metabox ~/s/p/argocd-apps> kubectl --context admin run debug-pod --image=busybox --rm -it --restart=Never -n puppet -- env | grep -i puppetdb_port
PUPPETDB_PORT_8081_TCP_PORT=8081
PUPPETDB_PORT_8081_TCP_PROTO=tcp
PUPPETDB_PORT=tcp://10.43.101.142:8080
PUPPETDB_PORT_8080_TCP=tcp://10.43.101.142:8080
PUPPETDB_PORT_8080_TCP_ADDR=10.43.101.142
PUPPETDB_PORT_8081_TCP=tcp://10.43.101.142:8081
PUPPETDB_PORT_8080_TCP_PROTO=tcp
PUPPETDB_PORT_8081_TCP_ADDR=10.43.101.142
PUPPETDB_PORT_8080_TCP_PORT=8080
```
Reviewed-on: #32
the puppetca is not pointing to the puppetmasters which prevents the
puppet-compilers from starting, preventing puppetdb/puppetboard from
starting.
- point puppetca service -> puppetserver-master
Reviewed-on: #31
updating all the names of services and their respective filenames to
better match the way puppet infra is used in my lab.
- puppet -> the compilers
- puppetca -> the master(s)
- puppetdb -> the puppetdb
- puppetboard -> puppetboard
updated references to these services in all other definitions I could find
note: need a good way to test these changes with argocd
Reviewed-on: #30
complete the implementation of puppet in kubernetes, taking many
features from the openvox helm chart and improving on them. changes from
helm are:
- using vault for storing secrets
- using g10k instead of r10k
- using a single shared g10k cronjob for all masters/compilers
- using a single shared /etc/puppetlabs/code directory (shared, cephfs)
changes:
- deploy puppet master and compiler servers with statefulset/deployment
- deploy puppetdb with postgresql backend, taking advantage of cnpg cluster and pooler
- deploy puppetboard
- all supporting configmaps, services, ingresses, and hpas
- added vaultstaticsecret for eyaml private keys
- configured secure mounting of eyaml keys at /var/lib/puppet/keys/
- updated base kustomization to include all 23 new puppet resource files
Reviewed-on: #29
g10k hardlinks, so reqires that the cache and code be in the same pvc.
updated r10k repository with cachedir in same pvc, and so now I can
remove these unused pvcs from argo.
unkin/puppet-r10k#4
Reviewed-on: #28
working towards a larger, redundant, autoscaling and simple puppet
implementation in kubernetes. this was originally based on the openvox
helm chart with several improvements (not all in this pr)
- use of cnpg instead of single bitnamilegacy postgres container
- use for g10k instead of r10k
- run one instance of g10k per namespace, instead of per-pod
- store only keep one copy of the environments/branches (instead of per-pod)
- change g10k to native cronjob instead of hacky implementation
- use vault secrets
part one adds:
- cnpg puppetdb pgsql cluster
- cnpg puppetdb pgpooler
- persistent volume claims for puppet, puppetdb, the code repository, etc
Reviewed-on: #25
Follow these steps **in order**. Do not skip steps.
### 1 — Choose an issue
Present the issues above to the user as a numbered list (index, one-line title). Ask which one to work on. Wait for the answer before continuing.
### 2 — Sync master
```bash
git checkout master
git pull
```
Confirm you are on master and up to date.
### 3 — Create a branch
Name the branch `benvin/issue-<N>-<short-slug>` where `<short-slug>` is 2–4 kebab-case words from the issue title.
```bash
git checkout -b benvin/issue-<N>-<slug>
```
### 4 — Read the issue in full
Re-read the full issue body shown above. If any part is ambiguous, state your interpretation before coding.
**If you discover other problems while working:** do NOT solve them inline. Create a new Gitea issue with `tea issues create --title "..." --description "..."` and stay focused on the assigned issue.
### 5 — Implement the solution
Make the code changes needed to resolve the issue. Follow the conventions already in the repo:
- `main.py` route handlers each contain a single function call; logic lives in submodules.
- No comments unless the WHY is non-obvious.
- No new files unless the issue or architecture requires it.
- Security: no command injection, XSS, SQL injection, or secrets in code.
- **For performance improvements:** implement at the most generic call site possible so the fix applies to all current and future implementations, not just the one being tested.
### 6 — Update tests
Add or update tests that cover the new behaviour. Tests live in `tests/`. Check existing test structure before writing new ones — mirror the style and fixture patterns already in use.
### 7 — Update README
If the feature introduces new config keys, endpoints, or user-facing behaviour, document it in `README.md`. Keep additions concise — follow the existing section style.
### 8 — Run the full test suite
```bash
make test
```
All tests must pass. If any fail, fix them before proceeding. Do not skip or suppress failing tests.
### 9 — Live Docker test (new package type only)
**Skip this step if the issue does not add a new remote package type.**
If the issue adds a new package type (e.g. `deb`, `conda`, `cargo`, `rubygems`, or any type not already in `remotes.yaml`), do the following before committing.
#### 9a — Add a real test remote to remotes.yaml
Append a valid, publicly accessible remote of the new type to `remotes.yaml`. Use a real upstream URL and patterns that cover both an immutable file (versioned artifact) and a mutable file (index/metadata). Add a comment explaining which URLs to use for manual testing.
#### 9b — Start the stack
```bash
make docker-up
```
Wait until `curl -s http://localhost:8000/health` returns `{"status":"healthy"}`.
#### 9c — Test a mutable file (first fetch — cache miss)
Download the index or metadata file for the new remote. Confirm:
- HTTP 200
- `X-Artifact-Source: remote` header (or equivalent log line confirming a cache miss)
- Content looks correct (not empty, not an error page)
Confirm the tool resolves and downloads correctly through the proxy.
#### 9i — Tear down
```bash
make docker-down
```
Fix any failures found during 9b–9h before moving on.
### 9.5 — Performance issues: measure before/after and gate the PR
**Skip this step if the issue is not a performance improvement.**
For performance issues, a PR is only warranted if there is a measurable gain. Use the Docker stack to compare before and after.
#### 9.5a — Baseline measurement (before)
Start the stack with the **unmodified** code (temporarily revert your change):
```bash
make docker-up
```
Warm or clear the cache as appropriate, then measure the relevant metric — e.g. concurrent request latency during a slow operation, response time for a specific endpoint, or throughput. Record the numbers.
#### 9.5b — Apply your change and rebuild
```bash
make docker-up # rebuilds the image
```
Repeat exactly the same measurement. Record the numbers.
#### 9.5c — Decide
If the improvement is not clearly measurable, **do not open a PR**. Instead:
1. Update the issue with your findings.
2. Note any conditions under which the improvement would be observable.
3. Skip steps 11–14.
If the improvement is clear, proceed with the commit and PR. Include the before/after numbers in the PR description and the issue comment.
#### 9.5d — Tear down
```bash
make docker-down
```
### 10 — Build the wheel (smoke check)
```bash
uv build --wheel
```
Confirm the build succeeds.
### 11 — Stage and commit
Stage only the files you changed. Do not use `git add -A` or `git add .` — list files explicitly. Run:
```bash
git add <file1> <file2> ...
git commit
```
The commit message must:
- Start with a conventional-commit prefix (`feat:`, `fix:`, `refactor:`, `chore:`, etc.)
- Summarise the change in ≤ 72 characters on the first line
- Optionally include a short body explaining *why* (not *what*)
If the pre-commit hook auto-fixes files, re-stage the fixed files and commit again.
### 12 — Push the branch
```bash
git push origin <branch-name>
```
### 13 — Open a pull request
```bash
tea pulls create \
--base master \
--head <branch-name> \
--title "<same as commit subject>" \
--description "Closes #<N>\n\n## Summary\n<bullet points>\n\n## Test plan\n<what was verified>"
```
### 14 — Comment on the issue
```bash
tea comment <N> "<resolution comment>"
```
The comment must cover:
- **How it was resolved** — what changed and why
- **Issues encountered** — any non-obvious problems hit during implementation
- **Potential future improvements** — what could be done next
### 15 — Return to master
```bash
git checkout master
```
Report the PR URL and a one-sentence summary to the user.
This is an **ArgoCD GitOps repository** that manages Kubernetes applications for the `au-syd1` cluster using a Kustomize + Helm pattern. Applications are deployed via ArgoCD ApplicationSets that watch directory patterns in this repo.
The migration pattern for this repo is: **Terragrunt/Terraform → ArgoCD** (see `migration.md` for full guide).
---
## Essential Commands
```bash
# Build and render manifests for a path (outputs to manifests/<path>/)
Some overlays vendor Helm charts locally under `apps/overlays/au-syd1/<app-name>/charts/<chart-name>/`. When a chart is vendored, the overlay's `kustomization.yaml` references the local path. When not vendored, it references the OCI or HTTP repo directly.
Current Kubernetes target version: **1.33.7** (used by kubeconform in CI).
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.