12 Commits

Author SHA1 Message Date
unkinben dfbedec41b Update apps/base/kanidm/vaultauth.yaml
ci/woodpecker/pr/pre-commit Pipeline failed
ci/woodpecker/pr/kubeconform Pipeline was successful
Fix the VaultAuth object
2026-05-30 23:06:47 +10:00
unkinben 4d594fbde7 feat(kanidm): vault-managed replication certs with auto-restart (#176)
- Store per-pod replication certs in Vault (kv/kubernetes/namespace/kanidm/default/repl-certs)
- VaultAuth + VaultStaticSecret sync certs to kanidm-repl-certs Secret
- busybox config-init init container injects peer certs from Secret into server.toml at startup
- Remove hardcoded partner_cert entries from per-pod server.toml templates
- Add automatic_refresh = true to all replication configs
- Add reloader.stakater.com/auto annotation to trigger rolling restart on ConfigMap/Secret changes
- Document domain UUID mismatch resolution and cert rotation in README

Reviewed-on: #176
2026-05-30 23:00:46 +10:00
unkinben 1b781e0885 feat(woodpecker): set workflow pod priority class to power (#175)
## Summary
Sets `WOODPECKER_BACKEND_K8S_PRIORITY_CLASS: power` on the Woodpecker agent so all CI pipeline pods are scheduled with the `power` PriorityClass (value 100, preemptionPolicy: Never).

This means pipeline pods can be evicted when the cluster is under pressure but won't preempt other workloads.

## Dependency
Requires the `power` PriorityClass to exist on the cluster — deploy PR #174 (priority-classes app) first.

## Test plan
- Trigger a pipeline run and confirm pods are created with `priorityClassName: power`
- `kubectl get pod -n woodpecker -o jsonpath='{.items[*].spec.priorityClassName}'`

Reviewed-on: #175
2026-05-26 23:58:57 +10:00
unkinben ede25a3858 feat(platform): add priority-classes app with low/power/medium/high classes (#174)
## Summary
- New `apps/base/priority-classes/` app with four `PriorityClass` objects managed via the `platform` ArgoCD project
- Adds `apps/overlays/*/priority-classes` to the platform ApplicationSet generator
- Adds `priority-classes` namespace to platform AppProject destinations (required even for cluster-scoped resources)

| Class | Value | PreemptionPolicy | Intent |
|---|---|---|---|
| `low` | 100 | Never | Background work; evictable, won't preempt others |
| `power` | 100 | Never | Compute-heavy but expendable (e.g. AI/ML workloads) |
| `medium` | 10000 | PreemptLowerPriority | Standard services |
| `high` | 100000 | PreemptLowerPriority | Critical services; preempts lower-priority pods |

`PriorityClass` is already in the platform project's `clusterResourceWhitelist` so no project policy changes were needed.

## Test plan
- ArgoCD syncs `platform-priority-classes` successfully
- `kubectl get priorityclasses low power medium high` shows all four classes

Reviewed-on: #174
2026-05-26 23:41:54 +10:00
unkinben f5f713fe86 feat(artifactapi): add open-webui/open-webui to ghcr immutable patterns (#173)
Part of #155 (prerequisite for open-webui deployment PR #172).

## Summary
- Adds `^open-webui/open-webui` to the `ghcr` remote's `immutable_patterns` in `remote-docker.yaml` so version-pinned open-webui image pulls are cached indefinitely through artifactapi

## Test plan
- artifactapi serves `ghcr.io/open-webui/open-webui:<version>` with `X-Artifact-Source: cache` on second fetch

Reviewed-on: #173
2026-05-26 23:28:27 +10:00
unkinben 3990fbfe06 feat(vault): switch to Kubernetes service registration (#171)
Replaces Consul service registration with the native Kubernetes provider so Vault labels its own pods with active/standby/perf-standby status without requiring a Consul dependency.

## Changes
- `values.yaml`: swap `service_registration "consul"` for `service_registration "kubernetes" {}`, add `VAULT_K8S_NAMESPACE` and `VAULT_K8S_POD_NAME` env vars via downward API
- `role_k8s-service-registration.yaml`: Role + RoleBinding granting the `vault` service account `get`/`update`/`patch` on pods
- `kustomization.yaml`: include new RBAC file

Reviewed-on: #171
2026-05-26 00:06:56 +10:00
unkinben d358098fff chore: update replication certs (#170)
- add replication certs for kanidm-0, kanidm-1 and kanidm-2

Reviewed-on: #170
2026-05-25 23:52:06 +10:00
unkinben 201e601737 feat: update kanidm replicaiton (#169)
- split to per-server configs
- remove init containers that attempted to automate the replication config
- add README.md

Reviewed-on: #169
2026-05-25 23:25:48 +10:00
unkinben d230d87ec9 feat(artifactapi): add conftest to GitHub generic remote cache (#168)
## Summary

- Adds `open-policy-agent/conftest/.*/conftest_.*_Linux_x86_64.tar.gz$` to the `github` remote immutable patterns in artifactapi

## Why

conftest v0.68.2 (https://github.com/open-policy-agent/conftest/releases/tag/v0.68.2) is now used for OPA policy checks in CI (see #167). Caching the release tarball in artifactapi reduces external dependency on GitHub during builds.

Reviewed-on: #168
2026-05-25 22:44:57 +10:00
unkinben 6497dab25e fix(puppet): remove explicit clusterIP: null from puppetdb Service (#166)
## Summary

- Removes `clusterIP: null` from the `puppetdb` Service spec

## Why

Setting `clusterIP: null` makes ArgoCD's desired state explicit about the field being null. Kubernetes assigns a real IP on creation and the field is immutable afterward. The null vs assigned-IP mismatch causes permanent OutOfSync on the puppetdb Service. Removing the field means ArgoCD no longer claims ownership of `clusterIP`, so the API server's value is authoritative.

Reviewed-on: #166
2026-05-25 22:44:24 +10:00
unkinben f403c6b05d fix(kanidm): add explicit group/kind/weight to TLSRoute refs (#165)
## Summary

- Adds `group: gateway.networking.k8s.io` and `kind: Gateway` to `parentRefs`
- Adds `group: ""`, `kind: Service`, and `weight: 1` to `backendRefs`

## Why

The Gateway API controller defaults these fields when creating/updating TLSRoute objects, so the live state always has them. ArgoCD diffs desired vs live by string comparison, causing the `kanidm` TLSRoute to show permanent OutOfSync. Same root cause as #162 (HTTPRoutes).

Reviewed-on: #165
2026-05-25 22:43:52 +10:00
unkinben ac8b8212bd fix(consul): normalize cpu limit to canonical string form (#164)
## Summary

- Changes `server.resources.limits.cpu` from `1000m` to `"1"` in consul Helm values

## Why

`1000m` (1000 milliCPU) is equivalent to `1` CPU, but Kubernetes normalizes the value to `"1"` when storing. ArgoCD diffs desired vs live by string comparison, so the mismatch causes a permanent OutOfSync on the `consul-server` StatefulSet. Same root cause as #163.

Reviewed-on: #164
2026-05-25 22:43:35 +10:00
35 changed files with 283 additions and 387 deletions
-7
View File
@@ -40,10 +40,3 @@ repos:
entry: ci/validate-no-secrets.sh
language: system
pass_filenames: false
- id: conftest_policies
name: OPA policy checks (conftest)
entry: conftest test --policy policy/
language: system
types: [yaml]
exclude: ".*/charts/.*|.*/templates/.*|\\.woodpecker/.*"
pass_filenames: true
@@ -6,6 +6,7 @@ remotes:
immutable_patterns:
- "^cloudnative-pg/cloudnative-pg"
- "^emberstack/helm-charts"
- "^open-webui/open-webui"
- "^openvoxproject/"
- "^stakater/reloader"
- "^stalwartlabs/stalwart"
@@ -36,6 +36,7 @@ remotes:
- "neovim/neovim/.*/nvim-linux-x86_64.tar.gz$"
- "nzbgetcom/nzbget/.*/nzbget-.*.x86_64.rpm$"
- "onedr0p/exportarr/.*/exportarr_.*_linux_amd64.tar.gz$"
- "open-policy-agent/conftest/.*/conftest_.*_Linux_x86_64.tar.gz$"
- "openbao/openbao-plugins/.*/openbao-plugin-secrets-consul_linux_amd64_.*.tar.gz$"
- "openbao/openbao-plugins/.*/openbao-plugin-secrets-nomad_linux_amd64_.*.tar.gz$"
- "prometheus-community/bind_exporter/.*/bind_exporter-.*.linux-amd64.tar.gz$"
+51
View File
@@ -0,0 +1,51 @@
# kanidm
Three-replica kanidm identity server with Vault-managed replication certificates.
## Architecture
- Per-pod `server-N.toml` in `resources/` — each has its own replication origin hardcoded
- `config-init` busybox init container copies the right config and injects peer certs from the
vault-synced `kanidm-repl-certs` Secret at pod startup
- `reloader.stakater.com/auto: "true"` triggers a rolling restart when the ConfigMap or Secret changes
- Vault path: `kv/kubernetes/namespace/kanidm/default/repl-certs`
- Keys: `kanidm-0`, `kanidm-1`, `kanidm-2` — each holds that pod's replication certificate
## Initial setup
After the first pod starts, generate the admin credentials:
```bash
kubectl exec -n kanidm kanidm-0 -- /sbin/kanidmd recover-account -c /config/server.toml admin
kubectl exec -n kanidm kanidm-0 -- /sbin/kanidmd recover-account -c /config/server.toml idm_admin
```
## Replication certificate rotation
When certs need to be renewed, update vault and reloader will roll the StatefulSet:
```bash
# Get new cert from a pod
kubectl exec -it -n kanidm kanidm-N -- /sbin/kanidmd renew-replication-certificate -c /config/server.toml
# Write updated cert to vault (reloader triggers restart automatically)
vault kv patch kv/kubernetes/namespace/kanidm/default/repl-certs "kanidm-N=<cert>"
```
## Resolving domain UUID mismatch
If pods initialized independently (each with a different domain UUID), replication will fail with
`Consumer Domain UUID does not match`. Fix by resetting kanidm-1 and kanidm-2 to sync from
kanidm-0 (the authoritative node):
```bash
# Scale down to avoid split-brain during reset
kubectl scale statefulset -n kanidm kanidm --replicas=1
# Delete the stale PVCs for the replica pods
kubectl delete pvc -n kanidm data-kanidm-1 data-kanidm-2
# Scale back up — replicas start with empty DBs and automatic_refresh=true
# will trigger a full sync from kanidm-0 once TLS peer certs are verified
kubectl scale statefulset -n kanidm kanidm --replicas=3
```
-40
View File
@@ -1,40 +0,0 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kanidm-config
namespace: kanidm
labels:
app.kubernetes.io/name: kanidm
app.kubernetes.io/instance: kanidm
data:
server.toml: |
version = "2"
domain = "auth.unkin.net"
origin = "https://auth.unkin.net"
bindaddress = "[::]:8443"
db_path = "/data/kanidm.db"
db_arc_size = 2048
tls_chain = "/data/tls/tls.crt"
tls_key = "/data/tls/tls.key"
log_level = "info"
[online_backup]
path = "/data/backups/"
schedule = "0 22 * * *"
versions = 7
[replication]
origin = "__REPL_ORIGIN__"
bindaddress = "[::]:8444"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kanidm-repl-certs
namespace: kanidm
labels:
app.kubernetes.io/name: kanidm
app.kubernetes.io/instance: kanidm
data: {}
+15 -2
View File
@@ -5,12 +5,25 @@ kind: Kustomization
resources:
- namespace.yaml
- serviceaccount.yaml
- rbac.yaml
- vaultauth.yaml
- vaultstaticsecret.yaml
- certificate.yaml
- configmap.yaml
- service.yaml
- statefulset.yaml
- poddisruptionbudget.yaml
- gateway.yaml
- httproute.yaml
- tlsroute.yaml
configMapGenerator:
- name: kanidm-config
namespace: kanidm
options:
disableNameSuffixHash: true
labels:
app.kubernetes.io/name: kanidm
app.kubernetes.io/instance: kanidm
files:
- server-0.toml=resources/server-0.toml
- server-1.toml=resources/server-1.toml
- server-2.toml=resources/server-2.toml
+20
View File
@@ -0,0 +1,20 @@
version = "2"
domain = "auth.unkin.net"
origin = "https://auth.unkin.net"
bindaddress = "[::]:8443"
db_path = "/data/kanidm.db"
db_arc_size = 2048
tls_chain = "/data/tls/tls.crt"
tls_key = "/data/tls/tls.key"
log_level = "info"
[online_backup]
path = "/data/backups/"
schedule = "0 22 * * *"
versions = 7
[replication]
origin = "repl://kanidm-0.kanidm-headless.kanidm.svc.cluster.local:8444"
bindaddress = "[::]:8444"
automatic_refresh = true
+20
View File
@@ -0,0 +1,20 @@
version = "2"
domain = "auth.unkin.net"
origin = "https://auth.unkin.net"
bindaddress = "[::]:8443"
db_path = "/data/kanidm.db"
db_arc_size = 2048
tls_chain = "/data/tls/tls.crt"
tls_key = "/data/tls/tls.key"
log_level = "info"
[online_backup]
path = "/data/backups/"
schedule = "0 22 * * *"
versions = 7
[replication]
origin = "repl://kanidm-1.kanidm-headless.kanidm.svc.cluster.local:8444"
bindaddress = "[::]:8444"
automatic_refresh = true
+20
View File
@@ -0,0 +1,20 @@
version = "2"
domain = "auth.unkin.net"
origin = "https://auth.unkin.net"
bindaddress = "[::]:8443"
db_path = "/data/kanidm.db"
db_arc_size = 2048
tls_chain = "/data/tls/tls.crt"
tls_key = "/data/tls/tls.key"
log_level = "info"
[online_backup]
path = "/data/backups/"
schedule = "0 22 * * *"
versions = 7
[replication]
origin = "repl://kanidm-2.kanidm-headless.kanidm.svc.cluster.local:8444"
bindaddress = "[::]:8444"
automatic_refresh = true
+12 -40
View File
@@ -4,6 +4,8 @@ kind: StatefulSet
metadata:
name: kanidm
namespace: kanidm
annotations:
reloader.stakater.com/auto: "true"
labels:
app.kubernetes.io/name: kanidm
app.kubernetes.io/instance: kanidm
@@ -36,23 +38,19 @@ spec:
fsGroup: 1000
initContainers:
- name: config-init
image: kanidm/server:1.10.3
image: busybox:1.36
command: ["/bin/sh", "-c"]
args:
- |
set -e
REPL_ORIGIN="repl://${POD_NAME}.kanidm-headless.kanidm.svc.cluster.local:8444"
sed "s|__REPL_ORIGIN__|${REPL_ORIGIN}|g" /config-template/server.toml > /config/server.toml
cp "/config-template/server-${POD_NAME##*-}.toml" /config/server.toml
for peer in kanidm-0 kanidm-1 kanidm-2; do
if [ "${peer}" = "${POD_NAME}" ]; then
continue
fi
[ "${peer}" = "${POD_NAME}" ] && continue
cert_file="/repl-certs/${peer}"
if [ -s "${cert_file}" ]; then
fqdn="${peer}.kanidm-headless.kanidm.svc.cluster.local"
printf '\n[replication."repl://%s:8444"]\ntype = "mutual-pull"\npartner_cert = "%s"\n' \
"${fqdn}" "$(cat ${cert_file})" >> /config/server.toml
fi
[ -s "${cert_file}" ] || continue
fqdn="${peer}.kanidm-headless.kanidm.svc.cluster.local"
printf '\n[replication."repl://%s:8444"]\ntype = "mutual-pull"\npartner_cert = "%s"\n' \
"${fqdn}" "$(cat ${cert_file})" >> /config/server.toml
done
env:
- name: POD_NAME
@@ -62,6 +60,7 @@ spec:
volumeMounts:
- name: config-template
mountPath: /config-template
readOnly: true
- name: config
mountPath: /config
- name: repl-certs
@@ -70,33 +69,6 @@ spec:
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
- name: repl-cert-publisher
image: bitnami/kubectl:1.33
restartPolicy: Always
command: ["/bin/sh", "-c"]
args:
- |
until kubectl exec "${POD_NAME}" -c kanidm -- /sbin/kanidmd renew-replication-certificate 2>/dev/null | grep -q '^# certificate:'; do
sleep 30
done
while true; do
cert=$(kubectl exec "${POD_NAME}" -c kanidm -- /sbin/kanidmd renew-replication-certificate 2>/dev/null \
| grep '^# certificate:' | sed 's/^# certificate: "\(.*\)"$/\1/')
if [ -n "${cert}" ]; then
kubectl patch configmap kanidm-repl-certs \
--type=merge \
-p "{\"data\":{\"${POD_NAME}\":\"${cert}\"}}"
fi
sleep 3600
done
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false
containers:
- name: kanidm
image: kanidm/server:1.10.3
@@ -145,8 +117,8 @@ spec:
- name: config
emptyDir: {}
- name: repl-certs
configMap:
name: kanidm-repl-certs
secret:
secretName: kanidm-repl-certs
- name: tls
secret:
secretName: kanidm-tls
+7 -2
View File
@@ -13,9 +13,14 @@ spec:
- auth.unkin.net
- au.auth.unkin.net
parentRefs:
- name: kanidm
- group: gateway.networking.k8s.io
kind: Gateway
name: kanidm
sectionName: https-passthrough
rules:
- backendRefs:
- name: kanidm
- group: ""
kind: Service
name: kanidm
port: 8443
weight: 1
+18
View File
@@ -0,0 +1,18 @@
---
apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultAuth
metadata:
name: default
namespace: kanidm
spec:
allowedNamespaces:
- kanidm
kubernetes:
audiences:
- vault
role: default
serviceAccount: default
tokenExpirationSeconds: 600
method: kubernetes
mount: k8s/au/syd1
vaultConnectionRef: vso-system/default
+20
View File
@@ -0,0 +1,20 @@
---
apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultStaticSecret
metadata:
name: repl-certs
namespace: kanidm
labels:
app.kubernetes.io/name: kanidm
app.kubernetes.io/instance: kanidm
spec:
vaultAuthRef: default
mount: kv
type: kv-v2
path: kubernetes/namespace/kanidm/default/repl-certs
refreshAfter: 5m
destination:
name: kanidm-repl-certs
create: true
overwrite: true
hmacSecretData: true
@@ -0,0 +1,6 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- priorityclasses.yaml
@@ -0,0 +1,36 @@
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: low
value: 100
preemptionPolicy: Never
globalDefault: false
description: "Low-importance workloads. Can be evicted under pressure but will not preempt other pods."
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: power
value: 100
preemptionPolicy: Never
globalDefault: false
description: "Compute-heavy workloads with low scheduling importance. Evictable under pressure."
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: medium
value: 10000
preemptionPolicy: PreemptLowerPriority
globalDefault: false
description: "Standard workloads. Will preempt low-priority pods if the cluster is under pressure."
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high
value: 100000
preemptionPolicy: PreemptLowerPriority
globalDefault: false
description: "High-importance services. Will preempt medium- and low-priority pods if necessary."
+1 -1
View File
@@ -150,7 +150,7 @@ spec:
memory: 350Mi
cpu: 100m
limits:
memory: 1Gi
memory: 1024Mi
cpu: 500m
securityContext:
runAsNonRoot: true
+1 -1
View File
@@ -35,7 +35,7 @@ spec:
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: "1"
cpu: 1
memory: 1536Mi
requests:
cpu: 250m
@@ -31,11 +31,11 @@ spec:
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: "2"
memory: 3Gi
cpu: 2
memory: 3072Mi
requests:
cpu: 500m
memory: 1Gi
memory: 1024Mi
ports:
- containerPort: 8140
name: puppetserver
@@ -35,11 +35,11 @@ spec:
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: "2"
cpu: 2
memory: 3500Mi
requests:
cpu: 250m
memory: 1Gi
memory: 1024Mi
ports:
- containerPort: 8140
name: puppetserver
-1
View File
@@ -9,7 +9,6 @@ metadata:
name: puppetdb
namespace: puppet
spec:
clusterIP: null
ports:
- name: pdb-http
port: 8080
@@ -53,7 +53,7 @@ spec:
cpu: 500m
memory: 1Gi
limits:
cpu: "2"
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: repodata
@@ -56,7 +56,7 @@ spec:
cpu: 500m
memory: 1Gi
limits:
cpu: "2"
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: repodata
@@ -53,7 +53,7 @@ spec:
cpu: 500m
memory: 1Gi
limits:
cpu: "2"
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: repodata
@@ -52,7 +52,7 @@ spec:
cpu: 500m
memory: 1Gi
limits:
cpu: "2"
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: repodata
+1
View File
@@ -6,3 +6,4 @@ resources:
- namespace.yaml
- gateway.yaml
- httproute.yaml
- role_k8s-service-registration.yaml
@@ -0,0 +1,24 @@
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: vault-k8s-service-registration
namespace: vault
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: vault-k8s-service-registration
namespace: vault
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: vault-k8s-service-registration
subjects:
- kind: ServiceAccount
name: vault
namespace: vault
+1 -1
View File
@@ -37,7 +37,7 @@ server:
cpu: 100m
limits:
memory: 2Gi
cpu: 1000m
cpu: "1"
client:
enabled: false
@@ -0,0 +1,6 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../base/priority-classes
+9 -3
View File
@@ -40,9 +40,7 @@ server:
}
}
service_registration "consul" {
address = "consul-server.consul.svc.cluster.local:8500"
}
service_registration "kubernetes" {}
dataStorage:
enabled: true
@@ -50,6 +48,14 @@ server:
storageClass: cephrbd-fast-delete
accessMode: ReadWriteOnce
extraEnv:
- name: VAULT_K8S_NAMESPACE
value: vault
- name: VAULT_K8S_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
statefulSet:
securityContext:
container:
@@ -2,6 +2,7 @@ agent:
replicaCount: 3
env:
WOODPECKER_MAX_WORKFLOWS: "8"
WOODPECKER_BACKEND_K8S_PRIORITY_CLASS: power
WOODPECKER_BACKEND_K8S_STORAGE_CLASS: cephrbd-fast-delete
WOODPECKER_BACKEND_K8S_VOLUME_SIZE: 10G
WOODPECKER_BACKEND_K8S_STORAGE_RWX: false
+1
View File
@@ -22,6 +22,7 @@ spec:
- path: apps/overlays/*/jfrog
- path: apps/overlays/*/kanidm
- path: apps/overlays/*/node-feature-discovery
- path: apps/overlays/*/priority-classes
- path: apps/overlays/*/puppet
- path: apps/overlays/*/purelb
- path: apps/overlays/*/reflector-system
+2
View File
@@ -31,6 +31,8 @@ spec:
server: https://kubernetes.default.svc
- namespace: 'node-feature-discovery'
server: https://kubernetes.default.svc
- namespace: 'priority-classes'
server: https://kubernetes.default.svc
- namespace: 'purelb'
server: https://kubernetes.default.svc
- namespace: 'puppet'
-94
View File
@@ -1,94 +0,0 @@
package main
# Gateway API resources require several fields to be set explicitly even though
# the Gateway API controller defaults them. ArgoCD diffs desired vs live by
# string comparison, so any field the controller defaults that is absent from
# the git manifest causes a permanent OutOfSync.
#
# Affected resources:
# HTTPRoute / TLSRoute — parentRefs and backendRefs (see PR #162, #165)
# Gateway — listeners[*].tls.certificateRefs (see PR #153)
_route_kinds := {"HTTPRoute", "TLSRoute"}
# ---- parentRefs: group must be "gateway.networking.k8s.io" ----
deny contains msg if {
_route_kinds[input.kind]
ref := input.spec.parentRefs[i]
object.get(ref, "group", null) != "gateway.networking.k8s.io"
msg := sprintf(
"%s %s/%s parentRefs[%d]: add 'group: gateway.networking.k8s.io' — controller defaults this field, causing ArgoCD OutOfSync when omitted",
[input.kind, input.metadata.namespace, input.metadata.name, i],
)
}
# ---- parentRefs: kind must be "Gateway" ----
deny contains msg if {
_route_kinds[input.kind]
ref := input.spec.parentRefs[i]
object.get(ref, "kind", null) != "Gateway"
msg := sprintf(
"%s %s/%s parentRefs[%d]: add 'kind: Gateway' — controller defaults this field, causing ArgoCD OutOfSync when omitted",
[input.kind, input.metadata.namespace, input.metadata.name, i],
)
}
# ---- backendRefs: group must be present (may be empty string "") ----
deny contains msg if {
_route_kinds[input.kind]
rule := input.spec.rules[ri]
ref := rule.backendRefs[bi]
not _has_key(ref, "group")
msg := sprintf(
"%s %s/%s rules[%d].backendRefs[%d]: add 'group: \"\"' — controller defaults this field, causing ArgoCD OutOfSync when omitted",
[input.kind, input.metadata.namespace, input.metadata.name, ri, bi],
)
}
# ---- backendRefs: kind must be "Service" ----
deny contains msg if {
_route_kinds[input.kind]
rule := input.spec.rules[ri]
ref := rule.backendRefs[bi]
object.get(ref, "kind", null) != "Service"
msg := sprintf(
"%s %s/%s rules[%d].backendRefs[%d]: add 'kind: Service' — controller defaults this field, causing ArgoCD OutOfSync when omitted",
[input.kind, input.metadata.namespace, input.metadata.name, ri, bi],
)
}
# ---- backendRefs: weight must be present ----
deny contains msg if {
_route_kinds[input.kind]
rule := input.spec.rules[ri]
ref := rule.backendRefs[bi]
not _has_key(ref, "weight")
msg := sprintf(
"%s %s/%s rules[%d].backendRefs[%d]: add 'weight: 1' — controller defaults this field, causing ArgoCD OutOfSync when omitted",
[input.kind, input.metadata.namespace, input.metadata.name, ri, bi],
)
}
# ---- Gateway certificateRefs: group must be present (may be empty string "") ----
deny contains msg if {
input.kind == "Gateway"
listener := input.spec.listeners[li]
ref := listener.tls.certificateRefs[ci]
not _has_key(ref, "group")
msg := sprintf(
"Gateway %s/%s listeners[%d].tls.certificateRefs[%d]: add 'group: \"\"' — admission webhook defaults this field, causing ArgoCD OutOfSync when omitted",
[input.metadata.namespace, input.metadata.name, li, ci],
)
}
# ---- Helper: key presence check (works for null, "", and any defined value) ----
_has_key(obj, key) if {
_ = obj[key]
}
-13
View File
@@ -1,13 +0,0 @@
package main
# Deny all Kubernetes Ingress resources.
# This cluster uses Gateway API (HTTPRoute + Gateway) for ingress routing.
# Ingress is the legacy API and must not be added.
deny contains msg if {
input.kind == "Ingress"
msg := sprintf(
"%s/%s: Ingress resources are forbidden — use Gateway API HTTPRoute instead",
[input.metadata.namespace, input.metadata.name],
)
}
-173
View File
@@ -1,173 +0,0 @@
package main
# Kubernetes normalizes resource quantity values on write. ArgoCD diffs by
# string comparison, so a non-canonical value in git will always differ from
# the live object, causing permanent OutOfSync.
#
# Rules enforced here:
# CPU integers — k8s converts integer 1 to string "1" (see PR #163)
# CPU milliCPU — k8s converts 1000m → "1", 2000m → "2", etc. (PR #164)
# Memory Mi→Gi — k8s converts 1024Mi → 1Gi, 2048Mi → 2Gi, etc. (PR #163)
# clusterIP null — k8s assigns a real IP; null in git always differs
# from the live assigned value (see PR #166)
# ---- Container helpers ----
# Extracts containers from Deployment/StatefulSet/DaemonSet/Job/CronJob/Pod
_containers contains c if { c := input.spec.template.spec.containers[_] }
_containers contains c if { c := input.spec.template.spec.initContainers[_] }
_containers contains c if { c := input.spec.containers[_] }
_containers contains c if { c := input.spec.initContainers[_] }
_containers contains c if { c := input.spec.jobTemplate.spec.template.spec.containers[_] }
# ---- CPU: must not be an integer ----
# YAML `cpu: 1` (unquoted) parses as JSON integer; k8s stores as string "1".
deny contains msg if {
c := _containers[_]
cpu := c.resources.limits.cpu
is_number(cpu)
msg := sprintf(
"%s container %q: cpu limit %v is an unquoted integer — use \"%v\" (string) to prevent ArgoCD OutOfSync",
[input.kind, c.name, cpu, cpu],
)
}
deny contains msg if {
c := _containers[_]
cpu := c.resources.requests.cpu
is_number(cpu)
msg := sprintf(
"%s container %q: cpu request %v is an unquoted integer — use \"%v\" (string) to prevent ArgoCD OutOfSync",
[input.kind, c.name, cpu, cpu],
)
}
deny contains msg if {
input.kind == "Cluster"
cpu := input.spec.resources.limits.cpu
is_number(cpu)
msg := sprintf(
"Cluster %s/%s: cpu limit %v is an unquoted integer — use \"%v\" (string) to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, cpu, cpu],
)
}
deny contains msg if {
input.kind == "Cluster"
cpu := input.spec.resources.requests.cpu
is_number(cpu)
msg := sprintf(
"Cluster %s/%s: cpu request %v is an unquoted integer — use \"%v\" (string) to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, cpu, cpu],
)
}
# ---- CPU: milliCPU divisible by 1000 normalizes to a whole number ----
# k8s converts 1000m → "1", 2000m → "2". Use the canonical whole-number form.
_milli_whole(cpu) := n if {
is_string(cpu)
endswith(cpu, "m")
val := to_number(substring(cpu, 0, count(cpu) - 1))
val % 1000 == 0
val > 0
n := val / 1000
}
deny contains msg if {
c := _containers[_]
n := _milli_whole(c.resources.limits.cpu)
msg := sprintf(
"%s container %q: cpu limit %q normalizes to \"%v\" — use canonical form to prevent ArgoCD OutOfSync",
[input.kind, c.name, c.resources.limits.cpu, n],
)
}
deny contains msg if {
c := _containers[_]
n := _milli_whole(c.resources.requests.cpu)
msg := sprintf(
"%s container %q: cpu request %q normalizes to \"%v\" — use canonical form to prevent ArgoCD OutOfSync",
[input.kind, c.name, c.resources.requests.cpu, n],
)
}
deny contains msg if {
input.kind == "Cluster"
n := _milli_whole(input.spec.resources.limits.cpu)
msg := sprintf(
"Cluster %s/%s: cpu limit %q normalizes to \"%v\" — use canonical form to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, input.spec.resources.limits.cpu, n],
)
}
deny contains msg if {
input.kind == "Cluster"
n := _milli_whole(input.spec.resources.requests.cpu)
msg := sprintf(
"Cluster %s/%s: cpu request %q normalizes to \"%v\" — use canonical form to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, input.spec.resources.requests.cpu, n],
)
}
# ---- Memory: Mi values divisible by 1024 normalize to Gi ----
# k8s converts 1024Mi → 1Gi, 2048Mi → 2Gi, etc.
_mi_canonical(mem) := canonical if {
endswith(mem, "Mi")
val := to_number(substring(mem, 0, count(mem) - 2))
val % 1024 == 0
val > 0
canonical := sprintf("%vGi", [val / 1024])
}
deny contains msg if {
c := _containers[_]
canonical := _mi_canonical(c.resources.limits.memory)
msg := sprintf(
"%s container %q: memory limit %q normalizes to %q — use canonical form to prevent ArgoCD OutOfSync",
[input.kind, c.name, c.resources.limits.memory, canonical],
)
}
deny contains msg if {
c := _containers[_]
canonical := _mi_canonical(c.resources.requests.memory)
msg := sprintf(
"%s container %q: memory request %q normalizes to %q — use canonical form to prevent ArgoCD OutOfSync",
[input.kind, c.name, c.resources.requests.memory, canonical],
)
}
deny contains msg if {
input.kind == "Cluster"
canonical := _mi_canonical(input.spec.resources.limits.memory)
msg := sprintf(
"Cluster %s/%s: memory limit %q normalizes to %q use canonical form to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, input.spec.resources.limits.memory, canonical],
)
}
deny contains msg if {
input.kind == "Cluster"
canonical := _mi_canonical(input.spec.resources.requests.memory)
msg := sprintf(
"Cluster %s/%s: memory request %q normalizes to %q — use canonical form to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, input.spec.resources.requests.memory, canonical],
)
}
# ---- Service: clusterIP must not be null ----
# k8s assigns a real IP on creation; clusterIP is immutable afterward.
# Setting null in git means desired=null vs live=10.x.x.x → permanent OutOfSync.
# Remove the field and let k8s own it.
deny contains msg if {
input.kind == "Service"
input.spec.clusterIP == null
msg := sprintf(
"Service %s/%s has 'clusterIP: null' remove this field; Kubernetes assigns the IP on creation and it causes ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name],
)
}