10 Commits

Author SHA1 Message Date
unkinben 3d85105afd feat(open-webui): HA deployment with CNPG, PDB, and session persistence
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/kubeconform Pipeline failed
- Switch from SQLite/PVC to CNPG PostgreSQL (3 instances, low-resource)
  with a transaction-mode PgBouncer pooler (2 instances)
- Raise open-webui replicas to 3 with priorityClassName: power
- Add PodDisruptionBudget (minAvailable: 1)
- Add Gateway API sessionPersistence (cookie) on the HTTPS HTTPRoute
  so WebSocket connections stick to the same backend pod
- Add postgres-credentials VaultStaticSecret; DATABASE_URL must be
  added to kv/kubernetes/namespace/open-webui/default/open-webui-credentials
2026-05-26 23:37:10 +10:00
unkinben 85a8cfe47d fix(open-webui): use traefik-internal gateway for chat hostname
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/kubeconform Pipeline was successful
2026-05-26 23:26:01 +10:00
unkinben 16dabbbf8d fix(open-webui): use litellm external hostname as OPENAI_API_BASE_URL
ci/woodpecker/pr/pre-commit Pipeline was canceled
ci/woodpecker/pr/kubeconform Pipeline was canceled
2026-05-26 23:25:26 +10:00
unkinben 1bcb88d3dd feat(open-webui): deploy Open WebUI with litellm backend
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/kubeconform Pipeline was successful
Deploys Open WebUI (chat.k8s.syd1.au.unkin.net) into the open-webui
namespace via the aitooling ArgoCD project. Uses SQLite with a 10Gi
cephrbd PVC for persistence, routes model requests to the existing
litellm deployment, and exposes the UI through the traefik-external
gateway. Credentials (OPENAI_API_KEY, WEBUI_SECRET_KEY) are injected
via VaultStaticSecret from kv/kubernetes/namespace/open-webui/default.

Closes #155
2026-05-26 00:11:25 +10:00
unkinben d358098fff chore: update replication certs (#170)
- add replication certs for kanidm-0, kanidm-1 and kanidm-2

Reviewed-on: #170
2026-05-25 23:52:06 +10:00
unkinben 201e601737 feat: update kanidm replicaiton (#169)
- split to per-server configs
- remove init containers that attempted to automate the replication config
- add README.md

Reviewed-on: #169
2026-05-25 23:25:48 +10:00
unkinben d230d87ec9 feat(artifactapi): add conftest to GitHub generic remote cache (#168)
## Summary

- Adds `open-policy-agent/conftest/.*/conftest_.*_Linux_x86_64.tar.gz$` to the `github` remote immutable patterns in artifactapi

## Why

conftest v0.68.2 (https://github.com/open-policy-agent/conftest/releases/tag/v0.68.2) is now used for OPA policy checks in CI (see #167). Caching the release tarball in artifactapi reduces external dependency on GitHub during builds.

Reviewed-on: #168
2026-05-25 22:44:57 +10:00
unkinben 6497dab25e fix(puppet): remove explicit clusterIP: null from puppetdb Service (#166)
## Summary

- Removes `clusterIP: null` from the `puppetdb` Service spec

## Why

Setting `clusterIP: null` makes ArgoCD's desired state explicit about the field being null. Kubernetes assigns a real IP on creation and the field is immutable afterward. The null vs assigned-IP mismatch causes permanent OutOfSync on the puppetdb Service. Removing the field means ArgoCD no longer claims ownership of `clusterIP`, so the API server's value is authoritative.

Reviewed-on: #166
2026-05-25 22:44:24 +10:00
unkinben f403c6b05d fix(kanidm): add explicit group/kind/weight to TLSRoute refs (#165)
## Summary

- Adds `group: gateway.networking.k8s.io` and `kind: Gateway` to `parentRefs`
- Adds `group: ""`, `kind: Service`, and `weight: 1` to `backendRefs`

## Why

The Gateway API controller defaults these fields when creating/updating TLSRoute objects, so the live state always has them. ArgoCD diffs desired vs live by string comparison, causing the `kanidm` TLSRoute to show permanent OutOfSync. Same root cause as #162 (HTTPRoutes).

Reviewed-on: #165
2026-05-25 22:43:52 +10:00
unkinben ac8b8212bd fix(consul): normalize cpu limit to canonical string form (#164)
## Summary

- Changes `server.resources.limits.cpu` from `1000m` to `"1"` in consul Helm values

## Why

`1000m` (1000 milliCPU) is equivalent to `1` CPU, but Kubernetes normalizes the value to `"1"` when storing. ArgoCD diffs desired vs live by string comparison, so the mismatch causes a permanent OutOfSync on the `consul-server` StatefulSet. Same root cause as #163.

Reviewed-on: #164
2026-05-25 22:43:35 +10:00
37 changed files with 533 additions and 393 deletions
-7
View File
@@ -40,10 +40,3 @@ repos:
entry: ci/validate-no-secrets.sh
language: system
pass_filenames: false
- id: conftest_policies
name: OPA policy checks (conftest)
entry: conftest test --policy policy/
language: system
types: [yaml]
exclude: ".*/charts/.*|.*/templates/.*|\\.woodpecker/.*"
pass_filenames: true
@@ -36,6 +36,7 @@ remotes:
- "neovim/neovim/.*/nvim-linux-x86_64.tar.gz$"
- "nzbgetcom/nzbget/.*/nzbget-.*.x86_64.rpm$"
- "onedr0p/exportarr/.*/exportarr_.*_linux_amd64.tar.gz$"
- "open-policy-agent/conftest/.*/conftest_.*_Linux_x86_64.tar.gz$"
- "openbao/openbao-plugins/.*/openbao-plugin-secrets-consul_linux_amd64_.*.tar.gz$"
- "openbao/openbao-plugins/.*/openbao-plugin-secrets-nomad_linux_amd64_.*.tar.gz$"
- "prometheus-community/bind_exporter/.*/bind_exporter-.*.linux-amd64.tar.gz$"
+32
View File
@@ -0,0 +1,32 @@
# kanidm
Single-replica kanidm identity server deployment.
## Initial setup
After the pod starts for the first time, generate the admin and idm_admin credentials:
```bash
kubectl exec -n kanidm kanidm-0 -- /sbin/kanidmd recover-account admin
kubectl exec -n kanidm kanidm-0 -- /sbin/kanidmd recover-account idm_admin
```
## Adding replication
If replication is needed in the future:
1. Scale the StatefulSet to 3 replicas and add `podAntiAffinity` to spread across nodes.
2. Add a `[replication]` section to `configmap.yaml` per pod (origin is pod-specific:
`repl://kanidm-N.kanidm-headless.kanidm.svc.cluster.local:8444`).
3. Add the replication port (8444) back to the StatefulSet container ports and headless service.
4. Restore `rbac.yaml` for the cert-publisher sidecar, or exchange certificates manually:
```bash
# On each pod, get its replication certificate
kubectl exec -n kanidm kanidm-0 -- /sbin/kanidmd renew-replication-certificate
# Add each peer's certificate to the other pods' configs under:
# [replication."repl://<peer-fqdn>:8444"]
# type = "mutual-pull"
# partner_cert = "<cert>"
```
-40
View File
@@ -1,40 +0,0 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kanidm-config
namespace: kanidm
labels:
app.kubernetes.io/name: kanidm
app.kubernetes.io/instance: kanidm
data:
server.toml: |
version = "2"
domain = "auth.unkin.net"
origin = "https://auth.unkin.net"
bindaddress = "[::]:8443"
db_path = "/data/kanidm.db"
db_arc_size = 2048
tls_chain = "/data/tls/tls.crt"
tls_key = "/data/tls/tls.key"
log_level = "info"
[online_backup]
path = "/data/backups/"
schedule = "0 22 * * *"
versions = 7
[replication]
origin = "__REPL_ORIGIN__"
bindaddress = "[::]:8444"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kanidm-repl-certs
namespace: kanidm
labels:
app.kubernetes.io/name: kanidm
app.kubernetes.io/instance: kanidm
data: {}
+13 -2
View File
@@ -5,12 +5,23 @@ kind: Kustomization
resources:
- namespace.yaml
- serviceaccount.yaml
- rbac.yaml
- certificate.yaml
- configmap.yaml
- service.yaml
- statefulset.yaml
- poddisruptionbudget.yaml
- gateway.yaml
- httproute.yaml
- tlsroute.yaml
configMapGenerator:
- name: kanidm-config
namespace: kanidm
options:
disableNameSuffixHash: true
labels:
app.kubernetes.io/name: kanidm
app.kubernetes.io/instance: kanidm
files:
- server-0.toml=resources/server-0.toml
- server-1.toml=resources/server-1.toml
- server-2.toml=resources/server-2.toml
+27
View File
@@ -0,0 +1,27 @@
version = "2"
domain = "auth.unkin.net"
origin = "https://auth.unkin.net"
bindaddress = "[::]:8443"
db_path = "/data/kanidm.db"
db_arc_size = 2048
tls_chain = "/data/tls/tls.crt"
tls_key = "/data/tls/tls.key"
log_level = "info"
[online_backup]
path = "/data/backups/"
schedule = "0 22 * * *"
versions = 7
[replication]
origin = "repl://kanidm-0.kanidm-headless.kanidm.svc.cluster.local:8444"
bindaddress = "[::]:8444"
[replication."repl://kanidm-1.kanidm-headless.kanidm.svc.cluster.local:8444"]
type = "mutual-pull"
partner_cert = "MIIB-TCCAZ-gAwIBAgIRASqOpORz60wiv7wF_7oBOxQwCgYIKoZIzj0EAwIwTDEtMCsGA1UEAwwkMmE4ZWE0ZTQtNzNlYi00YzIyLWJmYmMtMDVmZmJhMDEzYjE0MRswGQYDVQQKDBJLYW5pZG0gUmVwbGljYXRpb24wHhcNMjYwNTI1MTMyODM5WhcNMzAwNTI1MTMyODM5WjBMMS0wKwYDVQQDDCQyYThlYTRlNC03M2ViLTRjMjItYmZiYy0wNWZmYmEwMTNiMTQxGzAZBgNVBAoMEkthbmlkbSBSZXBsaWNhdGlvbjBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABFQP3zpFRt7TCOhzrUpOJBojn-sC2LmqZUub8P2ymVdIQbmoAyh4Q8Me0hNWJFyuFDnnqO06dt5I2iv0910-X6KjYjBgMCAGA1UdJQEB_wQWMBQGCCsGAQUFBwMCBggrBgEFBQcDATA8BgNVHREENTAzgjFrYW5pZG0tMS5rYW5pZG0taGVhZGxlc3Mua2FuaWRtLnN2Yy5jbHVzdGVyLmxvY2FsMAoGCCqGSM49BAMCA0gAMEUCIGjl58U6apcDjMEPIca8Wwg_JMfuMvV-uVcJI49Gl_9GAiEA2tFdb9rnFeBI7mwysScf5UsmY3ZziMD3UVm1vWN2IKs"
[replication."repl://kanidm-2.kanidm-headless.kanidm.svc.cluster.local:8444"]
type = "mutual-pull"
partner_cert = "MIIB-TCCAZ-gAwIBAgIRAeFGUAJbCkJ2vzf_Vv4qjeUwCgYIKoZIzj0EAwIwTDEtMCsGA1UEAwwkZTE0NjUwMDItNWIwYS00Mjc2LWJmMzctZmY1NmZlMmE4ZGU1MRswGQYDVQQKDBJLYW5pZG0gUmVwbGljYXRpb24wHhcNMjYwNTI1MTMyOTEwWhcNMzAwNTI1MTMyOTEwWjBMMS0wKwYDVQQDDCRlMTQ2NTAwMi01YjBhLTQyNzYtYmYzNy1mZjU2ZmUyYThkZTUxGzAZBgNVBAoMEkthbmlkbSBSZXBsaWNhdGlvbjBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABCrncHSbDNSV3_aOSZ14plbVfrvSXQQL9MOqvrDKlf_Q6WbcA8OrTUjs3Jt0Q2beWjC3Z5-5c9fGu8M_k2iVWf-jYjBgMCAGA1UdJQEB_wQWMBQGCCsGAQUFBwMCBggrBgEFBQcDATA8BgNVHREENTAzgjFrYW5pZG0tMi5rYW5pZG0taGVhZGxlc3Mua2FuaWRtLnN2Yy5jbHVzdGVyLmxvY2FsMAoGCCqGSM49BAMCA0gAMEUCIQDHY5Yl-bhDTuJaYnHSMSiSAEWPrDcRVzvfmOJukuJ1QQIgSwgyeSG3K0MY87DI1RDYAdZlpP1YOK3Yatj7-YSXPC0"
+27
View File
@@ -0,0 +1,27 @@
version = "2"
domain = "auth.unkin.net"
origin = "https://auth.unkin.net"
bindaddress = "[::]:8443"
db_path = "/data/kanidm.db"
db_arc_size = 2048
tls_chain = "/data/tls/tls.crt"
tls_key = "/data/tls/tls.key"
log_level = "info"
[online_backup]
path = "/data/backups/"
schedule = "0 22 * * *"
versions = 7
[replication]
origin = "repl://kanidm-1.kanidm-headless.kanidm.svc.cluster.local:8444"
bindaddress = "[::]:8444"
[replication."repl://kanidm-0.kanidm-headless.kanidm.svc.cluster.local:8444"]
type = "mutual-pull"
partner_cert = "MIIB-jCCAZ-gAwIBAgIRAVKuoPDpF0IBnvFjCwdK41EwCgYIKoZIzj0EAwIwTDEtMCsGA1UEAwwkNTJhZWEwZjAtZTkxNy00MjAxLTllZjEtNjMwYjA3NGFlMzUxMRswGQYDVQQKDBJLYW5pZG0gUmVwbGljYXRpb24wHhcNMjYwNTI1MTMzNzQ5WhcNMzAwNTI1MTMzNzQ5WjBMMS0wKwYDVQQDDCQ1MmFlYTBmMC1lOTE3LTQyMDEtOWVmMS02MzBiMDc0YWUzNTExGzAZBgNVBAoMEkthbmlkbSBSZXBsaWNhdGlvbjBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABGejqjk0Eet-RILHI236wHYqISdnPlebqnkuUTh4W2mCzkmqKibyjxGIUOs8LBrUeTR2DxVR1VV6H2rYQk2wdROjYjBgMCAGA1UdJQEB_wQWMBQGCCsGAQUFBwMCBggrBgEFBQcDATA8BgNVHREENTAzgjFrYW5pZG0tMC5rYW5pZG0taGVhZGxlc3Mua2FuaWRtLnN2Yy5jbHVzdGVyLmxvY2FsMAoGCCqGSM49BAMCA0kAMEYCIQCSkFj2A-KVWv2tKJLFzb18J5eWWKtsvWewZTn-FVnRnQIhAKJbt84IoZ9oXxgfp0VOLyVZiAgUgwMFS6JOfno3D-Nw"
[replication."repl://kanidm-2.kanidm-headless.kanidm.svc.cluster.local:8444"]
type = "mutual-pull"
partner_cert = "MIIB-TCCAZ-gAwIBAgIRAeFGUAJbCkJ2vzf_Vv4qjeUwCgYIKoZIzj0EAwIwTDEtMCsGA1UEAwwkZTE0NjUwMDItNWIwYS00Mjc2LWJmMzctZmY1NmZlMmE4ZGU1MRswGQYDVQQKDBJLYW5pZG0gUmVwbGljYXRpb24wHhcNMjYwNTI1MTMyOTEwWhcNMzAwNTI1MTMyOTEwWjBMMS0wKwYDVQQDDCRlMTQ2NTAwMi01YjBhLTQyNzYtYmYzNy1mZjU2ZmUyYThkZTUxGzAZBgNVBAoMEkthbmlkbSBSZXBsaWNhdGlvbjBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABCrncHSbDNSV3_aOSZ14plbVfrvSXQQL9MOqvrDKlf_Q6WbcA8OrTUjs3Jt0Q2beWjC3Z5-5c9fGu8M_k2iVWf-jYjBgMCAGA1UdJQEB_wQWMBQGCCsGAQUFBwMCBggrBgEFBQcDATA8BgNVHREENTAzgjFrYW5pZG0tMi5rYW5pZG0taGVhZGxlc3Mua2FuaWRtLnN2Yy5jbHVzdGVyLmxvY2FsMAoGCCqGSM49BAMCA0gAMEUCIQDHY5Yl-bhDTuJaYnHSMSiSAEWPrDcRVzvfmOJukuJ1QQIgSwgyeSG3K0MY87DI1RDYAdZlpP1YOK3Yatj7-YSXPC0"
+27
View File
@@ -0,0 +1,27 @@
version = "2"
domain = "auth.unkin.net"
origin = "https://auth.unkin.net"
bindaddress = "[::]:8443"
db_path = "/data/kanidm.db"
db_arc_size = 2048
tls_chain = "/data/tls/tls.crt"
tls_key = "/data/tls/tls.key"
log_level = "info"
[online_backup]
path = "/data/backups/"
schedule = "0 22 * * *"
versions = 7
[replication]
origin = "repl://kanidm-2.kanidm-headless.kanidm.svc.cluster.local:8444"
bindaddress = "[::]:8444"
[replication."repl://kanidm-0.kanidm-headless.kanidm.svc.cluster.local:8444"]
type = "mutual-pull"
partner_cert = "MIIB-jCCAZ-gAwIBAgIRAVKuoPDpF0IBnvFjCwdK41EwCgYIKoZIzj0EAwIwTDEtMCsGA1UEAwwkNTJhZWEwZjAtZTkxNy00MjAxLTllZjEtNjMwYjA3NGFlMzUxMRswGQYDVQQKDBJLYW5pZG0gUmVwbGljYXRpb24wHhcNMjYwNTI1MTMzNzQ5WhcNMzAwNTI1MTMzNzQ5WjBMMS0wKwYDVQQDDCQ1MmFlYTBmMC1lOTE3LTQyMDEtOWVmMS02MzBiMDc0YWUzNTExGzAZBgNVBAoMEkthbmlkbSBSZXBsaWNhdGlvbjBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABGejqjk0Eet-RILHI236wHYqISdnPlebqnkuUTh4W2mCzkmqKibyjxGIUOs8LBrUeTR2DxVR1VV6H2rYQk2wdROjYjBgMCAGA1UdJQEB_wQWMBQGCCsGAQUFBwMCBggrBgEFBQcDATA8BgNVHREENTAzgjFrYW5pZG0tMC5rYW5pZG0taGVhZGxlc3Mua2FuaWRtLnN2Yy5jbHVzdGVyLmxvY2FsMAoGCCqGSM49BAMCA0kAMEYCIQCSkFj2A-KVWv2tKJLFzb18J5eWWKtsvWewZTn-FVnRnQIhAKJbt84IoZ9oXxgfp0VOLyVZiAgUgwMFS6JOfno3D-Nw"
[replication."repl://kanidm-1.kanidm-headless.kanidm.svc.cluster.local:8444"]
type = "mutual-pull"
partner_cert = "MIIB-TCCAZ-gAwIBAgIRASqOpORz60wiv7wF_7oBOxQwCgYIKoZIzj0EAwIwTDEtMCsGA1UEAwwkMmE4ZWE0ZTQtNzNlYi00YzIyLWJmYmMtMDVmZmJhMDEzYjE0MRswGQYDVQQKDBJLYW5pZG0gUmVwbGljYXRpb24wHhcNMjYwNTI1MTMyODM5WhcNMzAwNTI1MTMyODM5WjBMMS0wKwYDVQQDDCQyYThlYTRlNC03M2ViLTRjMjItYmZiYy0wNWZmYmEwMTNiMTQxGzAZBgNVBAoMEkthbmlkbSBSZXBsaWNhdGlvbjBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABFQP3zpFRt7TCOhzrUpOJBojn-sC2LmqZUub8P2ymVdIQbmoAyh4Q8Me0hNWJFyuFDnnqO06dt5I2iv0910-X6KjYjBgMCAGA1UdJQEB_wQWMBQGCCsGAQUFBwMCBggrBgEFBQcDATA8BgNVHREENTAzgjFrYW5pZG0tMS5rYW5pZG0taGVhZGxlc3Mua2FuaWRtLnN2Yy5jbHVzdGVyLmxvY2FsMAoGCCqGSM49BAMCA0gAMEUCIGjl58U6apcDjMEPIca8Wwg_JMfuMvV-uVcJI49Gl_9GAiEA2tFdb9rnFeBI7mwysScf5UsmY3ZziMD3UVm1vWN2IKs"
+3 -49
View File
@@ -36,24 +36,10 @@ spec:
fsGroup: 1000
initContainers:
- name: config-init
image: kanidm/server:1.10.3
image: busybox:1.36
command: ["/bin/sh", "-c"]
args:
- |
set -e
REPL_ORIGIN="repl://${POD_NAME}.kanidm-headless.kanidm.svc.cluster.local:8444"
sed "s|__REPL_ORIGIN__|${REPL_ORIGIN}|g" /config-template/server.toml > /config/server.toml
for peer in kanidm-0 kanidm-1 kanidm-2; do
if [ "${peer}" = "${POD_NAME}" ]; then
continue
fi
cert_file="/repl-certs/${peer}"
if [ -s "${cert_file}" ]; then
fqdn="${peer}.kanidm-headless.kanidm.svc.cluster.local"
printf '\n[replication."repl://%s:8444"]\ntype = "mutual-pull"\npartner_cert = "%s"\n' \
"${fqdn}" "$(cat ${cert_file})" >> /config/server.toml
fi
done
- cp "/config-template/server-${POD_NAME##*-}.toml" /config/server.toml
env:
- name: POD_NAME
valueFrom:
@@ -62,41 +48,12 @@ spec:
volumeMounts:
- name: config-template
mountPath: /config-template
readOnly: true
- name: config
mountPath: /config
- name: repl-certs
mountPath: /repl-certs
readOnly: true
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
- name: repl-cert-publisher
image: bitnami/kubectl:1.33
restartPolicy: Always
command: ["/bin/sh", "-c"]
args:
- |
until kubectl exec "${POD_NAME}" -c kanidm -- /sbin/kanidmd renew-replication-certificate 2>/dev/null | grep -q '^# certificate:'; do
sleep 30
done
while true; do
cert=$(kubectl exec "${POD_NAME}" -c kanidm -- /sbin/kanidmd renew-replication-certificate 2>/dev/null \
| grep '^# certificate:' | sed 's/^# certificate: "\(.*\)"$/\1/')
if [ -n "${cert}" ]; then
kubectl patch configmap kanidm-repl-certs \
--type=merge \
-p "{\"data\":{\"${POD_NAME}\":\"${cert}\"}}"
fi
sleep 3600
done
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false
containers:
- name: kanidm
image: kanidm/server:1.10.3
@@ -144,9 +101,6 @@ spec:
name: kanidm-config
- name: config
emptyDir: {}
- name: repl-certs
configMap:
name: kanidm-repl-certs
- name: tls
secret:
secretName: kanidm-tls
+7 -2
View File
@@ -13,9 +13,14 @@ spec:
- auth.unkin.net
- au.auth.unkin.net
parentRefs:
- name: kanidm
- group: gateway.networking.k8s.io
kind: Gateway
name: kanidm
sectionName: https-passthrough
rules:
- backendRefs:
- name: kanidm
- group: ""
kind: Service
name: kanidm
port: 8443
weight: 1
+91
View File
@@ -0,0 +1,91 @@
---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: open-webui-postgres
namespace: open-webui
spec:
affinity:
podAntiAffinityType: preferred
bootstrap:
initdb:
database: open-webui
encoding: UTF8
localeCType: C
localeCollate: C
owner: open-webui
secret:
name: postgres-credentials
enablePDB: true
enableSuperuserAccess: false
failoverDelay: 0
imageName: ghcr.io/cloudnative-pg/postgresql:17-minimal-trixie
instances: 3
logLevel: info
maxSyncReplicas: 0
minSyncReplicas: 0
monitoring:
customQueriesConfigMap:
- key: queries
name: cnpg-default-monitoring
disableDefaultQueries: false
enablePodMonitor: false
postgresql:
parameters:
archive_mode: "on"
archive_timeout: 5min
dynamic_shared_memory_type: posix
effective_cache_size: 128MB
full_page_writes: "on"
log_destination: csvlog
log_directory: /controller/log
log_filename: postgres
log_rotation_age: "0"
log_rotation_size: "0"
log_truncate_on_rotation: "false"
logging_collector: "on"
max_connections: "100"
max_parallel_workers: "4"
max_replication_slots: "16"
max_worker_processes: "4"
shared_buffers: 64MB
shared_memory_type: mmap
ssl_max_protocol_version: TLSv1.3
ssl_min_protocol_version: TLSv1.3
wal_keep_size: 128MB
wal_level: logical
wal_log_hints: "on"
wal_receiver_timeout: 5s
wal_sender_timeout: 5s
syncReplicaElectionConstraint:
enabled: false
primaryUpdateMethod: restart
primaryUpdateStrategy: unsupervised
probes:
liveness:
isolationCheck:
connectionTimeout: 1000
enabled: true
requestTimeout: 1000
replicationSlots:
highAvailability:
enabled: true
slotPrefix: _cnpg_
synchronizeReplicas:
enabled: true
updateInterval: 30
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 256Mi
smartShutdownTimeout: 180
startDelay: 3600
stopDelay: 1800
storage:
resizeInUseVolumes: true
size: 5Gi
storageClass: cephrbd-fast-delete
switchoverDelay: 3600
+33
View File
@@ -0,0 +1,33 @@
---
apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
name: open-webui-postgres-pooler
namespace: open-webui
spec:
cluster:
name: open-webui-postgres
instances: 2
pgbouncer:
parameters:
default_pool_size: "50"
max_client_conn: "200"
paused: false
poolMode: transaction
template:
metadata:
labels:
app: pooler
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- pooler
topologyKey: kubernetes.io/hostname
containers: []
type: rw
+61
View File
@@ -0,0 +1,61 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: open-webui
namespace: open-webui
spec:
replicas: 3
selector:
matchLabels:
app: open-webui
template:
metadata:
annotations:
reloader.stakater.com/auto: "true"
labels:
app: open-webui
spec:
priorityClassName: power
containers:
- name: open-webui
image: ghcr.io/open-webui/open-webui:main
imagePullPolicy: Always
ports:
- containerPort: 8080
name: http
protocol: TCP
env:
- name: OPENAI_API_BASE_URL
value: https://litellm.k8s.syd1.au.unkin.net
- name: WEBUI_URL
value: https://chat.k8s.syd1.au.unkin.net
envFrom:
- secretRef:
name: open-webui-credentials
livenessProbe:
httpGet:
path: /health
port: 8080
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /health
port: 8080
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 250m
memory: 512Mi
restartPolicy: Always
+37
View File
@@ -0,0 +1,37 @@
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
labels:
traefik.io/instance: internal
annotations:
cert-manager.io/cluster-issuer: vault-issuer
cert-manager.io/common-name: chat.k8s.syd1.au.unkin.net
cert-manager.io/private-key-size: "4096"
external-dns.alpha.kubernetes.io/hostname: chat.k8s.syd1.au.unkin.net
external-dns.alpha.kubernetes.io/target: 198.18.200.4
name: open-webui
namespace: open-webui
spec:
gatewayClassName: traefik-internal
listeners:
- allowedRoutes:
namespaces:
from: Same
hostname: chat.k8s.syd1.au.unkin.net
name: http
port: 80
protocol: HTTP
- allowedRoutes:
namespaces:
from: Same
hostname: chat.k8s.syd1.au.unkin.net
name: https
port: 443
protocol: HTTPS
tls:
certificateRefs:
- group: ""
kind: Secret
name: open-webui-tls
mode: Terminate
+53
View File
@@ -0,0 +1,53 @@
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: open-webui-http-redirect
namespace: open-webui
spec:
hostnames:
- chat.k8s.syd1.au.unkin.net
parentRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: open-webui
sectionName: http
rules:
- filters:
- type: RequestRedirect
requestRedirect:
scheme: https
statusCode: 301
matches:
- path:
type: PathPrefix
value: /
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: open-webui
namespace: open-webui
spec:
hostnames:
- chat.k8s.syd1.au.unkin.net
parentRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: open-webui
sectionName: https
rules:
- backendRefs:
- group: ""
kind: Service
name: open-webui
port: 8080
weight: 1
matches:
- path:
type: PathPrefix
value: /
sessionPersistence:
type: Cookie
cookieName: open-webui-backend
absoluteTimeout: 24h0m0s
+15
View File
@@ -0,0 +1,15 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- cnpg_cluster.yaml
- cnpg_pooler.yaml
- deployment.yaml
- pdb.yaml
- service.yaml
- gateway.yaml
- httproute.yaml
- vaultauth.yaml
- vaultstaticsecret.yaml
+5
View File
@@ -0,0 +1,5 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: open-webui
+11
View File
@@ -0,0 +1,11 @@
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: open-webui
namespace: open-webui
spec:
minAvailable: 1
selector:
matchLabels:
app: open-webui
+17
View File
@@ -0,0 +1,17 @@
---
apiVersion: v1
kind: Service
metadata:
name: open-webui
namespace: open-webui
spec:
internalTrafficPolicy: Cluster
ports:
- name: http
port: 8080
protocol: TCP
targetPort: http
selector:
app: open-webui
sessionAffinity: None
type: ClusterIP
+18
View File
@@ -0,0 +1,18 @@
---
apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultAuth
metadata:
name: default
namespace: open-webui
spec:
allowedNamespaces:
- open-webui
kubernetes:
audiences:
- vault
role: default
serviceAccount: default
tokenExpirationSeconds: 600
method: kubernetes
mount: k8s/au/syd1
vaultConnectionRef: vso-system/default
@@ -0,0 +1,34 @@
---
apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultStaticSecret
metadata:
name: postgres-credentials
namespace: open-webui
spec:
destination:
create: true
name: postgres-credentials
overwrite: true
hmacSecretData: true
mount: kv
path: kubernetes/namespace/open-webui/default/postgres-credentials
refreshAfter: 5m
type: kv-v2
vaultAuthRef: default
---
apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultStaticSecret
metadata:
name: open-webui-credentials
namespace: open-webui
spec:
destination:
create: true
name: open-webui-credentials
overwrite: true
hmacSecretData: true
mount: kv
path: kubernetes/namespace/open-webui/default/open-webui-credentials
refreshAfter: 5m
type: kv-v2
vaultAuthRef: default
+1 -1
View File
@@ -150,7 +150,7 @@ spec:
memory: 350Mi
cpu: 100m
limits:
memory: 1Gi
memory: 1024Mi
cpu: 500m
securityContext:
runAsNonRoot: true
+1 -1
View File
@@ -35,7 +35,7 @@ spec:
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: "1"
cpu: 1
memory: 1536Mi
requests:
cpu: 250m
@@ -31,11 +31,11 @@ spec:
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: "2"
memory: 3Gi
cpu: 2
memory: 3072Mi
requests:
cpu: 500m
memory: 1Gi
memory: 1024Mi
ports:
- containerPort: 8140
name: puppetserver
@@ -35,11 +35,11 @@ spec:
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: "2"
cpu: 2
memory: 3500Mi
requests:
cpu: 250m
memory: 1Gi
memory: 1024Mi
ports:
- containerPort: 8140
name: puppetserver
-1
View File
@@ -9,7 +9,6 @@ metadata:
name: puppetdb
namespace: puppet
spec:
clusterIP: null
ports:
- name: pdb-http
port: 8080
@@ -53,7 +53,7 @@ spec:
cpu: 500m
memory: 1Gi
limits:
cpu: "2"
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: repodata
@@ -56,7 +56,7 @@ spec:
cpu: 500m
memory: 1Gi
limits:
cpu: "2"
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: repodata
@@ -53,7 +53,7 @@ spec:
cpu: 500m
memory: 1Gi
limits:
cpu: "2"
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: repodata
@@ -52,7 +52,7 @@ spec:
cpu: 500m
memory: 1Gi
limits:
cpu: "2"
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: repodata
+1 -1
View File
@@ -37,7 +37,7 @@ server:
cpu: 100m
limits:
memory: 2Gi
cpu: 1000m
cpu: "1"
client:
enabled: false
@@ -0,0 +1,6 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../base/open-webui
+1
View File
@@ -11,6 +11,7 @@ spec:
revision: HEAD
directories:
- path: apps/overlays/*/litellm
- path: apps/overlays/*/open-webui
- path: apps/overlays/*/paperclip
template:
metadata:
+2
View File
@@ -11,6 +11,8 @@ spec:
destinations:
- namespace: 'litellm'
server: https://kubernetes.default.svc
- namespace: 'open-webui'
server: https://kubernetes.default.svc
- namespace: 'paperclip'
server: https://kubernetes.default.svc
clusterResourceWhitelist:
-94
View File
@@ -1,94 +0,0 @@
package main
# Gateway API resources require several fields to be set explicitly even though
# the Gateway API controller defaults them. ArgoCD diffs desired vs live by
# string comparison, so any field the controller defaults that is absent from
# the git manifest causes a permanent OutOfSync.
#
# Affected resources:
# HTTPRoute / TLSRoute — parentRefs and backendRefs (see PR #162, #165)
# Gateway — listeners[*].tls.certificateRefs (see PR #153)
_route_kinds := {"HTTPRoute", "TLSRoute"}
# ---- parentRefs: group must be "gateway.networking.k8s.io" ----
deny contains msg if {
_route_kinds[input.kind]
ref := input.spec.parentRefs[i]
object.get(ref, "group", null) != "gateway.networking.k8s.io"
msg := sprintf(
"%s %s/%s parentRefs[%d]: add 'group: gateway.networking.k8s.io' — controller defaults this field, causing ArgoCD OutOfSync when omitted",
[input.kind, input.metadata.namespace, input.metadata.name, i],
)
}
# ---- parentRefs: kind must be "Gateway" ----
deny contains msg if {
_route_kinds[input.kind]
ref := input.spec.parentRefs[i]
object.get(ref, "kind", null) != "Gateway"
msg := sprintf(
"%s %s/%s parentRefs[%d]: add 'kind: Gateway' — controller defaults this field, causing ArgoCD OutOfSync when omitted",
[input.kind, input.metadata.namespace, input.metadata.name, i],
)
}
# ---- backendRefs: group must be present (may be empty string "") ----
deny contains msg if {
_route_kinds[input.kind]
rule := input.spec.rules[ri]
ref := rule.backendRefs[bi]
not _has_key(ref, "group")
msg := sprintf(
"%s %s/%s rules[%d].backendRefs[%d]: add 'group: \"\"' — controller defaults this field, causing ArgoCD OutOfSync when omitted",
[input.kind, input.metadata.namespace, input.metadata.name, ri, bi],
)
}
# ---- backendRefs: kind must be "Service" ----
deny contains msg if {
_route_kinds[input.kind]
rule := input.spec.rules[ri]
ref := rule.backendRefs[bi]
object.get(ref, "kind", null) != "Service"
msg := sprintf(
"%s %s/%s rules[%d].backendRefs[%d]: add 'kind: Service' — controller defaults this field, causing ArgoCD OutOfSync when omitted",
[input.kind, input.metadata.namespace, input.metadata.name, ri, bi],
)
}
# ---- backendRefs: weight must be present ----
deny contains msg if {
_route_kinds[input.kind]
rule := input.spec.rules[ri]
ref := rule.backendRefs[bi]
not _has_key(ref, "weight")
msg := sprintf(
"%s %s/%s rules[%d].backendRefs[%d]: add 'weight: 1' — controller defaults this field, causing ArgoCD OutOfSync when omitted",
[input.kind, input.metadata.namespace, input.metadata.name, ri, bi],
)
}
# ---- Gateway certificateRefs: group must be present (may be empty string "") ----
deny contains msg if {
input.kind == "Gateway"
listener := input.spec.listeners[li]
ref := listener.tls.certificateRefs[ci]
not _has_key(ref, "group")
msg := sprintf(
"Gateway %s/%s listeners[%d].tls.certificateRefs[%d]: add 'group: \"\"' — admission webhook defaults this field, causing ArgoCD OutOfSync when omitted",
[input.metadata.namespace, input.metadata.name, li, ci],
)
}
# ---- Helper: key presence check (works for null, "", and any defined value) ----
_has_key(obj, key) if {
_ = obj[key]
}
-13
View File
@@ -1,13 +0,0 @@
package main
# Deny all Kubernetes Ingress resources.
# This cluster uses Gateway API (HTTPRoute + Gateway) for ingress routing.
# Ingress is the legacy API and must not be added.
deny contains msg if {
input.kind == "Ingress"
msg := sprintf(
"%s/%s: Ingress resources are forbidden — use Gateway API HTTPRoute instead",
[input.metadata.namespace, input.metadata.name],
)
}
-173
View File
@@ -1,173 +0,0 @@
package main
# Kubernetes normalizes resource quantity values on write. ArgoCD diffs by
# string comparison, so a non-canonical value in git will always differ from
# the live object, causing permanent OutOfSync.
#
# Rules enforced here:
# CPU integers — k8s converts integer 1 to string "1" (see PR #163)
# CPU milliCPU — k8s converts 1000m → "1", 2000m → "2", etc. (PR #164)
# Memory Mi→Gi — k8s converts 1024Mi → 1Gi, 2048Mi → 2Gi, etc. (PR #163)
# clusterIP null — k8s assigns a real IP; null in git always differs
# from the live assigned value (see PR #166)
# ---- Container helpers ----
# Extracts containers from Deployment/StatefulSet/DaemonSet/Job/CronJob/Pod
_containers contains c if { c := input.spec.template.spec.containers[_] }
_containers contains c if { c := input.spec.template.spec.initContainers[_] }
_containers contains c if { c := input.spec.containers[_] }
_containers contains c if { c := input.spec.initContainers[_] }
_containers contains c if { c := input.spec.jobTemplate.spec.template.spec.containers[_] }
# ---- CPU: must not be an integer ----
# YAML `cpu: 1` (unquoted) parses as JSON integer; k8s stores as string "1".
deny contains msg if {
c := _containers[_]
cpu := c.resources.limits.cpu
is_number(cpu)
msg := sprintf(
"%s container %q: cpu limit %v is an unquoted integer — use \"%v\" (string) to prevent ArgoCD OutOfSync",
[input.kind, c.name, cpu, cpu],
)
}
deny contains msg if {
c := _containers[_]
cpu := c.resources.requests.cpu
is_number(cpu)
msg := sprintf(
"%s container %q: cpu request %v is an unquoted integer — use \"%v\" (string) to prevent ArgoCD OutOfSync",
[input.kind, c.name, cpu, cpu],
)
}
deny contains msg if {
input.kind == "Cluster"
cpu := input.spec.resources.limits.cpu
is_number(cpu)
msg := sprintf(
"Cluster %s/%s: cpu limit %v is an unquoted integer — use \"%v\" (string) to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, cpu, cpu],
)
}
deny contains msg if {
input.kind == "Cluster"
cpu := input.spec.resources.requests.cpu
is_number(cpu)
msg := sprintf(
"Cluster %s/%s: cpu request %v is an unquoted integer — use \"%v\" (string) to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, cpu, cpu],
)
}
# ---- CPU: milliCPU divisible by 1000 normalizes to a whole number ----
# k8s converts 1000m → "1", 2000m → "2". Use the canonical whole-number form.
_milli_whole(cpu) := n if {
is_string(cpu)
endswith(cpu, "m")
val := to_number(substring(cpu, 0, count(cpu) - 1))
val % 1000 == 0
val > 0
n := val / 1000
}
deny contains msg if {
c := _containers[_]
n := _milli_whole(c.resources.limits.cpu)
msg := sprintf(
"%s container %q: cpu limit %q normalizes to \"%v\" — use canonical form to prevent ArgoCD OutOfSync",
[input.kind, c.name, c.resources.limits.cpu, n],
)
}
deny contains msg if {
c := _containers[_]
n := _milli_whole(c.resources.requests.cpu)
msg := sprintf(
"%s container %q: cpu request %q normalizes to \"%v\" — use canonical form to prevent ArgoCD OutOfSync",
[input.kind, c.name, c.resources.requests.cpu, n],
)
}
deny contains msg if {
input.kind == "Cluster"
n := _milli_whole(input.spec.resources.limits.cpu)
msg := sprintf(
"Cluster %s/%s: cpu limit %q normalizes to \"%v\" — use canonical form to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, input.spec.resources.limits.cpu, n],
)
}
deny contains msg if {
input.kind == "Cluster"
n := _milli_whole(input.spec.resources.requests.cpu)
msg := sprintf(
"Cluster %s/%s: cpu request %q normalizes to \"%v\" — use canonical form to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, input.spec.resources.requests.cpu, n],
)
}
# ---- Memory: Mi values divisible by 1024 normalize to Gi ----
# k8s converts 1024Mi → 1Gi, 2048Mi → 2Gi, etc.
_mi_canonical(mem) := canonical if {
endswith(mem, "Mi")
val := to_number(substring(mem, 0, count(mem) - 2))
val % 1024 == 0
val > 0
canonical := sprintf("%vGi", [val / 1024])
}
deny contains msg if {
c := _containers[_]
canonical := _mi_canonical(c.resources.limits.memory)
msg := sprintf(
"%s container %q: memory limit %q normalizes to %q — use canonical form to prevent ArgoCD OutOfSync",
[input.kind, c.name, c.resources.limits.memory, canonical],
)
}
deny contains msg if {
c := _containers[_]
canonical := _mi_canonical(c.resources.requests.memory)
msg := sprintf(
"%s container %q: memory request %q normalizes to %q — use canonical form to prevent ArgoCD OutOfSync",
[input.kind, c.name, c.resources.requests.memory, canonical],
)
}
deny contains msg if {
input.kind == "Cluster"
canonical := _mi_canonical(input.spec.resources.limits.memory)
msg := sprintf(
"Cluster %s/%s: memory limit %q normalizes to %q use canonical form to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, input.spec.resources.limits.memory, canonical],
)
}
deny contains msg if {
input.kind == "Cluster"
canonical := _mi_canonical(input.spec.resources.requests.memory)
msg := sprintf(
"Cluster %s/%s: memory request %q normalizes to %q — use canonical form to prevent ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name, input.spec.resources.requests.memory, canonical],
)
}
# ---- Service: clusterIP must not be null ----
# k8s assigns a real IP on creation; clusterIP is immutable afterward.
# Setting null in git means desired=null vs live=10.x.x.x → permanent OutOfSync.
# Remove the field and let k8s own it.
deny contains msg if {
input.kind == "Service"
input.spec.clusterIP == null
msg := sprintf(
"Service %s/%s has 'clusterIP: null' remove this field; Kubernetes assigns the IP on creation and it causes ArgoCD OutOfSync",
[input.metadata.namespace, input.metadata.name],
)
}