17 Commits

Author SHA1 Message Date
unkinben d4b66bb651 fix: use chart logLevel value instead of duplicate extraArg
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/kubeconform Pipeline was successful
2026-05-23 01:08:49 +10:00
unkinben 1944dbbfcd temp: enable debug logging on externaldns to diagnose TLSRoute sync timeout (#140)
Temporary: enable --log-level=debug to understand why the TLSRoute informer never reports HasSynced within the 1m interval. To be closed/reverted after root cause is found.
Reviewed-on: #140
2026-05-23 01:07:45 +10:00
unkinben 0940cc20f8 fix(traefik): listen on port 443 directly for Gateway API compatibility (#138)
## Problem

Gateway listeners with `port: 443` were rejected with `PortUnavailable: Cannot find entryPoint for Gateway: no matching entryPoint for port 443 and protocol "HTTPS"`.

Traefik matches Gateway listener ports against its internal entryPoint ports (pod-level), not the Service's `exposedPort`. The `websecure` entryPoint was configured on port `8443`, so port `443` listeners had no match.

## Fix

- `ports.websecure.port: 443` — Traefik now binds directly on 443
- `securityContext.capabilities.add: [NET_BIND_SERVICE]` — allows a non-root process to bind to privileged ports (<1024)

The Service `exposedPort` stays at `443`, so external connectivity is unchanged. All existing Gateway listeners (`port: 443`) are correct as-is.

Applies to both internal and external Traefik instances.

## Test plan

- [ ] Traefik pods restart cleanly
- [ ] `kubectl get gateway -A` shows listeners as `Programmed: True`
- [ ] `https://rancher.k8s.syd1.au.unkin.net` (already merged) is reachable

Reviewed-on: #138
2026-05-23 00:44:13 +10:00
unkinben 20ce2b1b92 feat(cattle-system): migrate rancher Ingress to Gateway API (#132)
## Summary

- Replace `Ingress` (nginx) with `Gateway` + `HTTPRoute` using `traefik-internal` GatewayClass
- TLS terminated at the Gateway listener; cert-manager provisions the certificate via `vault-issuer`
- external-dns annotations moved to the Gateway

## Test plan

- [ ] ArgoCD syncs the cattle-system app cleanly
- [ ] cert-manager issues the `rancher-tls` certificate
- [ ] external-dns creates the DNS record
- [ ] `https://rancher.k8s.syd1.au.unkin.net` is reachable

Reviewed-on: #132
2026-05-23 00:24:57 +10:00
unkinben 64dc5a0242 fix(traefik): add instance labels to GatewayClasses (#137)
## Problem

GatewayClasses were `Unknown` even after controllerName was fixed. The `kubernetesGateway` `labelSelector` applies to all watched resources, including GatewayClasses themselves. Since neither GatewayClass had a `traefik.io/instance` label, both Traefik instances filtered them out and never accepted them.

## Fix

- `gatewayclass-internal.yaml`: add `traefik.io/instance: internal`
- `gatewayclass-external.yaml`: add `traefik.io/instance: external`

## Test plan

- [ ] `kubectl get gatewayclass` shows both as `Accepted: True`

Reviewed-on: #137
2026-05-23 00:23:18 +10:00
unkinben 57c14d32c0 fix(traefik): remove invalid controllerName flag causing CrashLoopBackOff (#136)
## URGENT — Traefik pods are CrashLoopBackOff

The merged PR #135 added `--providers.kubernetesgateway.controllerName` as an `additionalArguments` entry. Traefik v3.7.0 does not support this flag and fails immediately on startup.

Old replica sets are still running (one pod each) but new pods cannot come up.

## Fix

- Remove `additionalArguments` from both `values-internal.yaml` and `values-external.yaml`
- Revert GatewayClass `controllerName` back to `traefik.io/gateway-controller` (the hardcoded Traefik default — no override mechanism exists in v3.7.0)

## After merge

GatewayClasses will remain `Unknown` until a separate solution for internal/external separation is implemented (the `labelSelector` approach needs further investigation).

Reviewed-on: #136
2026-05-22 23:58:56 +10:00
unkinben 2df359c4a9 fix(traefik): set controllerName on GatewayClasses and Traefik providers (#135)
## Problem

Both GatewayClasses (`traefik-internal`, `traefik-external`) were stuck as `Unknown`. Neither Traefik deployment had `controllerName` set in `kubernetesGateway`, so both defaulted to `traefik.io/gateway-controller` — which matched neither GatewayClass.

## Fix

- `gatewayclass-internal.yaml`: `controllerName: traefik.io/gateway-controller-internal`
- `gatewayclass-external.yaml`: `controllerName: traefik.io/gateway-controller-external`
- `values-internal.yaml`: added `controllerName: traefik.io/gateway-controller-internal`
- `values-external.yaml`: added `controllerName: traefik.io/gateway-controller-external`

## Test plan

- [ ] ArgoCD syncs traefik-system cleanly
- [ ] `kubectl get gatewayclass` shows both as `Accepted: True`

Reviewed-on: #135
2026-05-22 23:44:06 +10:00
unkinben f53a2dc4f8 fix: terraform_vault must be RFC1123 compliant (#128)
Reviewed-on: #128
2026-05-21 23:19:20 +10:00
unkinben c5dd3cc5cb feat: add terraform_vault role (#127)
this adds a service account that can be used to run the terraform_vault
workflows with, so that we can access the jwt to generate a token

Reviewed-on: #127
2026-05-21 23:13:48 +10:00
unkinben 462b2b3f4f feat(externaldns): add Gateway API sources for httproute, tlsroute, grpcroute, tcproute, udproute (#126)
Reviewed-on: #126
2026-05-18 00:11:33 +10:00
unkinben 73c9b3f603 fix(traefik): replace invalid controllername flag with labelSelector for v3 (#125)
Remove --providers.kubernetesgateway.controllername which does not exist in
Traefik v3, update GatewayClass controllerName to the standard v3 value, and
use labelSelector on each instance's kubernetesGateway provider to differentiate
internal vs external traffic.

Reviewed-on: #125
2026-05-18 00:03:12 +10:00
unkinben 9a01a9ef19 fix: enable gateway/ingress class on platform project (#124)
- add missing classes to platform required to deploy traefik system

Reviewed-on: #124
2026-05-17 23:56:12 +10:00
unkinben 53553ddcfd feat: deploy internal/external traefik routers (#119)
deploy traefik for internal and external applications. port forwarding
from the external routers will only occur to the IP of the
traefik-external service.

- traefik-internal and traefik-external added
- each is a different deployment

Reviewed-on: #119
2026-05-17 23:44:50 +10:00
unkinben 5d3ff3a0f4 feat(artifactapi): allow kubeconform and kustomize from GitHub (#123)
Adds immutable patterns for yannh/kubeconform and kubernetes-sigs/kustomize
to fix 403 Forbidden errors when downloading their Linux amd64 releases.

Reviewed-on: #123
2026-05-17 12:19:27 +10:00
unkinben c3002dc3c1 feat(artifactapi): allow kubecolor releases from GitHub (#122)
Reviewed-on: #122
2026-05-11 23:39:48 +10:00
unkinben 27db33536a feat(artifactapi): allow almalinux, debian, and fedora from Docker Hub (#121)
Reviewed-on: #121
2026-05-10 22:56:39 +10:00
unkinben 8a7068a1c4 feat(artifactapi): add argo-helm as a remote and virtual helm member (#120)
Reviewed-on: #120
2026-05-10 22:53:43 +10:00
24 changed files with 318 additions and 176 deletions
@@ -1,18 +0,0 @@
---
apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultAuth
metadata:
name: default
namespace: argocd-image-updater
spec:
allowedNamespaces:
- argocd-image-updater
kubernetes:
audiences:
- vault
role: argocd-image-updater
serviceAccount: argocd-image-updater
tokenExpirationSeconds: 600
method: kubernetes
mount: k8s/au/syd1
vaultConnectionRef: vso-system/default
@@ -1,40 +0,0 @@
---
# Credentials for polling the git.unkin.net container registry.
# Vault KV path: kv/service/argocd-image-updater/registry-creds
# Required key: creds — value format: "<username>:<token>"
apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultStaticSecret
metadata:
name: registry-creds
namespace: argocd-image-updater
spec:
destination:
create: true
name: registry-creds
overwrite: true
hmacSecretData: true
mount: kv
path: service/argocd-image-updater/registry-creds
refreshAfter: 5m
type: kv-v2
vaultAuthRef: default
---
# ArgoCD API token for image updater to discover and update Applications.
# Vault KV path: kv/service/argocd-image-updater/argocd-token
# Required key: token — generate via: argocd account generate-token --account image-updater
apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultStaticSecret
metadata:
name: argocd-token
namespace: argocd-image-updater
spec:
destination:
create: true
name: argocd-token
overwrite: true
hmacSecretData: true
mount: kv
path: service/argocd-image-updater/argocd-token
refreshAfter: 5m
type: kv-v2
vaultAuthRef: default
@@ -19,7 +19,10 @@ remotes:
package: "docker"
description: "Docker Hub registry"
immutable_patterns:
- "^library/almalinux"
- "^library/busybox"
- "^library/debian"
- "^library/fedora"
- "^library/nginx"
- "^library/postgres"
- "^library/redis"
@@ -26,7 +26,9 @@ remotes:
- "helmfile/helmfile/.*/helmfile_.*_linux_amd64.tar.gz$"
- "helmfile/vals/.*/vals_.*_linux_amd64.tar.gz$"
- "jesseduffield/lazydocker/.*/lazydocker_.*_Linux_x86_64.tar.gz$"
- "kubecolor/kubecolor/.*/kubecolor_.*_linux_amd64.tar.gz$"
- "kubernetes-sigs/gateway-api/.*/standard-install.yaml$"
- "kubernetes-sigs/kustomize/.*/kustomize_.*_linux_amd64.tar.gz$"
- "lxc/incus/.*.tar.gz$"
- "mikefarah/yq/.*/yq_linux_amd64$"
- "neovim/neovim-releases/.*/nvim-linux-x86_64.tar.gz$"
@@ -54,6 +56,7 @@ remotes:
- "VictoriaMetrics/VictoriaMetrics/.*/vlutils-linux-amd64-.*.tar.gz$"
- "VictoriaMetrics/VictoriaMetrics/.*/vmutils-linux-amd64-.*.tar.gz$"
- "xorpaul/g10k/.*/g10k-.*-linux-amd64.zip$"
- "yannh/kubeconform/.*/kubeconform-linux-amd64.tar.gz$"
cache:
immutable_ttl: 0
mutable_ttl: 7200
+29
View File
@@ -0,0 +1,29 @@
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
labels:
traefik.io/instance: internal
annotations:
cert-manager.io/cluster-issuer: vault-issuer
cert-manager.io/common-name: rancher.k8s.syd1.au.unkin.net
cert-manager.io/private-key-size: "4096"
external-dns.alpha.kubernetes.io/hostname: rancher.k8s.syd1.au.unkin.net
external-dns.alpha.kubernetes.io/target: "198.18.200.4"
name: rancher
namespace: cattle-system
spec:
gatewayClassName: traefik-internal
listeners:
- allowedRoutes:
namespaces:
from: Same
hostname: rancher.k8s.syd1.au.unkin.net
name: https
port: 443
protocol: HTTPS
tls:
certificateRefs:
- kind: Secret
name: rancher-tls
mode: Terminate
+20
View File
@@ -0,0 +1,20 @@
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: rancher
namespace: cattle-system
spec:
hostnames:
- rancher.k8s.syd1.au.unkin.net
parentRefs:
- name: rancher
sectionName: https
rules:
- backendRefs:
- name: rancher
port: 80
matches:
- path:
type: PathPrefix
value: /
-29
View File
@@ -1,29 +0,0 @@
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rancher
namespace: cattle-system
annotations:
cert-manager.io/cluster-issuer: vault-issuer
cert-manager.io/common-name: rancher.k8s.syd1.au.unkin.net
cert-manager.io/private-key-size: "4096"
external-dns.alpha.kubernetes.io/hostname: rancher.k8s.syd1.au.unkin.net
external-dns.alpha.kubernetes.io/target: "198.18.200.0"
spec:
ingressClassName: nginx
tls:
- hosts:
- rancher.k8s.syd1.au.unkin.net
secretName: rancher-tls
rules:
- host: rancher.k8s.syd1.au.unkin.net
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: rancher
port:
number: 80
+2 -1
View File
@@ -6,4 +6,5 @@ resources:
- namespace.yaml
- vaultauth.yaml
- vaultstaticsecret.yaml
- ingress.yaml
- gateway.yaml
- httproute.yaml
@@ -0,0 +1,9 @@
---
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
labels:
traefik.io/instance: external
name: traefik-external
spec:
controllerName: traefik.io/gateway-controller
@@ -0,0 +1,9 @@
---
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
labels:
traefik.io/instance: internal
name: traefik-internal
spec:
controllerName: traefik.io/gateway-controller
@@ -4,5 +4,5 @@ kind: Kustomization
resources:
- namespace.yaml
- vaultauth.yaml
- vaultstaticsecret.yaml
- gatewayclass-internal.yaml
- gatewayclass-external.yaml
@@ -2,4 +2,4 @@
apiVersion: v1
kind: Namespace
metadata:
name: argocd-image-updater
name: traefik-system
+1
View File
@@ -6,5 +6,6 @@ resources:
- namespace.yaml
- cnpg_cluster.yaml
- cnpg_pooler.yaml
- serviceaccount_terraform_vault.yaml
- vaultauth.yaml
- vaultstaticsecret.yaml
@@ -0,0 +1,6 @@
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: terraform-vault
namespace: woodpecker
@@ -1,14 +0,0 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../base/argocd-image-updater
helmCharts:
- name: argocd-image-updater
repo: https://artifactapi.k8s.syd1.au.unkin.net/api/v1/virtual/helm
version: "0.10.3"
releaseName: argocd-image-updater
namespace: argocd-image-updater
valuesFile: values.yaml
@@ -1,33 +0,0 @@
config:
argocd:
grpcWeb: false
serverAddress: argocd-server.argocd
insecure: true
plaintext: false
registries:
- name: git.unkin.net
api_url: https://git.unkin.net
prefix: git.unkin.net
credentials: secret:argocd-image-updater/registry-creds#creds
insecure: false
authScripts:
enabled: false
extraEnv:
- name: ARGOCD_TOKEN
valueFrom:
secretKeyRef:
name: argocd-token
key: token
gitCommitUser: "ArgoCD Image Updater"
gitCommitEmail: "argocd-image-updater@unkin.net"
rbac:
enabled: true
serviceAccount:
create: true
name: argocd-image-updater
@@ -24,6 +24,11 @@ policy: "sync"
sources:
- service
- ingress
- gateway-httproute
- gateway-tlsroute
- gateway-grpcroute
- gateway-tcproute
- gateway-udproute
# Environment variables for TSIG secret and algorithm from Vault
env:
@@ -49,3 +54,5 @@ extraArgs:
- --rfc2136-tsig-axfr
- --rfc2136-tsig-secret=$(EXTERNAL_DNS_RFC2136_TSIG_SECRET)
- --ingress-class=nginx
logLevel: debug
@@ -0,0 +1,24 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../base/traefik-system
helmCharts:
- name: traefik
repo: https://artifactapi.k8s.syd1.au.unkin.net/api/v1/virtual/helm
version: "40.0.0"
releaseName: traefik-internal
namespace: traefik-system
valuesFile: values-internal.yaml
apiVersions:
- policy/v1/PodDisruptionBudget
- name: traefik
repo: https://artifactapi.k8s.syd1.au.unkin.net/api/v1/virtual/helm
version: "40.0.0"
releaseName: traefik-external
namespace: traefik-system
valuesFile: values-external.yaml
apiVersions:
- policy/v1/PodDisruptionBudget
@@ -0,0 +1,98 @@
image:
tag: v3.7.0
podDisruptionBudget:
enabled: true
maxUnavailable: 1
gateway:
enabled: false
gatewayClass:
enabled: false
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
providers:
kubernetesCRD:
enabled: false
kubernetesIngress:
enabled: false
kubernetesGateway:
enabled: true
experimentalChannel: false
namespaces: []
nativeLBByDefault: false
labelSelector: "traefik.io/instance=external"
logs:
access:
enabled: true
global:
checkNewVersion: true
sendAnonymousUsage: false
notAppendXForwardedFor: false
service:
enabled: true
single: true
annotations:
purelb.io/service-group: "dmz"
purelb.io/addresses: 198.18.199.0
annotationsTCP: {}
annotationsUDP: {}
labels: {}
spec:
type: LoadBalancer
loadBalancerIP: "198.18.199.0"
additionalServices: {}
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 5
metrics: []
behavior: {}
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: "{{ template \"traefik.fullname\" . }}"
persistence:
enabled: false
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: '{{ template "traefik.name" . }}'
app.kubernetes.io/instance: '{{ .Release.Name }}-{{ include "traefik.namespace" . }}'
topologyKey: kubernetes.io/hostname
podSecurityContext:
runAsGroup: 65532
runAsNonRoot: true
runAsUser: 65532
seccompProfile:
type: RuntimeDefault
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: [ALL]
add: [NET_BIND_SERVICE]
readOnlyRootFilesystem: true
ports:
web:
port: 80
websecure:
port: 443
enabled: true
@@ -0,0 +1,98 @@
image:
tag: v3.7.0
podDisruptionBudget:
enabled: true
maxUnavailable: 1
gateway:
enabled: false
gatewayClass:
enabled: false
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
providers:
kubernetesCRD:
enabled: false
kubernetesIngress:
enabled: false
kubernetesGateway:
enabled: true
experimentalChannel: false
namespaces: []
nativeLBByDefault: false
labelSelector: "traefik.io/instance=internal"
logs:
access:
enabled: true
global:
checkNewVersion: true
sendAnonymousUsage: false
notAppendXForwardedFor: false
service:
enabled: true
single: true
annotations:
purelb.io/service-group: "common"
purelb.io/addresses: 198.18.200.4
annotationsTCP: {}
annotationsUDP: {}
labels: {}
spec:
type: LoadBalancer
loadBalancerIP: "198.18.200.4"
additionalServices: {}
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 5
metrics: []
behavior: {}
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: "{{ template \"traefik.fullname\" . }}"
persistence:
enabled: false
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: '{{ template "traefik.name" . }}'
app.kubernetes.io/instance: '{{ .Release.Name }}-{{ include "traefik.namespace" . }}'
topologyKey: kubernetes.io/hostname
podSecurityContext:
runAsGroup: 65532
runAsNonRoot: true
runAsUser: 65532
seccompProfile:
type: RuntimeDefault
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: [ALL]
add: [NET_BIND_SERVICE]
readOnlyRootFilesystem: true
ports:
web:
port: 80
websecure:
port: 443
enabled: true
-36
View File
@@ -1,36 +0,0 @@
---
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: image-updater-apps
namespace: argocd
spec:
generators:
- git:
repoURL: https://git.unkin.net/unkin/argocd-apps
revision: HEAD
directories:
- path: apps/overlays/*/artifactapi
template:
metadata:
name: 'platform-{{path[3]}}'
annotations:
argocd-image-updater.argoproj.io/image-list: "artifactapi=git.unkin.net/unkin/artifactapi"
argocd-image-updater.argoproj.io/artifactapi.update-strategy: semver
argocd-image-updater.argoproj.io/write-back-method: git
argocd-image-updater.argoproj.io/git-branch: main
spec:
project: platform
source:
repoURL: https://git.unkin.net/unkin/argocd-apps
targetRevision: HEAD
path: '{{path}}'
destination:
server: https://kubernetes.default.svc
namespace: '{{path[3]}}'
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- ServerSideApply=true
@@ -4,7 +4,6 @@ kind: Kustomization
resources:
- aitooling.yaml
- imageupdater.yaml
- observability.yaml
- platform.yaml
- storage.yaml
+2 -1
View File
@@ -10,7 +10,7 @@ spec:
repoURL: https://git.unkin.net/unkin/argocd-apps
revision: HEAD
directories:
- path: apps/overlays/*/argocd-image-updater
- path: apps/overlays/*/artifactapi
- path: apps/overlays/*/cattle-system
- path: apps/overlays/*/cert-manager
- path: apps/overlays/*/certificates
@@ -25,6 +25,7 @@ spec:
- path: apps/overlays/*/reflector-system
- path: apps/overlays/*/reloader-system
- path: apps/overlays/*/reposync
- path: apps/overlays/*/traefik-system
- path: apps/overlays/*/vm-system
- path: apps/overlays/*/vso-system
- path: apps/overlays/*/woodpecker
+4
View File
@@ -60,6 +60,10 @@ spec:
kind: Certificate
- group: 'cert-manager.io'
kind: Issuer
- group: 'gateway.networking.k8s.io'
kind: GatewayClass
- group: 'networking.k8s.io'
kind: IngressClass
namespaceResourceWhitelist:
- group: '*'
kind: '*'