Commit Graph

8 Commits

Author SHA1 Message Date
unkinben 225d11590d fix(litellm): normalize postgres cluster resource values
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/kubeconform Pipeline was successful
ArgoCD diffs by string: 1024Mi != 1Gi (same bytes) and integer 1 != "1".
Use the canonical forms Kubernetes stores after admission.
2026-05-24 20:39:53 +10:00
unkinben cbc2c1cb9f fix(gateways): add explicit group: "" to all certificateRefs entries (#153)
The Gateway API admission server defaults certificateRefs[].group to ""
when it is omitted. ArgoCD diffed the desired state (no group field) against
the live state (group: "") and flagged every gateway as out of sync.

Fix: explicitly set group: "" in all certificateRefs entries so the
rendered manifest matches the API server's canonical form exactly.

Affected: artifactapi, cattle-system, consul, litellm, paperclip,
puppet (puppetboard + puppetdb), vault.

Reviewed-on: #153
2026-05-23 23:47:24 +10:00
unkinben e05f9bfd83 feat: increase litellm resources (#144)
finding litellm performance has dropped, crashed in multiple cases, and
then it had scaled to the maximum level using the majority of memory in
cluster.

- reduce the rate at which litellm autoscales
- increase the requests/limits to match usage

Reviewed-on: #144
2026-05-23 17:59:43 +10:00
unkinben 445d8b6e7e feat: add HTTP→HTTPS redirect to Gateway API services (#145)
Add port 80 HTTP listener and redirect HTTPRoute to artifactapi,
cattle-system (rancher), litellm, paperclip, and puppetboard — restoring
the redirect behaviour that existed on the previous nginx/traefik Ingress
resources.

Reviewed-on: #145
2026-05-23 17:34:07 +10:00
unkinben 4f5c3f7ea0 feat(litellm): migrate Ingress to Gateway API (#134)
## Summary

- Replace `Ingress` (nginx) with `Gateway` + `HTTPRoute` using `traefik-internal` GatewayClass
- TLS terminated at the Gateway listener; cert-manager provisions the certificate via `vault-issuer`
- external-dns annotations moved to the Gateway

## Test plan

- [ ] ArgoCD syncs the litellm app cleanly
- [ ] cert-manager issues the `litellm-tls` certificate
- [ ] external-dns creates the DNS record
- [ ] `https://litellm.k8s.syd1.au.unkin.net` is reachable

Reviewed-on: #134
2026-05-23 01:29:54 +10:00
unkinben 6138afb98b feat: add litellm-env configmap with STORE_MODEL_IN_DB=True (#97)
Reviewed-on: #97
2026-05-01 22:17:53 +10:00
unkinben 949ddb76e4 chore: litellm ooming (#95)
- update memory and cpu resources

Reviewed-on: #95
2026-05-01 21:54:00 +10:00
unkinben 5372914803 feat: add litellm to new aitooling ArgoCD project (#94)
Deploys LiteLLM proxy with CNPG PostgreSQL (3-instance HA), PgBouncer
pooler, and Redis cache. Introduces a dedicated aitooling AppProject and
ApplicationSet to keep AI tooling services separate from platform infra.

Reviewed-on: #94
2026-05-01 21:40:26 +10:00