Commit Graph

37 Commits

Author SHA1 Message Date
unkinben ce8ebc71ce Consolidate BIND DNS into one bind-internal namespace (#225)
**HOLD until v0.1.3 is tagged/built** (operator #4 merged + tagged) — this PR bumps the operator to v0.1.3, whose CRD adds the `clusterRef` field these keys use.

## Why
Put all BIND DNS services in one `bind-internal` namespace and name the StatefulSets clearly.

## Changes
- 3 clusters consolidated into `bind-internal`, StatefulSets renamed **bind-authoritative** / **bind-resolvers** / **bind-externaldns**; LBs kept on 198.18.200.6/.7/.8; external-dns hostnames renamed to match
- `clusterRef` added to `transfer-key` (→ bind-authoritative) and `externaldns-key` (→ bind-externaldns) so keys are scoped per cluster
- removed the old `ns-auth`/`ns-resolver`/`ns-externaldns` apps; ApplicationSet + AppProject now list `bind-internal`
- bumped `bind-system` operator to **v0.1.3** (CRD link + image)
- operator stays in `bind-system`

## Deploy impact
ArgoCD prunes the old ns-* namespaces (StatefulSets/PVCs — data is only seed SOA+NS, no migrated records yet) and creates the renamed clusters in bind-internal.

## Validated
`kustomize build` → 28 docs (3 BindCluster, 20 BindZone, 2 catalog, 2 keys, ns); kubeconform clean.

Reviewed-on: #225
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-04 00:35:43 +10:00
unkinben dbb5ad4f86 Rename bind DNS namespaces to ns-* (#223)
Renames the three BIND DNS app namespaces `binddns-{auth,resolver,externaldns}` -> `ns-{auth,resolver,externaldns}`.

## Why
Shorter, clearer namespace names for the DNS tiers.

## Changes
- `argocd/applicationsets/platform.yaml`: overlay path registrations renamed (the ApplicationSet derives each app's namespace from its overlay dir name)
- `argocd/projects/platform.yaml`: destination namespaces renamed

## Coupled with
The per-tier PRs (#220/#221/#222) rename the overlay dirs + namespaces + external-dns hostnames to match. No app deploys to a renamed namespace until both this and the tier PR are merged (harmless before then — the ApplicationSet only instantiates apps for existing dirs).

Reviewed-on: #223
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-03 21:16:24 +10:00
unkinben 4b8f9313c8 Deploy bind-operator (operator + CRDs) (#219)
First of a 4-PR split of the bind rollout (was #216). Deploys just the operator control plane so it can be verified before any DNS clusters exist.

## Why
Roll out incrementally: operator + CRDs first, then each BIND tier as its own PR.

## Changes
- `apps/base/bind-system`: operator Deployment (`git.unkin.net/unkin/bind-operator:v0.1.1`), RBAC, namespace; CRDs pulled from the operator repo by raw URL (`config/crd/install.yaml` @ v0.1.1)
- au-syd1 `bind-system` overlay
- register all four bind apps in `argocd/applicationsets/platform.yaml` (DNS overlays instantiate only when their dirs land in the follow-up PRs)
- add `binddns-*` namespaces to `argocd/projects/platform.yaml`
- add `schemas/bind.unkin.net/*.json` for kubeconform

## Deploy impact
Operator pod + CRDs only. No DNS services yet — the operator is idle until BindClusters exist.

## Follow-ups (merge after this)
binddns-auth, binddns-resolver, binddns-externaldns — one PR each.

Reviewed-on: #219
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-07-03 20:04:57 +10:00
unkinben 7f1444fb38 Add Authentik identity provider deployment (#211)
## Summary
- Deploy Authentik (identity.unkin.net) via Helm chart 2026.5.3
- CNPG PostgreSQL cluster (3 instances) with separate rw/ro poolers (2 instances each)
- Redis with 5Gi persistent storage
- Gateway API for HTTPS (identity.unkin.net) and LDAPS (ldap.k8s.syd1.au.unkin.net, ldap.main.unkin.net)
- TLSRoute for LDAPS passthrough, HTTPRoute for external-dns record creation
- Vault secrets for postgres credentials, authentik secret key, and S3 storage credentials
- S3 storage via RadosGW (bucket: authentik)
- 3 server replicas, 2 worker replicas
- Woodpecker ServiceAccount for terraform-authentik CI
- Platform applicationset and project updated

## Dependencies
- terraform-git #15 (merged) — repo definition
- terraform-vault #78 (merged) — auth roles and Consul ACL

## Vault secrets needed before deploy
Write to `kv/kubernetes/namespace/authentik/default/`:
- `postgres-credentials`: username + password
- `authentik-credentials`: AUTHENTIK_SECRET_KEY
- `s3-credentials`: S3 access key + secret key

Reviewed-on: #211
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-28 17:42:49 +10:00
unkinben cfca1e5278 Add age-api deployment (#210)
## Summary
- Deploy age-api to the au-syd1 cluster
- Uses configMapGenerator for people config with jaidi, ben, and sudaporn
- Includes gateway, httproute, service, and deployment
- Image: git.unkin.net/unkin/age-api:v0.1.0

Reviewed-on: #210
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-28 12:19:38 +10:00
unkinben ede25a3858 feat(platform): add priority-classes app with low/power/medium/high classes (#174)
## Summary
- New `apps/base/priority-classes/` app with four `PriorityClass` objects managed via the `platform` ArgoCD project
- Adds `apps/overlays/*/priority-classes` to the platform ApplicationSet generator
- Adds `priority-classes` namespace to platform AppProject destinations (required even for cluster-scoped resources)

| Class | Value | PreemptionPolicy | Intent |
|---|---|---|---|
| `low` | 100 | Never | Background work; evictable, won't preempt others |
| `power` | 100 | Never | Compute-heavy but expendable (e.g. AI/ML workloads) |
| `medium` | 10000 | PreemptLowerPriority | Standard services |
| `high` | 100000 | PreemptLowerPriority | Critical services; preempts lower-priority pods |

`PriorityClass` is already in the platform project's `clusterResourceWhitelist` so no project policy changes were needed.

## Test plan
- ArgoCD syncs `platform-priority-classes` successfully
- `kubectl get priorityclasses low power medium high` shows all four classes

Reviewed-on: #174
2026-05-26 23:41:54 +10:00
unkinben 3756208ccd benvin/kanidm (#159)
Reviewed-on: #159
2026-05-24 19:55:22 +10:00
unkinben c6f9893804 fix(argocd): add vault and consul to platform project destinations (#152)
Vault and consul namespaces were missing from the platform AppProject
allowed destinations, causing ArgoCD sync failures with:
  destination server 'https://kubernetes.default.svc' and namespace
  'vault' do not match any of the allowed destinations in project 'platform'

Reviewed-on: #152
2026-05-23 23:27:24 +10:00
unkinben 9a01a9ef19 fix: enable gateway/ingress class on platform project (#124)
- add missing classes to platform required to deploy traefik system

Reviewed-on: #124
2026-05-17 23:56:12 +10:00
unkinben 5e03215f4d chore: migrate reloader/reflector to virtual/helm (#115)
Reviewed-on: #115
2026-05-05 21:42:23 +10:00
unkinben 18c519f979 chore: remove hashicorp helm repo (#113)
- no longer required, this is in virtual/helm repo in artifactapi

Reviewed-on: #113
2026-05-03 23:51:44 +10:00
unkinben bcea7df925 chore: swap vso to virtual helm repo (#109)
- testing if there will be any changes after merging, before merging all of them

Reviewed-on: #109
2026-05-03 16:49:53 +10:00
unkinben 8e7bc289f6 chore: enable access to paperclip namespace (#101)
Reviewed-on: #101
2026-05-02 21:30:59 +10:00
unkinben 5372914803 feat: add litellm to new aitooling ArgoCD project (#94)
Deploys LiteLLM proxy with CNPG PostgreSQL (3-instance HA), PgBouncer
pooler, and Redis cache. Introduces a dedicated aitooling AppProject and
ApplicationSet to keep AI tooling services separate from platform infra.

Reviewed-on: #94
2026-05-01 21:40:26 +10:00
unkinben 7d555cd31a feat: migrate purelb to ArgoCD (#84)
Migrate PureLB load balancer from Terragrunt to ArgoCD/Kustomize.
Deploys purelb v0.13.0 with two LBNodeAgent and two ServiceGroup CRs
(common: 198.18.200.0/24, dmz: 198.18.199.0/24).
Adds LBNodeAgent and ServiceGroup to kubeconform skip list (no CRD catalog schema).

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #84
2026-04-07 19:52:17 +10:00
unkinben f0bdc0231a feat: migrate vso-system to ArgoCD (#81)
Migrate Vault Secrets Operator from Terragrunt to ArgoCD/Kustomize.
Deploys vault-secrets-operator v1.2.0 with 3 replicas, plus ClusterRole,
ClusterRoleBindings, and vault-admin ServiceAccount.

Note: static service account tokens (kubernetes.io/service-account-token)
cannot be stored in git; create manually or via Vault after deployment.

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #81
2026-04-07 19:33:50 +10:00
unkinben b100f3034e feat: migrate observability to ArgoCD (#82)
Migrate Victoria Metrics cluster and agent from Terragrunt to ArgoCD/Kustomize.
Creates new observability AppProject and ApplicationSet.
Deploys victoria-metrics-cluster v0.33.0 (vmselect/vminsert/vmstorage with
HPA, PDB, ingress) and victoria-metrics-agent v0.30.0 (3 replicas, k8s scrape
configs) in the observability namespace.

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #82
2026-04-07 19:15:45 +10:00
unkinben 181bc152e7 feat: migrate vm-system to ArgoCD (#80)
Migrate Victoria Metrics operator from Terragrunt to ArgoCD/Kustomize.
Deploys victoria-metrics-operator v0.57.1 with 2 replicas in vm-system.

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #80
2026-03-27 17:04:15 +11:00
unkinben 5bcbd7e1ba feat: migrate elastic-system to ArgoCD (#79)
Migrate ECK operator from Terragrunt to ArgoCD/Kustomize.
Deploys eck-operator v3.2.0 with 2 replicas and PodDisruptionBudget
in the elastic-system namespace.

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #79
2026-03-27 17:00:05 +11:00
unkinben 02195e6235 feat: migrate reposync to ArgoCD (#78)
Migrate repository sync cronjobs from Terragrunt to ArgoCD/Kustomize.
Adds four daily CronJobs (almalinux9-baseos, almalinux9-appstream, epel9,
openvox7) with associated PVCs and ConfigMaps in the reposync namespace.

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #78
2026-03-27 16:26:35 +11:00
unkinben 301f8dcc1a fix: add NodeFeatureRule and Intel device plugin permissions to platform project (#49)
- Add nfd.k8s-sigs.io/NodeFeatureRule for node-feature-discovery
- Add deviceplugin.intel.com/* for Intel device plugins (GpuDevicePlugin, etc.)
- Add cert-manager.io resources (Certificate, Issuer) for Intel device plugins

Reviewed-on: #49
2026-03-19 02:20:32 +11:00
unkinben dfbb315522 feat: migrate node-feature-discovery and inteldeviceplugins-system to platform project (#48)
- Add node-feature-discovery and inteldeviceplugins-system to platform project
- Convert intel-nfd-rules from local Helm chart to static NodeFeatureRule manifests
- Add required Helm repositories (NFD OCI registry and Intel charts)
- Create base configurations with Helm charts and overlay structures
- Update platform ApplicationSet and project permissions

Reviewed-on: #48
2026-03-19 02:14:45 +11:00
unkinben c157774033 fix: enable ServerSideApply for ArgoCD ApplicationSets (#46)
- resolve CRD annotation size limit errors by enabling server-side apply
- add storage ApplicationSet and project to kustomization files

Reviewed-on: #46
2026-03-19 01:37:56 +11:00
unkinben 90f793464b feat: migrate CSI drivers to dedicated storage project (#45)
- Migrate csi-cephfs from Terraform to ArgoCD
- Migrate csi-cephrbd from Terraform to ArgoCD
- Create dedicated storage project and ApplicationSet for CSI drivers
- Add csi-* pattern matching in storage ApplicationSet
- Remove CSI apps from platform project to separate concerns

Reviewed-on: #45
2026-03-19 01:29:31 +11:00
unkinben 06a8f98b5c feat: migrate cnpg-system from Terraform to ArgoCD (#44)
- Add cnpg-system base ArgoCD application with namespace
- Create cnpg-system overlay for au-syd1 with CloudNativePG Helm chart
- Update platform ApplicationSet to include cnpg-system deployment
- Configure cloudnative-pg operator v0.27.0 with HA and resource limits
- Maintain one-to-one migration from Terraform configuration

Reviewed-on: #44
2026-03-19 01:25:50 +11:00
unkinben 0bf6e80d6f feat: migrate externaldns from Terraform to ArgoCD (#43)
- Add externaldns base ArgoCD application with namespace and Vault integration
- Create externaldns overlay for au-syd1 with Helm chart configuration
- Update platform ApplicationSet to include externaldns deployment
- Configure external-dns v1.19.0 with RFC2136 provider for DNS updates
- Maintain one-to-one migration from Terraform configuration including TSIG secrets

Reviewed-on: #43
2026-03-19 01:22:39 +11:00
unkinben ed300fabed feat: migrate cert-manager from Terraform to ArgoCD (#42)
- Add cert-manager base ArgoCD application with namespace, RBAC resources
- Create cert-manager overlay for au-syd1 with Helm chart configuration
- Update platform ApplicationSet to include cert-manager deployment
- Configure cert-manager v1.19.2 with jetstack Helm repository
- Maintain one-to-one migration from Terraform configuration

Reviewed-on: #42
2026-03-19 01:18:19 +11:00
unkinben 656aedfc53 fix: enable unscoped permissions (#41)
- add access to create priorityclass resourcees in platform applicationset

Reviewed-on: #41
2026-03-19 01:03:54 +11:00
unkinben ea71ebb55b feat: migrate cattle-system (Rancher) from Terraform to ArgoCD (#39)
- Add cattle-system base ArgoCD application with namespace, Vault integration, and ingress
- Create cattle-system overlay for au-syd1 with Rancher Helm chart configuration
- Update platform ApplicationSet to include cattle-system deployment
- Update platform project to include Rancher Helm repository as source
- Configure Rancher v2.13.1 with HA, TLS, audit logging, and bootstrap secret from Vault
- Maintain one-to-one migration from Terraform configuration

Reviewed-on: #39
2026-03-19 00:56:39 +11:00
unkinben 8207935d36 fix: cannot write to certificates namespace (#38)
- enable the platform application to write to certificates namespace

Reviewed-on: #38
2026-03-19 00:20:39 +11:00
unkinben 14e3946d4b feat: initial puppet deployment (#25)
working towards a larger, redundant, autoscaling and simple puppet
implementation in kubernetes. this was originally based on the openvox
helm chart with several improvements (not all in this pr)

- use of cnpg instead of single bitnamilegacy postgres container
- use for g10k instead of r10k
- run one instance of g10k per namespace, instead of per-pod
- store only keep one copy of the environments/branches (instead of per-pod)
- change g10k to native cronjob instead of hacky implementation
- use vault secrets

part one adds:

- cnpg puppetdb pgsql cluster
- cnpg puppetdb pgpooler
- persistent volume claims for puppet, puppetdb, the code repository, etc

Reviewed-on: #25
2026-03-09 01:10:30 +11:00
unkinben 05a88459a5 chore: migrate artifactapi to kustomize (#18)
- migrate terraform deployment to kustomize

Reviewed-on: #18
2026-03-06 21:35:47 +11:00
unkinben dbd8914013 feat: migrate woodpecker to argocd (#13)
- move woodpecker helm chart deployment to argocd
- move cnpg resources
- move vault resources

Reviewed-on: #13
2026-03-03 22:24:17 +11:00
unkinben be9d485bfe feat: testing jfrog-container-registry (#11)
- trialing jfrog container registry

Reviewed-on: #11
2026-03-02 23:07:47 +11:00
unkinben 0daa026f01 feat: add pre-commit workflow (#10)
- enforce pre-commit is run for all pull-requests

Reviewed-on: #10
2026-03-02 00:19:04 +11:00
unkinben ebb47348fe fix: resolve issues with helm deployments (#8)
- remove helm-patch files that are unused
- change platform namespaces allowed to *-system
- change chart name

Reviewed-on: #8
2026-03-01 18:55:47 +11:00
unkinben 971835f845 feat: initial commit
- add structure to clusters, apps and argocd objects
- add bootstrapping features
2026-03-01 14:31:16 +11:00