Commit Graph

80 Commits

Author SHA1 Message Date
unkinben 5bcbd7e1ba feat: migrate elastic-system to ArgoCD (#79)
Migrate ECK operator from Terragrunt to ArgoCD/Kustomize.
Deploys eck-operator v3.2.0 with 2 replicas and PodDisruptionBudget
in the elastic-system namespace.

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #79
2026-03-27 17:00:05 +11:00
unkinben 02195e6235 feat: migrate reposync to ArgoCD (#78)
Migrate repository sync cronjobs from Terragrunt to ArgoCD/Kustomize.
Adds four daily CronJobs (almalinux9-baseos, almalinux9-appstream, epel9,
openvox7) with associated PVCs and ConfigMaps in the reposync namespace.

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #78
2026-03-27 16:26:35 +11:00
unkinben 95c9302aa8 feat: enable downloading tea (#77)
- enable downloading the tea prebuilt binaries

Reviewed-on: #77
2026-03-26 14:02:15 +11:00
unkinben e269220228 fix: clone r10k config to /tmp/r10k-config instead of /shared (#76)
The g10k-code cronjob was failing with "Permission denied" because the
container (running as uid 999, non-root) attempted to create /shared in
the container root filesystem, which is not writable. Clone to /tmp
which is always writable by unprivileged users.

Reviewed-on: #76
2026-03-24 19:25:06 +11:00
unkinben 1388875685 fix: remove shared-config PVC from g10k cronjob, clone r10k config directly (#75)
The RWO puppetserver-shared-config PVC caused multi-attach errors when
the cronjob pod was scheduled on a different node than the previous run,
stalling the init container indefinitely. Since the config only needs to
exist for the duration of the job, remove the init container and PVC
entirely and clone the r10k config directly into /shared within the main
container before running g10k.

Reviewed-on: #75
2026-03-24 18:54:58 +11:00
unkinben 49224d4a1b fix: increase generate-types memory limit and remove invalid JVM env var (#74)
The container was OOMKilled on every run because the 256Mi limit was far
too low for `puppet generate types`. Remove PUPPETSERVER_JAVA_ARGS (only
relevant to the puppetserver JVM, not the puppet CLI) and raise the
memory limit to 1Gi / request 512Mi.

Reviewed-on: #74
2026-03-24 18:51:46 +11:00
unkinben 28dc8dc238 feat: update gems for puppet (#73)
- add deep_merge, ipaddr, and hiera-eyaml gems
- pin intel-device-plugins to 0.35.0

Reviewed-on: #73
2026-03-24 18:33:03 +11:00
unkinben 33420e1286 revert: remove filemapper gem install (#72)
filemapper is not available on RubyGems under that name and was causing
puppetserver-compiler to crash loop. The interfaces provider that
requires puppetx/filemapper is Debian-specific and should not be loaded
on RedHat-based puppetservers.

Reviewed-on: #72
2026-03-24 18:22:23 +11:00
unkinben 0fc1268c51 fix: install filemapper gem and deploy generate-types cronjob (#71)
The network module's interfaces provider requires puppetx/filemapper
which was not installed, causing catalog compilation failures with
"no such file to load -- puppetx/filemapper".

Adds filemapper to additional-ruby-gems.sh for puppetserver/compiler
pods, installs it directly in the generate-types cronjob (which has no
access to that script), and adds cronjob_generate-types.yaml to the
kustomization so the CronJob is actually deployed.

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #71
2026-03-22 00:03:33 +11:00
unkinben c0d95b71a7 fix: connect puppetboard to puppetdb over SSL on port 8081 (#70)
Puppetboard was connecting to PuppetDB on port 8080 (plain HTTP), causing
403 Forbidden errors on the /metrics/v2 Jolokia endpoint which requires
HTTPS with a Puppet certificate. Also replaced the invalid
PUPPETDB_SSL_SKIP_VERIFY var with the correct PUPPETDB_SSL_VERIFY,
PUPPETDB_CERT, and PUPPETDB_KEY pointing to the certs already generated
by the cert-generator init container.

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #70
2026-03-22 00:01:54 +11:00
unkinben 2a96d9e948 feat: add PuppetDB read-only database user and pooler (#69)
PuppetDB requires a separate read-only database user for its read pool.
Without it, it refuses to use the write user for read queries and all
/pdb/query/v4 calls fail with a 500.

- Add puppetdb_read role via CNPG managed.roles with password sourced
  from a new postgres-read-credentials Vault secret
- Grant CONNECT, USAGE, SELECT and default privileges to puppetdb_read
  via postInitApplicationSQL (must also be run manually on existing cluster)
- Add puppet-postgres-pooler-ro Pooler (type: ro) routing to replicas
- Add puppetdb-read-database-conf ConfigMap with read-database.conf
  mounted into /etc/puppetlabs/puppetdb/conf.d/ in the PuppetDB deployment
- Wire OPENVOXDB_READ_POSTGRES_* env vars from the new secret

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.6 via Crush <crush@charm.land>

Reviewed-on: #69
2026-03-21 23:31:01 +11:00
unkinben b49e8d3647 chore: change back to puppetdb:8081 (#68)
- puppetdb requires access via 8081 from puppetservers
- puppetservers do not trust the certificate via ingress

Reviewed-on: #68
2026-03-21 22:50:46 +11:00
unkinben 5f227939bc feat: add CronJob to generate Puppet types for all environments (#67)
- add kubernetes CronJob that runs every 5 minutes to automaticall generate Puppet types for all environments in the code directory.

Reviewed-on: #67
2026-03-21 17:39:03 +11:00
unkinben ffc861daa7 fix: update puppet.conf with main/server/user (#66)
- master config section is not used
- server containes all setting specifically for a server (puppet, puppet ca)
- user is for all puppet <command> tooling, like 'puppet generate'

Reviewed-on: #66
2026-03-21 17:16:15 +11:00
unkinben 47bd341371 chore: tidy initContainers (#65)
- make initcontainers easier to read/follow

Reviewed-on: #65
2026-03-21 17:16:07 +11:00
unkinben ee9ec23f6f chore: use docker not container (#64)
was referencing the main branch of upstream container, not the one I am
actually using. s/container/docker/

Reviewed-on: #64
2026-03-21 16:47:02 +11:00
unkinben 3f355bbfd3 feat: add custom entrypoint script for additional Ruby gems (#63)
Add support for installing additional Ruby gems via custom entrypoint script.
The script is mounted as a ConfigMap into /container-custom-entrypoint.d/
and will be executed during Puppetserver container startup.

Reviewed-on: #63
2026-03-21 16:01:46 +11:00
unkinben 00cbb6a817 fix: update ENC script CA certificate path (#62)
- Mount vault-ca-cert secret at /opt/vault-ca-cert.crt in both deployments
- Update cobbler-enc script to use correct CA certificate path
- Resolves OSError about missing TLS CA certificate bundle

Reviewed-on: #62
2026-03-20 23:05:35 +11:00
unkinben f474c5c530 feat: add shared bins volume for uv and cobbler-enc (#61)
- Add puppet-shared-bins PVC (10GB) for shared binaries
- Mount /opt/bin in both compiler and master deployments
- Add init container to install uv binary and cobbler script to shared volume
- Update cobbler-enc to use absolute path and uv cache directory
- Configure puppet.conf to reference cobbler-enc from /opt/bin

Reviewed-on: #61
2026-03-20 22:49:31 +11:00
unkinben c1ea6e1e81 fix: update puppet.conf to point to enc (#60)
enc script is in /etc/puppetlabs/puppet to ensure its copied during the init container phase

Reviewed-on: #60
2026-03-20 21:34:40 +11:00
unkinben 3553e9f6dd refactor: simplify DNS alt names for puppetserver compiler (#59)
Remove individual compiler pod DNS names and use generic puppetserver-compiler name instead.

Reviewed-on: #59
2026-03-20 21:27:04 +11:00
unkinben 6decc45e65 fix: use http port for puppetdb (#58)
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): puppetdb:8081
ERROR:pypuppetdb.api.base:Could not reach PuppetDB on puppetdb:8081 over HTTP.

- puppetdb_host assumes HTTP when not verifying ssl

Reviewed-on: #58
2026-03-20 21:26:52 +11:00
unkinben c2d23aaeae refactor: convert puppetserver compilers to deployment with configmap integration (#57)
- Convert StatefulSet to Deployment for better scaling flexibility
- Add initContainer to copy configmaps to shared RWX volume (10GB)
- Integrate puppetserver-compiler-config configmap for environment variables
- Configure configMapGenerator with stable names (disableNameSuffixHash)
- Update HPA to target Deployment instead of StatefulSet
- Simplify puppetboard SSL config to skip verification for internal connections

Reviewed-on: #57
2026-03-20 20:47:36 +11:00
unkinben f25117ab7f testing via ingress for puppetdb (#56)
Reviewed-on: #56
2026-03-20 00:00:41 +11:00
unkinben 47b894c450 enable debugging for puppetboard (#55)
Reviewed-on: #55
2026-03-19 23:56:49 +11:00
unkinben 059992f6a3 fix: external access to puppetdb (#53) (#54)
- use vault cert for puppetdb ingress

Reviewed-on: #53

Reviewed-on: #54
2026-03-19 23:32:27 +11:00
unkinben 6ffb0898a4 fix: external access to puppetdb (#53)
- use vault cert for puppetdb ingress

Reviewed-on: #53
2026-03-19 23:26:02 +11:00
unkinben 30d56030b5 fix: increase number of cnpg_pooler_connections (#52)
in previous puppet installs, the puppetdb api service opens MANY
connections. we need to increase the number to greater than 300.

Reviewed-on: #52
2026-03-19 18:37:03 +11:00
unkinben 504d4ae7c9 fix: enable PuppetDB HTTPS support with automatic SSL certificate generation (#51)
This enables secure HTTPS communication to PuppetDB, required for other puppet related services

- make use of USE_OPENVOXSERVER flag

Reviewed-on: #51
2026-03-19 17:06:49 +11:00
unkinben 24d09744e3 git commit -m "fix: configure PuppetDB HTTPS connections and add Puppetboard SSL support (#50)
- Update PuppetDB connections from HTTP (8080) to HTTPS (8081)
- Add automatic certificate generation for Puppetboard using Puppet CA
- Implement initContainers for proper certificate provisioning before app start
- Add dedicated PVC for Puppetboard certificates with RWX access
- Configure SSL verification and client authentication for secure PuppetDB access

Reviewed-on: #50
2026-03-19 16:34:41 +11:00
unkinben 301f8dcc1a fix: add NodeFeatureRule and Intel device plugin permissions to platform project (#49)
- Add nfd.k8s-sigs.io/NodeFeatureRule for node-feature-discovery
- Add deviceplugin.intel.com/* for Intel device plugins (GpuDevicePlugin, etc.)
- Add cert-manager.io resources (Certificate, Issuer) for Intel device plugins

Reviewed-on: #49
2026-03-19 02:20:32 +11:00
unkinben dfbb315522 feat: migrate node-feature-discovery and inteldeviceplugins-system to platform project (#48)
- Add node-feature-discovery and inteldeviceplugins-system to platform project
- Convert intel-nfd-rules from local Helm chart to static NodeFeatureRule manifests
- Add required Helm repositories (NFD OCI registry and Intel charts)
- Create base configurations with Helm charts and overlay structures
- Update platform ApplicationSet and project permissions

Reviewed-on: #48
2026-03-19 02:14:45 +11:00
unkinben d641f630e9 fix: change puppet compilers to use HTTP for internal puppetdb connections (#47)
This resolves SSL certificate verification failures preventing puppetdb access

- Update OPENVOXDB_SERVER_URLS from https://puppetdb:8081 to http://puppetdb:8080
- External access to puppetdb will still use HTTPS via ingress
- Internal cluster communication does not require encryption

Reviewed-on: #47
2026-03-19 01:51:11 +11:00
unkinben c157774033 fix: enable ServerSideApply for ArgoCD ApplicationSets (#46)
- resolve CRD annotation size limit errors by enabling server-side apply
- add storage ApplicationSet and project to kustomization files

Reviewed-on: #46
2026-03-19 01:37:56 +11:00
unkinben 90f793464b feat: migrate CSI drivers to dedicated storage project (#45)
- Migrate csi-cephfs from Terraform to ArgoCD
- Migrate csi-cephrbd from Terraform to ArgoCD
- Create dedicated storage project and ApplicationSet for CSI drivers
- Add csi-* pattern matching in storage ApplicationSet
- Remove CSI apps from platform project to separate concerns

Reviewed-on: #45
2026-03-19 01:29:31 +11:00
unkinben 06a8f98b5c feat: migrate cnpg-system from Terraform to ArgoCD (#44)
- Add cnpg-system base ArgoCD application with namespace
- Create cnpg-system overlay for au-syd1 with CloudNativePG Helm chart
- Update platform ApplicationSet to include cnpg-system deployment
- Configure cloudnative-pg operator v0.27.0 with HA and resource limits
- Maintain one-to-one migration from Terraform configuration

Reviewed-on: #44
2026-03-19 01:25:50 +11:00
unkinben 0bf6e80d6f feat: migrate externaldns from Terraform to ArgoCD (#43)
- Add externaldns base ArgoCD application with namespace and Vault integration
- Create externaldns overlay for au-syd1 with Helm chart configuration
- Update platform ApplicationSet to include externaldns deployment
- Configure external-dns v1.19.0 with RFC2136 provider for DNS updates
- Maintain one-to-one migration from Terraform configuration including TSIG secrets

Reviewed-on: #43
2026-03-19 01:22:39 +11:00
unkinben ed300fabed feat: migrate cert-manager from Terraform to ArgoCD (#42)
- Add cert-manager base ArgoCD application with namespace, RBAC resources
- Create cert-manager overlay for au-syd1 with Helm chart configuration
- Update platform ApplicationSet to include cert-manager deployment
- Configure cert-manager v1.19.2 with jetstack Helm repository
- Maintain one-to-one migration from Terraform configuration

Reviewed-on: #42
2026-03-19 01:18:19 +11:00
unkinben 656aedfc53 fix: enable unscoped permissions (#41)
- add access to create priorityclass resourcees in platform applicationset

Reviewed-on: #41
2026-03-19 01:03:54 +11:00
unkinben ea71ebb55b feat: migrate cattle-system (Rancher) from Terraform to ArgoCD (#39)
- Add cattle-system base ArgoCD application with namespace, Vault integration, and ingress
- Create cattle-system overlay for au-syd1 with Rancher Helm chart configuration
- Update platform ApplicationSet to include cattle-system deployment
- Update platform project to include Rancher Helm repository as source
- Configure Rancher v2.13.1 with HA, TLS, audit logging, and bootstrap secret from Vault
- Maintain one-to-one migration from Terraform configuration

Reviewed-on: #39
2026-03-19 00:56:39 +11:00
unkinben 5255c78927 chore: bump kubetest container (#40)
unkin/packer-images#43

Error: Error: chart requires kubeVersion: < 1.35.0-0 which is incompatible with Kubernetes v1.35.0

Reviewed-on: #40
2026-03-19 00:55:30 +11:00
unkinben 8207935d36 fix: cannot write to certificates namespace (#38)
- enable the platform application to write to certificates namespace

Reviewed-on: #38
2026-03-19 00:20:39 +11:00
unkinben 3f282fbdc2 feat: migrate certificates from Terraform to ArgoCD (#37)
- Add certificates base ArgoCD application with namespace and Vault CA certificate secret
- Create certificates overlay for au-syd1 with static certificate configuration
- Update platform ApplicationSet to include certificates deployment
- Configure Vault CA certificate with reflector annotations for cross-namespace replication
- Maintain one-to-one migration from Terraform configuration

Note: Skip no_plain_secrets hook as this is a public CA certificate that needs
to be replicated via reflector, not a sensitive secret

Reviewed-on: #37
2026-03-19 00:16:33 +11:00
unkinben 3961fe4e68 fix: annotations, not labels (#36)
<picard face palm gif>

- purelb requires annotations not labels

Reviewed-on: #36
2026-03-18 15:17:58 +11:00
unkinben e86cd7a6ae feat: ensure puppet is available externally (#35)
- change puppet/puppetca -> LoadBalancer
- dedicate ip's for puppet and puppetca loadbalancers
- name the puppetserver port
- remove puppet/puppetca ingress

Reviewed-on: #35
2026-03-18 15:07:25 +11:00
unkinben 88fe895409 fix: puppetboard port issues (#34)
service / ingres / deployment mismatch, attempt 2

Reviewed-on: #34
2026-03-18 14:31:43 +11:00
unkinben 687a7f1ffd fix: svc/puppetboard forwarding to wrong port (#33)
puppetboard uses `PUPPETBOARD_PORT` to specify the port, otherwise it
listens on tcp/80

```
ENV PUPPETBOARD_PORT 80
ENV PUPPETBOARD_HOST 0.0.0.0
ENV PUPPETBOARD_STATUS_ENDPOINT /status
ENV PUPPETBOARD_SETTINGS docker_settings.py
EXPOSE 80
```

- change svc/puppetboard to use tcp/80

Reviewed-on: #33
2026-03-18 14:25:00 +11:00
unkinben 64fb4da04c fix: puppetboard tcp is not a valid port (#32)
puppetdb_port has tcp:// in it, even though we pass the correct variable
in from a configmap.

```
ben@metabox ~/s/p/argocd-apps> kubectl --context admin run debug-pod --image=busybox --rm -it --restart=Never -n puppet -- env | grep -i puppetdb_port
PUPPETDB_PORT_8081_TCP_PORT=8081
PUPPETDB_PORT_8081_TCP_PROTO=tcp
PUPPETDB_PORT=tcp://10.43.101.142:8080
PUPPETDB_PORT_8080_TCP=tcp://10.43.101.142:8080
PUPPETDB_PORT_8080_TCP_ADDR=10.43.101.142
PUPPETDB_PORT_8081_TCP=tcp://10.43.101.142:8081
PUPPETDB_PORT_8080_TCP_PROTO=tcp
PUPPETDB_PORT_8081_TCP_ADDR=10.43.101.142
PUPPETDB_PORT_8080_TCP_PORT=8080
```

Reviewed-on: #32
2026-03-18 12:51:54 +11:00
unkinben 35f00858ae fix: puppet-compiler cant find ca (#31)
the puppetca is not pointing to the puppetmasters which prevents the
puppet-compilers from starting, preventing puppetdb/puppetboard from
starting.

- point puppetca service -> puppetserver-master

Reviewed-on: #31
2026-03-18 12:39:38 +11:00
unkinben 276d8c1d78 fix: update service names and references (#30)
updating all the names of services and their respective filenames to
better match the way puppet infra is used in my lab.

- puppet -> the compilers
- puppetca -> the master(s)
- puppetdb -> the puppetdb
- puppetboard -> puppetboard

updated references to these services in all other definitions I could find

note: need a good way to test these changes with argocd

Reviewed-on: #30
2026-03-18 12:19:57 +11:00