375 Commits

Author SHA1 Message Date
unkinben aeae26711f Convert RKE2 registries to template, disable default endpoints (#474)
## Summary
- Replace static `registries.yaml` with EPP template driven by `rke2::registries` hash
- Add `disable-default-registry-endpoint: true` to all mirrors — RKE2 will only use artifactapi and never fall back to upstream registries
- Registry configuration now fully managed via hiera data (`roles/infra/k8s.yaml`)

Reviewed-on: #474
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-29 22:30:48 +10:00
benvin 7b53be7f8c chore: enable rke2 registries (#473)
- re-enable registries for rke2 machines

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #473
2026-06-27 22:27:33 +10:00
benvin 57c844b7e8 feat: upgrade grafana from default to 13.0.2 (#470)
Pin grafana package version to 13.0.2 via a new version parameter on
profiles::metrics::grafana, wired through to the puppet-grafana class.

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #470
2026-06-06 23:46:16 +10:00
benvin 757de20682 feat: upgrade gitea from 1.22.0 to 1.26.2 (#469)
- update release to install to 1.26.2
- change base_url to artifactapi
- update releases/checksums

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #469
2026-06-06 20:23:25 +10:00
unkinben b754d947d5 feat: add auth.unkin.net proxying to Kubernetes Traefik ingress (#467)
Add static haproxy2 backends for syd1 Kubernetes Traefik ingress
(external 198.18.199.0, internal 198.18.200.4) and route
auth.unkin.net to the internal backend with Let's Encrypt cert.

Reviewed-on: #467
2026-06-02 22:50:10 +10:00
unkinben ba35c8907c chore: increase inotify limits on rke2 nodes to fix fsnotify watcher errors (#466)
Reviewed-on: #466
2026-05-26 23:50:25 +10:00
unkinben 4b9b28ddb7 chore: disable rp_filter on k8s nodes (#461)
- k8s control/compute are multihomed, must disable rp_filter

Reviewed-on: #461
2026-04-11 21:51:42 +10:00
unkinben 0451894b48 feat: add ceph service management profiles and facts (#459)
## Summary

- Adds `Unkin::Ceph::Utils` facter module detecting ceph service instances via `systemctl list-units`, exposing `is_ceph_mon`, `is_ceph_mgr`, `is_ceph_mds`, `is_ceph_osd` booleans and a `ceph_services` hash of unit names
- Adds `profiles::ceph::mon`, `mgr`, `mds`, `osd` — each with `Boolean $ensure_running` that iterates discovered service instances and manages them as running and enabled
- Works across incus nodes (mon/mgr/mds/osd) and k8s compute/control nodes (osd only); verified on prodnxsr0001 which correctly reports `is_ceph_osd: true` and `ceph_services: {osd: [ceph-osd@5]}`

## Test plan

- [x] Noop deploy against prodnxsr0001.main.unkin.net passed cleanly
- [x] `ceph_services` fact returns correct service map
- [x] `is_ceph_osd` returns `True`, `is_ceph_mon` returns `False` as expected
- [x] Test on an incus/ceph node with mon/mgr/mds services

Reviewed-on: #459
2026-04-07 19:02:17 +10:00
unkinben 3714691240 chore: enable access to dns (#460)
rebuilding router, taking the chance to not mess up ip ranges. I did
have 198.18.21.0/24 and 198.18.21.160/27 and 198.18.21.192/27 all on
differnt interfaces.

- update IP's that can reach bind view for main.unkin.net
- keep both for intermediate period

Reviewed-on: #460
2026-04-06 22:46:40 +10:00
unkinben dbe04a91e3 chore: change to ceph-public loopback (#458)
- use ceph public loopback port 9443 for dashboard

Reviewed-on: #458
2026-04-05 22:35:39 +10:00
unkinben 0c0d4a3f61 chore: update r10k repo path (#454)
- change to use letsencrypt ssl path for simpler tls trust management

Reviewed-on: #454
2026-03-17 17:36:58 +11:00
unkinben bc769aa1df feat: add ldap groups for kubernetes/vault (#449)
need to separate the permissions inside vault into different groups, one
per-permission.

- add group for each kubernetes role in vault

Reviewed-on: #449
2026-02-14 19:22:26 +11:00
unkinben 4e652ccbe6 chore: add alt-names to consul (#448)
- ensure consul datacenter is added to altnames

Reviewed-on: #448
2026-02-09 01:03:20 +11:00
unkinben 8c24c6582f feat: manage vault version (#446)
- add params for version and package name
- add param to cleanup openbao
- add version lock (if not latest)

Reviewed-on: #446
2026-02-08 22:26:22 +11:00
unkinben 6bfc63ca31 feat: enable plugins for vault/openbao (#447)
- install openbao-plugins
- add plugin_directory

Reviewed-on: #447
2026-02-08 19:19:33 +11:00
unkinben c4d28d52bc chore: remove helm deploys from puppet (#444)
- migrate helm deployments to terraform

Reviewed-on: #444
2026-01-30 20:52:51 +11:00
unkinben 6219855fb1 chore: add additional user (#443)
- as per request

Reviewed-on: #443
2026-01-26 20:21:10 +11:00
unkinben 7215a6f534 chore: terraform state too large for body (#442)
- update consul/nginx max body size to 512MB

Reviewed-on: #442
2026-01-18 17:15:08 +11:00
unkinben 88efdbcdd3 chore: reduce synced repos (#441)
- remove repos now available via artifactapi

Reviewed-on: #441
2026-01-17 17:12:44 +11:00
unkinben dbe1398218 chore: centralise all yum repo configuration (#436)
- add 30+ repository definitions to AlmaLinux/all_releases.yaml with `ensure: absent` defaults
- update all role-specific hieradata files to use `ensure: present` pattern
- remove duplicated repository URL/GPG key configurations from individual roles
- maintains existing functionality while improving maintainability"

Reviewed-on: #436
2026-01-15 21:35:13 +11:00
unkinben 383bbb0507 fix: ensure join-api is functioning (#434)
- consul was directing new rke2 control nodes to a dead join api
- add additional check to verify its responding (not just up)

Reviewed-on: #434
2026-01-11 13:51:36 +11:00
unkinben 6f51bffeaa core: bump radowgw client_max_body_size (#433)
Reviewed-on: #433
2026-01-07 23:27:09 +11:00
unkinben 57870658b5 feat: act runner updates (#432)
saving artifacts are breaking in some actions as the runner will switch
between different git hosts. using haproxy will ensure the same backend
is always hit via stick-tables and cookies

- ensure runners use haproxy to reach git

we now package act_runner now, lets use the rpm

- change installation method to rpm instead of curl + untar
- add capability to versionlock act_runner
- fix paths to act_runner
- remove manually installed act_runner

Reviewed-on: #432
2026-01-03 21:51:47 +11:00
unkinben f8caa71f34 fix: increase artifact upload size for git (#431)
- rpmbuilder artifacts can be very large
- increase 1Gb limit to 5GB

Reviewed-on: #431
2025-12-30 22:52:43 +11:00
unkinben a2c56c9e46 chore: add almalinux 9.7 repositories (#430)
- ensure almalinux 9.7 is synced

Reviewed-on: #430
2025-12-30 22:48:54 +11:00
unkinben 0aec795aec feat: manage externaldns bind (#428)
- add module to manage externaldns bind for k8s
- add infra::dns::externaldns role
- add 198.18.19.20 as anycast for k8s external-dns service

Reviewed-on: #428
2025-11-22 23:25:55 +11:00
unkinben 9854403b02 feat: add syslog listener for vlinsert (#427)
- enable syslog capture via vlinsert
- add syslog.service.consul service

Reviewed-on: #427
2025-11-20 23:47:10 +11:00
unkinben 6400c89853 feat: add vmcluster static targets (#426)
- add ability to list static targets for vmagent to scrape
- add vyos router to be scraped

Reviewed-on: #426
2025-11-20 20:19:53 +11:00
unkinben 9eff241003 feat: add SMTP submission listener and enhance stalwart configuration (#425)
- add SMTP submission listener on port 587 with TLS requirement
- configure HAProxy frontend/backend for submission with send-proxy-v2 support
- add send-proxy-v2 support to all listeners
- add dynamic HAProxy node discovery for proxy trusted networks
- use service hostname instead of node FQDN for autoconfig/autodiscover
- remove redundant IMAP/IMAPS/SMTP alt-names from TLS certificates
- update VRRP CNAME configuration to use mail.main.unkin.net

Reviewed-on: #425
2025-11-09 18:48:06 +11:00
unkinben 35614060bd chore: replace stalwart S3 keys (#424)
- update stalwart S3 AK/SK after migrating to new zonegroup

Reviewed-on: #424
2025-11-08 22:56:24 +11:00
unkinben 2c9fb3d86a chore: add stalwart required tls alt names (#422)
- add alt-names for service addresses stalwart is expected to reply too

Reviewed-on: #422
2025-11-08 21:28:41 +11:00
unkinben 559c453906 chore: change transport for main.unkin.net (#421)
- ensure main.unkin.net mail is delivered to stalwart load-balancer addr

Reviewed-on: #421
2025-11-08 21:10:11 +11:00
unkinben 5b0365c096 feat: manage haproxy for stalwart (#420)
- add frontends for imap, imaps and smtp
- add backends for webadmin, imap, imaps and smtp

Reviewed-on: #420
2025-11-08 21:07:43 +11:00
unkinben 9dd74013ea feat: create stalwart module (#418)
- add stalwart module
- add psql database on the shared patroni instance
- add ceph-rgw credentials to eyaml
- ensure psql pass and s3 access key are converted to sensitive

Reviewed-on: #418
2025-11-08 19:09:30 +11:00
unkinben 92a48b4113 feat: ensure latest openbao package (#417)
- stop version locking openbao, use latest

Reviewed-on: #417
2025-11-06 20:01:37 +11:00
unkinben 78adef0eee refactor: recreate profiles::postfix::gateway with parameterization and templates (#416)
- refactor profiles::postfix::gateway as parameterized class
- move base postfix parameters, transports, and virtuals to hiera for flexibility
- convert SMTP restrictions to arrays for better readability using join()
- add postscreen enable/disable boolean with conditional master.cf configuration
- add per-domain TLS policy maps (smtp_tls_policy_maps)
- convert alias_maps to array parameter for flexibility
- convert all postfix map files to ERB templates with parameter hashes
- add map parameters: sender_canonical_maps, sender_access_maps, relay_recipients_maps,
  relay_domains_maps, recipient_canonical_maps, recipient_access_maps, postscreen_access_maps, helo_access_maps
- move default map data to hiera while keeping parameters as empty hashes by default

This approach balances flexibility with data-driven configuration, allowing
easy customization through parameters while keeping transport/virtual maps
and default map data in hiera for role-specific overrides.

Reviewed-on: #416
2025-11-01 17:26:00 +11:00
unkinben a2a8edb731 feat: implement comprehensive postfix gateway with eFa5 configuration (#414)
- add voxpupuli-postfix module to Puppetfile
- create profiles::postfix::gateway class with config based on efa5
- add master.cf entries for postscreen, smtpd, dnsblog, and tlsproxy services
- create postfix hash files: aliases, access controls, canonical maps
- configure TLS with system PKI certificates and strong cipher suites
- add transport and virtual alias mappings for mail routing

Reviewed-on: #414
2025-11-01 00:43:58 +11:00
unkinben e95a59b88a feat: migrate puppetserver -> openvox-server (#412)
- enable openvox repo
- ensure puppetdb-termini and puppetserver are purged
- set openvox-server as the package to install
- set termini package to openvoxdb-termini

Reviewed-on: #412
2025-10-18 23:49:51 +11:00
unkinben 98b866fce7 feat: migrate puppet-agent to openvox (#408)
- change from puppet-agent to openvox-agent
- upgrade version from 7.34 to 7.36
- ensure workflow of: Yumrepo -> dnf-makecache -> Package

Reviewed-on: #408
2025-10-18 19:11:38 +11:00
unkinben e724326d43 feat: allow access to runner certs (#407)
- allow access to runner certs, used for mtls auth to incus

Reviewed-on: #407
2025-10-17 22:46:45 +11:00
unkinben d8b354558d feat: add incus auto-client certificate trust (#406)
- add fact to export vault public cert from agents
- add fact to export list of trusted incus client certs
- add method for incus clients to export their client cert to be trusted

Reviewed-on: #406
2025-10-17 22:46:26 +11:00
unkinben efbbb6bcb1 feat: moderate the k8s install (#403)
- only install a base config
- wait for 3 masters before deploying helm charts
- remove cluster-domain
- manage nginx ingres via rke2 helmconfig

Reviewed-on: #403
2025-10-12 17:50:24 +11:00
unkinben 16e654fdd7 feat: use openbao (#404)
- change vault role to use openbao

Reviewed-on: #404
2025-10-11 20:55:21 +11:00
unkinben b224cfb516 fix: cattle-system namespace (#399)
- cattle-system namespace is created earlier than helm
- leave namespaces.yaml to manage cattle-system namespace (required
  before installing helm/rancher)

Reviewed-on: #399
2025-09-21 00:21:41 +10:00
unkinben 571a9b25a7 fix: resolve rke2-server errors (#397)
- kubectl yaml files must not use underscores
- replace unicode hyphen with ascii hyphen

Reviewed-on: #397
2025-09-20 18:40:18 +10:00
unkinben 762f415d2d feat: k8s helm rework (#396)
- remove helm-generated-yaml, replace with helm execs
- template/parameterise ceph csi

Reviewed-on: #396
2025-09-20 17:40:41 +10:00
unkinben 4e77fb7ee7 feat: manage rancher, purelb, cert-manager (#395)
This change will install rancher, purelb and cert-manager, then
configure a dmz and common ip pool to be used by loadbalancers. The
nginx ingres controller is configured to use 198.18.200.0 (common) and
announce the ip from all nodes so that it becomes an anycast ip in ospf.

- manage the install of rancher, purelb and cert-manager
- add rancher ingress routes
- add nginx externalip/loadBalancer

Reviewed-on: #395
2025-09-14 20:59:39 +10:00
unkinben 6e4bc9fbc7 feat: adding rke2 (#394)
- manage rke2 repos
- add rke2 module (init, params, install, config, service)
- split roles::infra::k8s::node -> control/compute roles
- moved common k8s config into k8s.yaml
- add bootstrap_node, manage server and token fields in rke2 config
- manage install of helm
- manage node attributes (from puppet facts)
- manage frr exclusions for service/cluster network

Reviewed-on: #394
2025-09-14 13:27:49 +10:00
unkinben 98a433d366 feat: mirror rke2 repo for rhel9 (#392)
- create rhel9 mirrors for rke2 1.33 and common

Reviewed-on: #392
2025-09-13 19:49:52 +10:00
unkinben 0665873dc8 feat: update ospf source for learned routes (#388)
- enable changing the source address for learned ospf routes
- this enables the loopback0 interface to be used as a default src address
- ensure k8s nodes use loopback0 as default src
- ensure incus nodes use loopback0 as default src

Reviewed-on: #388
2025-09-07 16:09:21 +10:00