619 Commits

Author SHA1 Message Date
unkinben aeae26711f Convert RKE2 registries to template, disable default endpoints (#474)
## Summary
- Replace static `registries.yaml` with EPP template driven by `rke2::registries` hash
- Add `disable-default-registry-endpoint: true` to all mirrors — RKE2 will only use artifactapi and never fall back to upstream registries
- Registry configuration now fully managed via hiera data (`roles/infra/k8s.yaml`)

Reviewed-on: #474
Co-authored-by: Ben Vincent <ben@unkin.net>
Co-committed-by: Ben Vincent <ben@unkin.net>
2026-06-29 22:30:48 +10:00
benvin 7b53be7f8c chore: enable rke2 registries (#473)
- re-enable registries for rke2 machines

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #473
2026-06-27 22:27:33 +10:00
benvin 57c844b7e8 feat: upgrade grafana from default to 13.0.2 (#470)
Pin grafana package version to 13.0.2 via a new version parameter on
profiles::metrics::grafana, wired through to the puppet-grafana class.

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #470
2026-06-06 23:46:16 +10:00
benvin 757de20682 feat: upgrade gitea from 1.22.0 to 1.26.2 (#469)
- update release to install to 1.26.2
- change base_url to artifactapi
- update releases/checksums

---------

Co-authored-by: Ben Vincent <ben@unkin.net>
Reviewed-on: #469
2026-06-06 20:23:25 +10:00
unkinben 6ef1b20abd feat: add switch to change to almalinux-vault (#468)
- move old almalinux versions to query the almalinux-vault
- default to the almalinux remote

Reviewed-on: #468
2026-06-06 17:35:04 +10:00
unkinben b754d947d5 feat: add auth.unkin.net proxying to Kubernetes Traefik ingress (#467)
Add static haproxy2 backends for syd1 Kubernetes Traefik ingress
(external 198.18.199.0, internal 198.18.200.4) and route
auth.unkin.net to the internal backend with Let's Encrypt cert.

Reviewed-on: #467
2026-06-02 22:50:10 +10:00
unkinben ba35c8907c chore: increase inotify limits on rke2 nodes to fix fsnotify watcher errors (#466)
Reviewed-on: #466
2026-05-26 23:50:25 +10:00
unkinben 4b9b28ddb7 chore: disable rp_filter on k8s nodes (#461)
- k8s control/compute are multihomed, must disable rp_filter

Reviewed-on: #461
2026-04-11 21:51:42 +10:00
unkinben 0451894b48 feat: add ceph service management profiles and facts (#459)
## Summary

- Adds `Unkin::Ceph::Utils` facter module detecting ceph service instances via `systemctl list-units`, exposing `is_ceph_mon`, `is_ceph_mgr`, `is_ceph_mds`, `is_ceph_osd` booleans and a `ceph_services` hash of unit names
- Adds `profiles::ceph::mon`, `mgr`, `mds`, `osd` — each with `Boolean $ensure_running` that iterates discovered service instances and manages them as running and enabled
- Works across incus nodes (mon/mgr/mds/osd) and k8s compute/control nodes (osd only); verified on prodnxsr0001 which correctly reports `is_ceph_osd: true` and `ceph_services: {osd: [ceph-osd@5]}`

## Test plan

- [x] Noop deploy against prodnxsr0001.main.unkin.net passed cleanly
- [x] `ceph_services` fact returns correct service map
- [x] `is_ceph_osd` returns `True`, `is_ceph_mon` returns `False` as expected
- [x] Test on an incus/ceph node with mon/mgr/mds services

Reviewed-on: #459
2026-04-07 19:02:17 +10:00
unkinben 3714691240 chore: enable access to dns (#460)
rebuilding router, taking the chance to not mess up ip ranges. I did
have 198.18.21.0/24 and 198.18.21.160/27 and 198.18.21.192/27 all on
differnt interfaces.

- update IP's that can reach bind view for main.unkin.net
- keep both for intermediate period

Reviewed-on: #460
2026-04-06 22:46:40 +10:00
unkinben dbe04a91e3 chore: change to ceph-public loopback (#458)
- use ceph public loopback port 9443 for dashboard

Reviewed-on: #458
2026-04-05 22:35:39 +10:00
unkinben 0c0d4a3f61 chore: update r10k repo path (#454)
- change to use letsencrypt ssl path for simpler tls trust management

Reviewed-on: #454
2026-03-17 17:36:58 +11:00
unkinben bc769aa1df feat: add ldap groups for kubernetes/vault (#449)
need to separate the permissions inside vault into different groups, one
per-permission.

- add group for each kubernetes role in vault

Reviewed-on: #449
2026-02-14 19:22:26 +11:00
unkinben 4e652ccbe6 chore: add alt-names to consul (#448)
- ensure consul datacenter is added to altnames

Reviewed-on: #448
2026-02-09 01:03:20 +11:00
unkinben 8c24c6582f feat: manage vault version (#446)
- add params for version and package name
- add param to cleanup openbao
- add version lock (if not latest)

Reviewed-on: #446
2026-02-08 22:26:22 +11:00
unkinben 6bfc63ca31 feat: enable plugins for vault/openbao (#447)
- install openbao-plugins
- add plugin_directory

Reviewed-on: #447
2026-02-08 19:19:33 +11:00
unkinben c4d28d52bc chore: remove helm deploys from puppet (#444)
- migrate helm deployments to terraform

Reviewed-on: #444
2026-01-30 20:52:51 +11:00
unkinben 6219855fb1 chore: add additional user (#443)
- as per request

Reviewed-on: #443
2026-01-26 20:21:10 +11:00
unkinben 7215a6f534 chore: terraform state too large for body (#442)
- update consul/nginx max body size to 512MB

Reviewed-on: #442
2026-01-18 17:15:08 +11:00
unkinben 88efdbcdd3 chore: reduce synced repos (#441)
- remove repos now available via artifactapi

Reviewed-on: #441
2026-01-17 17:12:44 +11:00
unkinben 1077bdcbc1 chore: update ceph gpgkey (#438)
- stop checking ceph gpgkey (fixme)
- use artifactapi for retrieving large rke image bundle

Reviewed-on: #438
2026-01-16 23:51:11 +11:00
unkinben 4e928585f5 fix: ceph repos remove dash (#437)
Reviewed-on: #437
2026-01-15 21:52:17 +11:00
unkinben dbe1398218 chore: centralise all yum repo configuration (#436)
- add 30+ repository definitions to AlmaLinux/all_releases.yaml with `ensure: absent` defaults
- update all role-specific hieradata files to use `ensure: present` pattern
- remove duplicated repository URL/GPG key configurations from individual roles
- maintains existing functionality while improving maintainability"

Reviewed-on: #436
2026-01-15 21:35:13 +11:00
unkinben 383bbb0507 fix: ensure join-api is functioning (#434)
- consul was directing new rke2 control nodes to a dead join api
- add additional check to verify its responding (not just up)

Reviewed-on: #434
2026-01-11 13:51:36 +11:00
unkinben 6f51bffeaa core: bump radowgw client_max_body_size (#433)
Reviewed-on: #433
2026-01-07 23:27:09 +11:00
unkinben 57870658b5 feat: act runner updates (#432)
saving artifacts are breaking in some actions as the runner will switch
between different git hosts. using haproxy will ensure the same backend
is always hit via stick-tables and cookies

- ensure runners use haproxy to reach git

we now package act_runner now, lets use the rpm

- change installation method to rpm instead of curl + untar
- add capability to versionlock act_runner
- fix paths to act_runner
- remove manually installed act_runner

Reviewed-on: #432
2026-01-03 21:51:47 +11:00
unkinben f8caa71f34 fix: increase artifact upload size for git (#431)
- rpmbuilder artifacts can be very large
- increase 1Gb limit to 5GB

Reviewed-on: #431
2025-12-30 22:52:43 +11:00
unkinben a2c56c9e46 chore: add almalinux 9.7 repositories (#430)
- ensure almalinux 9.7 is synced

Reviewed-on: #430
2025-12-30 22:48:54 +11:00
unkinben 40d8e924ee feat: enable managing root password (#429)
- update root password in common.eyaml
- add missing param to the accounts::root manifest
- remove if block as undef sshkeys has same effect

Reviewed-on: #429
2025-12-28 20:12:12 +11:00
unkinben 0aec795aec feat: manage externaldns bind (#428)
- add module to manage externaldns bind for k8s
- add infra::dns::externaldns role
- add 198.18.19.20 as anycast for k8s external-dns service

Reviewed-on: #428
2025-11-22 23:25:55 +11:00
unkinben 9854403b02 feat: add syslog listener for vlinsert (#427)
- enable syslog capture via vlinsert
- add syslog.service.consul service

Reviewed-on: #427
2025-11-20 23:47:10 +11:00
unkinben 6400c89853 feat: add vmcluster static targets (#426)
- add ability to list static targets for vmagent to scrape
- add vyos router to be scraped

Reviewed-on: #426
2025-11-20 20:19:53 +11:00
unkinben 9eff241003 feat: add SMTP submission listener and enhance stalwart configuration (#425)
- add SMTP submission listener on port 587 with TLS requirement
- configure HAProxy frontend/backend for submission with send-proxy-v2 support
- add send-proxy-v2 support to all listeners
- add dynamic HAProxy node discovery for proxy trusted networks
- use service hostname instead of node FQDN for autoconfig/autodiscover
- remove redundant IMAP/IMAPS/SMTP alt-names from TLS certificates
- update VRRP CNAME configuration to use mail.main.unkin.net

Reviewed-on: #425
2025-11-09 18:48:06 +11:00
unkinben 35614060bd chore: replace stalwart S3 keys (#424)
- update stalwart S3 AK/SK after migrating to new zonegroup

Reviewed-on: #424
2025-11-08 22:56:24 +11:00
unkinben 1b0fd10fd7 fix: remove . from end of vrrp_cnames (#423)
- autoconfig/autodiscovery should not end with a dot

Reviewed-on: #423
2025-11-08 21:38:10 +11:00
unkinben 2c9fb3d86a chore: add stalwart required tls alt names (#422)
- add alt-names for service addresses stalwart is expected to reply too

Reviewed-on: #422
2025-11-08 21:28:41 +11:00
unkinben 559c453906 chore: change transport for main.unkin.net (#421)
- ensure main.unkin.net mail is delivered to stalwart load-balancer addr

Reviewed-on: #421
2025-11-08 21:10:11 +11:00
unkinben 5b0365c096 feat: manage haproxy for stalwart (#420)
- add frontends for imap, imaps and smtp
- add backends for webadmin, imap, imaps and smtp

Reviewed-on: #420
2025-11-08 21:07:43 +11:00
unkinben 1e7dfb9d9d feat: manage additional ceph sections (#419)
- ensure mons configuration are managed in code
- ensure radowgw configuration are managed in code

Reviewed-on: #419
2025-11-08 19:19:44 +11:00
unkinben 9dd74013ea feat: create stalwart module (#418)
- add stalwart module
- add psql database on the shared patroni instance
- add ceph-rgw credentials to eyaml
- ensure psql pass and s3 access key are converted to sensitive

Reviewed-on: #418
2025-11-08 19:09:30 +11:00
unkinben 92a48b4113 feat: ensure latest openbao package (#417)
- stop version locking openbao, use latest

Reviewed-on: #417
2025-11-06 20:01:37 +11:00
unkinben 78adef0eee refactor: recreate profiles::postfix::gateway with parameterization and templates (#416)
- refactor profiles::postfix::gateway as parameterized class
- move base postfix parameters, transports, and virtuals to hiera for flexibility
- convert SMTP restrictions to arrays for better readability using join()
- add postscreen enable/disable boolean with conditional master.cf configuration
- add per-domain TLS policy maps (smtp_tls_policy_maps)
- convert alias_maps to array parameter for flexibility
- convert all postfix map files to ERB templates with parameter hashes
- add map parameters: sender_canonical_maps, sender_access_maps, relay_recipients_maps,
  relay_domains_maps, recipient_canonical_maps, recipient_access_maps, postscreen_access_maps, helo_access_maps
- move default map data to hiera while keeping parameters as empty hashes by default

This approach balances flexibility with data-driven configuration, allowing
easy customization through parameters while keeping transport/virtual maps
and default map data in hiera for role-specific overrides.

Reviewed-on: #416
2025-11-01 17:26:00 +11:00
unkinben a2a8edb731 feat: implement comprehensive postfix gateway with eFa5 configuration (#414)
- add voxpupuli-postfix module to Puppetfile
- create profiles::postfix::gateway class with config based on efa5
- add master.cf entries for postscreen, smtpd, dnsblog, and tlsproxy services
- create postfix hash files: aliases, access controls, canonical maps
- configure TLS with system PKI certificates and strong cipher suites
- add transport and virtual alias mappings for mail routing

Reviewed-on: #414
2025-11-01 00:43:58 +11:00
unkinben e95a59b88a feat: migrate puppetserver -> openvox-server (#412)
- enable openvox repo
- ensure puppetdb-termini and puppetserver are purged
- set openvox-server as the package to install
- set termini package to openvoxdb-termini

Reviewed-on: #412
2025-10-18 23:49:51 +11:00
unkinben 98b866fce7 feat: migrate puppet-agent to openvox (#408)
- change from puppet-agent to openvox-agent
- upgrade version from 7.34 to 7.36
- ensure workflow of: Yumrepo -> dnf-makecache -> Package

Reviewed-on: #408
2025-10-18 19:11:38 +11:00
unkinben e724326d43 feat: allow access to runner certs (#407)
- allow access to runner certs, used for mtls auth to incus

Reviewed-on: #407
2025-10-17 22:46:45 +11:00
unkinben d8b354558d feat: add incus auto-client certificate trust (#406)
- add fact to export vault public cert from agents
- add fact to export list of trusted incus client certs
- add method for incus clients to export their client cert to be trusted

Reviewed-on: #406
2025-10-17 22:46:26 +11:00
unkinben efbbb6bcb1 feat: moderate the k8s install (#403)
- only install a base config
- wait for 3 masters before deploying helm charts
- remove cluster-domain
- manage nginx ingres via rke2 helmconfig

Reviewed-on: #403
2025-10-12 17:50:24 +11:00
unkinben 16e654fdd7 feat: use openbao (#404)
- change vault role to use openbao

Reviewed-on: #404
2025-10-11 20:55:21 +11:00
unkinben b224cfb516 fix: cattle-system namespace (#399)
- cattle-system namespace is created earlier than helm
- leave namespaces.yaml to manage cattle-system namespace (required
  before installing helm/rancher)

Reviewed-on: #399
2025-09-21 00:21:41 +10:00