Populate the repo with the Vault/OpenBao dynamic secrets engine that mints LiteLLM virtual keys scoped by model, spending limit, and lease TTL. - Secrets backend: config, roles, creds paths and a revocable litellm_key type - LiteLLM API client (generate/update/delete/info) with master-key auth - Unit tests (mock LiteLLM) and a docker-compose e2e against both Vault and OpenBao proving the same binary works on each - Makefile, woodpecker CI (build/test/pre-commit), pre-commit config
vault-plugin-secrets-litellm
A dynamic secrets engine for LiteLLM that runs on both HashiCorp Vault and OpenBao. It generates virtual API keys on a LiteLLM proxy that are:
- scoped to specific models (
models) - capped by a spending limit (
max_budget) - bound to a lease TTL — revoking the lease revokes the key in LiteLLM
Vault and OpenBao
OpenBao is a fork of Vault and keeps its plugin protocol compatible, so the
same plugin binary registers and runs on either engine unchanged — the CLI
commands below are identical apart from vault vs bao. The end-to-end test
suite exercises the full lifecycle against both to prove it.
How it works
The backend authenticates to the LiteLLM proxy with the master key and manages
virtual keys through the proxy's key-management API (/key/generate,
/key/update, /key/delete, /key/info). Each generated key is wrapped in a
Vault lease, so Vault owns the key's lifecycle: renew extends it, revoke deletes
it.
┌─────────┐ read creds/<role> ┌──────────────────────┐ /key/generate ┌──────────┐
│ client │ ───────────────────► │ vault + litellm plugin│ ───────────────► │ litellm │
└─────────┘ ◄─── virtual key ── └──────────────────────┘ ◄── sk-... ───── └──────────┘
Usage
# 1. Enable the engine
vault secrets enable -path=litellm vault-plugin-secrets-litellm
# 2. Configure the connection to the LiteLLM proxy
vault write litellm/config \
base_url=http://litellm:4000 \
master_key=sk-master-...
# 3. Define a role: which models, how much budget, what TTL
vault write litellm/roles/team-a \
models="gpt-3.5-turbo,gpt-4" \
max_budget=50 \
ttl=1h \
max_ttl=24h
# 4. Generate a scoped, budgeted, time-limited virtual key
vault read litellm/creds/team-a
# Key key
# --- ---
# lease_id litellm/creds/team-a/AbC...
# lease_duration 1h
# key sk-...
# max_budget 50
# models [gpt-3.5-turbo gpt-4]
# 5. Revoking the lease revokes the key in LiteLLM
vault lease revoke litellm/creds/team-a/AbC...
Paths
| Path | Ops | Description |
|---|---|---|
config |
read/write/delete | LiteLLM connection (base_url, master_key) |
roles/<name> |
read/write/delete | Constraints for generated keys |
roles/ |
list | List configured roles |
creds/<name> |
read | Generate a virtual key for the role |
Role fields
| Field | Type | Description |
|---|---|---|
models |
list | Allowed models; empty means unrestricted |
max_budget |
float | Spending limit per key; 0 means unlimited |
key_alias_prefix |
string | Prefix for the generated key alias (default vault) |
metadata |
kv pairs | Metadata attached to each generated key |
ttl |
duration | Default lease TTL |
max_ttl |
duration | Maximum lease TTL |
Development
make build # build the plugin into ./dist
make test # unit tests (race-enabled)
make lint # go vet
make fmt # gofmt
make e2e # full end-to-end test in Docker against Vault AND OpenBao
make e2e-vault # e2e against Vault only
make e2e-openbao # e2e against OpenBao only
End-to-end tests
make e2e builds the plugin, spins up a LiteLLM proxy with its Postgres key
store, plus both a Vault and an OpenBao dev server with the plugin mounted, then
exercises the whole lifecycle against each engine: configure → create role →
generate key → call an allowed model → verify a disallowed model is rejected →
revoke → verify the key stops working.
Requires Docker. Bind mounts use the :z flag so the stack works under SELinux
(Fedora/RHEL). Select engines with ENGINES="vault openbao".
Releasing
Versioning is tag-driven:
make patch # v0.1.0 -> v0.1.1
make minor # v0.1.1 -> v0.2.0
make major # v0.2.0 -> v1.0.0