feat: add litellm to new aitooling ArgoCD project (#94)
Deploys LiteLLM proxy with CNPG PostgreSQL (3-instance HA), PgBouncer pooler, and Redis cache. Introduces a dedicated aitooling AppProject and ApplicationSet to keep AI tooling services separate from platform infra. Reviewed-on: #94
This commit was merged in pull request #94.
This commit is contained in:
@@ -0,0 +1,41 @@
|
||||
---
|
||||
apiVersion: autoscaling/v2
|
||||
kind: HorizontalPodAutoscaler
|
||||
metadata:
|
||||
name: litellm-hpa
|
||||
namespace: litellm
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: litellm
|
||||
minReplicas: 2
|
||||
maxReplicas: 10
|
||||
metrics:
|
||||
- type: Resource
|
||||
resource:
|
||||
name: cpu
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 60
|
||||
behavior:
|
||||
scaleUp:
|
||||
stabilizationWindowSeconds: 0
|
||||
selectPolicy: Max
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 100
|
||||
periodSeconds: 30
|
||||
- type: Pods
|
||||
value: 4
|
||||
periodSeconds: 30
|
||||
scaleDown:
|
||||
stabilizationWindowSeconds: 300
|
||||
selectPolicy: Min
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 10
|
||||
periodSeconds: 60
|
||||
- type: Pods
|
||||
value: 2
|
||||
periodSeconds: 60
|
||||
Reference in New Issue
Block a user