feat: increase litellm resources (#144)

finding litellm performance has dropped, crashed in multiple cases, and
then it had scaled to the maximum level using the majority of memory in
cluster.

- reduce the rate at which litellm autoscales
- increase the requests/limits to match usage

Reviewed-on: #144
This commit was merged in pull request #144.
This commit is contained in:
2026-05-23 17:59:43 +10:00
parent 445d8b6e7e
commit e05f9bfd83
3 changed files with 9 additions and 9 deletions
+4 -4
View File
@@ -10,14 +10,14 @@ spec:
kind: Deployment
name: litellm
minReplicas: 2
maxReplicas: 10
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 0
@@ -25,7 +25,7 @@ spec:
policies:
- type: Percent
value: 100
periodSeconds: 30
periodSeconds: 60
- type: Pods
value: 4
periodSeconds: 30
@@ -34,7 +34,7 @@ spec:
selectPolicy: Min
policies:
- type: Percent
value: 10
value: 30
periodSeconds: 60
- type: Pods
value: 2