feat: increase litellm resources (#144)
finding litellm performance has dropped, crashed in multiple cases, and then it had scaled to the maximum level using the majority of memory in cluster. - reduce the rate at which litellm autoscales - increase the requests/limits to match usage Reviewed-on: #144
This commit was merged in pull request #144.
This commit is contained in:
@@ -10,14 +10,14 @@ spec:
|
||||
kind: Deployment
|
||||
name: litellm
|
||||
minReplicas: 2
|
||||
maxReplicas: 10
|
||||
maxReplicas: 4
|
||||
metrics:
|
||||
- type: Resource
|
||||
resource:
|
||||
name: cpu
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 60
|
||||
averageUtilization: 80
|
||||
behavior:
|
||||
scaleUp:
|
||||
stabilizationWindowSeconds: 0
|
||||
@@ -25,7 +25,7 @@ spec:
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 100
|
||||
periodSeconds: 30
|
||||
periodSeconds: 60
|
||||
- type: Pods
|
||||
value: 4
|
||||
periodSeconds: 30
|
||||
@@ -34,7 +34,7 @@ spec:
|
||||
selectPolicy: Min
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 10
|
||||
value: 30
|
||||
periodSeconds: 60
|
||||
- type: Pods
|
||||
value: 2
|
||||
|
||||
Reference in New Issue
Block a user