feat: increase litellm resources
finding litellm performance has dropped, crashed in multiple cases, and then it had scaled to the maximum level using the majority of memory in cluster. - reduce the rate at which litellm autoscales - increase the requests/limits to match usage
This commit is contained in:
@@ -76,11 +76,11 @@ spec:
|
||||
updateInterval: 30
|
||||
resources:
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
cpu: 1
|
||||
memory: 1024Mi
|
||||
requests:
|
||||
cpu: 250m
|
||||
memory: 256Mi
|
||||
memory: 512Mi
|
||||
smartShutdownTimeout: 180
|
||||
startDelay: 3600
|
||||
stopDelay: 1800
|
||||
|
||||
@@ -56,10 +56,10 @@ spec:
|
||||
resources:
|
||||
limits:
|
||||
cpu: "2"
|
||||
memory: 6Gi
|
||||
memory: 8Gi
|
||||
requests:
|
||||
cpu: 250m
|
||||
memory: 2Gi
|
||||
memory: 6Gi
|
||||
volumeMounts:
|
||||
- mountPath: /app/config.yaml
|
||||
name: config
|
||||
|
||||
@@ -10,14 +10,14 @@ spec:
|
||||
kind: Deployment
|
||||
name: litellm
|
||||
minReplicas: 2
|
||||
maxReplicas: 10
|
||||
maxReplicas: 4
|
||||
metrics:
|
||||
- type: Resource
|
||||
resource:
|
||||
name: cpu
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 60
|
||||
averageUtilization: 80
|
||||
behavior:
|
||||
scaleUp:
|
||||
stabilizationWindowSeconds: 0
|
||||
@@ -25,7 +25,7 @@ spec:
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 100
|
||||
periodSeconds: 30
|
||||
periodSeconds: 60
|
||||
- type: Pods
|
||||
value: 4
|
||||
periodSeconds: 30
|
||||
@@ -34,7 +34,7 @@ spec:
|
||||
selectPolicy: Min
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 10
|
||||
value: 30
|
||||
periodSeconds: 60
|
||||
- type: Pods
|
||||
value: 2
|
||||
|
||||
Reference in New Issue
Block a user