Files
argocd-apps/apps/base/litellm/hpa.yaml
T
unkinben a5e6f12003
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/pr/kubeconform Pipeline was successful
feat: increase litellm resources
finding litellm performance has dropped, crashed in multiple cases, and
then it had scaled to the maximum level using the majority of memory in
cluster.

- reduce the rate at which litellm autoscales
- increase the requests/limits to match usage
2026-05-23 16:39:33 +10:00

42 lines
851 B
YAML

---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: litellm-hpa
namespace: litellm
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: litellm
minReplicas: 2
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 0
selectPolicy: Max
policies:
- type: Percent
value: 100
periodSeconds: 60
- type: Pods
value: 4
periodSeconds: 30
scaleDown:
stabilizationWindowSeconds: 300
selectPolicy: Min
policies:
- type: Percent
value: 30
periodSeconds: 60
- type: Pods
value: 2
periodSeconds: 60