Rate Limit
Configure tenant-wide token and request rate limits in FinOps Config.
Rate limits at the Finops Config top level apply globally or can complement organization- and key-level limits configured elsewhere.
Access
Open Finops Config → Rate Limit.
Create a rate limit
- Click + Create.
- Configure token limits:
- Token Max Limit — maximum tokens per window
- Token Reset Duration — window length (e.g.
1h,1d)
- Configure request limits:
- Request Max Limit — maximum requests per window
- Request Reset Duration — window length
- Optionally scope to a Virtual Key, Provider, or Governed Organization.
- Click Submit.
Monitoring limits
The Simulator shows real-time usage bars under Limits on each governance hierarchy card. When exceeded, responses display rate-limit errors before reaching the upstream provider.
Related
- Organization Rate Limits — per-department throttling
- User Keys — key-scoped limits