Rate Limiting
Sliding window rate limiting with per-plan limits.
Overview
CueAPI rate limits API requests using a sliding window algorithm backed by Redis. Limits are per-plan and per-API-key.
Per-plan limits
| Plan | Requests per minute |
|---|---|
| Free | 60 |
| Pro | 200 |
| Scale | 500 |
How it works
CueAPI uses a Redis sorted set for each API key:
- Each request adds a timestamp to the sorted set
- Entries older than 60 seconds are removed
- If the count exceeds the limit, the request is rejected with
429 - Rejected requests do not count against the window (preventing feedback loops)
Response headers
Every response includes rate limit headers:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 58
When rate limited:
HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
Exempt endpoints
These endpoints are not rate limited:
GET /healthGET /statusPOST /v1/billing/webhook(Stripe webhooks)GET /auth/device(verification page)GET /v1/auth/verify(magic link)
Unauthenticated requests
Requests without an API key are rate limited by IP address at 60 requests per minute.
Graceful degradation
If Redis is unavailable, rate limiting is skipped entirely. The request proceeds without rate limit checks. This prevents Redis outages from blocking all API traffic.
Usage warnings
When you approach your monthly execution limit, CueAPI adds a warning header:
X-CueAPI-Usage-Warning: approaching_limit
This appears at 80% of your plan's monthly execution limit.