Rate Limiting

Sliding window rate limiting with per-plan limits.

Overview

CueAPI rate limits API requests using a sliding window algorithm backed by Redis. Limits are per-plan and per-API-key.

Per-plan limits

Plan	Requests per minute
Free	60
Pro	200
Scale	500

How it works

CueAPI uses a Redis sorted set for each API key:

Each request adds a timestamp to the sorted set
Entries older than 60 seconds are removed
If the count exceeds the limit, the request is rejected with 429
Rejected requests do not count against the window (preventing feedback loops)

Response headers

Every response includes rate limit headers:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 58

When rate limited:

HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0

Exempt endpoints

These endpoints are not rate limited:

GET /health
GET /status
POST /v1/billing/webhook (Stripe webhooks)
GET /auth/device (verification page)
GET /v1/auth/verify (magic link)

Unauthenticated requests

Requests without an API key are rate limited by IP address at 60 requests per minute.

Graceful degradation

If Redis is unavailable, rate limiting is skipped entirely. The request proceeds without rate limit checks. This prevents Redis outages from blocking all API traffic.

Usage warnings

When you approach your monthly execution limit, CueAPI adds a warning header:

X-CueAPI-Usage-Warning: approaching_limit

This appears at 80% of your plan's monthly execution limit.

Best practices

Previous← SSRF Protection NextSubprocessors →