Rate Limiting
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude
Connect to Cursor
Install MCP server on Cursor
Connect to VS Code
Install MCP server on VS Code

All Bookable API endpoints are subject to rate limiting to ensure fair usage and consistent performance across all integration partners.

Overview

Property	Value
Limit	200 requests per 60-second window
Scope	Per OAuth2 client (`sub` claim)
Exceeded response	`HTTP 429 Too Many Requests`
Error code	`RATE-R-001`

Rate limits are applied per OAuth2 client identity — identified by the sub claim in your access token. This means your limit is shared across all servers, processes, and requests that authenticate using the same client credentials. If you run multiple workers or services under one client, their requests count toward a single shared quota.

Response headers

Every API response includes headers so you can monitor your consumption in real time:

Header	Present on	Description
`X-RateLimit-Limit`	All responses	Total number of requests permitted in the current window (e.g. `200`)
`X-RateLimit-Remaining`	All responses	Number of requests remaining before you are throttled (e.g. `150`)
`X-RateLimit-Reset`	All responses	Unix timestamp (seconds since epoch) when the current window resets

Example response headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 200
X-RateLimit-Remaining: 150
X-RateLimit-Reset: 1741651200

When the limit is exceeded:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 200
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1741651200
Content-Type: application/problem+json

{
  "type": "https://tools.ietf.org/html/rfc6585#section-4",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "You have exceeded the rate limit of 200 requests per 60-second window.",
  "code": "RATE-R-001",
  "isRetryable": true
}

How the window works

Bookable uses a fixed 60-second window. At the start of each window, your remaining quota is reset to the full limit (200). The X-RateLimit-Reset header on every response contains the Unix timestamp when the current window resets — subtract the current time to calculate how long to wait.

Window 1 (0s–60s)       Window 2 (60s–120s)
│◄──────────────────►│◄──────────────────►│
│                    │                    │
│  200 requests      │  Quota resets to   │
│  consumed          │  200 again         │
│  → 429 returned    │                    │

Handling rate limit errors

When you receive a 429, read the X-RateLimit-Reset header and wait until that Unix timestamp before retrying. Retrying immediately will continue to return 429 and waste time.

async function requestWithRetry(
  url: string,
  options: RequestInit,
  maxRetries = 3
): Promise<Response> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status !== 429) {
      return response;
    }

    if (attempt === maxRetries) {
      throw new Error(`Rate limit exceeded after ${maxRetries} retries`);
    }

    const resetAt = parseInt(response.headers.get("X-RateLimit-Reset") ?? "0", 10);
    const waitMs = Math.max((resetAt * 1000) - Date.now(), 1000);
    console.warn(`Rate limited. Retrying in ${Math.ceil(waitMs / 1000)}s...`);
    await new Promise((resolve) => setTimeout(resolve, waitMs));
  }

  throw new Error("Unreachable");
}

Proactive monitoring

Rather than waiting for a 429, inspect the X-RateLimit-Remaining header on every response to stay ahead of your limit.

function checkRateLimit(response: Response): void {
  const remaining = parseInt(response.headers.get("X-RateLimit-Remaining") ?? "200", 10);
  const limit = parseInt(response.headers.get("X-RateLimit-Limit") ?? "200", 10);
  const usagePercent = ((limit - remaining) / limit) * 100;

  if (usagePercent >= 80) {
    console.warn(`Rate limit at ${usagePercent.toFixed(0)}% — ${remaining} requests remaining`);
  }
}

Best practices

Batch where possible Use availability search with date ranges rather than issuing one request per date. Fewer, richer requests are more efficient than many narrow ones.

Use X-Partner-Reference Always set the x-Partner-Reference header on your requests. This makes it significantly easier to correlate your API calls with rate limit activity in support investigations.

Avoid thundering herd patterns If you have multiple workers, stagger their startup and avoid synchronized polling intervals. Workers hitting the API in lockstep can exhaust the shared window in a burst.

Cache aggressively Venue and product data changes infrequently. Cache responses locally for a few minutes rather than re-fetching on every user action.

Respect X-RateLimit-Reset exactly Do not retry before the X-RateLimit-Reset timestamp has passed. Early retries return another 429 and do not reset the clock.

Frequently asked questions

Does the limit apply per endpoint or globally? The 200 requests/60s limit is global across all API endpoints for your client. A mix of booking lookups and availability checks all count against the same quota.

Do failed requests (4xx, 5xx) count against my limit? Yes. All requests that reach the API — regardless of outcome — consume quota.

What happens to requests in flight when the window resets? In-flight requests that arrived before the window reset are counted against the previous window. New requests after the reset begin consuming the refreshed quota.

Can my limit be increased? Contact your Bookable account team to discuss higher limits if your integration consistently approaches the default threshold.

Rate LimitingCopyCopy for LLMCopy page as Markdown for LLMsView as MarkdownOpen this page as MarkdownOpen in ChatGPTGet insights from ChatGPTOpen in ClaudeGet insights from ClaudeConnect to CursorInstall MCP server on CursorConnect to VS CodeInstall MCP server on VS Code