Rate limits
The Stripe API uses a number of safeguards against bursts of incoming traffic to help maximize its stability. Users who send many requests in quick succession might see error responses that show up as status code 429
. We have several limiters in the API, including:
A rate limiter that limits the number of requests received by the API within any given second.
For most APIs, Stripe allows up to 100 read operations per second and 100 write operations per second in live mode, and 25 operations per second for each in test mode.
For the Files API, Stripe allows up to 20 read operations per second and 20 write operations per second in both live mode and test mode. Live mode and test mode limits are separate.
For the Search API, Stripe allows up to 20 read operations per second which applies for all search endpoints in both live mode and test mode. Live mode and test mode limits are separate.
A concurrency limiter that limits the number of requests that are active at any given time. Problems with this limiter are less common compared to the request rate limiter, but it’s more likely to result in resource-intensive, long-lived requests.
Treat these limits as maximums and don’t generate unnecessary load. See Handling limiting gracefully for advice on handling 429s. If you suddenly see a rising number of rate limited requests, please contact support.
We may reduce limits to prevent abuse, or increase limits to enable high-traffic applications. To request an increased rate limit, please contact support. If you’re requesting a large increase, contact us 6 weeks in advance of when you’ll need the increased rate limit.
Common causes and mitigations
Rate limiting can occur under a variety of conditions, but it’s most common in these scenarios:
- Running a large volume of closely-spaced requests can lead to rate limiting. Often this is part of an analytical or migration operation. When engaging in these activities, you should try to control the request rate on the client side (see Handling limiting gracefully).
- Issuing many long-lived requests can trigger limiting. Requests vary in the amount of Stripe’s server resources they use, and more resource-intensive requests tend to take longer and run the risk of causing new requests to be shed by the concurrency limiter. Resource requirements vary widely, but list requests and requests that include expansions generally use more resources and take longer to run. We suggest profiling the duration of Stripe API requests and watching for timeouts to try and spot those that are unexpectedly slow.
- A sudden increase in charge volume like a flash sale might result in rate limiting. We try to set our rates high enough that the vast majority of users are never rate limited for legitimate payment traffic, but if you suspect that an upcoming event might push you over the limits listed above, please contact support.
Handling limiting gracefully
A basic technique for integrations to gracefully handle limiting is to watch for 429
status codes and build in a retry mechanism. The retry mechanism should follow an exponential backoff schedule to reduce request volume when necessary. We’d also recommend building some randomness into the backoff schedule to avoid a thundering herd effect.
You can only optimize individual requests to a limited degree, so an even more sophisticated approach would be to control traffic to Stripe at a global level, and throttle it back if you detect substantial rate limiting. A common technique for controlling rate is to implement something like a token bucket rate limiting algorithm on the client-side. Ready-made and mature implementations for token bucket are available in almost any programming language.
Object lock timeouts
Integrations may encounter errors with HTTP status 429
, code lock_timeout
, and this message:
This object cannot be accessed right now because another API request or Stripe process currently accessing it. If you see this error intermittently, retry the request. If you see this error frequently and are making multiple concurrent requests to a single object, make your requests serially or at a lower rate.
The Stripe API locks objects on some operations so that concurrent workloads don’t interfere and produce an inconsistent result. The error above is caused by a request trying to acquire a lock that’s already held elsewhere, and timing out after it couldn’t be acquired in time.
Lock timeouts have a different cause than rate limiting, but their mitigations are similar. As with rate limiting errors, we recommend retrying on an exponential backoff schedule (see Handling limiting gracefully). But unlike rate limiting errors, the automatic retry mechanisms built into Stripe’s client libraries retry 429
s caused by lock timeouts:
Lock contention is caused by concurrent access on related objects. Integrations can vastly reduce this by making sure that mutations on the same object are queued up and run sequentially instead. Concurrent operations against the API are still okay, but try to make sure simultaneous operations operate only on unique objects. It’s also possible to see lock contention caused by a conflict with an internal Stripe background process—this should be rare, but because it’s beyond user control, we recommend that all integrations are able to retry requests.
Load testing
It’s common for users to prepare for a major sales event by load testing their systems, with the Stripe API running in test mode as part of it. We generally discourage this practice because API limits are lower in test mode, so the load test is likely to hit limits that it wouldn’t hit in production. Test mode is also not a perfect stand-in for live API calls, and that can be somewhat misleading. For example, creating a charge in live mode sends a request to a payment gateway and that request is mocked in test mode, resulting in significantly different latency profiles.
As an alternative, we recommend building integrations so that they have a configurable system for mocking out requests to the Stripe API, which can be enabled for load tests. For realistic results, they should simulate latency by sleeping for a time that you determine by sampling the durations of real live mode Stripe API calls, as seen from the perspective of the integration.