You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current thinking is to add more examples for doing batch inference to avoid this issue on the client. We can still look for ways to backoff automatically, though, in the future.
Feature request
Have the (async) client automatically backoff sending requests when the deployment is overloaded.
Motivation
When the async client exceeds the deployment queue capacity / rate limits, it currently fails with
The text was updated successfully, but these errors were encountered: