> ## Documentation Index
> Fetch the complete documentation index at: https://agno-v2-agui.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Tokens-per-minute rate limiting

<img src="https://mintcdn.com/agno-v2-agui/B_u8TpSpMOBFKmtg/images/tpm_issues.png?fit=max&auto=format&n=B_u8TpSpMOBFKmtg&q=85&s=f4903a438c1a689f79ea00ffbd95ae3a" alt="Chat with pdf" width="698" height="179" data-path="images/tpm_issues.png" />

If you face any problems with proprietary models (like OpenAI models) where you are rate limited, we provide the option to set `exponential_backoff=True` and to change `delay_between_retries` to a value in seconds (defaults to 1 second).

For example:

```python theme={null}
from agno.agent import Agent
from agno.models.openai import OpenAIChat

agent = Agent(
    model=OpenAIChat(id="gpt-5-mini"),
    description="You are an enthusiastic news reporter with a flair for storytelling!",
    markdown=True,
    exponential_backoff=True,
    delay_between_retries=2
)
agent.print_response("Tell me about a breaking news story from New York.", stream=True)
```

See our [models documentation](/models/overview) for specific information about rate limiting.

In the case of OpenAI, they have tier based rate limits. See the [docs](https://platform.openai.com/docs/guides/rate-limits/usage-tiers) for more information.
