Perplexity AI offers a specialized API platform focusing on search-grounded LLM responses (Sonar models).Like OpenAI, they use a tiered system where your limits increase as you spend more.
1. Step-by-Step: Generate Perplexity API Key
The API is managed through the Perplexity Console, which is separate from the standard chat interface.
Add Payment Method: Unlike the chat "Pro" subscription, the API requires prepaid credits.
Navigate to the Billing tab.
Add a credit card and purchase a minimum of $5 in credits.
Generate Key: * Go to the API Keys section.
Click + Generate.
Copy the key immediately. It starts with pplx-. Perplexity will not show it again for security reasons.
Set Up Auto-Top-Up (Optional): To prevent your scripts from breaking when credits run out, enable "Automatic Top-up" to refresh your balance when it falls below $2.
2. API Usage Tiers
Your "Tier" is determined by your total cumulative spend on the platform.
Tier
Total Credits Purchased
Status
Monthly Spend Limit
Tier 0
$0
New/Trial
$5 (One-time)
Tier 1
$50+
Light Usage
$100
Tier 2
$250+
Regular Usage
$500
Tier 3
$500+
Heavy Usage
$1,000
Tier 4
$1,000+
Production
$5,000
Tier 5
$5,000+
Enterprise
Custom ($200k+)
3. Model-Wise API Usage Limits
Perplexity uses Requests Per Minute (RPM) as the primary throttle. Limits for Tier 1 are shown below; these scale significantly as you reach Tier 3 and above.
Model Category
Specific Models
Tier 1 RPM
Tier 3 RPM
Sonar (Standard)
sonar, sonar-pro
150
1,000
Reasoning
sonar-reasoning, sonar-reasoning-pro
150
1,000
Deep Research
sonar-deep-research
10
40
Async Search
POST /v1/async/sonar
10
40
Search (Raw)
POST /search
50 QPS*
50 QPS
*QPS = Queries Per Second. The Search API has a high-burst capacity of 50 requests instantly.
Important Implementation Notes
Deep Research Limits: The sonar-deep-research model has much lower rate limits because it performs multiple autonomous steps (searching, browsing, and reasoning) per single request.
Error Handling: If you receive a 429 Error, Perplexity recommends using the x-ratelimit-remaining headers in the API response to calculate exactly when your "bucket" will refill.
Context Windows: While models like sonar-pro support large contexts, sending very large prompts frequently may trigger token-based throttling even if you stay under the RPM limit.
Log in if you already have an OpenAI account, or Sign up to create a new one.
Note: Your ChatGPT Plus subscription does not cover API usage. The API operates on a separate "pay-as-you-go" credit system.
2. Initial Setup (For New Users)
If this is your first time on the developer platform, you may be prompted to:
Create an Organization: Give it a name (e.g., "Personal" or "Project-X").
Set your Role: Select whether you are a developer, researcher, or hobbyist.
3. Navigate to API Keys
On the left-hand sidebar menu, look for the "Settings" icon (gear icon) or the "Dashboard" link.
Select API keys from the submenu.
Alternatively, you can click on your profile icon in the top-right corner and select "View API keys."
4. Create Your Secret Key
Click the + Create new secret key button.
Name your key: Give it a descriptive name (e.g., "Website-Bot" or "Testing") so you can track usage.
Set Permissions: You can choose "Full Access" or "Restricted" if you want to limit the key to specific models or actions.
Click Create secret key.
5. Secure Your Key
Copy the key immediately. For security reasons, OpenAI will only show this key once.
Save it in a password manager or an environment variable file (.env). If you lose it, you cannot retrieve it; you will have to delete it and create a new one.
💡 Important Pro-Tips
Billing is Required: Your API key will likely return an error (like 429: Insufficient Quota) until you add a payment method. Go to Settings > Billing and add at least $5 in credits to activate the key.
Usage Limits: Set a "Hard Limit" in the billing settings to ensure you don't accidentally spend more than you intended if your code loops or your key is leaked.
Never Hardcode: Avoid pasting your API key directly into your code files. Use environment variables to keep your credentials safe from being accidentally uploaded to sites like GitHub.
OpenAI's API usage is structured into Tiers.Your limits (RPM - Requests Per Minute, and TPM - Tokens Per Minute) increase automatically as you spend more and build a history of successful payments.
The following table reflects the standard rate limits for the latest 2026 models (like GPT-5.4) and legacy favorites like GPT-4o.
OpenAI API Usage Limits by Tier (2026)
Usage Tier
Qualification (Cumulative Spend)
Sample Monthly Credit Limit
Rate Limit Example (GPT-5.4 / GPT-4o)
Free
No payment history
$0 - $100
Minimal: Approx. 3 RPM / 40,000 TPM
Tier 1
$5+ paid
$100
Standard: 500 RPM / 30,000 TPM
Tier 2
$50+ paid & 7 days since 1st payment
$500
High: 5,000 RPM / 450,000 TPM
Tier 3
$100+ paid & 7 days since 1st payment
$1,000
Professional: 5,000 RPM / 600,000 TPM
Tier 4
$250+ paid & 14 days since 1st payment
$5,000
Business: 10,000 RPM / 800,000 TPM
Tier 5
$1,000+ paid & 30 days since 1st payment
$200,000+
Enterprise: 10,000+ RPM / 2,000,000+ TPM
Key Terms to Remember
RPM (Requests Per Minute): The number of times you can call the API in 60 seconds.
TPM (Tokens Per Minute): The total volume of text (input + output) processed per minute.
RPD (Requests Per Day): Some legacy or specialized models also have daily caps.
Batch Queue Limit: If you use the Batch API (asynchronous processing for 50% discount), this determines how many tokens you can have "in progress" at once.
How to Check Your Specific Limits
OpenAI often adjusts these based on real-time server load. To see the exact limits for your account:
You will see a breakdown per model (e.g., gpt-5.4, gpt-4o-mini, dall-e-3).
Pro-Tip: If you are building a tool for high-frequency use, always implement Exponential Backoff in your code. This ensures your script waits and retries automatically if you hit a "Rate Limit Reached" error.
Usage limits vary significantly by model.OpenAI categorizes its models into "flagship" (high reasoning), "mini" (efficiency), and "nano" (high volume) tiers.
As of April 2026, here are the model-specific rate limits for Tier 1 (the starting paid tier). Limits for higher tiers (Tier 2-5) scale upward, often doubling or tripling at each level.
Tier 1 Model-Specific Rate Limits (Standard)
Model Category
Key Models
Requests Per Minute (RPM)
Tokens Per Minute (TPM)
Flagship Reasoning
o1, o3
500
3,000,000
Flagship General
gpt-5.4, gpt-5
10,000
1,000,000
Workhorse (Long Context)
gpt-4.1 (1M context)
1,000
1,000,000
Efficiency (Mini)
gpt-5.4-mini, o4-mini
1,000
1,000,000
High Volume (Nano)
gpt-5.4-nano, gpt-4.1-nano
5,000
5,000,000
Legacy Flagship
gpt-4o, gpt-4-turbo
300
300,000
Realtime / Audio
gpt-4o-realtime-preview
36
6,000
Important Usage Nuances
Shared Limits: Many models share a common pool of tokens. For example, if you use 500,000 tokens on gpt-5.4, those tokens may be deducted from your available gpt-5.4-mini limit if they are in the same "shared limit" group.
The "Usage Cap": Even if your RPM/TPM is high, Tier 1 has a total Monthly Spend Limit of $100. Once you spend $100 in a month, the API will stop working until the next billing cycle unless you qualify for Tier 2 ($50+ cumulative spend and 7 days of history).
Context Window vs. Rate Limit: While gpt-4.1 has a massive 1 million token context window, your rate limit might only allow 1 million tokens per minute. This means you could effectively only send one maximum-length request every 60 seconds.
DataZone vs. Global: * GlobalStandard: Higher limits, requests can be processed anywhere.
DataZoneStandard: Slightly lower limits, ensures data stays within a specific geographic region (useful for compliance).
How to Check Your Live Limits
Since OpenAI often increases these limits as they add server capacity, the most accurate way to check your current standing is: