Perplexity AI offers a specialized API platform focusing on search-grounded LLM responses (Sonar models).
1. Step-by-Step: Generate Perplexity API Key
The API is managed through the Perplexity Console, which is separate from the standard chat interface.
Access the Console: Go to
docs.perplexity.ai or directly toperplexity.ai/settings/api .Add Payment Method: Unlike the chat "Pro" subscription, the API requires prepaid credits.
Navigate to the Billing tab.
Add a credit card and purchase a minimum of $5 in credits.
Generate Key: * Go to the API Keys section.
Click + Generate.
Copy the key immediately.
It starts with pplx-. Perplexity will not show it again for security reasons.
Set Up Auto-Top-Up (Optional): To prevent your scripts from breaking when credits run out, enable "Automatic Top-up" to refresh your balance when it falls below $2.
2. API Usage Tiers
Your "Tier" is determined by your total cumulative spend on the platform.
| Tier | Total Credits Purchased | Status | Monthly Spend Limit |
| Tier 0 | $0 | New/Trial | $5 (One-time) |
| Tier 1 | $50+ | Light Usage | $100 |
| Tier 2 | $250+ | Regular Usage | $500 |
| Tier 3 | $500+ | Heavy Usage | $1,000 |
| Tier 4 | $1,000+ | Production | $5,000 |
| Tier 5 | $5,000+ | Enterprise | Custom ($200k+) |
3. Model-Wise API Usage Limits
Perplexity uses Requests Per Minute (RPM) as the primary throttle.
| Model Category | Specific Models | Tier 1 RPM | Tier 3 RPM |
| Sonar (Standard) | sonar, sonar-pro | 150 | 1,000 |
| Reasoning | sonar-reasoning, sonar-reasoning-pro | 150 | 1,000 |
| Deep Research | sonar-deep-research | 10 | 40 |
| Async Search | POST /v1/async/sonar | 10 | 40 |
| Search (Raw) | POST /search | 50 QPS* | 50 QPS |
*QPS = Queries Per Second.
Important Implementation Notes
Deep Research Limits: The
sonar-deep-researchmodel has much lower rate limits because it performs multiple autonomous steps (searching, browsing, and reasoning) per single request.Error Handling: If you receive a 429 Error, Perplexity recommends using the
x-ratelimit-remainingheaders in the API response to calculate exactly when your "bucket" will refill.Context Windows: While models like
sonar-prosupport large contexts, sending very large prompts frequently may trigger token-based throttling even if you stay under the RPM limit.
No comments:
Post a Comment