| Cap | Output Tokens (visible + thinking) | API $ Equiv |
|---|---|---|
| 5-hour session | 2.66M | $66 |
| Weekly | 19.7M | $492 |
| Monthly (4.33 wk) | 85.2M | $2,130 |
Max 20x Plan Actual Limits
Tips
- Monthly API value is roughly $2,130 (10.6x the $200 subscription).
- The 5-hour rolling cap is 2.66M output tokens (visible + thinking) = $66 at API rates.
- To run 24/7, throttle to 85% speed (1,954 tok/min) or you'll exhaust the weekly budget 25 hours early.
▾
Measured Caps
▾
Evidence
We were running parallel evaluations and noted how quickly we hit limits. We collected telemetry data throughout, which made the post-hoc math straightforward.
Run 1: Parallel batch jobs
We ran 20 Claude Opus workers with ultrathink (32k thinking token budget) in parallel. After 41 minutes, we hit the rate limit.
| Token Type | Count | Source |
|---|---|---|
| Uncached input | 20,102 | input_tokens |
| Cache write | 4,020,328 | cache_creation_tokens |
| Cache read | 54,477,156 | cache_read_tokens |
| Visible output | 1,584,049 | output_tokens |
| Total cost | $120.61 | |
The cost gap reveals hidden thinking tokens
If we calculate the cost from visible tokens, we get $92.07. But telemetry reported $120.61. The $28.54 gap is thinking tokens—billed at output rate but not shown in telemetry.
Uncached: 20,102 × $5.00/M = $0.10
Cache write: 4,020,328 × $6.25/M = $25.13
Cache read: 54,477,156 × $0.50/M = $27.24
Visible out: 1,584,049 × $25.00/M = $39.60
TOTAL = $92.07
Telemetry reported: $120.61
Gap: $28.54 → ~1.14M thinking tokens
Run 2: Retry after cooldown
We ran 10 workers for 19 minutes. The /usage command showed we'd consumed 74% of the 5-hour cap and 10% of the weekly quota. From that we can back-calculate the actual limits:
Output tokens used: 1,967,024
5-hr cap = 1,967,024 / 0.74 = 2.66M tokens
Weekly = 1,967,024 / 0.10 = 19.7M tokens
▾
Token Types
Input tokens
- Uncached input — Tokens after the last cache breakpoint (fully processed each request)
- Cache write — Tokens being written to cache for reuse (costs 1.25x input rate)
- Cache read — Tokens retrieved from cache instead of reprocessed (costs 0.1x input rate)
How Caching Works
Claude Code caches the system prompt and conversation history. On subsequent turns, cached tokens are read at $0.50/MTok instead of $5/MTok. Cache expires after 5 min. Cache reads don't count toward rate limits.
Output tokens
- Visible output — Text you see (code, explanations, tool calls)
- Thinking — Internal reasoning in extended thinking mode (hidden from you)
▾
Opus 4.5 Pricing
| Token Type | Rate |
|---|---|
| Input (uncached) | $5.00/MTok |
| Cache write (5-min) | $6.25/MTok |
| Cache read | $0.50/MTok |
| Output | $25.00/MTok |