📦 PTU Sizing
Normalized TPM (workload demand)
—
—
Min PTUs Required
—
—
Recommended PTUs (+headroom)
—
—
Monthly capacity vs demand (per your PTU purchase, 24/7 sustained)
PTU Monthly Capacity (normalized)
—
—
Workload Demand (uncached + 6×output)
—
—
Capacity Used
—
—
—
💡 Capacity for Your Purchase
PTUs Purchased
—
Peak Input TPM
—
N × TPM/PTU
Utilization @ Target
—
Burst Headroom
—
💰 Monthly Cost Comparison @ Target TPM
🔝 THE COMPARISON THAT MATTERS — these are the two numbers to compare. Not PTU base alone.
🎯 If you buy PTU — Effective Total/mo
—
—
💳 If you stay on PayGo — Total/mo
—
—
📈 PTU saves you
—
PayGo − PTU Effective
PTU Effective Total breakdown — PTU base is fixed; cached input + spillover are pass-through PayGo charges that PTU cannot eliminate.
PTU Base (fixed monthly)
—
—
+ Cached (FREE under PTU)
—
—
+ Spillover PayGo (over capacity)
—
—
🛍️ Buy PTU to skip spillover
—
—
—
🎯 Break-even Thresholds (when does PTU win?)
With your current PTU purchase, the table below shows the threshold for each variable (holding the others constant) where PTU + Spillover = Pure PayGo. Cross the threshold in the right direction → PTU starts saving money.
| Variable | Current | Break-even threshold | To make PTU win | Note |
|---|
🧮 Monthly Token Budget & PayGo Cost Breakdown
Assumes target Input TPM is sustained 24/7 for 30 days (= 43,200 min/mo).
Formula: tokens/mo = TPM × 43,200 · PayGo $/mo = (tokens/mo / 1,000,000) × price/M
Formula: tokens/mo = TPM × 43,200 · PayGo $/mo = (tokens/mo / 1,000,000) × price/M
Input (uncached) tokens/mo
—
—
Input (cached) tokens/mo
—
—
Output tokens/mo
—
—
| Bucket | TPM | Tokens / month | Price ($/M) | PayGo $/mo |
|---|---|---|---|---|
| Total PayGo | — | — | — | — |
—
📊 Cost Sensitivity Across TPM Levels
If you ran at this Input TPM continuously, how does PTU + Spillover compare with pure PayGo?
| Input TPM | Util % | PTU base | Spillover PayGo | Effective Total | Pure PayGo | Savings/mo |
|---|
📈 Cumulative Cost Over Time @ Target TPM