01 · MODEL PICKER

Three questions.

1 · What are you building?

2 · What matters most?

3 · Expected volume?

Answer the three questions to get a recommendation — with the reasoning, the exact model ID, and a runner-up.

02 · COST CALCULATOR

What will it cost?

Live math across every current Claude model. Prices are per million tokens, from Anthropic's published list pricing.

Input tokens / request

Output tokens / request

Requests / month

Prompt cache hit rate0%

Cached input reads bill at ~10% of the input price. Assumes a warmed cache; the one-time 1.25× write premium isn't included.

Batch API (−50% everything, async ≤24h)

Model	$ / request	$ / month	vs cheapest

03 · THE LINEUP

Every current Claude model.

Click any model ID to copy it. Data cached 2026-06-24; verify against GET /v1/models for live values.

Model	Model ID (click to copy)	Context	Max out	Input $/MTok	Output $/MTok	Best for

Sonnet 5 intro pricing ($2 / $10 per MTok) applies through 2026-08-31, then $3 / $15. Batch API is 50% off all token usage on every model. Cache reads ≈ 10% of input price.

04 · INTEGRATION CHEATSHEET

Ship it without the 400s.

⚠ These return a 400 on the Claude 5 era models

budget_tokens — removed on Fable 5 / Opus 4.8 / 4.7 / Sonnet 5. Use thinking: {type: "adaptive"} + output_config.effort.
temperature / top_p / top_k — removed on the same models. Steer with prompting.
assistant prefill — 400 on all 4.6+ models. Use structured outputs (output_config.format).
thinking: {type: "disabled"} — 400 on Fable 5 only. Omit the parameter; thinking is always on.

Fable 5 specifics

Check stop_reason == "refusal" before reading content — safety classifiers can decline with HTTP 200.
Opt into server-side fallbacks by default: beta server-side-fallback-2026-06-01 + fallbacks: [{"model": "claude-opus-4-8"}].
Requires 30-day data retention — ZDR orgs get 400 on every request.
Same tokenizer as Opus 4.8; plan for minutes-long turns at high effort — stream everything.

Cost levers, cheapest first

Batch API — 50% off everything, async within 24h.
Prompt caching — reads at ~10% of input price. Keep stable content first; a timestamp in your system prompt silently kills it.
Right-size the model — Haiku 4.5 is 10× cheaper than Opus 4.8 on input.
effort: "medium" — often the cost/quality sweet spot on 4.6+ models.

05 · FOR YOUR TOOLING

models.json

The whole registry — IDs, pricing, context windows, discounts — as machine-readable JSON. Point your scripts, dashboards, and agents at it.

curl -s https://5.maib.io/models.json | jq '.models[] | {id, input_per_mtok, output_per_mtok}'

New model drops. This page updates.

Leave an email and get a one-line note when the lineup or pricing changes. Nothing else, ever.

5.maib.io — mAIb Technologies · Pricing data cached 2026-06-24, verify against the Anthropic Models API

The Five (vision) models.json llms.txt

You're overpayingfor Claude.