How to read model pricing
Prices are quoted per million tokens, separately for input, cached input and output. Output is almost always the most expensive, so output-heavy workloads cost more than the input price alone suggests. A model's sticker price means little until you apply it to your token mix.
Full pricing table
Embedding model pricing
Choosing on more than price
Context window, caching support, batch discounts, vision/audio support, latency and quality all matter. The cheapest model that passes your evaluation is the right one — not the cheapest model overall. Run your numbers in the API cost calculator to rank models by real cost.
Prices are maintained manually and are estimates. See data sources for links and last-checked dates.