Model Cost Pages | AI Cost Save

Model Cost Pages

High-intent pricing pages for users already comparing OpenAI, Claude, DeepSeek, and more.

Live pricing models

Claude Haiku 4.5

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.0008 | Output: 0.004

View pricing details Official pricing docs

Claude Opus 4.5

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.005 | Output: 0.025

View pricing details Official pricing docs

Claude Opus 4.6

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.005 | Output: 0.025

View pricing details Official pricing docs

Claude Sonnet 4.5

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.003 | Output: 0.015

View pricing details Official pricing docs

Claude Sonnet 4.6

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.003 | Output: 0.015

View pricing details Official pricing docs

Deepseek-chat

DeepSeek models known for cost-efficient reasoning and coding-focused performance.

Input: 0.00014 | Output: 0.00028

View pricing details Official pricing docs

Deepseek-reasoner

DeepSeek models known for cost-efficient reasoning and coding-focused performance.

Input: 0.002 | Output: 0.004

View pricing details Official pricing docs

Doubao-lite

General-purpose model suitable for text generation and reasoning in common API workflows.

Input: 0.00004 | Output: 0.00008

View pricing details Official pricing docs

Doubao-pro

General-purpose model suitable for text generation and reasoning in common API workflows.

Input: 0.0001 | Output: 0.0003

View pricing details Official pricing docs

Gemini 2.5 Flash

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0003 | Output: 0.0025

View pricing details Official pricing docs

Gemini 2.5 Flash Lite

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0001 | Output: 0.0004

View pricing details Official pricing docs

Gemini 3.1 Flash Image Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0003 | Output: 0.0025

View pricing details Official pricing docs

Gemini 3.1 Flash Lite Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.00025 | Output: 0.0015

View pricing details Official pricing docs

Gemini 3.1 Pro Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.002 | Output: 0.01

View pricing details Official pricing docs

Gemini 3.1 Pro Preview Custom Tools

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.002 | Output: 0.012

View pricing details Official pricing docs

Gemini 3 Flash Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0005 | Output: 0.003

View pricing details Official pricing docs

GPT-4.1

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.002 | Output: 0.008

View pricing details Official pricing docs

GPT-4.1 mini

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0004 | Output: 0.0016

View pricing details Official pricing docs

GPT-4.1 nano

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0001 | Output: 0.0004

View pricing details Official pricing docs

GPT-4o

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0025 | Output: 0.01

View pricing details Official pricing docs

GPT-4o mini

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00015 | Output: 0.0006

View pricing details Official pricing docs

GPT-5.2

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

View pricing details Official pricing docs

GPT-5.2-Codex

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

View pricing details Official pricing docs

GPT-5.3 Chat

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

View pricing details Official pricing docs

GPT-5.3-Codex

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

View pricing details Official pricing docs

GPT-5.4

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0025 | Output: 0.015

View pricing details Official pricing docs

GPT-5.4 Pro

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.03 | Output: 0.18

View pricing details Official pricing docs

GPT-5 Nano

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00005 | Output: 0.0004

View pricing details Official pricing docs

Kimi-k2-0711-preview

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00222

View pricing details Official pricing docs

Kimi-k2-0905-preview

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00222

View pricing details Official pricing docs

Kimi-k2.5

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.0001 | Output: 0.00292

View pricing details Official pricing docs

Kimi-k2-thinking

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00222

View pricing details Official pricing docs

Kimi-k2-thinking-turbo

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00806

View pricing details Official pricing docs

kimi-k2-turbo-preview

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00806

View pricing details Official pricing docs

Qwen3.5-Flash

Alibaba Cloud Qwen models optimized for general chat and Chinese language scenarios.

Input: 0.00004 | Output: 0.00008

View pricing details Official pricing docs

Qwen3.5-Plus

Alibaba Cloud Qwen models optimized for general chat and Chinese language scenarios.

Input: 0.00012 | Output: 0.00024

View pricing details Official pricing docs

Qwen3-max

Alibaba Cloud Qwen models optimized for general chat and Chinese language scenarios.

Input: 0.0006 | Output: 0.0018

View pricing details Official pricing docs

Guides and comparisons

Claude : coût par token (quoi suivre)

Estimer Claude en séparant input/output et en tenant compte du volume de calls.

DeepSeek API pricing (facteurs de coût)

Comment estimer les coûts DeepSeek et comparer la valeur quand vous optimisez prompts et retries.

GPT-4 vs Claude : le coût, comment choisir

Un cadre simple pour choisir entre GPT-4 et Claude en fonction du total tokens facturés.

Tarification Kimi (ce qu’il faut suivre)

Comprendre la tarification Kimi avec input/output tokens + volume d’appels de votre workflow pour mieux contrôler vos dépenses.

OpenAI : coût par token (guide pratique)

Comprendre le coût OpenAI et estimer le coût par token pour input et output.

Coût de l’API Qwen (par token & estimation pratique)

Estimez le coût Qwen à partir des tokens input/output et du volume d’appels réel — puis optimisez l’endroit où le gaspillage se cache.