Model Cost Pages | AI Cost Save

Model Cost Pages

High-intent pricing pages for users already comparing OpenAI, Claude, DeepSeek, and more.

Live pricing models

Claude Haiku 4.5

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.0008 | Output: 0.004

View pricing details Official pricing docs

Claude Opus 4.5

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.005 | Output: 0.025

View pricing details Official pricing docs

Claude Opus 4.6

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.005 | Output: 0.025

View pricing details Official pricing docs

Claude Sonnet 4.5

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.003 | Output: 0.015

View pricing details Official pricing docs

Claude Sonnet 4.6

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.003 | Output: 0.015

View pricing details Official pricing docs

Deepseek-chat

DeepSeek models known for cost-efficient reasoning and coding-focused performance.

Input: 0.00014 | Output: 0.00028

View pricing details Official pricing docs

Deepseek-reasoner

DeepSeek models known for cost-efficient reasoning and coding-focused performance.

Input: 0.002 | Output: 0.004

View pricing details Official pricing docs

Doubao-lite

General-purpose model suitable for text generation and reasoning in common API workflows.

Input: 0.00004 | Output: 0.00008

View pricing details Official pricing docs

Doubao-pro

General-purpose model suitable for text generation and reasoning in common API workflows.

Input: 0.0001 | Output: 0.0003

View pricing details Official pricing docs

Gemini 2.5 Flash

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0003 | Output: 0.0025

View pricing details Official pricing docs

Gemini 2.5 Flash Lite

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0001 | Output: 0.0004

View pricing details Official pricing docs

Gemini 3.1 Flash Image Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0003 | Output: 0.0025

View pricing details Official pricing docs

Gemini 3.1 Flash Lite Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.00025 | Output: 0.0015

View pricing details Official pricing docs

Gemini 3.1 Pro Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.002 | Output: 0.01

View pricing details Official pricing docs

Gemini 3.1 Pro Preview Custom Tools

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.002 | Output: 0.012

View pricing details Official pricing docs

Gemini 3 Flash Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0005 | Output: 0.003

View pricing details Official pricing docs

GPT-4.1

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.002 | Output: 0.008

View pricing details Official pricing docs

GPT-4.1 mini

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0004 | Output: 0.0016

View pricing details Official pricing docs

GPT-4.1 nano

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0001 | Output: 0.0004

View pricing details Official pricing docs

GPT-4o

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0025 | Output: 0.01

View pricing details Official pricing docs

GPT-4o mini

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00015 | Output: 0.0006

View pricing details Official pricing docs

GPT-5.2

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

View pricing details Official pricing docs

GPT-5.2-Codex

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

View pricing details Official pricing docs

GPT-5.3 Chat

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

View pricing details Official pricing docs

GPT-5.3-Codex

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

View pricing details Official pricing docs

GPT-5.4

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0025 | Output: 0.015

View pricing details Official pricing docs

GPT-5.4 Pro

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.03 | Output: 0.18

View pricing details Official pricing docs

GPT-5 Nano

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00005 | Output: 0.0004

View pricing details Official pricing docs

Kimi-k2-0711-preview

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00222

View pricing details Official pricing docs

Kimi-k2-0905-preview

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00222

View pricing details Official pricing docs

Kimi-k2.5

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.0001 | Output: 0.00292

View pricing details Official pricing docs

Kimi-k2-thinking

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00222

View pricing details Official pricing docs

Kimi-k2-thinking-turbo

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00806

View pricing details Official pricing docs

kimi-k2-turbo-preview

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00806

View pricing details Official pricing docs

Qwen3.5-Flash

Alibaba Cloud Qwen models optimized for general chat and Chinese language scenarios.

Input: 0.00004 | Output: 0.00008

View pricing details Official pricing docs

Qwen3.5-Plus

Alibaba Cloud Qwen models optimized for general chat and Chinese language scenarios.

Input: 0.00012 | Output: 0.00024

View pricing details Official pricing docs

Qwen3-max

Alibaba Cloud Qwen models optimized for general chat and Chinese language scenarios.

Input: 0.0006 | Output: 0.0018

View pricing details Official pricing docs

Guides and comparisons

Claude Cost Per Token (What to Track)

A clear way to estimate Claude costs, including input vs output tokens and workflow call volume.

DeepSeek API Pricing (Cost Drivers)

How to estimate DeepSeek costs and compare value across tasks, especially when you optimize prompts and reduce retries.

GPT-4 vs Claude Cost: How to Decide

A decision framework for choosing between GPT-4 and Claude based on real workflow token usage.

Kimi API Pricing (What to Track)

Understand Kimi pricing with input/output tokens and workflow call volume so you can control spend.

OpenAI Cost Per Token (Practical Guide)

Understand OpenAI cost drivers and how to estimate cost per token for input and output.

Qwen API Cost (Cost Per Token & Practical Estimate)

Estimate Qwen API cost using input/output tokens and real workflow call volume — then optimize where waste hides.