AI Cost Save
AICostSave

Model Cost Pages

High-intent pricing pages for users already comparing OpenAI, Claude, DeepSeek, and more.

Live pricing models

Claude Haiku 4.5

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.0008 | Output: 0.004

Claude Opus 4.5

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.005 | Output: 0.025

Claude Opus 4.6

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.005 | Output: 0.025

Claude Sonnet 4.5

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.003 | Output: 0.015

Claude Sonnet 4.6

Anthropic Claude models focused on long-context reasoning and stable enterprise usage.

Input: 0.003 | Output: 0.015

Deepseek-chat

DeepSeek models known for cost-efficient reasoning and coding-focused performance.

Input: 0.00014 | Output: 0.00028

Deepseek-reasoner

DeepSeek models known for cost-efficient reasoning and coding-focused performance.

Input: 0.002 | Output: 0.004

Doubao-lite

General-purpose model suitable for text generation and reasoning in common API workflows.

Input: 0.00004 | Output: 0.00008

Doubao-pro

General-purpose model suitable for text generation and reasoning in common API workflows.

Input: 0.0001 | Output: 0.0003

Gemini 2.5 Flash

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0003 | Output: 0.0025

Gemini 2.5 Flash Lite

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0001 | Output: 0.0004

Gemini 3.1 Flash Image Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0003 | Output: 0.0025

Gemini 3.1 Flash Lite Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.00025 | Output: 0.0015

Gemini 3.1 Pro Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.002 | Output: 0.01

Gemini 3.1 Pro Preview Custom Tools

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.002 | Output: 0.012

Gemini 3 Flash Preview

Google Gemini models for text, multimodal workloads, and high-throughput inference.

Input: 0.0005 | Output: 0.003

GPT-4.1

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.002 | Output: 0.008

GPT-4.1 mini

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0004 | Output: 0.0016

GPT-4.1 nano

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0001 | Output: 0.0004

GPT-4o

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0025 | Output: 0.01

GPT-4o mini

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00015 | Output: 0.0006

GPT-5.2

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

GPT-5.2-Codex

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

GPT-5.3 Chat

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

GPT-5.3-Codex

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00175 | Output: 0.014

GPT-5.4

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.0025 | Output: 0.015

GPT-5.4 Pro

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.03 | Output: 0.18

GPT-5 Nano

OpenAI general-purpose text and multimodal models for chat, tools, and content generation.

Input: 0.00005 | Output: 0.0004

Kimi-k2-0711-preview

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00222

Kimi-k2-0905-preview

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00222

Kimi-k2.5

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.0001 | Output: 0.00292

Kimi-k2-thinking

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00222

Kimi-k2-thinking-turbo

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00806

kimi-k2-turbo-preview

Moonshot Kimi models designed for long-context processing and Chinese-language Q&A.

Input: 0.00014 | Output: 0.00806

Qwen3.5-Flash

Alibaba Cloud Qwen models optimized for general chat and Chinese language scenarios.

Input: 0.00004 | Output: 0.00008

Qwen3.5-Plus

Alibaba Cloud Qwen models optimized for general chat and Chinese language scenarios.

Input: 0.00012 | Output: 0.00024

Qwen3-max

Alibaba Cloud Qwen models optimized for general chat and Chinese language scenarios.

Input: 0.0006 | Output: 0.0018

Guides and comparisons

Claude 成本(你应该关注什么)

用输入/输出 tokens + 你的工作流调用量,估算 Claude 的成本。

DeepSeek API 定价(成本驱动因素)

如何估算 DeepSeek 成本,并解释为什么优化 prompt 与减少重试会更省钱。

GPT-4 vs Claude 成本:怎么选更划算

用“工作流的总计费 tokens”来决定选 GPT-4 还是 Claude,而不是只看单价。

Kimi API 定价(你需要关注的点)

结合 input/output tokens 与工作流调用量,理解 Kimi 定价并控制支出。

OpenAI 成本(按 Token 理解)

用输入/输出 tokens 的方式理解 OpenAI 的 cost,并把它变成可估算的成本框架。

Qwen API 成本(按 Token 估算与实用框架)

用 input/output tokens 和真实工作流调用量来估算 Qwen API 成本,并找到浪费隐藏点。