LLM Models' price comparison

You need API-KEY to chat pdf in papersgpt for online LLM

We provide support for almost all the mainstream LLMs(large language models), users need to provide or buy LLM API KEY themselves. Which LLM is the most suitable for you, which is more cost-effective, the following gives you some reference information.

What's token

Before making a price comparison, we first need to figure out what a token is. In the field of Artificial Intelligence and Natural Language Processing, Token is the basic unit of text after segmentation. The number of English words contained in a Token is not fixed. In English, common short words such as "the" "and" are a Token, while longer words such as "hesitation" are also a Token. As a rough estimate, on average, an English Token may correspond to 3–5 letters.

Comparision of popular models from different vendors

We select one model from each of the large model vendors.Our selection considers the factors include intelligence, cost and speed, chooses four products that are nearly in the same class for comparison. I hope this comparison serves as a reference for selecting your model.

Price/Performance Ranking

Here we compare four popular AI models to give you a reference.

Gemini 2.0 Flash

Price

Input $0.1/million Tokens, Output $0.4/million Tokens (Note: its lite version, Gemini 2.0 Flash-Lite, is priced as low as $0.0075/million Tokens).

Advantage

Industry-leading multimodal processing capability, context window up to 1 million Tokens, suitable for multimedia content generation, data analysis and other scenarios, price is low.

Conclusion

The best overall price/performance ratio, especially for budget-sensitive users who need multimodal capabilities.

DeepSeek R1

Price

Input $0.55/million Tokens (latest data), output $2.19/million Tokens.

Advantages

open source model, training cost is low, high inference efficiency, suitable for lightweight tasks (e.g. code generation, mathematical reasoning).

Conclusion

Cost-effective for Chinese scenarios, but the output cost is still higher than Gemini.

ChatGPT 4o

Price

$5/million Tokens for input, $20/million Tokens for output (note: the price of its lite version, GPT-4o mini, is as low as $0.60 for input and $2.4 for output).

Strengths

Top language understanding and reasoning capabilities, supports multi-domain applications.

Conclusion

Best price/performance ratio among traditional high-end models, but significantly higher output costs than Gemini and DeepSeek.

Claude-3.7-sonnet

Price

$3/million Tokens input, $15/million Tokens output.

Strengths

Stable performance in specialized areas (e.g., programming, complex reasoning), with a context window of 128K Tokens.

Conclusion

Suitable for professional scenarios, but the price is less competitive. Priced at the same level as its predecessor, the Claude 3.5 Sonnet.

Which is more cost-effective

In terms of price, performance, and scene suitability, Gemini 2.0 Flash is the best solution for the current price/performance ratio due to its multimodal capabilities and extremely low cost. If you want to further reduce the cost, you can pay attention to its Lite version or the open source ecosystem of DeepSeek R1.

Selection advice

Multi-modal and low-cost requirements: Prefer Gemini 2.0 Flash (or its Lite version).
Chinese scenarios and lightweight tasks: DeepSeek R1 is a cost-effective choice for open source solutions.
Professional domain depth needs: Claude-3.7-sonnet is more reliable in programming and mathematical reasoning.
Traditional high-end model transition: ChatGPT 4o mini can replace GPT-3.5 Turbo with more than 60% cost reduction.

Price Comparison Table

Model	Input Price(USD/Million Tokens)	Output Price(USD/Million Tokens)	Value for Money Highlights
Gemini 2.0 Flash	0.1	0.4	Multi-modal + large window, lowest price in the industry
DeepSeek R1	0.55	2.19	Open source low cost, Chinese optimization highlights
ChatGPT 4o	5	20	Top performance, lite version of the price-performance enhancement
Claude 3.7 Sonnet	3	15	Stable professional reasoning, context window expansion

Prices for more models

Here is a price table for more models.

Model	Input Price(USD/Million Tokens)	Output Price(USD/Million Tokens)	Comments
Gemini 2.0 Flash Lite	0.075	0.30	Gemini's smallest and most cost effective model, built for at scale usage
Gemini 2.0 Flash	0.1	0.4	Gemini's balanced multimodal model with great performance across all tasks
Gemini 2.5 Flash Lite	0.10	0.40	Gemini's smaller and more cost effective model, built for at scale usage
Gemini 2.5 Flash	0.30	2.50	Gemini's first hybrid reasoning model which supports a 1M token context window and has thinking budgets
Gemini 2.5 Pro	1.25 or 2.50	10.00 or 15.00	Gemini's state-of-the-art multipurpose model, which excels at coding and complex reasoning tasks
DeepSeek V3 0324	0.27	1.10	Open source lower cost, Chinese optimization highlights
DeepSeek R1 0528	0.55	2.19	Open source low cost, Chinese optimization highlights
ChatGPT 4o mini	0.60	2.4	Build low-latency, multimodal experiences including speech-to-speech,lite version.
ChatGPT 4o	5	20	Build low-latency, multimodal experiences including speech-to-speech.
ChatGPT 5 nano	0.05	0.4	OpenAI's fastest, cheapest version of GPT-5—great for summarization and classification tasks
ChatGPT 5 mini	0.25	2.0	OpenAI's faster, cheaper version of GPT-5 for well-defined tasks
ChatGPT 5	1.25	10	OpenAI's best model for coding and agentic tasks across industries.
Claude 3.5 Haiku	0.8	4	Claude's most cost-effective model
Claude 3.7 Sonnet	3	15	Stable professional reasoning, context window expansion
Claude 4 Sonnet	3 or 6	15 or 22.5	Optimal balance of intelligence, cost, and speed
Claude 4.1 Sonnet	15	75	Claude's most intelligent model for complex tasks

Please note that model prices may change dynamically.You can get the price here:

https://ai.google.dev/gemini-api/docs/pricing

https://openai.com/api/pricing/

https://api-docs.deepseek.com/quick_start/pricing

https://www.anthropic.com/pricing#api