We provide support for almost all the mainstream LLMs(large language models), users need to provide or buy LLM API KEY themselves. Which LLM is the most suitable for you, which is more cost-effective, the following gives you some reference information.
Before making a price comparison, we first need to figure out what a token is. In the field of Artificial Intelligence and Natural Language Processing, Token is the basic unit of text after segmentation. The number of English words contained in a Token is not fixed. In English, common short words such as "the" "and" are a Token, while longer words such as "hesitation" are also a Token. As a rough estimate, on average, an English Token may correspond to 3–5 letters.
We select one model from each of the large model vendors.Our selection considers the factors include intelligence, cost and speed, chooses four products that are nearly in the same class for comparison. I hope this comparison serves as a reference for selecting your model.
Here we compare four popular AI models to give you a reference.
Gemini 2.0 Flash
Input $0.1/million Tokens, Output $0.4/million Tokens (Note: its lite version, Gemini 2.0 Flash-Lite, is priced as low as $0.0075/million Tokens).
Industry-leading multimodal processing capability, context window up to 1 million Tokens, suitable for multimedia content generation, data analysis and other scenarios, price is low.
The best overall price/performance ratio, especially for budget-sensitive users who need multimodal capabilities.
DeepSeek R1
Input $0.55/million Tokens (latest data), output $2.19/million Tokens.
open source model, training cost is low, high inference efficiency, suitable for lightweight tasks (e.g. code generation, mathematical reasoning).
Cost-effective for Chinese scenarios, but the output cost is still higher than Gemini.
ChatGPT 4o
$5/million Tokens for input, $20/million Tokens for output (note: the price of its lite version, GPT-4o mini, is as low as $0.60 for input and $2.4 for output).
Top language understanding and reasoning capabilities, supports multi-domain applications.
Best price/performance ratio among traditional high-end models, but significantly higher output costs than Gemini and DeepSeek.
Claude-3.7-sonnet
$3/million Tokens input, $15/million Tokens output.
Stable performance in specialized areas (e.g., programming, complex reasoning), with a context window of 128K Tokens.
Suitable for professional scenarios, but the price is less competitive. Priced at the same level as its predecessor, the Claude 3.5 Sonnet.
In terms of price, performance, and scene suitability, Gemini 2.0 Flash is the best solution for the current price/performance ratio due to its multimodal capabilities and extremely low cost. If you want to further reduce the cost, you can pay attention to its Lite version or the open source ecosystem of DeepSeek R1.
Model | Input Price(USD/Million Tokens) | Output Price(USD/Million Tokens) | Value for Money Highlights |
---|---|---|---|
Gemini 2.0 Flash | 0.1 | 0.4 | Multi-modal + large window, lowest price in the industry |
DeepSeek R1 | 0.55 | 2.19 | Open source low cost, Chinese optimization highlights |
ChatGPT 4o | 5 | 20 | Top performance, lite version of the price-performance enhancement |
Claude 3.7 Sonnet | 3 | 15 | Stable professional reasoning, context window expansion |
Here is a price table for more models.
Model | Input Price(USD/Million Tokens) | Output Price(USD/Million Tokens) | Comments |
---|---|---|---|
Gemini 2.0 Flash Lite | 0.075 | 0.30 | Gemini's smallest and most cost effective model, built for at scale usage |
Gemini 2.0 Flash | 0.1 | 0.4 | Gemini's balanced multimodal model with great performance across all tasks |
Gemini 2.5 Flash Lite | 0.10 | 0.40 | Gemini's smaller and more cost effective model, built for at scale usage |
Gemini 2.5 Flash | 0.30 | 2.50 | Gemini's first hybrid reasoning model which supports a 1M token context window and has thinking budgets |
Gemini 2.5 Pro | 1.25 or 2.50 | 10.00 or 15.00 | Gemini's state-of-the-art multipurpose model, which excels at coding and complex reasoning tasks |
DeepSeek V3 0324 | 0.27 | 1.10 | Open source lower cost, Chinese optimization highlights |
DeepSeek R1 0528 | 0.55 | 2.19 | Open source low cost, Chinese optimization highlights |
ChatGPT 4o mini | 0.60 | 2.4 | Build low-latency, multimodal experiences including speech-to-speech,lite version. |
ChatGPT 4o | 5 | 20 | Build low-latency, multimodal experiences including speech-to-speech. |
ChatGPT 5 nano | 0.05 | 0.4 | OpenAI's fastest, cheapest version of GPT-5—great for summarization and classification tasks |
ChatGPT 5 mini | 0.25 | 2.0 | OpenAI's faster, cheaper version of GPT-5 for well-defined tasks |
ChatGPT 5 | 1.25 | 10 | OpenAI's best model for coding and agentic tasks across industries. |
Claude 3.5 Haiku | 0.8 | 4 | Claude's most cost-effective model |
Claude 3.7 Sonnet | 3 | 15 | Stable professional reasoning, context window expansion |
Claude 4 Sonnet | 3 or 6 | 15 or 22.5 | Optimal balance of intelligence, cost, and speed |
Claude 4.1 Sonnet | 15 | 75 | Claude's most intelligent model for complex tasks |
Please note that model prices may change dynamically.You can get the price here:
https://ai.google.dev/gemini-api/docs/pricing
https://openai.com/api/pricing/