Why Choose Claude, ChatGPT, or Gemini for Your Project?
When choosing a generative AI model, it’s important to select one that meets your unique requirements for the best fit.

Duncan Curtis, SVP of AI product and technology at Sama, compares distinct advantages and considerations for generative AI models like Claude, ChatGPT, and Gemini. In this comprehensive article, he examines factors like performance, budget, and safety to help buyers choose the best model for their project.
As generative AI (GenAI) models continue to proliferate and become available for use, we are past the days when ChatGPT was the end-all and be-all. Companies like Anthropic and Google are making their own model families available, and these launches have been accompanied by claims that the Claude family of models from Anthropic performs better than ChatGPT.
A Checklist To Choose The Right AI
When choosing a model family to work within your business, it can be difficult to sift through all of these claims and determine which model family (and even which specific model within that family) will best suit your needs. Depending on your data type, desired outcomes, and budget, one model may stand out.
To choose the right model to get started with, follow this checklist:
- What modality or modalities will your model need? In other words, do you need just text and image capabilities or capacity for video and sound?
- How big are your input and output data?
- How complex are the tasks you are trying to perform?
- How important is performance versus budget?
- How critical is AI assistant safety to your use case?
- Does your company have an existing arrangement with Microsoft Azure or GCP?
Next, let’s introduce the model families we will examine through these lenses. First is OpenAI’s GPT4, originally launched in 2023, and its lower-budget alternative, GPT3.5. We’ll also look at Google’s Gemini series, including the Gemini 1.5 series introduced in February 2024, including video and sound capabilities. The most recent family is Claude, launched by Anthropic. It is available in three primary forms: Haiku, Sonnet, and Opus, which differ in performance levels and budgetary costs. In addition, we’ll look briefly at Llama 2, Meta’s open-source model.
One of the critical points of any model is its context window token limit. A larger limit allows more context for inputs and more complex inputs and outputs. With a large enough limit, users can provide a series of novels as context and then query against them in a prompt.
OpenAI’s rough estimate is that one token is about four characters, and we’ll work off that in comparing token limits. To better understand what that looks like, we’ll also convert character limits to ranges of words using a calculator. For context on this front, a general adult novel ranges from about 50,000 words on the short end to 100,000 words.
Claude Opus 3’s general production limit is 200,000 tokens (800,000 characters, or over 120,000 words). It has been tested with and can exceed over 1 million tokens. GPT4 has a much smaller limit of 8,192 tokens, but the Turbo models increase this capacity to 120,000 tokens. GPT3.5 Turbo models start at 16,385 tokens.
See More: How To Enhance LLM Evaluation For Responsible AI
Finally, Gemini 1.5 Pro offers up to 1 million tokens regardless of the use case. But while token limits are important, so is budget, and it is often measured in cost per million tokens.
1. Claude’s pricing for its APIs is as follows
- Opus (highest-performance model): $15 per million tokens input; $75 per million tokens output
- Sonnet (mid-range model): $3 per million tokens input; $15 per million tokens output
- Haiku (budget model): 25 cents per million tokens input; $1.25 per million tokens output
2. GPT, meanwhile, is as follows
- All GPT-4 Turbo models (as of this writing): $10 per million tokens input; $30 per million tokens output
- GPT-4: $30 per million of tokens input; $60 per million of tokens output
- GPT-3.5 Turbo: 50 cents per million tokens input; $1.50 per million tokens output
3.Gemini Pro’s pricing is as follows
Gemini Pro is free for up to 60 queries per month. A priced API based on usage will soon be made available, but note that it prices per thousand tokens, not per million. You’d need to multiply costs by 1,000 to get a per-million rate. Here are the listed rates as of this writing:
- $0.000125 per 1,000 characters input ($1.25 per million tokens)
- $0.0025 per image input
- $0.000375 per 1,000 characters output ($0.37 per million tokens)
Understanding AI Model Comparison With Use Cases
Finally, there are three more components you may want to consider based on your specific use case: ability to fine-tune, safety, and cloud providers. If you need to fine-tune a model and really understand what it is doing, ChatGPT-4 or Llama 2 will be your best bets.
However, customizing a highly specific model is a complicated process that should be weighed against the simpler approach of providing necessary information to a large-context model. In terms of safety, Anthropic’s evolving AI Constitution and Responsible Scaling Policy make it the leader with Claude models. Finally, cloud provider availability may also restrict your choices. The GPT family is best suited to Microsoft Azure, and Claude and Gemini are both available on Google’s GCP.
Now, to illustrate these differences, here are sample use cases:
- An assistant to ask questions about potential fantasy football games, with extremely long texts or videos that need to be input or output: Gemini 1.5 Pro is best suited for this complex, long task due to its 1 million context window limit and capacity for video input.
- A dating-focused model to help you break the ice, meaning you’re working on a budget: Claude 3 Haiku is simple and effective, and the token limit is not a concern.
- Creating a downstream AI assistant with specific knowledge, where the model will need to be fine-tuned, such as for an agricultural assistant with knowledge of local plants and pests: ChatGPT-4 has the ability to be tinkered with.
- A model that reads doctors’ notes to look for inconsistencies with recommended protocols and treatments: This task doesn’t require the ability to handle long texts, but it does require complexity and high performance rather than budget. The Claude 3 family is likely the best bet.
- If, on the other hand, you need a budget-friendly model but still need to handle complex tasks, such as a model to act as a writing assistant, ChatGPT-4 will be a better fit.
- For models where safety is critical, such as those designed for the physically disabled or other populations with specific needs, Claude again stands out.
- Finally, if you need the deepest understanding of what you are doing, Llama 2, with its open-source API that allows custom modifications, should be your first choice.
Unsurprisingly, no two models are the same, even though they all fall under the category of generative AI. When deciding on a model, you must consider your specific needs. If possible, consider experimenting with multiple models for your specific use case to understand better which may be the right fit for you. After all, making the wrong choice could negatively affect your budget or the performance you need out of the model.
Image Source: Shutterstock