Anthropic launches new Claude models for reasoning, AI agents

Artificial Intelligence AI Assistant Apps - ChatGPT, Google Gemini, Anthropic Claude

Anthropic unveiled its next generation of Claude models called Claude Opus 4 and Claude Sonnet 4 for coding, advanced reasoning, and AI agents.

Anthropic — which is backed by Amazon (NASDAQ:AMZN) and Alphabet’s (NASDAQ:GOOGL) (NASDAQ:GOOG) Google — said Claude Opus 4 is the world’s best coding model, with sustained performance on complex, long-running tasks and agent workflows.

The company added that Claude Sonnet 4 is a significant upgrade to Claude Sonnet 3.7, delivering superior coding and reasoning while responding more precisely to instructions.

Besides the models, the company also announced other updates. Extended thinking with tool use (beta): both models can use tools, such as web search, during extended thinking, allowing Claude to alternate between reasoning and tool use to improve responses.

Both models can use tools in parallel, follow instructions more precisely, and, when given access to local files by developers, show improved memory capabilities, according to the company.

Anthropic added that Claude Code is now generally available. In addition, the company is releasing four new capabilities on the Anthropic API that enable developers to build more powerful AI agents: the code execution tool, MCP connector, Files API, and the ability to cache prompts for up to one hour.

Anthropic noted that pricing remains consistent with previous Opus and Sonnet models: Opus 4 at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15.

Both models are available on the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI, as per the company.

Anthropic said Claude Opus 4 is its most powerful model yet and the best coding model in the world, leading on SWE-bench (72.5%) and Terminal-bench (43.2%). SWE-bench is a benchmark for performance on real software engineering task.

The company noted that it has significantly reduced behavior where the models use shortcuts or loopholes to complete tasks. Both new models are 65% less likely to engage in this behavior than Sonnet 3.7 on agentic tasks that are mainly susceptible to shortcuts and loopholes.

Claude Opus 4 also outperforms all previous models on memory capabilities, as per Anthropic.

Anthropic competes with several heavyweights in the AI space, including Google’s Gemini and Microsoft (MSFT)-backed OpenAI’s GPT-4.

Earlier this week, Alphabet’s CEO Sundar Pichai unveiled several AI updates for Gemini at its annual developer conference Google I/O and noted that Gemini 2.5 Pro beats out every competitor in the metrics measured by LMArena. Last month, OpenAI launched GPT-4.5, its largest and most conversational model to date, as it was trained across multiple data centers. Meta Platforms (META) has its latest Llama 4 family of open-source large language models, or LLMs.

Chinese companies are not far behind, with Baidu showcasing its AI models Ernie 4.5 Turbo and new reasoning model called Ernie X1 Turbo last month. Alibaba also has AI models, with the latest being Qwen 3, which is an enhanced AI model featuring hybrid reasoning capabilities. China’s DeepSeek (DEEPSEEK), which rattled the U.S. tech sector with its low cost AI models, had introduced its updated V3 model in March.

Anthropic launches new Claude models for reasoning, AI agents

More on Alphabet and Amazon

Leave a Reply Cancel reply