AI Spotlight

Anthropic Announces Upgraded Claude 3.5 Sonnet and New Claude 3.5 Haiku

Anthropic announces an upgraded Claude 3.5 Sonnet and introduces Claude 3.5 Haiku, alongside a public beta of experimental computer-use capabilities.

Hiraku

Oct 22, 2024 • 2 min read

Image from anthropic.com.

Anthropic has announced an upgraded version of Claude 3.5 Sonnet alongside the introduction of a new Claude 3.5 Haiku model, which will be released later this month. These developments reflect Anthropic’s ongoing focus on improving coding performance, AI-driven tool use, and practical applications through AI models. Additionally, the company launched a public beta of computer-use capabilities, allowing developers to test AI-powered workflows that simulate human computer interaction.

Upgraded Claude 3.5 Sonnet

The upgraded Claude 3.5 Sonnet builds on its predecessor with stronger performance across key benchmarks, particularly in software engineering and tool use. For coding tasks, it improved on the SWE-bench Verified score from 33.4% to 49.0%, outperforming specialized coding models like OpenAI’s o1-preview. In TAU-bench evaluations—which assess tool use in complex workflows—it achieved significant gains, reaching 69.2% in retail and 46.0% in airline-related tasks.

These enhancements are already being explored by companies like GitLab and Replit, who reported stronger performance in coding assistance and multi-step automation. GitLab found the model’s reasoning improved by up to 10% without added latency, making it well-suited for DevSecOps and planning tasks. The Browser Company also noted Claude’s superiority in automating complex web workflows.

Claude 3.5 Haiku

Anthropic also announced Claude 3.5 Haiku, the next evolution of their fastest model. While matching the speed and affordability of the previous Claude 3 Haiku, it outperforms the larger Claude 3 Opus model on multiple benchmarks. This makes it a practical option for user-facing applications, sub-agent tasks, and personalized product development. Claude 3.5 Haiku scored 40.6% on SWE-bench Verified, showing that it excels in coding-related tasks and maintains low-latency response times.

Haiku will initially be launched as a text-only model, with image input capabilities planned for future updates. This ensures developers can start integrating it into specialized workflows and automation systems as soon as it becomes available later this month.

Public Beta of Computer-Use Capability

Anthropic has also introduced computer-use functionality as a public beta, available today via its API, Amazon Bedrock, and Google Cloud's Vertex AI. This experimental capability allows Claude to simulate human interactions with computer interfaces, such as clicking, typing, and navigating software tools. The feature aims to support automation of repetitive processes and software testing, and companies like Replit are already integrating it into their platforms to evaluate apps during development.

Although the feature remains in its early stages, with limitations in areas like scrolling and zooming, Anthropic expects it to improve through developer feedback. The company has taken a proactive approach to safety, deploying classifiers to monitor potential misuse and prevent risks like fraud or misinformation.

Conclusion

With the Claude 3.5 Sonnet upgrade, the upcoming Haiku release, and new computer-use functionality, Anthropic continues to refine its AI offerings toward practical, real-world applications. These developments emphasize reliability and performance in both coding and automation tasks, ensuring that developers have access to tools that balance speed, cost-efficiency, and safety.

Upgraded Claude 3.5 Sonnet

Claude 3.5 Haiku

Public Beta of Computer-Use Capability

Conclusion

Get more straight to your inbox!