I-Anthropic yethula uClaude Sonnet 4.5 ngemiphumela emisha yokufaka amakhodi nemiphumela ye-Agentic State-of-the-the-the-the-the-art

Anthropic issued Claude Sonnet 4.5 And sets a new Software engineering of the last software and real computer usage. Update also changes the product of the concrete Center Print price remains unchanged from Sonnet 4 ($ 3 input / $ 15 output in one million tokens).
What's new?
- The certified SWE-Bench is record. Anthropic reports 77.2% The accuracy of Swen-Databased Database with the certified Swech using the SCAFFOLD TWO TWO Limited Tools (Bash + Editing the File), Over 10 Runs, None of the Test-Test Refreshes, 200 Budget “to think”. 1m-COONCUS setting 78.2%and higher maximum edits associated with sample and rejects such as rejections suggesting this 82.0%.
- Use the sota. Despite of- OSWORD-GuaranteedSonnet 4.5 leads to 61.4%Top from Sonnet 4's 42.2%, indicates control of the powerful tool and the ui-cheating of the UI of browser / Desktop.
- A long-term independence. The party saw > 30 hours Immediate focus on multiple codes – jumping more than previous restrictions and is directly relevant to agent's reliability.
- Reasoning / Stats. The release notes receive “great benefits” beyond general reasons and statistics; specifically the benchypes of the bench (eg security status is ASL-3 for the strongest protection against the instant injection.

What are agents?
Sonnet 4.5 aims to the Brittle Parts of Real Agents: Expanded planning, memory, and reliable trust tool. Anthropic's Claude Agent SDK Disclosing their production patterns (memory of memory of long-term jobs, allowing, linking sub-alent) rather than just at the end of the LLM. That means groups can reproduce the same used by Claude Code (Now with Checkpoints, renewed terminal, and the combination of vs code) to keep multiple multi-hour multi-hour jobs.
In imported activities Imitates “computer use,” 19-point jumping in Osworld – Verified; Fill in the ability of the navigation model, complete the sprerspesherts, and complete the complete web flows on anthropic demo browser demo. In businesses who examine Agentic Rpap-style work, Osworld high schools often link at low integes to intervene during the killing period.
Where you can drive it?
- Anthropic API and apps. Model ID
claude-sonnet-4-5; Sonnet equality 4. The creation of files and the murder of the code is now available directly to Claude applications for paid Tiers. - AWS Bedrock. Available with a beedrock with the combination guidelines to agentcore; AWS highlights long agents, memory / dynamic, and performance management (recognition, session separation).
- Google Cloud Vertex AI. Kind In Vertex AI for the support of the most boundaries by Adk / Agent, the completion of the provision, analysis tasks token 1m, and quick Capital.
- Gimbub Copilot. Community preview of Copilot Chat (VS code, web, mobile) and Copilot CLI; Organizations may approve of policy, and the BYO key is supported in the VS code.
Summary
By writing down 77.2% SWE-Bench score under obvious problems, a 61.4% OSWORD-Guaranteed To lead computer usage, and effective renewal (test zones, SDK, Copilot / Bedrock / Vertex availability), Claude Sonnet 4.5 It is for Long performance, tools-hard for agent's activity rather than a short memory. Independent repetition will determine how long the applicant is “the best to obtain the codes”, but targets, are independent, the swipe, and computer control) is aligned with real pain points.

Michal Sutter is a Master of Science for Science in Data Science from the University of Padova. On the basis of a solid mathematical, machine-study, and data engineering, Excerels in transforming complex information from effective access.
🔥[Recommended Read] NVIDIA AI Open-Spaces Vipe (Video Video Engine): A Powerful and Powerful Tool to Enter the 3D Reference for 3D for Spatial Ai



