Hermes Agent Ships Tool Search for MCP: Anthropic Evals Show 49% to 74% Accuracy Gain in Opus 4

Nous Research's open source Hermes Agent now ships with a Search Tool feature. It directly addresses a growing bottleneck in AI agent systems: many MCP tools fill the context window. In this introductory article, we'll reveal what Tool Search does, how it works, and when to use it.
Problem: MCP Tools Are Eating Your Window's Context
When connecting multiple MCP (Model Context Protocol) servers to an AI agent, a JSON schema of the tools is sent to the model each time. This happens even if the model only needs one or two tools for a particular task.
Real-world deployments sense this immediately. A Hermes deployment with five MCP servers and 34 devices shows an average information size of 45,000 tokens per turn. About 22,000 of those tokens – about 50% – are schema tools over the head alone.
Anthropic's own engineering data shows tool specifications can consume 134,000 tokens before development. Tool Attention measures the “MCP Tool Tax” at 15,000–60,000 tokens per turn with the typical use of multiple servers.
This creates two different problems:
- Costs: Generations missed from the cache at the beginning of the session can cost $0.07–$0.10 per turn.
- Loss of accuracy: Decision paralysis sets in when the model sees hundreds of irrelevant tool options at once.

Tool Search is Hermes Agent's ongoing disclosure layer for MCP and non-essential plugin tools. Instead of loading the entire tool schema upfront, the model loads only what it needs – on demand, per turn.
When Tool Search is active, MCP and plugin tools are replaced in the list of visible tools by three bridge tools:
tool_search(query, limit?) — search the deferred-tool catalog
tool_describe(name) — load the full schema for one tool
tool_call(name, arguments) — invoke a deferred tool
A typical connection looks like this:
Model: tool_search("create a github issue")
→ { matches: [{ name: "mcp_github_create_issue", ... }] }
Model: tool_describe("mcp_github_create_issue")
→ { parameters: { type: "object", properties: { ... } } }
Model: tool_call("mcp_github_create_issue", { title: "...", body: "..." })
→ { ok: true, issue_number: 42 }
The model searches for what it needs, loads the schema, and calls the tool. All hooks, guardrails, and approval commands are against the name of the original original instrument – not against the bridge.
Accuracy Numbers
This is not just a token saving feature. Search tool too improves the accuracy of the model in the MCP test.
According to Anthropic MCP's internal testing:
- Claude Opus 4: improved accuracy from 49% → 74% with search tools enabled
- Claude Opus 4.5: improved accuracy from 79.5% → 88.1% with search tools enabled
Large catalogs of tools create “decision paralysis” – the model is confused to choose between many irrelevant options. Removing those options from the context window reduces false positives. Anthropic data also shows i 85% reduction in tool definition token usage while maintaining access to the full tool library.
Functional Recovery Method: BM25 + Fallback
Under the hood, Hermes uses BM25 — a classic information retrieval algorithm — matching a model query with a catalog of tool names, definitions, and parameter names.
If BM25 does not return hits with positive results, the system reverts to matching the original substring to the tool name. This protects against degenerate zero-IDF scenarios, such as search "github" in the catalog where every tool name contains “github.”
Catalog i flawless in repentance. Rebuild from the current list of tools for every assembly. This prevents drift bugs when the database catalog is out of sync with live tool registrations.
By default, Tool Search is enabled auto mode. It only works if the executable schemas can use it at least 10% of the active model context window.
Below that threshold, the integration of the tool array is a clean pass. You don't pay a lot of money.
This decision is reviewed regularly:
- A session with few MCP tools and a remote content model may never open Tool Search.
- A session with multiple MCP servers attached (typically 15+ devices) starts it up.
- Removing servers in the middle of a session correctly reverts to direct exposure of the tool in the next compilation.
Configuration reference
Add this to yours hermes.yaml control behavior:
tools:
tool_search:
enabled: auto # auto (default), on, or off
threshold_pct: 10 # % of context at which auto mode kicks in
search_default_limit: 5
max_search_limit: 20
| The key | Default | Explanation |
|---|---|---|
enabled |
auto |
auto activate above the limit; on it is always active if there is at least one undoable tool; off it is completely disabling |
threshold_pct |
10 |
The percentage of the length of the context there auto he kicks. Range: 0–100 |
search_default_limit |
5 |
Hits back when the model calls tool_search except a limit |
max_search_limit |
20 |
A tightly bound model can be requested with limit. Range: 1-50 |
You can also use a simple boolean shorthand:
tools:
tool_search: true # equivalent to {enabled: auto}
Marktechpost Visual Explainer
Key Takeaways
- Tool Search defers MCP tool schemes until the model actually needs them — using a
tool_search/tool_describe/tool_callthe bridge. - Anthropic testing shows accuracy gains from 49% → 74% on Claude Opus 4 with large tool catalogs.
- Retrieving BM25 over tool name + description + parameter names enables the search, with a small reverse sequence of zero IDF edge conditions.
autoThe (default) mode is self-configuring — only active if tooltips exceed 10% of the context window.- Core Hermes tools have never been reversed; only MCP and non-core plugin tools are eligible.
Check it out Hermes Agent Tool Search Documents again Advanced Anthropic Tool Use. Also, feel free to follow us Twitter and don't forget to join our 150k+ ML SubReddit and Subscribe to Our newspaper. Wait! are you on telegram? now you can join us on telegram too.
Need to work with us on developing your GitHub Repo OR Hug Face Page OR Product Release OR Webinar etc.? Connect with us



