
MCP vs direct API calls: 17× token overhead measured
A benchmark shows MCP agents consume 4×–32× more tokens than direct API calls, with a 17× overhead measured in a SerpApi search test. This translates to a $0.27 per request overhead at Claude Sonnet pricing.
A benchmark published on June 23, 2026 measured token usage for two identical tasks: a SerpApi search and a repo-language check [Dev.to]. The MCP-based agent consumed 6,047 tokens per search call, while a CLI script that called the API directly used 351 tokens – a 17× overhead. The same pattern appeared in a language-check test: the direct API call required 1,365 tokens, whereas the MCP agent burned 44,026 tokens, driven by 43 unrelated tool definitions injected into every message [Dev.to].
The excess is due to schema injection, where every registered tool is serialized into the system prompt on each turn, regardless of whether the tool is used [Dev.to]. In a server with dozens of tools, the overhead can climb to 32× the baseline token count. At Claude Sonnet pricing, 90,000 overhead tokens translate to roughly $0.27 per request, or $270 per day for a pipeline that runs 1,000 times.
The token overhead has significant implications. For a batch pipeline that processes 1,000 items daily, the cost would be $270 on unused schema alone, eroding profit margins. High-volume, single-purpose agents also lose throughput: a 25-minute MCP run versus a 50-second direct-API run is a realistic scenario when token overhead multiplies across steps. MCP is suitable for conversational, multi-team environments where governance, audit logging, and rapid prototyping outweigh token waste. However, in deterministic pipelines, the token penalty outweighs these benefits, pushing teams toward lean direct-API integrations.
The data suggests a hybrid model—MCP for orchestration and direct API calls for hot-path tool execution—delivers the best of both worlds. Ignoring the token math leads to hidden expenses that quickly balloon [Dev.to].
Subscribe to the broadcast.
Daily digest of the day's most important tech news. No fluff. Engineering signal only.
// delivered via substack · double-opt-in confirmation


