The Model Context Protocol has taken over the AI tooling ecosystem. It's in Claude, it's in Cursor, it's in dozens of IDEs and agent frameworks. MCP is winning. And I'm increasingly worried about the thing nobody talks about: how much context window it eats.
Every MCP server you connect exposes tool definitions. Each tool has a name, description, parameter schema, and often examples. A typical MCP server with 10-15 tools adds 2,000-5,000 tokens to your context window just from tool definitions. Connect three or four MCP servers and you've burned 10,000-20,000 tokens before the conversation even starts.
That's a significant chunk of your context window devoted to telling the model about capabilities it might not even use in this conversation. If you have a 128K context window, 20K tokens of tool definitions is manageable. If you're on a smaller model with 32K context, it's crippling.
Apideck CLI takes a different approach. Instead of exposing full tool schemas in the context, it provides a lightweight command-line interface that the agent can invoke. The tool definitions are minimal - essentially just the command name and a one-line description. The full documentation lives outside the context window, accessible on demand.
The context savings are dramatic. Where a full MCP integration might use 5,000 tokens for a set of API tools, Apideck's approach uses a few hundred. That's an order of magnitude difference.
The tradeoff is obvious: less information in context means the model has less to work with when deciding how to use a tool. It might make more mistakes. It might need to check the documentation more often. But in practice, for well-designed CLIs with intuitive command names, the model figures it out with minimal context.
I think this tradeoff calculation is going to become increasingly important as people connect more tools to their AI assistants. The MCP ecosystem is growing fast. Every SaaS product wants an MCP server. Every database wants an MCP server. Every API wants an MCP server. If you connect all the MCP servers you'd theoretically want, you'd blow your context window on tool definitions alone.
This is a real scaling problem. The current approach of "dump all tool schemas into context" works when you have five tools. It doesn't work when you have fifty. And the direction of the ecosystem is toward fifty.
There are a few possible solutions:
Dynamic tool loading. Only load tool definitions relevant to the current task. This is what some agent frameworks already do, but it requires the framework to predict which tools the agent will need, which is itself an AI problem.
Compressed tool representations. Instead of full JSON schemas with descriptions and examples, use minimal representations that the model can still understand. This is essentially what Apideck does.
Tool documentation retrieval. Keep tool definitions out of context entirely and let the agent retrieve documentation when it needs it. This adds latency but saves context.
Better context windows. The brute-force solution. If context windows keep growing, the tool definition overhead becomes proportionally smaller. But this assumes context window growth outpaces tool proliferation, which I'm not confident about.
Apideck isn't perfect. The CLI-based approach has its own overhead - spawning processes, parsing output, handling errors through stdout/stderr. And it requires the target API to have a well-designed CLI, which not all do.
But I think it's pointing at a real problem that the MCP community needs to address. Context is a finite resource. Every token spent on tool definitions is a token not available for actual work. As the tool ecosystem grows, this budget pressure will intensify.
The MCP specification should probably include provisions for tiered tool disclosure - minimal descriptions by default, full schemas on request. That would let agents operate efficiently with many tools while still having access to detailed information when they need it.
For now, if you're building agents that connect to many tools and you're hitting context limits, Apideck's approach is worth studying. Context efficiency is going to be a competitive advantage in agent design.