Is MCP dead? No. You're just pointing it at the wrong job.
"MCP is dead."
It's a good headline, and the post behind it is better than the headline. It's not a hot take. It's a measured engineering writeup with numbers, and the numbers are real. So let me start by agreeing with all of them.
The critique is correct
Connect four MCP servers to a coding agent and roughly 10% of your context window is gone before you type anything. Just tool definitions sitting on the desk. One server in that test exposed 42 tools and burned over 12,000 tokens, all loaded up front, whether or not you ever called more than two of them.
The latency is real too. In their benchmark, an extra protocol layer between the model and the underlying API measured around 3x slower per call, and far worse on the first call once you count initialization. If you already live in a terminal, a Bash command against a CLI you use every day costs zero context and runs immediately. MCP can't beat that, and pretending otherwise is how you lose an argument with anyone who has actually shipped an agent.
So Quandri's conclusion is correct. Use a hybrid. Reach for Bash and a CLI first, skills for repeatable workflows, MCP only where it earns its place. I'm not here to tell you that's wrong.
I'm here to tell you what it's an argument about.
"MCP is dead" is an argument about action tools
Every example in the case against MCP is the same shape: an agent that does things. File a Linear ticket. Open a PR. Hit an internal service. High-frequency, deterministic, imperative actions where the model already knows exactly what it wants and the only question is how fast and how cheaply the call goes out.
For that job, the critique lands. If the model knows the exact action, wrapping it in a tool schema and a protocol round-trip is overhead. Writing code that calls the API directly is genuinely a better abstraction. Forty-two always-loaded tools to expose one product's API surface is a design smell. That's bad tool design. It is not a property of the protocol. It's what happens when you treat MCP as "expose my entire REST API as tools" instead of "give the model the few verbs it actually needs."
Point an efficiency lens at imperative action tools and you'll conclude they're inefficient. That's true. It's also not the only thing MCP is for.
The other job: memory, not actions
There's a second use of MCP that none of the takedowns measure, because dev-tooling benchmarks don't go looking for it: retrieval. Not "do this thing," but "what do I already know about this?"
That's the job Hjarni, a knowledge base with a built-in MCP server, does. The model isn't firing off forty imperative actions. It's reaching into your notes to pull the relevant context before it answers, and writing the keepers back when something lands. Generation is solved; finding it again is the problem. A maintained brain the model can read across sessions is the fix.
Look at that job through the same lens the critics used and the conclusions invert:
Tool count. A retrieval server doesn't need 42 tools. It needs a handful of stable verbs that don't change between sessions. Search, read, write, link. Your tools do too much is a product principle here, not just a slogan. The context-bloat problem is a problem of sprawl, and a focused server never has it.
Latency. The 3x-per-call number matters when you're chaining forty deterministic actions and milliseconds compound. It barely registers when the model makes one or two retrieval calls a turn. There the bottleneck is recall quality. Did it find the right note? That's the cost that matters, not protocol overhead. You don't optimize the cheap part of the budget.
Code execution. "Just write code that calls the API" assumes a stable, documented API the model can target. For your private knowledge base, that API is the MCP server. Code mode doesn't delete it.
"Code execution" is still MCP
Here's the part that gets lost in the funeral. The strongest version of "MCP is dead" leans on Anthropic's own code execution with MCP writeup. The idea: you let the model write code against tools instead of calling them one at a time.
Read the title again. Code execution with MCP. The pattern still uses MCP servers underneath. It changes how tools are presented to the model: a code API the model composes against, instead of a flat list it pages through. But the servers, the transport, the auth, the protocol: all still MCP. That's not MCP dying. That's MCP growing a better front end.
And the headline context cost, the thing the whole argument rests on, already has a fix that shipped. Tool Search with deferred loading keeps tool schemas out of context until the model actually needs one, cutting tool-definition tokens by 85% or more. Quandri's own update says so: deferred loading now largely addresses the context-bloat problem they measured. You can see it in current Claude Code builds, where most tools are listed by name first and the full schema loads only when a tool is actually called. The desk stays clear. The bloat argument was real and it's being engineered away in months, not buried.
That's not what a dead protocol looks like. That's what an immature one looks like while it's hardening.
Why we actually bet on MCP (and it isn't tokens)
Here's the honest part, because the house style around here is to concede the real tradeoffs (here's what you give up versus the local Karpathy setup, for the record).
We did not choose MCP because it's the cheapest way to spend context tokens. On a pure single-agent token budget, a CLI can win. We chose it for the one thing a CLI and a bespoke API can't do: be the same interface everywhere.
A CLI ties your knowledge to a terminal. A custom API ties it to glue code that you own and babysit forever, re-implemented per client. MCP is the one interface where the same notes are readable from Claude, from ChatGPT, from Cursor, from the Claude app on your phone. No syncing. No per-client integration work. Capture a thought on your phone, refine it at your desk in a different model, query it from your editor next week. Same brain. One protocol.
That's not a feature you benchmark. That's a standard. And standards don't die because someone clocked a 3x latency delta on action tools. USB was slower than the proprietary port it replaced, too. Portability won anyway, because the value was never the per-call speed. It was that one connector reached everything.
So: dead, or just early?
The "MCP is dead" posts are doing the field a favor. They're killing the lazy version of MCP. The one where you reflexively expose your whole API as forty always-on tools and call it an integration. That version should die. Good riddance.
What survives is sharper: a few stable verbs, deferred until needed, used for the jobs where a universal interface beats a faster proprietary one. For shipping imperative actions inside one agent, take the hybrid: Bash, a CLI, code execution. Quandri's right about that. For giving your AI a memory it can read from every client you use, MCP isn't dead. It's the only thing that does the job.
Pick the right tool for the call. Sometimes that's a shell command. For your second brain, it's an open protocol every model already speaks.
Try it now: the Knowledge Wiki template gives your assistant a ready-made brain to read from and write back to, over MCP, on every client. Wiring it up takes a few minutes from the Claude or ChatGPT docs. Paste the link into either one and it builds the structure for you.
Common questions
FAQ
Is MCP dead?
No. The "MCP is dead" argument is really an argument against one way of using it: bolting dozens of always-loaded action tools onto a coding agent. That bloats the context window and adds latency, and a CLI or code execution often beats it. But that is a critique of tool design, not of the protocol. For knowledge retrieval across many clients, MCP is the right tool and is getting better, not dying.
Isn't code execution replacing MCP?
Code execution doesn't replace MCP, it runs on top of it. Anthropic's own "code execution with MCP" pattern still uses MCP servers; it just changes how the tools are presented to the model. "Code mode" is MCP maturing, not MCP dying.
What about the context-window cost of MCP tools?
It's real when a server exposes 40+ tools that all load up front. The fix already shipped: Tool Search with deferred loading keeps tool schemas out of context until they're needed, cutting tool-definition tokens by 85%+. A focused server with a handful of stable tools never had the problem in the first place.
Why does Hjarni use MCP instead of a CLI or plain API?
Because the win we want is portability, not raw call latency. A CLI ties your knowledge to a terminal. An API ties it to glue code you maintain. MCP is the one interface that the same notes reach from Claude, ChatGPT, Cursor, and your phone, with no syncing. That's a standard, and standards don't die from a benchmark.