The Real Reason Behind Meta's Manus and OpenAI's OpenClaw Acquisitions

Four years ago, nobody would let AI fill out a form. And now we're really asking — why are we even filling forms? The entire AI space went from a little bit of skepticism to billion-dollar acquisitions. In just the last couple of months, Meta has dropped 2 billion dollars on Manus AI and OpenClaw has just joined OpenAI.

It's not that they have bought models, they have not bought UIs, but they have bought the orchestration layer which is underneath it. The infrastructure that makes these models run for hours.

Last year the bar was really simple: all the apps used to make it work just once. Ship an agent which will complete a task and call it done.

This year is different. Agents can run while we sleep. Tasks with more than 500 steps that don't need babysitting. Systems that survive context resets and recover from failures without human intervention.

So the question now is, can your agent do X? Sure, but can it do it reliably for more than a few minutes, a few hours? That's the engineering problem, and that is why infrastructure suddenly matters more than intelligence.

Everyone keeps saying AI will replace traditional interfaces. Replace them with what? Text, voice, sure. That is the initial door. But what happens behind it?

When you tell an agent to run this thing for three hours straight, something has to orchestrate that. Something has to manage the state when the context fills up. Something has to recover it when things go out of the way at 3 a.m.

The interface is conversational, but the orchestration question is still wide open.

Models are good enough now. ChatGPT, Claude, Gemini are all really capable in a way, and even the open source models are capable in a way which we can think of. But even today they are not perfect. But they eventually will be. But what do we have? Do we have the systems ready for them to run continuously?

Model providers are racing towards building better reasoning, better context windows, and better models in general. But when we talk about jobs getting replaced, when we talk about AI doing real work, how many companies are there who are building the orchestration layer of it? Or are most of the companies we see right now just wrapping UIs with API calls in the background?

Intelligence without infrastructure right now I think is just a demo. The models will get smarter regardless. The durability problem is the real problem to solve.

A brilliant model that cannot persist state across sessions will be practically useless. One that cannot route subtasks to specialized models will actually waste time and energy. One that cannot recover from errors and dies halfway through the thinking chain will be a big problem.

So this isn't technically a model problem, it's a system architecture problem:

Intelligent routing: Which model handles which task, when to switch, and how to failover.
State management: How does session N pick up where session N-1 left off?
Multi-model coordination: Reasoning models for planning, fast models for simple queries, and specialized models for coding and scientific things.

The smarter these models get, the more the harness matters. Raw intelligence without execution infrastructure is just potential energy.

A harness is not a framework, it's not a wrapper. It's the full-time runtime environment that turns unreliable AI into a reliable system. Which makes me remember the following tweet by @rakyll:

"People cannot tell the difference between models and models+harness."

Because the distinction between models and the agents are becoming blurry given what's underneath is mostly opaque to the client. I expect 2026 to be more focused on building the right harnesses.

— Jaana Dogan ヤナドガン

One of the steps has already been solidified which we call secure sandboxes, so the agents can execute code without destroying our system. But a few things like tiered memory architecture are needed, because even 200K tokens, let alone one million tokens, is not technically infinite. 50 tool calls deep with three retrieved documents, and we are hitting the context limit like this.

Context windows sound massive until we are running multi-hour tasks with hundreds of steps and suddenly we are dealing with context death, degraded summarization, bad compacting and lost data.

We are not really here to burn money on the wrong model. Opus 4.6, Codex 5.3, and all these batch models, whether you say it or not, are really costly. So state persistence across sessions, when the next agent comes in, and when do we route to the right model — all these matters really very much.

Models will commoditize fast. Open source is catching up to the frontier models. I know a lot of them are benchmaxxing it, but I can imagine a point in time when open source models are as good as Opus 4.6 right now within six months.

So the point to consider here is that within a year if the quality is differing this much and the difference at the top is marginal, the harness should not care which model it's plugging into. If the best model today is open source, we can use that. If it's cloud tomorrow, we can swap that in. The harness is a model-agnostic framework.

Building that infrastructure takes time, real time. The knowledge we gain regarding what breaks in production, what recovery patterns work, and how edge cases show up at scale does not disappear when a new model ships. It stays within the system.

Model improvements will lift all boats. Every competitor anyone has will get the same capability when better models ship. But what about harness improvements? Those stay within the team. The team that figures out durable state management first, the team that solves context death elegantly, and the team that builds recovery mechanisms are building something that model releases can't touch.

I personally don't think so. Because even perfect reasoning needs execution infrastructure. You could have an AGI tomorrow and it could still need sandboxing, memory management, routing and recovery. The bottleneck is not intelligence anymore; it's operational infrastructure that lets intelligence run continuously.

Better models make harnesses more important, not less, because they unlock longer horizon tasks that need even more robust orchestration.

And why I think that all these big labs will not build the real harness is because they are optimizing for utility rather than things like businesses which need to call seven different product APIs.

I think we should stop optimizing for model performance and start building orchestration infrastructure. The competitive advantage in 2026 isn't having access to the best model. Everyone has access to the best models. It's having the infrastructure that lets this model actually finish complex tasks and run them for more than three hours or two hours.

That means investing in sandboxing, memory architecture, routing, logging, and recovery mechanisms. And that means treating your harness as the product.

Can my system run for hours without human intervention? Can it survive context resets? Can it recover from failures? If not, we are technically building a demo and not a product, because I'm assuming after ten years there will be highly available personal intelligence which will do an insane amount of work with just a couple of lines of English.

So which model is best? It's the wrong question. That's asking what engine to buy when you need to build a car. The engine matters and the transmission matters, but the entire architecture put together matters even more. So the harness matters more than the model choice.

The models we use today are not going to be the models which we use in six months, but the orchestration layers which we build will compound. It will survive model swaps, it captures institutional knowledge about what breaks and what works, and that will become the moat.

The companies winning in 2026 won't be the ones with the smartest models. They'll be the ones who figure out how to make the agents finish what they start.

Intelligence is a rising tide, and infrastructure is the harbor we build. Meta's $2 billion deal for Manus and OpenAI's move for OpenClaw show that the race is now about who builds the best environment for agents to live in. In my view, OpenAI is backing OpenClaw because they want to set the open-source standard for how agents actually use our hardware, essentially building the "Operating System" for the agent era.

The best harness will win.

Originally published as an article on X.