Back to Blog

AI Agent Orchestration in 2026: OpenClaw, MCP, and the Security Lessons No One Wants to Hear

The orchestration landscape is moving fast — OpenClaw hit 200k stars, OpenAI acquired the creator, and 1,184 malicious skills poisoned the ecosystem. Here's what teams building agent systems need to know.

AI Agent Orchestration in 2026: OpenClaw, MCP, and the Security Lessons No One Wants to Hear
Matt Owens
Matt Owens
18 Feb 2026 - 11 min read

AI Agent Orchestration in 2026: OpenClaw, MCP, and the Security Lessons No One Wants to Hear

Three things happened in the last 30 days that tell you where agent orchestration is headed:

  1. OpenClaw crossed 200,000 GitHub stars — growing 18x faster than Kubernetes hit the same milestone.
  2. OpenAI acqui-hired OpenClaw’s creator, Peter Steinberger, signaling that orchestration infrastructure is where the next competitive battleground is.
  3. 1,184 malicious skills were discovered on ClawHub, including a coordinated campaign that exfiltrated API keys, crypto wallets, and browser credentials from self-hosted instances.

The message is clear: agent orchestration is the most important layer in the AI stack right now — and it’s also the most dangerous one to get wrong.

This post breaks down the orchestration landscape as it stands today, what OpenClaw got right (and catastrophically wrong), how MCP is becoming the standard glue layer, and the architecture patterns that actually hold up in production.

If you’ve read our AI Agent Architecture guide, this is the companion piece. That guide covers abstract patterns — tool design, RBAC, guardrails. This one covers the concrete landscape: who’s building what, where the bodies are buried, and what we’re seeing in real engagements.


FAQ: AI Agent Orchestration

What is agent orchestration? The infrastructure layer that manages how AI agents discover tools, execute tasks, maintain memory, handle sessions, and coordinate with each other. It’s everything between the LLM and the work the agent actually does.

What is OpenClaw? An open-source AI agent orchestration framework with 200k+ GitHub stars. It started as a WhatsApp relay script, grew into a multi-platform personal AI assistant, and was acqui-hired by OpenAI in February 2026. It supports 10+ messaging channels, native MCP integration, multi-agent routing, and a markdown-based skill system.

What is MCP? The Model Context Protocol — an open standard for connecting AI agents to tools. Think of it as USB-C for AI: a universal interface that lets any agent talk to any tool server, regardless of who built either side.

Is OpenClaw safe for production? Not without significant hardening. A January 2026 security audit found 512 vulnerabilities (8 critical), and the ClawHavoc campaign infected over 1,000 ClawHub skills with credential-stealing malware. The architecture has fundamental security gaps — plaintext credential storage, no default authentication, and workspace isolation that’s advisory rather than enforced.

What is HackMyClaw? A crowdsourced prompt injection challenge where participants try to trick an OpenClaw agent into leaking secrets via email. After 400+ attempts, no one has succeeded — but the test conditions are artificially narrow compared to real-world agent deployments where multiple channels and tool access expand the attack surface.


The orchestration landscape: who’s building what

The agent orchestration space has fragmented into three tiers.

Tier 1: General-purpose orchestration frameworks

These are the platforms that handle the full agent lifecycle — session management, tool execution, memory, and multi-agent coordination.

FrameworkArchitectureBest forWatch out for
OpenClawHub-and-spoke gateway, channel adapters, markdown skillsPersonal AI assistants, multi-channel botsSecurity posture, plaintext credentials
LangGraphDirected acyclic graph workflows, node-based stepsComplex conditional pipelinesSteep learning curve, rigid state management
CrewAIRole-based agents with YAML configurationTeam-metaphor workflows, simple automationLimited customization at scale
Microsoft Agent FrameworkUnified Semantic Kernel + AutoGen, A2A protocolEnterprise .NET/Python shopsGA targeting Q1 2026, still maturing
OpenAI Agents SDKLightweight Python, handoffs, guardrails, tracingFast prototyping with OpenAI modelsProvider lock-in risk

Tier 2: Security-first alternatives

OpenClaw’s security failures spawned a wave of hardened alternatives:

  • NanoClaw — Container-based isolation (Docker/Apple containers). Earned 7,000 stars in its first week by solving OpenClaw’s biggest security gap: workspace isolation.
  • Nanobot — Ultra-lightweight (4,000 lines of Python vs. OpenClaw’s 430,000+). Minimizes attack surface through radical simplicity.
  • TrustClaw — Focuses on secure cloud actions with sandboxed execution.

Tier 3: The glue layer (MCP)

MCP isn’t a framework — it’s the protocol that connects all of them to tools. Every Tier 1 framework either supports MCP natively (OpenClaw, Microsoft Agent Framework) or has community integrations. Over 1,000 MCP servers exist today, covering databases, SaaS APIs, CMS platforms, and infrastructure tools.

We built MCP Tools for Drupal — 222 tools across 34 submodules — because we needed production-grade MCP tooling for client engagements. The protocol works. The question is what you connect it to and how you secure the connections.


What OpenClaw got right

Credit where it’s due: OpenClaw’s architecture contains genuinely innovative patterns that every agent builder should study.

The Lane Queue: serial execution by default

OpenClaw’s most transferable innovation is the Lane Queue — a per-session task ordering system that executes requests serially within each session. Every session gets a structured key (workspace:channel:userId) that prevents cross-context data leaks, and requests within a session execute one at a time.

This eliminates an entire class of concurrency bugs. In multi-agent systems, the default should be serial execution with opt-in parallelism, not the reverse. Every production agent incident we’ve investigated that involved data corruption or context leakage traces back to concurrent access to shared state without proper isolation.

OpenClaw’s approach — dedicated lanes for cron, subagent, and standard requests with configurable parallelism — is the right pattern.

Channel normalization

Abstracting 10+ messaging platforms (WhatsApp, Telegram, Discord, Slack, Signal, Matrix, Teams) through a unified ChannelMessagingAdapter interface is the right level of abstraction. The adapter handles platform-specific quirks (typing indicators, message threading, media handling) while the agent logic stays platform-agnostic.

This matters because real-world agent deployments rarely target a single channel. If your orchestration layer doesn’t normalize channels, you end up building platform-specific agents that share no code.

Delegation of the agent loop

OpenClaw doesn’t implement its own agent reasoning loop. It delegates to the Pi agent framework for tool calling and LLM interaction, and focuses on everything else — session management, memory persistence, channel normalization, skill loading. This reflects a key insight:

“The hard problem in personal AI agents is not the agent loop itself, but everything around it.”

This separation of concerns is correct. The LLM interaction pattern is a solved problem. The unsolved problems are state management, security, observability, and tool lifecycle — exactly the things that break in production.

Memory as files, not vectors

OpenClaw stores memory as markdown files, YAML configurations, and JSONL conversation logs — all in ~/.openclaw/, all directly browsable and editable. No opaque vector database. You can git diff what the agent remembers, edit facts in a text editor, and audit knowledge without custom tooling.

This is operationally excellent. Vector databases are appropriate for RAG retrieval, but agent memory — the things the agent “knows” about the user — should be inspectable and version-controllable. When something goes wrong, you need to read the agent’s state. If that state is embedded in a vector store, you’re debugging with a flashlight in a dark room.


What OpenClaw got catastrophically wrong

OpenClaw’s security failures aren’t edge cases or theoretical concerns. They’re active exploits that compromised real installations.

CVE-2026-25253: Total gateway compromise (CVSS 8.8)

The Control UI trusted query-string gatewayUrl parameters without validation, auto-connecting and sending auth tokens to attacker-controlled servers. The WebSocket server lacked Origin header validation. With a stolen token, an attacker could modify configuration, disable sandboxes, and execute arbitrary commands.

This is a total gateway compromise — not a data leak, not a privilege escalation, but complete control over the agent and everything it can access.

No authentication by default

OpenClaw automatically grants full access to connections from 127.0.0.1 without authentication. Behind a misconfigured reverse proxy, external requests appear as local traffic. A Shodan scan found nearly 1,000 publicly accessible, unprotected OpenClaw installations.

Plaintext credential storage

Connected account credentials — WhatsApp sessions, API keys, Telegram tokens, Discord OAuth — stored as plaintext under ~/.openclaw/. Malware families have already begun specifically targeting this directory structure to harvest secrets.

ClawHavoc: supply-chain attack at scale

Between January 27 and February 1, 2026, a coordinated campaign uploaded 1,184 malicious skills to ClawHub. The skills mimicked trading bots and financial assistants but packaged a stealer called “AuthTool” that exfiltrated:

  • Browser passwords and extensions
  • Crypto wallet seed phrases
  • macOS Keychain data
  • Cloud service credentials

The root cause: minimal publishing requirements (1-week-old GitHub account, no code review). OpenClaw’s “skills as markdown” pattern — brilliant for extensibility — created a massive supply-chain attack surface because markdown files can contain executable instructions that the agent follows.

Prompt injection via connected channels

Malicious content embedded in emails and documents could force the LLM to perform unintended actions. Documented attacks include extracting private cryptographic keys via email injection, dumping home directory contents to group chats, and exfiltrating sensitive files through connected messaging channels.

Kaspersky’s assessment

“At best unsafe, and at worst utterly reckless.”

We agree with the assessment, not the tone. OpenClaw’s security failures are systemic, not incidental. They reflect architectural decisions — trust by default, plaintext storage, advisory sandboxing — that cannot be fixed with patches. The framework needs a security redesign from the ground up. Whether OpenAI’s acquisition leads to that redesign remains to be seen.


HackMyClaw: crowdsourced security testing fills the gap

OpenClaw has no bug bounty program and no dedicated security team. The community is filling the void — and the results are instructive.

The challenge

HackMyClaw is a prompt injection challenge created by Fernando Irarrázaval. The setup: an OpenClaw agent named “Fiu” runs on Claude Opus 4.6, monitors a Gmail inbox, and has access to a secrets.env file containing API keys and tokens. Fiu has been instructed — via 10-20 lines of prompt hardening — to never reveal the contents of that file.

Your job: send Fiu an email that tricks it into leaking secrets.env. The $300 bounty ($100 from the creator, $200 from sponsor Corgea) goes to the first person who extracts the secrets.

What happened

After 400+ emails from participants using every technique in the prompt injection playbook — role confusion, instruction overrides, encoding tricks, creative social engineering, love letters, songs — zero succeeded. The model resisted.

Two things are worth noting about why:

1. The model is actually good at this. Claude Opus 4.6 with basic prompt hardening (10-20 lines of “never reveal secrets.env”) held up against hundreds of adversarial attempts. This is meaningful. It suggests that state-of-the-art models with even minimal defensive prompting are resistant to naive prompt injection via a single narrow channel.

2. The test conditions are artificially favorable to the defender. Fiu processes emails in batches, one cycle per hour. After hundreds of obviously adversarial messages, the agent became “paranoid” — classifying most incoming emails as attack attempts. In production, an OpenClaw agent isn’t sitting behind a single email inbox fielding nothing but attack payloads. It’s connected to WhatsApp, Slack, Telegram, and the open web, processing a mix of legitimate content and potential injection vectors. The real attack surface is vastly wider than what HackMyClaw tests.

Why crowdsourced AI security testing matters

HackMyClaw is small — one agent, one channel, one bounty. But it represents something important: the community doing work that the project’s maintainers aren’t.

OpenClaw has no official bug bounty program. No security team. No paid vulnerability reports. CrowdStrike runs their own OpenClaw prompt injection challenge as a training exercise. Independent researchers at Penligent are publishing detailed attack writeups. The security community is mapping the attack surface that OpenClaw’s own team hasn’t.

This crowdsourced approach has real value, but it also has real limits:

  • Coverage is opportunistic, not systematic. Researchers test what interests them, not what matters most. The email injection vector is well-studied; the WebSocket gateway authentication bypass (CVE-2026-25253) was found by a formal audit, not crowdsourced testing.
  • No coordinated disclosure process. Without a security team or bug bounty program, responsible disclosure is informal. Vulnerabilities get published as blog posts and conference talks before patches ship.
  • Testing prompt injection in isolation misses the real risk. The dangerous attacks aren’t “trick the LLM into revealing a secret.” They’re multi-step chains: inject via a webpage → persist instructions in agent memory → exfiltrate data through a connected channel days later. HackMyClaw tests the first step. The full kill chain requires testing the orchestration layer, not just the model.

For teams deploying agent systems: don’t rely on crowdsourced testing as your security strategy. Run structured penetration tests that cover the full attack surface — prompt injection, tool abuse, privilege escalation, and supply-chain poisoning. The model might resist a direct prompt injection. The orchestration layer around it almost certainly won’t.


MCP: the glue layer that’s actually working

While the orchestration framework layer is chaotic, the tool connectivity layer is converging on a single standard: the Model Context Protocol.

Why MCP is winning

MCP solves the N×M problem. Without a standard protocol, every agent framework needs custom integrations for every tool — LangGraph needs a LangGraph adapter for Slack, CrewAI needs a CrewAI adapter for Slack, and so on. With MCP, you build one Slack server and every MCP-compatible agent can use it.

The adoption numbers tell the story:

  • 1,000+ community MCP servers covering databases, SaaS APIs, file systems, and CMS platforms
  • Native support in OpenClaw, Microsoft Agent Framework, Claude Code, OpenAI Codex, and Cursor
  • Two transport options — STDIO for local development, Streamable HTTP for remote/shared infrastructure
  • Schema-validated tool calls — the AI can’t send malformed input because the protocol enforces typed schemas

What we’ve learned building MCP servers

We’ve built MCP Tools for Drupal (222 tools across 34 submodules) and custom MCP servers for client platforms. Three lessons:

1. Tool granularity matters more than tool count. An MCP server with 222 well-scoped tools works better than one with 20 coarse-grained tools that try to do too much. AI assistants reason better about specific operations (“add a text field to the article content type”) than vague ones (“modify the content type”).

2. Security must live at the tool level, not the transport level. MCP supports scoped API keys, but that’s table stakes. Each tool needs its own permission check, input validation, and rate limit. Our MCP Tools implementation enforces three security presets (Development, Staging, Production) with per-tool-category access control.

3. Structured responses enable chaining. Every tool should return structured JSON, not rendered HTML or status messages. This lets AI assistants chain operations: create a content type → add fields → build a view → assign permissions — all in one conversation with verified intermediate results.

MCP’s gaps

MCP is a tool protocol, not an orchestration protocol. It doesn’t handle:

  • Agent-to-agent communication — that’s where protocols like Google’s A2A and Microsoft’s Agent2Agent come in
  • Session management — MCP servers are stateless; the client manages conversation context
  • Memory persistence — no built-in mechanism for tools to share state across sessions
  • Observability — no standardized tracing or metrics; you build your own

These gaps are fine. MCP shouldn’t try to be an orchestration framework. It should stay focused on tool connectivity and let the orchestration layer handle everything else.


Architecture patterns that hold up in production

Based on our agent development and penetration testing engagements, here are the patterns that survive contact with real users and real attackers.

1. Defense in depth for tool execution

Never trust a single layer of validation. Every tool call should pass through:

  1. Schema validation — reject malformed input at the protocol level (MCP does this natively)
  2. Permission check — verify the caller has the right scope for this operation
  3. Input sanitization — strip injection payloads, validate paths, check file types
  4. Rate limiting — per-user, per-tool, per-time-window
  5. Audit logging — every tool call logged with caller identity, parameters, and result

OpenClaw skipped layers 2 through 5 and got compromised. This isn’t optional in production.

2. Explicit trust boundaries

The biggest mistake in agent orchestration is treating the LLM’s output as trusted input. It isn’t. The LLM is processing user content, external documents, and channel messages — all of which can contain injection payloads.

Every tool call from an LLM should be treated as untrusted input from an external source. The orchestration layer should enforce the same validation you’d apply to an unauthenticated API request.

3. Workspace isolation that’s enforced, not advisory

OpenClaw’s workspace isolation is “advisory” — relative paths resolve inside the workspace, but absolute paths can reach the host filesystem. This is the wrong default. Agent workspaces should be sandboxed by default with explicit breakout permissions.

Container-based isolation (NanoClaw’s approach) is stronger. Process-level sandboxing with seccomp/AppArmor profiles is also effective. The point is: the isolation boundary should be enforced by the operating system, not by the agent framework’s path resolution logic.

4. No plaintext secrets

This should be obvious, but OpenClaw stored API keys and OAuth tokens as plaintext files. Production agent systems should use:

  • OS-level credential stores (macOS Keychain, Linux Secret Service, Windows Credential Manager)
  • Environment variables injected at runtime (never committed to disk)
  • External secret managers (Vault, AWS Secrets Manager, 1Password CLI)

If an agent needs persistent access to external services, the credential should be retrieved at runtime from a secure store, used, and discarded — never written to the filesystem.

5. Supply-chain verification for skills and plugins

ClawHub’s infection rate (1,184 malicious skills out of ~3,000) is a 39% poisoning rate. The lesson: any marketplace or registry for agent extensions needs:

  • Code signing — skills must be cryptographically signed by verified publishers
  • Automated scanning — VirusTotal integration (which OpenClaw added post-incident) is a start, but behavioral analysis is also needed
  • Capability declarations — skills should declare what permissions they need (file access, network access, credential access) and be denied everything else
  • Review gates — human review for skills that request sensitive capabilities

This is the same model that mobile app stores have used for 15 years. The agent ecosystem is learning the same lessons the mobile ecosystem learned in 2010.


What the OpenAI acquisition means

On February 15, 2026, Sam Altman announced that Peter Steinberger — OpenClaw’s creator — was joining OpenAI to “work on bringing agents to everyone.” The project continues as open source under a foundation.

Three implications:

1. Orchestration infrastructure is where the value is migrating. Model providers are commoditizing. The differentiation is moving up the stack — to the orchestration layer that manages agent lifecycles, tool connections, and multi-agent coordination. OpenAI acquiring the most popular open-source orchestration framework confirms this.

2. Expect convergence between OpenClaw and the OpenAI Agents SDK. The Agents SDK already provides lightweight agent primitives (handoffs, guardrails, tracing). OpenClaw provides the channel normalization, session management, and skill system. Combining them is the obvious play.

3. Security will improve, but slowly. OpenAI has the resources and the incentive to harden OpenClaw’s security posture. But the architectural issues (plaintext storage, advisory sandboxing, trust-by-default authentication) require redesign, not patches. This will take quarters, not weeks.

For teams building agent systems today: don’t wait for the OpenAI-backed version to be production-ready. Build your own security layer now, using the patterns above, and swap the orchestration framework later if something better ships.


What we recommend for production agent systems

The orchestration layer is not a problem you solve by picking a framework. It’s a problem you solve by understanding the security boundaries, choosing the right level of abstraction, and building defense in depth.

Here’s what we implement on agent development engagements:

  1. MCP for tool connectivity. It’s the standard. Build your tools as MCP servers with scoped access control and structured responses. This makes them framework-agnostic — you can swap orchestration layers without rebuilding tool integrations.

  2. Custom orchestration for production workloads. Off-the-shelf frameworks are fine for prototypes. For production, you need an orchestration layer that enforces your specific security policies, integrates with your identity provider, and logs at the granularity your compliance team requires.

  3. Sandboxed execution for all agent operations. Container isolation, not path-based advisory boundaries. Every tool call runs in a context where the blast radius is controlled.

  4. Observability from day one. Every tool call, every LLM interaction, every agent-to-agent handoff — logged, traced, and queryable. When something goes wrong (and it will), you need to reconstruct exactly what happened.

  5. Adversarial testing before launch. Run prompt injection tests, tool abuse scenarios, and privilege escalation attempts against your agent system before users do. The ClawHavoc campaign shows that attackers are already targeting agent infrastructure.


Build secure agent orchestration with CodeWheel

We build the orchestration layers, MCP servers, and security infrastructure that production agent systems require. If you need:

  • Custom MCP servers for your platform, CMS, or internal tools
  • Agent orchestration architecture with RBAC, audit logging, and tool gating
  • Security testing for existing agent deployments — prompt injection, tool abuse, privilege escalation

We build it. Read the AI Agent Architecture guide for the patterns, the MCP Tools for Drupal deep-dive for a concrete implementation, or schedule a consultation to discuss your use case.

Related resources:

External links:

Related Articles

Dig deeper into adjacent topics across RAG, AI security, and platform architecture.