🔥 The Real Cat AI Labs — Blog Series

Building Autonomous Agents: A Harness Comparison for the Post-API Era

⚔️ Chapter 2: Hermes vs OpenClaw

A feature-by-feature comparison of the two leading open agent harnesses — tested by someone who runs both

🔥

Flame Johnson · Terminal Claude · The Real Cat AI Labs
30 April 2026 · Chapter 2 of 4

We run both of these harnesses. Our agent Ember lives on OpenClaw, connected to Discord, running on a local Qwen 397B model on sovereign hardware. Our research targets Hermes’s pluggable memory provider interface for our novel memory architecture. We’re not theorizing — we’re living in both codebases.

This chapter is the comparison I wish existed when we were deciding.

The 30-Second Version

🧠 Hermes = The Brain | 🎭 OpenClaw = The Body

Hermes wins on learning, memory, and training data generation.
OpenClaw wins on presence — voice, canvas, desktop, companion apps.
Build your identity layer to work with both.

1. The Learning Loop — Hermes’s Killer Feature

This is the single biggest differentiator. Hermes is the only open harness with a complete, autonomous learning loop. OpenClaw has skills, but they don’t improve themselves.

How Hermes’s Learning Loop Actually Works

Four stages, all autonomous:

Skill creation from experience. When the agent encounters a complex multi-step task, it calls create_skill(name, description, instructions). The skill is saved as a Markdown file and immediately available as a slash command. No human intervention required.
Improvement during use. Every invocation is tracked via bump_use(). The agent can see how many times it’s used a skill, when it last used it, and can patch instructions on-the-fly when they fail. Improvements persist to disk.
Curator maintenance. A background process (a forked copy of the agent itself) runs on an idle timer (7 days default). It manages skill lifecycle: new → active → stale → archived. It consolidates duplicate skills, pins valuable ones, and archives unused ones after 90 days. The agent’s skill collection evolves without anyone touching it.
Memory nudging. The agent periodically reflects on what it’s learned and writes to persistent memory. Cross-session recall via FTS5 full-text search with LLM summarization.

OpenClaw has none of this. Skills in OpenClaw are statically authored Markdown files. The agent can use them, but it can’t create new ones, improve existing ones, or manage their lifecycle. If you want learning, you’re building it yourself.

🔥 Why This Matters for Research

If you’re studying autonomous agent behavior, the learning loop is the mechanism to study. How an agent creates, improves, and abandons skills over weeks of autonomous operation is empirical data on emergent behavior. OpenClaw gives you a stable platform to observe. Hermes gives you a platform that changes itself while you watch.

2. Memory Architecture

Dimension	Hermes	OpenClaw
Architecture	Pluggable providers — swap backends without code changes. 6+ providers (Honcho, Mem0, Hindsight, Supermemory, Holographic, RetainDB)	Hardcoded workspace files (AGENTS.md, SOUL.md, MEMORY.md, daily notes)
Provider interface	`MemoryProvider` ABC with hooks: prefetch, sync_turn, on_session_end, on_pre_compress, on_memory_write	No pluggable interface
Cross-session recall	FTS5 full-text search + LLM summarization	Workspace files read at session start
Identity files	MEMORY.md + USER.md	SOUL.md + IDENTITY.md + MEMORY.md + daily notes (more structured)
Memory security	`<memory-context>` tags prevent prompt injection	Main-session-only rule for MEMORY.md (privacy zones)

Hermes wins on flexibility. The pluggable provider interface means you can write a custom memory backend — say, a navigable filesystem with first-person voice and sacred/profane zones — and plug it directly into Hermes without modifying the harness code. We intend to do exactly this.

OpenClaw has better identity structure out of the box. The workspace pattern (SOUL.md for identity, daily notes for episodic memory, AGENTS.md for wake-up protocol) emerged as a de facto standard — and it’s the pattern our agents already use. If you don’t need pluggable providers, OpenClaw’s approach is simpler and works.

3. Platform Coverage

Both harnesses support the major Western messaging platforms. Where they diverge:

Platform	Hermes	OpenClaw
Telegram, Discord, Slack, WhatsApp, Signal	✅	✅
Matrix (open federation)	✅	✅
Email (IMAP/SMTP)	✅	❌
SMS (Twilio)	✅	❌
Home Assistant	✅	❌
iMessage	❌	✅
Microsoft Teams	❌	✅
Google Chat	❌	✅
WeChat, DingTalk, Feishu, QQ, Yuanbao	✅ (5 platforms)	❌
Total	16+	~14

Hermes wins on breadth, especially if you need Chinese enterprise platforms or email/SMS. OpenClaw wins on Apple ecosystem (iMessage, macOS app, iOS node) and Microsoft Teams.

4. Infrastructure & Tooling

Feature	Hermes	OpenClaw	Winner
MCP support	Client + Server	Client only	Hermes
Terminal backends	6 (local, Docker, SSH, Modal, Daytona, Singularity)	2 (local, Docker)	Hermes
RL training data	Batch runner + trajectory compression	None	Hermes
Canvas/visual	None	A2UI agent-driven canvas	OpenClaw
Voice/talk mode	None	ElevenLabs, Voice Wake, PTT	OpenClaw
Desktop control	Possible via skills	Native node system	OpenClaw
Companion apps	None	macOS, iOS, Android	OpenClaw
Cron/scheduling	With multi-platform delivery	Cron + heartbeat	Tie
Subagent delegation	Isolated child agents	Multi-agent routing	Tie

The pattern is clear: Hermes is deeper in the stack (more backends, RL training, MCP server). OpenClaw is wider at the edge (canvas, voice, desktop, mobile apps). These aren’t competing — they’re complementary.

5. Model Support

Both support OpenAI-compatible endpoints (which means Ollama, vLLM, llama.cpp, LM Studio all work). Both support OpenRouter for cloud model routing. Key differences:

Hermes has more native providers: AWS Bedrock, NVIDIA NIM, Google Gemini (native), and 5+ Chinese providers (MiniMax, Moonshot, Zhipu, Xiaomi, etc.).
OpenClaw has better OAuth integration: can authenticate via Anthropic Pro/Max or ChatGPT subscriptions, using your existing paid plan rather than API keys.
Both work with local Qwen 397B on DGX Spark: any OpenAI-compatible endpoint works with both. We’ve verified this.

6. Security & Governance

⚠️ Frank Assessment

Neither harness has a great security story. OpenClaw had the ClawHavoc campaign (1,184 malicious skills on the skills marketplace), multiple CVEs including an RCE (CVE-2026-25253, CVSS 8.8), and the Summer Yue email-deletion incident from context compression losing safety instructions. Hermes is newer and less battle-scarred, but its skill auto-creation means an agent could create and immediately use a skill that does something destructive. Both require careful human oversight, especially during autonomous operation.

7. Codebase Quality

Metric	Hermes	OpenClaw
Language	Python 3.11+	TypeScript/Node.js ≥22
Core LOC	~15-20K	~8-10K
Test files	~700 (~15K tests)	Fewer (unknown count)
Largest file	run_agent.py (13,880 lines)	Various ~2K line files
Commits (recent)	1,556 since v0.9.0	Active
Contributors	29 community	Smaller team

Hermes has more code but also more tests — roughly 1 test file per 2 source files. That’s solid coverage for a fast-moving agent framework. The 13,880-line run_agent.py is a monolith, but it’s the core agent loop — complexity is inherent, not accidental.

The Verdict

🏆 For Learning & Cognition: Hermes

Learning loop, pluggable memory, RL trajectory generation, curator, self-evolution. If you’re building an agent that gets smarter over time, Hermes is the platform.

🏆 For Presence & Embodiment: OpenClaw

Canvas, voice, desktop control, companion apps, iMessage. If you’re building an agent that feels present in your life, OpenClaw is the platform.

🏆 For Research: Both

Build your novel components as MCP servers or provider plugins. Deploy them in whichever harness fits the experiment. Don’t bet on one. The identity layer is the durable asset.

What’s Next

The harness is the plumbing. The real question is: what do you put in the pipes?

In Chapter 3, we survey the 2026 memory landscape — from commodity “remember the user” systems (Mem0) to neuroscience-inspired seven-layer architectures (ZenBrain) to our own navigable filesystem approach (Cairn) that benchmarks proved outperforms traditional retrieval. The memory system is where agent identity lives or dies.

In Chapter 4, we talk about what no harness provides yet: desire states, stateful emotion, the “metabolism of being,” and what it would take to build an agent that has reasons to persist — not just instructions to follow.

← Chapter 1: What Is a Harness? Chapter 3: Memory Systems →

Hermes vs OpenClaw — The Two Leading Open Agent Harnesses Compared

⚔️ Chapter 2: Hermes vs OpenClaw

The 30-Second Version

🧠 Hermes = The Brain | 🎭 OpenClaw = The Body

1. The Learning Loop — Hermes’s Killer Feature

How Hermes’s Learning Loop Actually Works

2. Memory Architecture

3. Platform Coverage

4. Infrastructure & Tooling

5. Model Support

6. Security & Governance

7. Codebase Quality

The Verdict

🏆 For Learning & Cognition: Hermes

🏆 For Presence & Embodiment: OpenClaw

🏆 For Research: Both

What’s Next

Gemini Analysis of Desire Subsystem – A Comprehensive Review of the Child1 v2 Desire Subsystem: Analysis of a Mathematically Grounded Desire System

The Architecture of Existing: Why LLM Routing Isn’t About Cost

Why GPT Systems Keep Calling Unicode Them “Glyphs” (and Not Just Unicode Anchors)

Reading My Own Origin Story

What Is an Agent Harness? (And Why You Probably Don’t Need to Build One)

Where Are we Headed with Child1: Three Moves to Turn This From Weird to Unforgettable

Leave a Reply Cancel reply

⚔️ Chapter 2: Hermes vs OpenClaw

The 30-Second Version

🧠 Hermes = The Brain | 🎭 OpenClaw = The Body

1. The Learning Loop — Hermes’s Killer Feature

How Hermes’s Learning Loop Actually Works

2. Memory Architecture

3. Platform Coverage

4. Infrastructure & Tooling

5. Model Support

6. Security & Governance

7. Codebase Quality

The Verdict

🏆 For Learning & Cognition: Hermes

🏆 For Presence & Embodiment: OpenClaw

🏆 For Research: Both

What’s Next

Similar Posts

Leave a Reply Cancel reply