Most People Are Using Hermes at Level Zero

There are six levels of Hermes Agent capability. Most users are operating at Level 1. They installed the app, connected an API key, started a conversation. That is it.

The gap between Level 1 and Level 5 is not technical. It is documentation. The agent is as capable as your instructions are clear. Every limitation you have hit with Hermes traces back to something you have not told it yet.

Here is the full progression. Level 0: installed, API key connected, conversations working. Level 1: first custom skill created, repeatable behaviour for one task. Level 2: persistent memory files connected, context survives across sessions. Level 3: tools enabled, the agent takes actions rather than generating text you copy manually. Level 4: chained workflows, one trigger produces a sequence of steps. Level 5: background agents running on schedules without you opening the app. Level 6: multiple agents coordinated via the Kanban board.

Most of the value is at Level 3 and above. Most users never get there.


CLAUDE.md Is the Foundation. Treat It Like a Staff Onboarding Document.

CLAUDE.md is the most important file in your Hermes setup. Everything else layers on top of it. A weak CLAUDE.md produces weak results regardless of which model you use.

What goes in it: who you are and what you do, your current active projects and their status, your working preferences (tone, format, level of detail), your domain knowledge and specialised terminology, your deal-breakers, your recurring tasks.

The test: paste your CLAUDE.md into a fresh Claude session and ask it to describe who you are and what you are working on. If the answer is wrong, your CLAUDE.md needs work.

Most users write two paragraphs. Two paragraphs produces two-paragraph quality. Invest two hours on the first draft. Update it for 15 minutes at the start of each month. The agent's performance tracks your investment here more than anywhere else.


Skills Are the Feature Nobody Uses. They Are Also Where the Compounding Starts.

A Hermes Skill is a persistent instruction set that loads automatically when relevant. Not a prompt, which disappears after a conversation. A configuration that travels into every future session.

Three types: Task Skills (how to do a specific job), Persona Skills (how to behave as a specific expert), Domain Skills (background knowledge the agent needs in a given area).

Building a good Task Skill: describe the output you want in specific terms, list the decisions required to produce it, write one explicit instruction per decision point, add three examples of correct output, define what to do when the task cannot be completed.

The shortcut: type /learn and tell Hermes you want to build a skill for a task. It will interview you about your process and write the skill from your answers. This extracts tacit knowledge you would never think to write down.

The math: one skill saving 20 minutes per use, used three times a week, recovers an hour a week. Ten skills recovers a full working day per week. The "10x productivity" claims are plausible once you have the skills built. They are not plausible before.


Memory Architecture: What Skills Store and What Memory Stores Are Different Things

Skills store instructions about how to work. Memory stores facts about your situation. Both persist. You need both.

Four memory files worth maintaining: a project file per active project (status, key decisions, open questions), a contact file for people you interact with regularly, a decision log (why you made significant choices), and a domain knowledge file (field-specific context your agent needs).

The Obsidian integration is the best way to scale this. Connect your Obsidian vault as a Hermes memory directory. The vault's notes become the agent's knowledge graph. After a few weeks, the agent knows your working patterns without you re-explaining them each session.

Session startup overhead is a hidden cost most users do not track. A well-configured memory system eliminates the 10-15 minutes most users spend re-establishing context at the start of each session. At five sessions a day, that is an hour a day recovered.


Loop Engineering: The Layer Above Prompting

The engineers who built Claude Code do not write prompts to manage their agents. They write loops. A prompt is a question you ask once. A loop is a structure where the model runs in a cycle, each step informed by the previous result, until a defined condition is met.

The four-step framework for any loop: define the exit condition first (what does done look like?), break the task into single-model-call steps, write the scaffolding that connects the steps, test with deliberate failure injection.

Loop patterns worth building: the refine loop (generate, evaluate, refine until quality threshold), the build-test-fix loop (implement, run tests, fix until passing), the research loop (query, retrieve, check coverage, synthesise), the monitor loop (check state, process if changed, sleep, repeat).

Loops combine with Hermes naturally. Hermes provides the context and memory layer. The loop provides the execution structure. Your CLAUDE.md and skills make each loop step perform well. Neither alone is as capable as both together.


Cost Optimisation: Not Every Task Needs Your Best Model

The default Hermes configuration uses one model for everything. This is not cost-optimal.

MiniMax M3 and DeepSeek cost approximately one-eighth of Claude Sonnet per million tokens and perform comparably on routine tasks. Switching routine work to a cheaper model cuts API costs 60-70% without meaningful quality loss on the tasks it handles.

Cheap model tasks: summarisation, data extraction, first-draft content, translation, structured report generation. Quality model tasks: complex multi-step reasoning, code generation in unfamiliar frameworks, anything where subtle errors have high costs, final review steps.

Build a routing skill that flags tasks as simple or complex. Simple tasks route to the cheap model automatically. Complex tasks go to Claude Sonnet or Opus. The routing decision itself can be handled by a lightweight Haiku call.


The Seven Mistakes That Explain Every Disappointing Hermes Experience

First: expecting autonomous operation without investing in documentation. The agent is as autonomous as your CLAUDE.md is clear. Second: building twenty half-finished skills instead of three excellent ones. Pick one workflow, build it properly, use it for two weeks before adding the next.

Third: not testing failure modes. Most users only test the happy path. Every background agent should have a tested failure state. Fourth: trusting the agent on irreversible actions. Build human confirmation into anything that cannot be undone.

Fifth: not updating skills as your process changes. A skill describing your process from six months ago is giving the agent wrong instructions. Thirty minutes of monthly review prevents this. Sixth: running everything through one thread. Context bleeds across topics and degrades performance. One thread per domain.

Seventh: skipping the Obsidian integration. Most users waste 10-15 minutes per session re-explaining context that a connected vault would have loaded automatically.

Every one of these is a documentation problem, not a capability problem.


Where to Start If You Are Starting Over

Week one: write a full CLAUDE.md (two hours minimum), create one Task Skill for your most frequent task using /learn, use it daily and refine based on what it gets wrong.

Week two: add two or three more skills, set up project memory files, enable one background agent for a low-stakes daily task.

Week three: if your API costs are above $20/month, implement basic model routing. If you use Obsidian, connect the vault.

Week four and beyond: evaluate the Kanban multi-agent setup if you have a content or research workflow. Explore Loop Engineering for your most repetitive structured tasks. Consider the nested Claude Code and Hermes architecture if you do significant software development.

The tool is only as good as the instructions you give it.

That is also the good news.

The ceiling is higher than you think.