Integrated Agent Memory for All Harnesses Using Hooks

0 1 7 minutes read

Integrated Agent Memory for All Harnesses Using Hooks

that the biggest debate is not about when the next best model drops, but about who will build the right harness around them. The harness is the scaffolding around the model: agent loop, tool definitions, content management, memory, information, and workflow that turn raw LLM into a useful product. The model is the engine, the harness is all that makes it drive. Examples of harnesses are Cursor, Claude Desktop, and others.

There's an ongoing debate in the AI coding tools space: does committing to a specific harness mean vendor lock-in? Memory is the sharpest in this. If your agent's memory lives inside a closed harness or behind a proprietary API, you don't really own it, and replacement costs add up quickly. But it doesn't have to be like that.

The idea for this blog post is simple: keep the memory layer outside the harness, and let any harness connect to it.

A memory design for an integrated agent.

In this post, I will show you how to create a a single, shared memory layer running on three different code agents – Claude's Codex, OpenAI's Codex, and Cursor – uses hooks as a means of integration and Neo4j as a persistent store.

The code for the integration hook is available on GitHub.

MCP tools can only get you so far in memory

MCP (Model Context Protocol) servers are the answer to giving agents access to external systems. And they work. You can expose the Neo4j database as an MCP tool and let the agent query it when it decides to do so.

But MCP tools are like that activated agent. The model must decide to call the tool, and must know when and why to do so. That means:

An agent needs to “remember to remember”, it has to continuously decide to store something to be remembered later.
There is no guarantee of consistency, one session may log everything, the next may log nothing.
You rely on the model's judgment about what's important in memory, in real time, while it's busy doing something else.

What you really want automated loggingwhich captures each session event regardless of what the model is doing, without using any of its context or attention.

This is exactly what hooks give you.

Hooks allow you to write program flow and decisions based on a predefined set of events.

Insert the hooks

Hooks are shell commands that fire automatically on lifecycle events: when a session starts, when a user submits information, before and after every tool use, and when a timer expires. The agent does not decide to call them, they work on the plan.

The important understanding is that The hooks are remarkably similar to all suppliers. Claude's code, Codex, Cursor, and others all support the same lifecycle events:

SessionStart because when the agent's time starts
UserPromptSubmit (or beforeSubmitPrompt in Cursor) when the user sends a message
PreToolUse / PostToolUse before and after each tool call
Wait because the session expires

The hook receives a JSON payload from stdin with the session ID, event name, device information, and user information. And the hook can output JSON to stdout to add more context back to the dialog. Same contract, three harnesses/clients.

There are other hooks, things like notification events, subagent stop, or precompiled hooks, but we won't use them here.

Shared memory layer

Now we need some place to persist the memory. A quick disclaimer: I work at Neo4j, so we'll use it in this example.

The model is straightforward. Each agent session is a node, connected to a linked list of event nodes, one for each hook invocation. Events are typed by the lifecycle event that initiated it: SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, Stop. A session is saved as an organized timeline of everything that happened during that session.

All five types of events are recorded for the store, giving you a complete test trail for each session in every harness. Two of them are also injection points. SessionStart fires before the agent reads its system information, so anything that is released by the hook is preempted from the system information. That's how persistent, agent-level memory comes into play. UserPromptSubmit fires just before the user's message is sent, and anything that is emitted when it is placed in the user's notification. That's a turn-level context hook, like pulling memories that correspond to what the user just typed.

So, what happens when we start a new session in any of these harnesses with active hooks, for example Cursor?

If we check the results in Neo4j browser.

An example session continued as a graph in Neo4j.

One important limitation: the hooks work outside of the harness model session. You cannot reuse the LLM that the agent is talking to. If you want an LLM-enabled function inside the hook you have to call your model, which adds a delay to every event the agent fires. That's why the hooks here do only two things: log events and inject computer generated memories. They are always quick and decisive.

The dream section

The actual memory work is done in a different phase of the dream: extracting facts from the sessions, summarizing what happened, updating the graph. This is just a batch job that runs every few hours, reads events collected since the last time, and writes back to the memory store. You could actually trigger a memory update asynchronously every time a session stops, but that sounds like too much; The timing is simple and works well for this show.

The dream function pulls the entire event from the last watermark of the session, gives it to Claude and the current memory store, and asks it to write a small set of solid notes. The notes themselves mimic a markdown wiki, the same model that Karpathy and others have been drawing on for personal LLM memory and the same model that Anthropic's capabilities already use: each memory is a file in a semantic path like profile/role.md, tools/bash/common-flags.mdor project/neo4j-skills.mdwith the YAML frontend at the top and the prose at the bottom. Claude is told to compile instead of append, so the path is live text, not log; if new events contradict the old note, the old note is rewritten. The result is a tree of small, self-contained markup files that a future session can read in a cold, abstract form from a skill, authorized by the dream class instead of by hand.

When we use it in our example, we get the following memories created.

A dream section for adding and organizing memories.

And now if I open a different harness, this time Claude Code Desktop with hooks open, I will get the following response.

Claude's code desktop uses an integrated memory layer.

Accessing memory

The final piece of the puzzle allows the agent to access the memory layer. As mentioned, there are two ways to inject information into an agent: hooks and MCP tools.

An agent that interacts with the memory layer with MCP hooks and tools.

Hooks determine and run at the beginning of each session to populate system information. This is where profile information and instructions on how to use memory properly should go. You can also add more context when the user's notification event fires, but it's only an accessory; you cannot manipulate other parts of the information.

MCP tools, on the other hand, give LLM direct access to the memory layer on demand. Instead of silently accepting context at first, the agent can search relevant memories, store new information, and update or delete existing entries. Basically, it's basic CRUD on top of hidden tag files stored in Neo4j.

In the end, I think you'll probably need both. In this project we only have hooks, no MCP tools, but you can simply plug in the official Neo4j MCP to let the agent explore the graph.

To make it work

Somewhat interestingly, the way I set up the hooks was to point the agent at any harness and ask them to put the hooks in, but I'm sure there are better ways too.

Summary

If you are not the owner of your memory, you are not your agent. Every harness today creates its own garden with context walls, preferences, and session history. Change them and start from scratch. It doesn't have to be like that.

Hooks break that pattern. They allow you to write integrations that connect to any harness from the outside and the interface is incredibly flexible. Code Claude, Codex, and Cursor all fire the same lifecycle events: session start, fast forward, tool use, session end. The hook gets the JSON from stdin, outputs the JSON from stdout to inject the context, and that's the whole contract. Because hooks work reliably on every event, they don't consume the model's attention or rely on the agent to decide what to save. Two identical Python scripts handle all three clients; small wrapping shells pass a --client the flag is the only glue for each harness.

The architecture has three stages:

Hooks (online) – easily log all events in Neo4j as a linked list per session. No model calls, no waiting costs, just install.
Dream section (offline) – the batch job reads the accumulated events, asks Claude to split them into persistent memories, and writes them back. Memories are organized by topic and grouped instead of expanded, so they stay current instead of growing forever.
Injection (online) – in the next session start with anywhere harnesses, profile memories are loaded into context. For each user notification, relevant memories are searched and added automatically.

The result is a layer of memory that sits beneath all three harnesses, operates without any of them knowing individually, and is entirely your own. You can switch from Cursor to Claude Code to Codex mid-project and pick up where you left off. Your agent's understanding of who you are, what you do, and how you choose to work follows you, not the tool.

The code is available here.

Ps: All images are created by the author.

Source link

nimda 2 hours ago

0 1 7 minutes read