What Should Your AI Actually Remember?
Every AI memory product eventually runs into the same temptation: save everything.
Every prompt. Every tool call. Every wrong turn. Then embed it all and call it memory.
A recent post on 12 Grams of Carbon tested that idea against a large monorepo and killed it. The author looked at what old agent session transcripts were actually worth. His conclusion: "keep track of artifacts, not scratch."
He is right. And the real lesson cuts deeper than transcripts. Raw transcripts fail because they remember too much. Automatic summaries fail because they remember the wrong things. Useful AI memory sits in the middle: small, deliberate, and reviewable.
This matters whether you call it ChatGPT memory, Claude memory, or agent context. The problem is the same: what should the AI carry forward? A lot of AI memory products are built on the losing side of this argument.
Transcripts are scratch work
Think about what a session transcript contains. Wrong turns. Abandoned approaches. Tool output that mattered for thirty seconds. The agent reading fifteen files to change one line.
By the end of a good session, everything valuable has already left the transcript. It became a commit. A pull request description. A doc. A decision.
The transcript is the scaffolding. The artifact is the building. Searching your scaffolding pile later mostly returns scaffolding.
There is a second problem. Transcripts are stale the moment the work continues. Yesterday's session describes yesterday's code. An agent that retrieves it confidently acts on a world that no longer exists.
Automatic memory keeps the wrong things
The same post makes a second observation that memory vendors should sit with. The author's team runs a weekly bot that proposes updates to their agent context files. Humans review every proposal. They reject about 80 percent.
Read that again. Even when an AI is explicitly asked to distill what matters, four out of five suggestions are noise. Left unsupervised, agents memorize random crap.
So curation needs a human in the loop. There is no architecture that removes the editor. There is only architecture that makes editing cheap or expensive.
What good memory looks like
Good memory is boring. That is the point.
It is a small set of deliberate notes that a human has seen.
- Decisions, with the reasoning.
- Project state that is actually current.
- Instructions for how you want the AI to work.
- Reference material worth keeping.
This is what Claude and ChatGPT should read at the start of a conversation. Not a vector soup of everything you ever said.
The write side matters as much as the read side. Let the AI suggest updates, but make review cheap enough that you actually do it. When an AI adds to your knowledge base, you should be able to see what it wrote, when, and why. And undo it. That 80 percent rejection rate does not disappear because the note lives in a nicer app.
That is why every note in Hjarni carries its history and provenance. You see which edits came from you, which came from Claude, which came from ChatGPT. One click reverts a bad one. The AI proposes. You stay editor in chief.
Keep the artifacts
The transcript-indexing approach optimizes for effort. Zero work, everything captured. It captures the wrong layer.
The knowledge base approach optimizes for signal. A little work, only the distilled layer. That layer is small enough to review and current enough to trust.
Your AI does not need to remember every word it ever said. It needs the handful of notes that describe your world. Write those down. Let the scaffolding go.
Hjarni is an AI-native knowledge base with a built-in MCP server. Claude and ChatGPT read and write your notes directly, and every note keeps its full history. Start free with 25 notes.
Give your AI a memory. Free.
Connect Claude or ChatGPT to notes they can actually read and write.
Give your AI a memory. Free.