A bigger context window is not a memory.You might be thinking, the context windows are huge now, so why not just paste everything in. It is a fair question, and it is the first thing most teams try. A million tokens holds roughly 1,500 pages. For a dollar or so you can keep months of conversation in the prompt. If that works, you do not need a memory layer at all.Back to the memory problem.

THE BIGGER-WINDOW QUESTION

A bigger context window is not a memory.

You might be thinking, the context windows are huge now, so why not just paste everything in. It is a fair question, and it is the first thing most teams try. A million tokens holds roughly 1,500 pages. For a dollar or so you can keep months of conversation in the prompt. If that works, you do not need a memory layer at all.

Here is the catch. A bigger window is recall capacity, not memory. It changes how much you can carry into a single call. It does not change the three things that actually break, and none of those three get better as the window grows.

The first is cost and latency. Memory is about not re-sending the whole history on every turn. When you stuff the window, you pay to re-read everything the agent has ever seen, on each step, and that cost grows with the length of the conversation. A memory layer sends the right two thousand tokens instead of all two hundred thousand. The window makes the ceiling higher. It does not make the bill smaller.

The second is that models do not read the window evenly. Recency bias, position effects, and the well-documented "lost in the middle" behaviour mean a fact sitting on page 700 is not reliably used just because it is technically in context. Capacity is not attention. A longer prompt can even make recall worse, because the signal you care about is now buried under everything else you pasted in.

The third is the one the skeptics quietly concede in their own writing. The window resets between sessions. A context window is per-conversation by definition, so cross-session continuity, the thing that makes an agent feel like it knows the user, cannot come from a window of any size.

So the honest version is this. Bigger windows kill the naive memory hacks, the frameworks that just dump everything into the prompt and hope. They make a disciplined memory layer more valuable, not less, because now the scarce resource is not storage, it is deciding what deserves to be in the prompt in the first place. That decision, what to keep, what to surface, what to let go of, is the actual work.

That is what Synap takes care of. We retrieve the right context before the agent asks for it, we resolve entities so the same user is the same user across channels, we keep track of what is still true and what changed last week, and we forget what has gone stale on purpose. On LongMemEval we score 92% and 93.2% on LoCoMo, with P50 retrieval under 15ms, which means the agent gets the right memory fast enough that the bigger window becomes a place to do reasoning, not a place to store everything and hope the model finds it.

A bigger window is a bigger desk. It is not a memory.

The proof, in numbers

92%LongMemEval accuracy 93.2%LoCoMo accuracy <15msP50 recall latency

← The memory problem How agent memory works Measuring agent memory

See how Synap decides what belongs in the prompt.

Get Started Free

Learn more about Maximem Synap·The memory problem