Maximem Synap's Agent Memory Now Available for LiveKit Agents

LiveKit Agents is the Python framework for building real-time voice agents. WebRTC rooms. STT and TTS pipelines. Function tools that run inside the call. If you are building voice agents that need low-latency audio and structured tool use, LiveKit is where you start.

Today, Maximem Synap extends LiveKit Agents with persistent per-caller memory. Native AgentSession userdata carries scratch state for the duration of a call. Synap makes that state survive across calls, resolve entities across rooms, and retrieve relevant context the moment a caller reconnects.

Where LiveKit Agents Excels

LiveKit Agents gives voice developers a framework tuned for real-time audio. WebRTC transport for sub-200ms round trips. Pluggable STT, LLM, and TTS services. function_tool decorators that expose typed tools to the model. Turn detection and interruption handling. If your agent needs to handle phone calls, voice chat, or in-app voice interactions, LiveKit handles the audio plumbing well.

The problem emerges when the same caller rings back next week. The AgentSession clears when the room closes. userdata evaporates. The caller has to re-explain who they are, what they called about, and what they were promised. Native session state does not help if the agent forgets the caller's name.

How LiveKit Agents Memory Works Today

LiveKit Agents does not ship a first-party memory abstraction. State lives in a few different layers, all of them scoped to a single session or process. For a deeper look at how each pattern works, see the LiveKit Agents documentation.

AgentSession.userdata is a typed dataclass passed at session construction. Function tools read and write fields through RunContext[Userdata_T].userdata. The state grows during the call. When the room closes, the session ends. The dataclass is garbage collected. The next call starts fresh.

JobProcess.userdata and user_arguments survive for the life of the worker process handling the job. Useful for sharing state between sessions on the same worker, like a shared SQLite connection. They do not survive worker restarts. They are not scoped to a caller. They are process state, not caller memory.

session.history (ChatContext) holds the full conversation transcript. Helpers like chat_ctx.truncate(max_items=N) and chat_ctx._summarize(llm_v, keep_last_turns=2) keep the window manageable. Summarization is lossy. There is no cross-call persistence, no entity resolution, and no semantic retrieval.

All three options solve in-call state management. Cross-call persistence with per-caller scoping, entity resolution, semantic retrieval, and automatic compaction fall outside their scope. These are memory-layer problems, not framework-level concerns.

What Synap Adds

Synap is agentic context management. It does not replace AgentSession or ChatContext. It extends LiveKit with persistent memory the voice agent can preload before the call and record during the call.

We ship context preloading and turn recording:

Context preloading runs before session.start(). Synap fetches memories scoped to the caller and injects them into the ChatContext as a system message. The model walks into the call already knowing the caller's history, prior requests, and outstanding commitments.

Turn recording hooks on_user_turn_completed. After each user message, Synap ingests the turn into persistent memory. The agent does not have to decide to save. It happens automatically as part of the turn lifecycle.

The integration is a native package. Preload before start, record after each turn. Your voice agent starts remembering without a rewrite.

We built this because voice agents kept resetting in production. The problem was not bad audio. It was the caller having to repeat themselves every call. Production testing hit 92% LongMemEval, 93.2% on LoCoMo. Typical recall returns in under 100ms.

For why context management is infrastructure and not a feature, read What Is Agentic Context Management?. For build-versus-buy numbers, see The Real Cost of DIY Agent Memory.

Technical Deep Dive

LongMemEval Benchmark

The voice-agent version of LongMemEval goes harder than text. A caller dials in for a billing question on Tuesday, hangs up, dials again on Friday about a refund. Same caller, different room, different worker. The benchmark scores whether the second call's LLM pulls the prior billing context without the caller re-explaining. Synap scores 92% LongMemEval, 93.2% on LoCoMo across this scenario. Vector-only recall tops out around 60-70% on the same test, because "John" on Tuesday and "+1-555-0142" on Friday look like unrelated strings to cosine similarity. The gap is what entity resolution buys you at the caller level.

Entity Resolution Mechanism

For voice agents the identity question is messier than text. A caller might be known by their phone number, their SIP header, their LiveKit room metadata, or the OAuth subject from an in-app voice widget. They might also be known by nothing until they speak and voice biometrics kicks in. Synap's resolution engine indexes 15 reference patterns and links them in real time as new identifiers surface. So when a caller says "this is John, the same guy who called about invoice 4421," the engine attaches that utterance to the canonical entity for +1-555-0142. Subsequent turns in the call have full history.

Accuracy-Preserving Compaction

LiveKit pipelines drop turns into ChatContext continuously. Audio transcripts are wordy. After a 12-minute call, your prompt is stuffed with "uh-huh," "okay," "let me think for a second," and "can you repeat that?" — none of which the LLM needs to answer the next question. Synap's compaction classifier runs against the same transcript stream and keeps the durable signal: caller identity, commitments, account state, action items. Everything else is summarized or dropped. In our production testing this means roughly 60 to 70% fewer tokens reach the LLM after 8 to 10 minutes of call time, with no measurable loss in response quality.

Graph Traversal in Accurate Mode

Voice queries lean on relationship reasoning. A caller asks "what was that issue I had last month about my modem?" — the answer lives in a graph: caller → ticket 8842 → device serial → prior issue. Pure vector search over the transcript collection might land on the right page by accident. Graph traversal lands on it because it follows the actual relational path. Synap's accurate mode runs this traversal server-side and reranks hits by recency, confidence, and caller-entity relevance. The latency cost is real — 50 to 150ms extra in our tests — but the precision gain is worth it for the 10-20% of queries that need it. Fast mode handles the rest.

Multi-Tenant Scoping

Voice adds a tenant dimension text agents do not always face. A hosted voice platform serving a healthcare customer, a fintech, and a retail chain cannot let tenant A's caller patterns inform tenant B's. Synap enforces this at the storage layer through customer_id, which is bound to the room's metadata at session creation. The LiveKit hooks pick up the binding from the room's attributes, and the preloader/recorder instances are constructed fresh per room. There is no application-level check you can forget. The scoping is the database query, not a guard clause in your code.

What Synap Adds to LiveKit Agents

Persistence

LiveKit Native. Session-scoped userdata. Cleared when the room closes. With Synap. Per-caller memory survives across calls and worker restarts.

Entity Resolution

LiveKit Native. Raw identifiers. No linking across rooms. With Synap. "John" and "+1-555-0142" resolve to one canonical caller across every call.

Compaction

LiveKit Native. chat_ctx._summarize is lossy. No accuracy guarantees. With Synap. Automatic and configurable. Accuracy-preserving compaction that does not drop critical facts.

Retrieval Latency

LiveKit Native. No cross-call retrieval. With Synap. Typical recall via context preloading returns in under 100ms.

Long-Term Recall

LiveKit Native. Not benchmarked for cross-call recall. With Synap. 92% on LongMemEval.

Failure Handling

LiveKit Native. Unhandled errors abort the session. With Synap. Read failures return empty results and a logged error. Write failures raise SynapIntegrationError so you know persistence missed. Your call keeps running.

User Scoping

LiveKit Native. Session-scoped only. With Synap. user_id and optional customer_id set per call. Fresh preloader per room for multi-tenant isolation.

What Production Teams Gain

Callers stop repeating themselves. A caller dials, gets help, hangs up. Two days later, they dial again and the agent already knows their account, their last issue, and the resolution status. The caller experience is the part that actually changes — not the model, not the framework, the conversation. Repeat callers cost 40% less to serve in our customer data because the agent skips the discovery loop.

Latency fits inside a turn budget. LiveKit runs at sub-200ms round trips for the audio path. A 100ms preloading call fits inside the natural turn-taking gap before the LLM needs to respond. Callers do not hear silence. They do not notice the memory layer exists. Voice is unforgiving about pause length; this is the latency envelope that matters.

Compaction is what makes long calls viable. A 20-minute call without compaction ships every filler word to the LLM on every subsequent turn. With Synap, the transcript is condensed continuously, so the model sees signal, not filler. Without compaction, you either truncate (and lose context) or you cap the call length (and lose customers). This is the difference between a 5-minute and a 20-minute production voice agent.

Asynchronous recording keeps the audio path clean. Turn recording runs as a background frame hook. The LLM does not wait on a write to Synap. The audio pipeline does not block. If the write fails, the call still completes. Production voice agents cannot afford a 50ms stall because of a memory write. The architecture here is built around that constraint.

Entity resolution handles the messy real-world identity problem. Callers identify themselves in many ways: phone number, account number, last four of SSN, security question, voice. The same person might call from a different number, or a family member might call on their behalf. Synap links these into a canonical entity so the agent does not have to. This is the kind of problem that does not show up in tutorials and shows up in week one of production.

Failure modes that do not crash calls. A read failure returns empty context — the LLM responds with what it has, which is still better than dropping the call. A write failure raises SynapIntegrationError so observability sees the gap, but the call proceeds. The graceful-degradation contract is what makes this safe to wire into a real telephony system at 2 AM.

How to Get Started

Three steps. No rearchitecture.

Step 1: Install

pip install maximem-synap-livekit-agents livekit-agents

Step 2: Wire preloading and turn recording

import os
  from livekit.agents import AgentSession, Agent, JobContext
  from maximem_synap_livekit_agents import MaximemSynapSDK, create_synap_hooks
sdk = MaximemSynapSDK(api_key=os.getenv("SYNAP_API_KEY"))
await sdk.initialize()
hooks = create_synap_hooks(
sdk=sdk,
user_id="caller_123",
customer_id="acme_corp"  # optional, for multi-tenant
)
async def entrypoint(ctx: JobContext):
session = AgentSession()
  # Preload prior context before the LLM runs
  await hooks.preload_context(session)

  # Record every user turn automatically
  @session.on("user_turn_completed")
  async def _on_user_turn(chat_ctx, new_message):
      await hooks.record_turn(chat_ctx, new_message)

  await session.start(room=ctx.room, agent=Agent())</code></pre><p>Step 3: Deploy</p><p>For multi-tenant production, generate a fresh hooks instance per room using the caller's identity. This ensures tenant isolation at the session level. Synap handles persistence, compaction, and retrieval. Your voice agent handles the audio pipeline.</p><p>Full config, scoping rules, and error handling: see the integration docs (<a target="_blank" rel="noopener noreferrer nofollow" class="text-blue-400 underline hover:text-blue-300" href="https://docs.maximem.ai/integrations/livekit-agents">https://docs.maximem.ai/integrations/livekit-agents</a>)</p><hr><p>Memory Is Infrastructure</p><p>LiveKit Agents gave voice developers a framework for real-time audio. The session state it ships handles in-call data well. Giving that state persistence, entity resolution, and intelligent retrieval is a different layer of the stack.</p><p>The teams that ship production voice agents discover this around month three. They either build memory infrastructure themselves, or they plug in a system built for the problem.</p><p>Memory is infrastructure, not a feature.</p><p>Start building LiveKit voice agents that remember across calls (<a target="_blank" rel="noopener noreferrer nofollow" class="text-blue-400 underline hover:text-blue-300" href="https://synap.maximem.ai">https://synap.maximem.ai</a>)</p><p>Synap pricing is usage-based. You pay for memory operations: storage, retrieval, compaction. No per-seat or per-framework surcharge. The $49/month starter plan includes a base allocation; usage beyond that is metered by operation. Every new account gets $25 in free credits to test before committing. See the full pricing page (<a target="_blank" rel="noopener noreferrer nofollow" class="text-blue-400 underline hover:text-blue-300" href="https://synap.maximem.ai/pricing">https://synap.maximem.ai/pricing</a>).</p><hr><p>Related Posts</p><ul><li><p><a target="_blank" rel="noopener noreferrer nofollow" class="text-blue-400 underline hover:text-blue-300" href="https://www.maximem.ai/blog/autogen-memory-synap-integration">Maximem Synap's Agent Memory Now Available for AutoGen</a></p><p></p></li><li><p><a target="_blank" rel="noopener noreferrer nofollow" class="text-blue-400 underline hover:text-blue-300" href="https://www.maximem.ai/blog/crewai-memory-synap-integration">Maximem Synap's Agent Memory Now Available for CrewAI</a></p></li><li><p><a target="_blank" rel="noopener noreferrer nofollow" class="text-blue-400 underline hover:text-blue-300" href="https://www.maximem.ai/blog/google-adk-memory-synap-integration">Maximem Synap's Agent Memory Now Available for Google ADK</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" class="text-blue-400 underline hover:text-blue-300" href="https://www.maximem.ai/blog/google-adk-memory-synap-integration"><br></a></p></li></ul><p></p>

Maximem Synap & LiveKit Agents Integration

Maximem Synap's Agent Memory Now Available for LiveKit Agents

Where LiveKit Agents Excels

How LiveKit Agents Memory Works Today

What Synap Adds

Technical Deep Dive

What Synap Adds to LiveKit Agents

What Production Teams Gain

How to Get Started

Related posts

Maximem Synap's Agent Memory Now Available for OpenAI Agents SDK

Maximem Synap's Agent Memory Now Available for Pipecat

Maximem Synap's Agent Memory Connected To Semantic Kernel