As conversations grow longer, the accumulated message history can exceed a model’s context window. Memory compaction lets you define strategies that automatically trim or condense that history before each agent call, keeping token usage under control without changing how you call the agent.Documentation Index
Fetch the complete documentation index at: https://docs.timbal.ai/llms.txt
Use this file to discover all available pages before exploring further.
How It Works
Compaction is triggered automatically based on the previous run’s token utilization. Before each call, the agent checks how much of the context window was used in the prior run. If utilization exceedsmemory_compaction_ratio (default 0.75, i.e. 75%), the configured compactors are applied to the resolved memory before it reaches the LLM.
memory_compaction_ratio=0.0 to always compact, or 1.0 to effectively disable auto-triggering.
Built-in Strategies
Import strategies fromtimbal.core.memory_compaction.
keep_last_n_turns(n)
Keeps only the last n user/assistant turn pairs. Structure-aware: never leaves orphaned tool calls or results.
keep_last_n_messages(n)
Keeps only the last n messages regardless of role. Also structure-aware.
compact_tool_results(...)
Reduces the size of tool call history. Useful when tools return large payloads that are no longer needed verbatim.
replacement controls what happens to compacted tool results:
None(default) — drop tool results and their correspondingtool_useentries entirely. Assistant messages that become empty are also dropped.str— replace each result with a template string. Supported placeholders:{tool_name},{call_id},{result_length}.callable(tool_name, call_id, result_text) -> str— call a function per result and use the return value as the replacement.
summarize(...)
Summarizes old messages into a single context message using an LLM call. Uses incremental summarization: on subsequent runs, only the new overflow messages are sent to the summarizer, which updates the existing summary rather than regenerating it from scratch.
System messages are always preserved and never included in summarization.
Composing Strategies
Pass a list of compactors to apply them in order. Each compactor receives the output of the previous one.Observability
When compaction fires, the agent span records acompaction key in its metadata:
utilization— the context window fraction from the previous run (nullwhenmemory_compaction_ratio=0.0).steps— one entry per compactor with message counts before and after.