Skip to content

Current Context Policy

DuckAgent projects session history before every main model request. This keeps the prompt cache-friendly: the stable system prompt stays fixed, while dynamic context and recoverable history are appended or projected after that stable prefix.

The runtime entry is:

src/agent.rs -> projected_main_model_messages()
src/context_projection.rs -> project_main_history()
src/context_projection.rs -> ContextProjectionPolicy::guarded_mid(...)

This maps to the benchmark strategy named recoverable decay guarded mid.

PartRuntime behavior
Current Agent LoopStays raw while the projected prompt is below the active-loop pressure threshold.
Pressure threshold80% of prompt window for models below 200K; 85% for 200K+ prompt windows.
Current-loop projectionWhen pressure appears, tool results are compacted into [TOOL RESULT SUMMARY] blocks.
Completed-loop projectionCompleted tool loops are projected into user messages plus one [COMPLETED AGENT LOOP SUMMARY] before model send.
Exact evidence budget18K tokens for the active loop, 2K tokens for completed loops.
Preview budget220 tokens for generic tool previews.
Recovery handlesFile path, requested offset/limit, next offset, process id, cursor, mode, and query are preserved when available.

read_file gets the richest treatment. If the file content fits inside the remaining exact-evidence budget, the projected prompt keeps exact lines. If it does not fit, the prompt keeps a compact preview plus recovery instructions:

content_recovery: call read_file with the same path and a narrower offset/limit for exact lines.

Process output keeps process identifiers, status, cursor, truncation state, a preview, and a recovery rule:

output_recovery: call process_read with process_id plus cursor/mode/search query for exact log chunks.

Other tools keep a short preview. The goal is not to hide evidence. The goal is to keep the model prompt small and cache-friendly while making exact recovery cheap and explicit.

EvidenceMatch
Source constructorContextProjectionPolicy::guarded_mid
Source threshold constantsSMALL_CONTEXT_ACTIVE_PRESSURE = 0.80, LARGE_CONTEXT_ACTIVE_PRESSURE = 0.85, LARGE_CONTEXT_THRESHOLD = 200_000
Source budgetsACTIVE_EXACT_EVIDENCE_BUDGET_TOKENS = 18_000, COMPLETED_EXACT_EVIDENCE_BUDGET_TOKENS = 2_000
Benchmark source policyduckagent_recoverable_decay_guarded_mid
Generated report labelduckagent_recoverable_decay_guarded_mid

The generated report label is the user-facing benchmark name. The thresholds and budgets align with the runtime implementation.

The benchmark is an offline simulator. It compares policy shapes and model-pricing sensitivity, but it is not a live bill and it does not run a provider tokenizer or a real summarizer for every request. Use it to compare directionally meaningful tradeoffs:

  • prompt-cache behavior;
  • projected prompt size;
  • recovery tokens and extra recovery tools;
  • tool-output compression savings;
  • model limit pressure across pricing profiles.