Current Context Policy
DuckAgent projects session history before every main model request. This keeps the prompt cache-friendly: the stable system prompt stays fixed, while dynamic context and recoverable history are appended or projected after that stable prefix.
The runtime entry is:
src/agent.rs -> projected_main_model_messages()src/context_projection.rs -> project_main_history()src/context_projection.rs -> ContextProjectionPolicy::guarded_mid(...)This maps to the benchmark strategy named recoverable decay guarded mid.
Policy shape
Section titled “Policy shape”| Part | Runtime behavior |
|---|---|
| Current Agent Loop | Stays raw while the projected prompt is below the active-loop pressure threshold. |
| Pressure threshold | 80% of prompt window for models below 200K; 85% for 200K+ prompt windows. |
| Current-loop projection | When pressure appears, tool results are compacted into [TOOL RESULT SUMMARY] blocks. |
| Completed-loop projection | Completed tool loops are projected into user messages plus one [COMPLETED AGENT LOOP SUMMARY] before model send. |
| Exact evidence budget | 18K tokens for the active loop, 2K tokens for completed loops. |
| Preview budget | 220 tokens for generic tool previews. |
| Recovery handles | File path, requested offset/limit, next offset, process id, cursor, mode, and query are preserved when available. |
Tool result projection
Section titled “Tool result projection”read_file gets the richest treatment. If the file content fits inside the remaining exact-evidence budget, the projected prompt keeps exact lines. If it does not fit, the prompt keeps a compact preview plus recovery instructions:
content_recovery: call read_file with the same path and a narrower offset/limit for exact lines.Process output keeps process identifiers, status, cursor, truncation state, a preview, and a recovery rule:
output_recovery: call process_read with process_id plus cursor/mode/search query for exact log chunks.Other tools keep a short preview. The goal is not to hide evidence. The goal is to keep the model prompt small and cache-friendly while making exact recovery cheap and explicit.
Benchmark match
Section titled “Benchmark match”| Evidence | Match |
|---|---|
| Source constructor | ContextProjectionPolicy::guarded_mid |
| Source threshold constants | SMALL_CONTEXT_ACTIVE_PRESSURE = 0.80, LARGE_CONTEXT_ACTIVE_PRESSURE = 0.85, LARGE_CONTEXT_THRESHOLD = 200_000 |
| Source budgets | ACTIVE_EXACT_EVIDENCE_BUDGET_TOKENS = 18_000, COMPLETED_EXACT_EVIDENCE_BUDGET_TOKENS = 2_000 |
| Benchmark source policy | duckagent_recoverable_decay_guarded_mid |
| Generated report label | duckagent_recoverable_decay_guarded_mid |
The generated report label is the user-facing benchmark name. The thresholds and budgets align with the runtime implementation.
What the benchmark does not prove
Section titled “What the benchmark does not prove”The benchmark is an offline simulator. It compares policy shapes and model-pricing sensitivity, but it is not a live bill and it does not run a provider tokenizer or a real summarizer for every request. Use it to compare directionally meaningful tradeoffs:
- prompt-cache behavior;
- projected prompt size;
- recovery tokens and extra recovery tools;
- tool-output compression savings;
- model limit pressure across pricing profiles.