The Iterative Agent Loop
The loop is implemented primarily inprocess_run_stream_message crates/palyra-daemon/src/application/run_stream/orchestration.rs:3#15. It manages a state machine that transitions through resolving sessions, planning usage, building tool catalogs, and executing provider turns.
Loop Components
- AgentRunLoopState: A pure, non-I/O struct that tracks the wall-clock budget and the growing message history crates/palyra-daemon/src/application/run_stream/agent_loop.rs#3-8. It enforces a default wall-clock budget of 15 minutes (
DEFAULT_AGENT_LOOP_WALL_CLOCK_BUDGET_MS) crates/palyra-daemon/src/application/run_stream/agent_loop.rs#25-25. - RunProgressController: A guardrail mechanism that detects repeated tool failures, denials, or “read loops” where the agent makes no progress crates/palyra-daemon/src/application/run_stream/agent_loop.rs#150-154.
- Final Answer Contract: A verification step that ensures the terminal answer produced by the model satisfies the run’s objectives before completion crates/palyra-daemon/src/application/run_stream/agent_loop.rs#45-46.
Run Execution Flow
The following diagram bridges the logical agent loop to the specific code entities inpalyra-daemon.
Agent Loop Orchestration Flow
Sources: crates/palyra-daemon/src/application/run_stream/orchestration.rs#3-15, crates/palyra-daemon/src/application/run_stream/agent_loop.rs#3-15, crates/palyra-daemon/src/application/run_stream/tool_flow.rs#1-10
The Tape: Append-Only Run Journal
The Tape is the source of truth for a run’s history. Every observable event—status updates, model tokens, tool proposals, approval requests, and tool results—is appended to theOrchestratorTape crates/palyra-daemon/src/application/run_stream/tape.rs#3-9.
Tape Invariants
- Wire First: The wire event is sent to the client before the tape append crates/palyra-daemon/src/application/run_stream/tape.rs#6-8.
- Sequential Integrity: The
tape_seq(sequence number) only advances after a successful journal append crates/palyra-daemon/src/application/run_stream/tape.rs#8-9. - Redaction: All text is passed through
redact_run_stream_textto strip sensitive keys, URLs, and auth credentials before storage crates/palyra-daemon/src/application/run_stream/tape.rs#140-150.
Tool Proposal and Execution Flow
Tools move through a strict pipeline to ensure security and operator control.- Proposal: The agent proposes a tool call via the model response.
- Gate (Security & Approval):
evaluate_tool_proposal_securitychecks the call against Cedar policies crates/palyra-daemon/src/application/run_stream/tool_flow.rs#59-63. If sensitive, it triggers aPendingToolApprovalcrates/palyra-daemon/src/application/approvals/mod.rs#41-43. - Approval: Interactive prompts are sent to the operator with a timeout (
TOOL_APPROVAL_RESPONSE_TIMEOUT) crates/palyra-daemon/src/gateway.rs#72-72. - Execution: If allowed, the tool is dispatched to the runtime via
execute_tool_with_runtime_dispatchcrates/palyra-daemon/src/application/run_stream/tool_flow.rs#67-67. - Attestation: Every outcome is signed with a
ToolAttestationcontaining a SHA-256 hash of the inputs, outputs, and execution metadata crates/palyra-daemon/src/tool_protocol.rs#83-97.
Parallel Execution
Palyra supports bounded parallel execution (default max 4) for tools classified asReadOnlySafe, PathScoped, or IdempotentNetwork crates/palyra-daemon/src/application/run_stream/tool_flow.rs#93-150.
Tool Execution Pipeline
Sources: crates/palyra-daemon/src/application/run_stream/tool_flow.rs#1-10, crates/palyra-daemon/src/tool_protocol.rs#1-15, crates/palyra-daemon/src/application/run_stream/tool_flow.rs#93-93
Context Assembly and Budgeting
Before each model turn, Palyra assembles the context viaprepare_model_provider_input crates/palyra-daemon/src/application/provider_input.rs#3-7.
Context Enrichment Layers
| Layer | Description | Source |
|---|---|---|
| Session Compaction | Condenses old history into summaries to save tokens. | session_compaction.rs |
| Memory Recall | Hybrid retrieval (vector + FTS) of relevant facts. | recall.rs |
| Trust Labels | Wraps recalled memory in fences to prevent prompt injection. | memory.rs |
| Pruning | Ephemeral removal of low-signal tokens for the current turn. | session_pruning.rs |
Session Compaction
To manage long-running sessions, theContextCompressor identifies “protected” events (recent messages or pins) and condenses the rest into a SessionCompactionPlan crates/palyra-daemon/src/application/session_compaction.rs#3-17. This involves mining the history for durable facts and decisions, which are then persisted to the workspace crates/palyra-daemon/src/application/session_compaction.rs#134-155.
Sources: crates/palyra-daemon/src/application/provider_input.rs#1-13, crates/palyra-daemon/src/application/session_compaction.rs#1-17