palyrad). It manages the lifecycle of agent runs, transforms raw user input into model-ready prompts through context augmentation, handles the bidirectional streaming of events, and manages post-run maintenance such as session compaction and learning.
Run Stream Pipeline
Agent execution is modeled as a “Run Stream,” a stateful pipeline that coordinates between the user (via gRPC or HTTP), the model provider, and the system’s internal tools.Execution Flow
- Initialization: A
RunStreamRequestis received. The system resolves theOrchestratorSessionRecordcrates/palyra-daemon/src/transport/http/handlers/console/chat.rs#40-69. - Input Assembly: The
prepare_model_provider_inputfunction gathers attachments, context references, and performs RAG (Retrieval-Augmented Generation) to build the final prompt crates/palyra-daemon/src/application/provider_input.rs#53-64. - Provider Execution: The orchestrated loop calls
execute_model_providercrates/palyra-daemon/src/application/run_stream/orchestration.rs#147-183. - Event Processing: Model outputs (text chunks, tool calls) are processed by
process_run_stream_provider_eventscrates/palyra-daemon/src/application/run_stream/orchestration.rs#13-15. - Finalization: The run transitions to
Done, heartbeats are cleared, and post-run reflection is scheduled crates/palyra-daemon/src/application/run_stream/orchestration.rs#107-144.
Run State Machine
TheRunStateMachine tracks the lifecycle of a single execution unit using the RunLifecycleState enum crates/palyra-daemon/src/application/run_stream/orchestration.rs#29-29.
| State | Description |
|---|---|
Accepted | Run created and queued. |
InProgress | Model provider is actively generating or tools are executing. |
Done | Run completed successfully. |
Cancelled | Execution stopped by user or system timeout. |
Failed | Terminal error encountered during execution. |
Provider Input Assembly & Context
Before a request is sent to an LLM, the Application Layer assembles aPreparedModelProviderInput crates/palyra-daemon/src/application/provider_input.rs#42-45. This process integrates multiple context sources:
- Memory Auto-Injection: Performs a vector search via
search_memoryand injects high-scoring items into the prompt crates/palyra-daemon/src/application/provider_input.rs#159-203. - Session Compaction Blocks: Injects condensed summaries of previous turns to stay within token limits crates/palyra-daemon/src/application/provider_input.rs#15-18.
- Vision Inputs: Processes
MessageAttachmentrecords intoProviderImageInputfor multimodal models crates/palyra-daemon/src/application/provider_input.rs#108-156. - Context References: Resolves specific entities (files, previous runs) referenced in the user message crates/palyra-daemon/src/application/provider_input.rs#10-12.
Session Compaction
To maintain long-running conversations without exceeding LLM context windows, Palyra employs asession_window_v1 strategy crates/palyra-daemon/src/application/session_compaction.rs#31-31.
Compaction Logic
- Eligibility: Triggered when token usage exceeds
AUTO_SESSION_COMPACTION_MIN_INPUT_TOKENS(default 480) crates/palyra-daemon/src/application/provider_input.rs#37-37. - Condensation: The system selects older
OrchestratorTapeRecordentries and generates asummary_textcrates/palyra-daemon/src/application/session_compaction.rs#76-94. - Checkpointing: Creates an
OrchestratorCheckpointRecordto allow “time-travel” or restoration of the pre-compacted state crates/palyra-daemon/src/application/session_compaction.rs#142-147. - Durable Writes: Significant facts or procedures identified during compaction are written to the Workspace via
apply_workspace_managed_blockcrates/palyra-daemon/src/application/session_compaction.rs#13-19.
Learning Runtime & Background Queue
The daemon runs a background supervisor that handles asynchronous maintenance and “learning” tasks.Background Queue Loop
Thespawn_background_queue_loop initializes a task worker that polls the JournalStore for pending OrchestratorBackgroundTaskRecord entries crates/palyra-daemon/src/background_queue.rs#31-44.
Task Processing
- Self-Healing: Tasks update
WorkHeartbeatUpdateto prevent the self-healing monitor from flagging the process as hung crates/palyra-daemon/src/background_queue.rs#108-112. - Post-Run Reflection: After a run completes, a
REFLECTION_TASK_KINDtask is queued to analyze the interaction for new “Learning Candidates” (facts or preferences) crates/palyra-daemon/src/background_queue.rs#11-12. - Cleanup: Tasks that exceed
expires_at_unix_msare marked asexpiredcrates/palyra-daemon/src/background_queue.rs#115-142.
Architectural Diagrams
Run Orchestration Data Flow
This diagram bridges the Natural Language Space (User Intent) to the Code Entity Space (Orchestration logic). Sources: crates/palyra-daemon/src/transport/http/handlers/console/chat.rs#139-183, crates/palyra-daemon/src/application/provider_input.rs#53-64, crates/palyra-daemon/src/application/run_stream/orchestration.rs#107-113.Session Compaction & Persistence
This diagram maps the logical compaction process to the underlying code entities. Sources: crates/palyra-daemon/src/application/session_compaction.rs#76-94, crates/palyra-daemon/src/application/session_compaction.rs#129-147, crates/palyra-daemon/src/application/session_compaction.rs#13-27.Key Components Reference
| Component | Code Entity | Responsibility |
|---|---|---|
| Run Loop | execute_run_stream_provider_request | Manages the async future for LLM calls and handles cancellation polling crates/palyra-daemon/src/application/run_stream/orchestration.rs#147-183. |
| Input Builder | PrepareModelProviderInputRequest | Struct containing all data needed to build a context-aware prompt crates/palyra-daemon/src/application/provider_input.rs#53-64. |
| Background Worker | poll_background_queue | Fetches and dispatches pending background tasks from the journal crates/palyra-daemon/src/background_queue.rs#46-93. |
| Compaction Strategy | SESSION_COMPACTION_STRATEGY | Defines how the session window is calculated and condensed crates/palyra-daemon/src/application/session_compaction.rs#31-31. |