Gateway Service Interface
TheGatewayService is defined in Protobuf and implemented in the daemon to handle high-concurrency, low-latency agent operations. It serves as the entry point for clients (CLI, Web Console, and Connectors) to submit messages and receive real-time updates.
Key gRPC Methods
RouteMessage: The primary entry point for inbound user intent. It resolves the appropriate agent, session, and channel context before initiating a Run.RunStream: A bidirectional stream that allows the client to receive granular updates (tape events) and provide real-time inputs (like tool approvals) during execution.
The Run Lifecycle
Every interaction in Palyra is encapsulated within anOrchestratorRun. The lifecycle of a run is managed by a formal state machine (RunStateMachine) which ensures that transitions (e.g., from Accepted to InProgress) are valid and persistent.
RunLifecycleState Transitions
The system tracks the progress of a run through the following states:| State | Description |
|---|---|
Accepted | The run has been created and queued for execution. |
InProgress | The orchestrator is actively processing the run (e.g., calling a model). |
Done | The run completed successfully. |
Cancelled | The run was terminated by a user request or system timeout. |
Failed | An unrecoverable error occurred during execution. |
State Machine Flow Diagram
The following diagram illustrates the transition logic within theRunStateMachine and how it maps to the GatewayRuntimeState.
Sources: crates/palyra-daemon/src/orchestrator.rs#77-77, crates/palyra-daemon/src/application/run_stream/orchestration.rs#113-143, crates/palyra-daemon/src/application/run_stream/orchestration.rs#197-202
The Tape Metaphor
Palyra uses a Tape metaphor for event sequencing. A “Tape” is a strictly ordered, append-only sequence ofJournalEvent records associated with a specific Run.
Characteristics of the Tape
- Immutability: Once an event is written to the tape, it cannot be changed.
- Sequencing: Every event has a
tape_seq(monotonically increasing integer) crates/palyra-daemon/src/gateway.rs#111-112. - Replayability: The UI (Web Console) uses the tape to reconstruct the chat transcript and tool execution history apps/web/src/console/sections/OperationsSection.tsx#160-165.
Tape Event Types
The tape captures various granularities of the run:StatusChange: Transitions in theRunLifecycleState.ModelTurn: Partial or complete tokens from an LLM.ToolCall: Intent to execute a specific tool.ToolOutput: The result of a tool execution.ApprovalRequest: A pause in the tape requiring user intervention.
Bidirectional RunStream
TheRunStream is the high-performance conduit between the Daemon and the Client. It uses tokio::sync::mpsc channels to bridge gRPC streaming with internal orchestrator logic.
Data Flow: RunStream
- Client -> Daemon: Sends
RunStreamInputwhich can containCancelRequestorToolApprovalDecision. - Daemon -> Client: Sends
RunStreamEventenvelopes containingTapeItembatches.
Implementation Details
Latency Budgets
The Gateway enforces strict latency budgets to ensure system responsiveness:- Journal Write: 25ms crates/palyra-daemon/src/gateway.rs#92-92.
- Tool Execution: 200ms (for overhead, excluding the tool’s own runtime) crates/palyra-daemon/src/gateway.rs#93-93.
Cancellation Logic
Cancellation is handled asynchronously. TheRunStream loop polls the GatewayRuntimeState for a cancellation flag using a tokio::select! block. If a cancel is detected, the run transitions to Cancelled and the model provider request is dropped.
Sources: crates/palyra-daemon/src/application/run_stream/orchestration.rs#113-118, crates/palyra-daemon/src/application/run_stream/orchestration.rs#158-181