Post-Run Learning and Reflection Pipeline

The Post-Run Learning and Reflection Pipeline is a subsystem within Palyra responsible for distilling transient agent interactions into durable knowledge, preferences, and skills. It operates as an asynchronous background process that analyzes completed runs to generate LearningCandidate objects, which are then audited and applied to the user’s long-term memory or workspace.

Reflection Architecture and Data Flow

The pipeline is triggered upon the completion of an agent run. It uses a sampling heuristic to determine if a run warrants reflection and then dispatches a background task to the BackgroundQueue.

Implementation Flow

Task Ingestion: The GatewayRuntimeState identifies a terminal run state and may enqueue a reflection task crates/palyra-daemon/src/background_queue.rs#45-46.
Queue Dispatch: The spawn_background_queue_loop leases OrchestratorBackgroundTaskRecord entries with the kind REFLECTION_TASK_KIND crates/palyra-daemon/src/background_queue.rs:45-46, 95-108.
Processing: The function process_post_run_reflection_task invokes the LearningCurator to analyze the run tape crates/palyra-daemon/src/background_queue.rs#45-46.
Candidate Generation: The curator produces LearningCandidate objects of various kinds (e.g., preference, patch_skill) crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#34-40.

Reflection Pipeline Overview

Title: Post-Run Reflection and Learning Loop Sources: crates/palyra-daemon/src/background_queue.rs#1-14, crates/palyra-daemon/src/background_queue.rs#95-108, crates/palyra-daemon/src/application/learning.rs#1-15

LearningCandidate Kinds

Learning candidates represent proposed changes to the agent’s behavior or knowledge base. They are categorized into specific kinds:

Kind	Description	Implementation
`preference`	User-specific stylistic or operational choices (e.g., “always use Python for data analysis”).	`apply_preference_candidate` crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#34-36
`durable_fact`	Long-term information extracted from the session (e.g., project milestones).	`Continuity Planner` crates/palyra-daemon/src/application/session_compaction.rs#7-11
`procedure`	Step-by-step instructions for complex tasks discovered during the run.	`LearningCurator` crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#37-40
`patch_skill`	Proposed modifications to existing `.palyra-skill` artifacts or logic.	`apply_patch_learning_candidate` crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#34-36

Sources: crates/palyra-daemon/src/application/session_compaction.rs#1-17, crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#34-40

Learning Curator Audit and Hygiene

Before a candidate is applied, it undergoes an audit process to ensure safety and consistency.

Skill Invocation Hygiene

The system projects the impact of a learning candidate on future tool usage. The function project_skill_invocation_hygiene_for_candidate crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#36-37 analyzes if a new preference or procedure would cause the agent to violate tool security postures or egress policies.

Conflict Detection

The preference_procedure_conflict_report crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#36-37 identifies contradictions between new candidates and existing curated knowledge. This uses CONTRADICTION_PAIRS (e.g., “enable” vs “disable”) to flag candidates for manual review if they sit on opposite sides of a logic boundary crates/palyra-daemon/src/application/session_compaction.rs#119-127.

Data Flow: Candidate Audit

Title: Learning Candidate Audit Projection Sources: crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#34-40, crates/palyra-daemon/src/application/session_compaction.rs#119-127

Session Compaction Integration

The learning pipeline is tightly coupled with Durable Session Compaction. While ephemeral pruning only affects the current prompt, compaction durably changes future prompts by writing “continuity” data into curated workspace documents crates/palyra-daemon/src/application/session_compaction.rs#1-17.

Continuity Planner: Scans the condensed event range for facts and decisions crates/palyra-daemon/src/application/session_compaction.rs#7-11.
Confidence Threshold: Candidates with a score below AUTO_WRITE_CONFIDENCE_THRESHOLD (0.82) are routed to the web console for operator review crates/palyra-daemon/src/application/session_compaction.rs#90-92.

Sources: crates/palyra-daemon/src/application/session_compaction.rs#1-17, crates/palyra-daemon/src/application/session_compaction.rs#90-92

Web Console Memory/Learning UI

The reflection pipeline is managed via the Memory Section of the Web Console.

Implementation Details

React State: Managed by useConsoleAppState.tsx, specifically properties like memoryLearningCandidates and memoryLearningCuratorReport apps/web/src/console/sections/MemorySection.tsx#70-75.
API Interactions: The ConsoleApiClient provides methods such as reviewLearningCandidate and applyLearningCandidate to transition candidates from proposed to applied states apps/web/src/console/sections/MemorySection.tsx#110-111.
UI Components: The MemorySection.tsx component renders tables of learning history, active preferences, and the curator’s conflict reports apps/web/src/console/sections/MemorySection.tsx#131-150.

Status Handling

Candidates in terminal states such as applied, denied, or conflicted are filtered using NON_APPLYABLE_LEARNING_PATCH_STATUSES to prevent redundant operations apps/web/src/console/sections/MemorySection.tsx#122-129. Sources: apps/web/src/console/sections/MemorySection.tsx#1-112, apps/web/src/console/useConsoleAppState.tsx#204-210

​Reflection Architecture and Data Flow

​Implementation Flow

​Reflection Pipeline Overview

​LearningCandidate Kinds

​Learning Curator Audit and Hygiene

​Skill Invocation Hygiene

​Conflict Detection

​Data Flow: Candidate Audit

​Session Compaction Integration

​Web Console Memory/Learning UI

​Implementation Details

​Status Handling