LearningCandidate objects, which are then audited and applied to the user’s long-term memory or workspace.
Reflection Architecture and Data Flow
The pipeline is triggered upon the completion of an agent run. It uses a sampling heuristic to determine if a run warrants reflection and then dispatches a background task to theBackgroundQueue.
Implementation Flow
- Task Ingestion: The
GatewayRuntimeStateidentifies a terminal run state and may enqueue a reflection task crates/palyra-daemon/src/background_queue.rs#45-46. - Queue Dispatch: The
spawn_background_queue_loopleasesOrchestratorBackgroundTaskRecordentries with the kindREFLECTION_TASK_KINDcrates/palyra-daemon/src/background_queue.rs:45-46, 95-108. - Processing: The function
process_post_run_reflection_taskinvokes theLearningCuratorto analyze the run tape crates/palyra-daemon/src/background_queue.rs#45-46. - Candidate Generation: The curator produces
LearningCandidateobjects of various kinds (e.g.,preference,patch_skill) crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#34-40.
Reflection Pipeline Overview
Title: Post-Run Reflection and Learning Loop Sources: crates/palyra-daemon/src/background_queue.rs#1-14, crates/palyra-daemon/src/background_queue.rs#95-108, crates/palyra-daemon/src/application/learning.rs#1-15LearningCandidate Kinds
Learning candidates represent proposed changes to the agent’s behavior or knowledge base. They are categorized into specific kinds:| Kind | Description | Implementation |
|---|---|---|
preference | User-specific stylistic or operational choices (e.g., “always use Python for data analysis”). | apply_preference_candidate crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#34-36 |
durable_fact | Long-term information extracted from the session (e.g., project milestones). | Continuity Planner crates/palyra-daemon/src/application/session_compaction.rs#7-11 |
procedure | Step-by-step instructions for complex tasks discovered during the run. | LearningCurator crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#37-40 |
patch_skill | Proposed modifications to existing .palyra-skill artifacts or logic. | apply_patch_learning_candidate crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#34-36 |
Learning Curator Audit and Hygiene
Before a candidate is applied, it undergoes an audit process to ensure safety and consistency.Skill Invocation Hygiene
The system projects the impact of a learning candidate on future tool usage. The functionproject_skill_invocation_hygiene_for_candidate crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#36-37 analyzes if a new preference or procedure would cause the agent to violate tool security postures or egress policies.
Conflict Detection
Thepreference_procedure_conflict_report crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#36-37 identifies contradictions between new candidates and existing curated knowledge. This uses CONTRADICTION_PAIRS (e.g., “enable” vs “disable”) to flag candidates for manual review if they sit on opposite sides of a logic boundary crates/palyra-daemon/src/application/session_compaction.rs#119-127.
Data Flow: Candidate Audit
Title: Learning Candidate Audit Projection Sources: crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#34-40, crates/palyra-daemon/src/application/session_compaction.rs#119-127Session Compaction Integration
The learning pipeline is tightly coupled with Durable Session Compaction. While ephemeral pruning only affects the current prompt, compaction durably changes future prompts by writing “continuity” data into curated workspace documents crates/palyra-daemon/src/application/session_compaction.rs#1-17.- Continuity Planner: Scans the condensed event range for facts and decisions crates/palyra-daemon/src/application/session_compaction.rs#7-11.
- Confidence Threshold: Candidates with a score below
AUTO_WRITE_CONFIDENCE_THRESHOLD(0.82) are routed to the web console for operator review crates/palyra-daemon/src/application/session_compaction.rs#90-92.
Web Console Memory/Learning UI
The reflection pipeline is managed via the Memory Section of the Web Console.Implementation Details
- React State: Managed by
useConsoleAppState.tsx, specifically properties likememoryLearningCandidatesandmemoryLearningCuratorReportapps/web/src/console/sections/MemorySection.tsx#70-75. - API Interactions: The
ConsoleApiClientprovides methods such asreviewLearningCandidateandapplyLearningCandidateto transition candidates from proposed to applied states apps/web/src/console/sections/MemorySection.tsx#110-111. - UI Components: The
MemorySection.tsxcomponent renders tables of learning history, active preferences, and the curator’s conflict reports apps/web/src/console/sections/MemorySection.tsx#131-150.
Status Handling
Candidates in terminal states such asapplied, denied, or conflicted are filtered using NON_APPLYABLE_LEARNING_PATCH_STATUSES to prevent redundant operations apps/web/src/console/sections/MemorySection.tsx#122-129.
Sources: apps/web/src/console/sections/MemorySection.tsx#1-112, apps/web/src/console/useConsoleAppState.tsx#204-210