InstructionCompiler.
Retrieval Architecture and Hybrid Search
Palyra utilizes a multi-stage retrieval pipeline to resolve relevant context from theJournalStore. The system performs hybrid search by merging results from SQLite FTS5 (lexical) and vector similarity (semantic) indices crates/palyra-daemon/src/journal.rs#135-143.
Scoring Components
The final relevance score for a retrieval candidate is calculated usingscore_with_profile crates/palyra-daemon/src/gateway/runtime.rs#107-110, which aggregates:
- Lexical Score: BM25-based matching via SQLite FTS5 virtual tables crates/palyra-daemon/src/journal.rs#138-140.
- Vector Score: Cosine similarity against embeddings produced by a
MemoryEmbeddingProvidercrates/palyra-daemon/src/journal.rs#168-175. - Recency Score: A decay function that favors newer information based on
created_at_unix_mscrates/palyra-daemon/src/retrieval.rs#106-107.
Retrieval Flow Diagram
This diagram illustrates the flow from a natural language query to the assembly of aRecallPlan within the GatewayRuntimeState.
Natural Language to Retrieval Plan Flow
Sources: crates/palyra-daemon/src/gateway/runtime.rs#105-110, crates/palyra-daemon/src/journal.rs#135-143, crates/palyra-daemon/src/application/tool_runtime/memory.rs#1-10
Recall Plan and External Indexing
TheRecallPlan is the structured output of the retrieval process. It includes RetrievalBranchDiagnostics to provide transparency into why specific items were selected or filtered crates/palyra-daemon/src/journal.rs#81-82.
External Retrieval Index
For large-scale workspace documents, Palyra supports an external retrieval index. This is managed via theExternalRetrievalRuntime crates/palyra-daemon/src/retrieval/external_index.rs.
- Chunking: Documents are split into segments based on
WORKSPACE_CHUNK_TARGET_BYTES(default 1024) with aWORKSPACE_CHUNK_OVERLAP_BYTES(default 160) crates/palyra-daemon/src/journal.rs#156-157. - Status Tracking: The
WorkspaceRetrievalIndexStatustracks the health and backfill progress of the vector index crates/palyra-daemon/src/journal/retrieval_index_status.rs#73-73.
Memory Scopes
Retrieval is strictly partitioned byMemoryLifecycleScope crates/palyra-daemon/src/application/memory.rs#42-42:
principal: Global user-level facts and preferences.session: Context specific to the current conversation.channel: Platform-specific context (e.g., Discord server info).workspace/project: Filesystem-backed knowledge crates/palyra-daemon/src/application/tool_runtime/memory.rs#19-21.
Instruction Compilation and Context Assembly
Once retrieval is complete, theInstructionCompiler assembles the final prompt context. This process ensures that retrieved information is presented to the model with appropriate “Trust Labels” and “Claim Boundaries” crates/palyra-daemon/src/application/instruction_compiler.rs#1-10.
Instruction Selection Logic
TheInstructionCompiler generates CompiledInstructions based on:
- Tool Catalog: Available tools for the turn crates/palyra-daemon/src/application/instruction_compiler.rs#61-61.
- Trust Summary: Aggregated safety posture of the retrieved context blocks crates/palyra-daemon/src/application/instruction_compiler.rs#31-38.
- Temporal Evidence: Current UTC/Unix timestamps to prevent model hallucination of dates crates/palyra-daemon/src/application/instruction_compiler.rs#149-150.
Context Assembly Diagram
This diagram shows how retrieved memory items are transformed into provider messages. Recall to Instruction Compiler Pipeline Sources: crates/palyra-daemon/src/application/instruction_compiler.rs#1-26, crates/palyra-daemon/src/application/instruction_compiler.rs#89-104, crates/palyra-daemon/src/application/tool_runtime/memory.rs#106-121Key Implementation Details
Claim Boundaries
To prevent the model from making false assertions about missing information, Palyra uses “Claim Boundaries” crates/palyra-daemon/src/application/tool_runtime/memory.rs#12-16. For example:MEMORY_HITS_ABSENT_CLAIM_BOUNDARY: “no memory hits were returned; do not invent stored preferences or prior facts” crates/palyra-daemon/src/application/tool_runtime/memory.rs#85-86.SESSION_SEARCH_HITS_PRESENT_CLAIM_BOUNDARY: Instructs the model to cite hits as session recall, not durable memory crates/palyra-daemon/src/application/tool_runtime/memory.rs#98-98.
Memory Maintenance
The system performs background maintenance via thespawn_scheduler_loop crates/palyra-daemon/src/lib.rs#143-143. This includes:
- Embedding Backfill: Processing items missing vectors crates/palyra-daemon/src/journal.rs#77-78.
- Retention Enforcement: Purging expired memory items based on
MEMORY_RETENTION_DAY_MScrates/palyra-daemon/src/journal.rs#150-150.
Tool Registry Integration
Thepalyra.memory.search and palyra.memory.recall tools are defined in the builtin tool registry with specific schemas for top_k, min_score, and scope crates/palyra-daemon/src/application/tool_registry/builtin.rs#105-124.
| Tool Name | Purpose | Parallelism Policy |
|---|---|---|
palyra.memory.search | Low-level search across lifecycle and workspace | ReadOnly |
palyra.memory.recall | High-level RAG with automatic context injection | ReadOnly |
palyra.memory.retain | Durable storage of new facts/preferences | Idempotent |