Memory Architecture & Data Flow
The memory system operates as a background-integrated service within thepalyrad daemon. It bridges the gap between raw conversation data (the “Tape”) and searchable knowledge.
Memory Item Lifecycle
- Ingestion: Items are created via
MemorySourcecategories. - Persistence: Records are stored as
MemoryItemRecordin thejournal_eventstable. - Embedding: A background process backfills vector embeddings for new or updated items.
- Retrieval: The
palyra.memory.searchtool or theRecallsystem queries items using a hybrid of lexical and vector search.
System Components Diagram
This diagram illustrates the relationship between the natural language inputs and the internal code entities that manage them. “Memory System Architecture” Sources: crates/palyra-daemon/src/journal/mod.rs#11-27, crates/palyra-daemon/src/gateway/runtime.rs#71-82, crates/palyra-daemon/src/application/run_stream/orchestration.rs#20-27MemorySource Categories
Memory items are categorized by their origin, which influences their priority and retrieval behavior:| Source | Code Identifier | Description |
|---|---|---|
| User Message | tape:user_message | Directly extracted from conversation history. |
| Summary | summary | Generated by the session compaction/reflection pipeline. |
| Manual | manual | Explicitly added by the user via CLI or Console. |
| Durable Fact | durable_fact | Validated facts promoted by the learning system. |
Vector Embeddings & Backfill
Palyra uses a background worker to ensure all memory items are indexed for semantic search. This process is managed byrun_memory_embeddings_backfill within the GatewayRuntimeState.
Backfill Logic
The backfill process operates in batches to minimize impact on the model provider’s rate limits. It identifies items in theMemoryItemRecord store that lack a corresponding vector embedding or have a stale content_hash.
- Batching: Configurable via
batch_size(default 64, max 256) crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#136-137. - Maintenance: A periodic
run_memory_maintenance_nowtask handles TTL expiration and vacuuming crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#141-142.
Auto-Injection & Retrieval
The system can automatically inject relevant memory items into the model prompt during theprepare_model_provider_input phase.
Configuration
TheMemoryRuntimeConfig struct defines the behavior of this subsystem:
auto_inject_enabled: Toggles automatic RAG crates/palyra-daemon/src/gateway/runtime.rs#75.auto_inject_max_items: Limits the number of items injected (default 3) crates/palyra-daemon/src/gateway/runtime.rs#76.max_item_tokens: Caps the size of individual items to prevent context overflow crates/palyra-daemon/src/gateway/runtime.rs#74.
Search Tooling
Thepalyra.memory.search tool (implemented via SearchMemoryRequest) allows the agent to perform explicit lookups. It supports:
- Hybrid Scoring: Combines lexical, vector, and recency scores crates/palyra-cli/src/commands/memory.rs#132-138.
- Scoping: Can be scoped to a
session_id,channel, orprincipalcrates/palyra-cli/src/commands/memory.rs#55-61.
CLI Memory Commands
Thepalyra CLI provides administrative and debug access to the memory system through MemoryCommand.
“CLI to Runtime Mapping”
Command Reference
palyra memory status: Displays usage, embedding counts, and maintenance schedules crates/palyra-cli/src/args/memory.rs#5-8.palyra memory search <query>: Performs a manual search with optional--show-metadataand--include-score-breakdowncrates/palyra-cli/src/args/memory.rs#20-42.palyra memory index --until-complete: Manually triggers the embedding backfill process crates/palyra-cli/src/args/memory.rs#10-19.palyra memory purge: Deletes items based on session or principal scope crates/palyra-cli/src/args/memory.rs#43-52.