Skip to main content
The Memory System in Palyra is a Retrieval-Augmented Generation (RAG) subsystem designed to provide long-term persistence and context retrieval for orchestrator runs. It manages the lifecycle of “Memory Items”—discrete units of information associated with specific principals, channels, or sessions—and handles their conversion into vector embeddings for semantic search.

Memory Architecture & Data Flow

The memory system operates as a background-integrated service within the palyrad daemon. It bridges the gap between raw conversation data (the “Tape”) and searchable knowledge.

Memory Item Lifecycle

  1. Ingestion: Items are created via MemorySource categories.
  2. Persistence: Records are stored as MemoryItemRecord in the journal_events table.
  3. Embedding: A background process backfills vector embeddings for new or updated items.
  4. Retrieval: The palyra.memory.search tool or the Recall system queries items using a hybrid of lexical and vector search.

System Components Diagram

This diagram illustrates the relationship between the natural language inputs and the internal code entities that manage them. “Memory System Architecture” Sources: crates/palyra-daemon/src/journal/mod.rs#11-27, crates/palyra-daemon/src/gateway/runtime.rs#71-82, crates/palyra-daemon/src/application/run_stream/orchestration.rs#20-27

MemorySource Categories

Memory items are categorized by their origin, which influences their priority and retrieval behavior:
SourceCode IdentifierDescription
User Messagetape:user_messageDirectly extracted from conversation history.
SummarysummaryGenerated by the session compaction/reflection pipeline.
ManualmanualExplicitly added by the user via CLI or Console.
Durable Factdurable_factValidated facts promoted by the learning system.
Sources: crates/palyra-daemon/src/journal/mod.rs#24-27, crates/palyra-daemon/src/application/learning.rs#21-28

Vector Embeddings & Backfill

Palyra uses a background worker to ensure all memory items are indexed for semantic search. This process is managed by run_memory_embeddings_backfill within the GatewayRuntimeState.

Backfill Logic

The backfill process operates in batches to minimize impact on the model provider’s rate limits. It identifies items in the MemoryItemRecord store that lack a corresponding vector embedding or have a stale content_hash. Sources: crates/palyra-daemon/src/gateway/runtime.rs#11-27, crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#143-156

Auto-Injection & Retrieval

The system can automatically inject relevant memory items into the model prompt during the prepare_model_provider_input phase.

Configuration

The MemoryRuntimeConfig struct defines the behavior of this subsystem:

Search Tooling

The palyra.memory.search tool (implemented via SearchMemoryRequest) allows the agent to perform explicit lookups. It supports: Sources: crates/palyra-daemon/src/gateway/runtime.rs#71-82, crates/palyra-daemon/src/application/run_stream/orchestration.rs#16-18, crates/palyra-cli/src/commands/memory.rs#37-67

CLI Memory Commands

The palyra CLI provides administrative and debug access to the memory system through MemoryCommand. “CLI to Runtime Mapping”

Command Reference

Sources: crates/palyra-cli/src/args/memory.rs#4-119, crates/palyra-cli/src/commands/memory.rs#4-23, crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#11-95