Memory System & Embeddings

The Memory System in Palyra is a Retrieval-Augmented Generation (RAG) subsystem designed to provide long-term persistence and context retrieval for orchestrator runs. It manages the lifecycle of “Memory Items”—discrete units of information associated with specific principals, channels, or sessions—and handles their conversion into vector embeddings for semantic search.

Memory Architecture & Data Flow

The memory system operates as a background-integrated service within the palyrad daemon. It bridges the gap between raw conversation data (the “Tape”) and searchable knowledge.

Memory Item Lifecycle

Ingestion: Items are created via MemorySource categories.
Persistence: Records are stored as MemoryItemRecord in the journal_events table.
Embedding: A background process backfills vector embeddings for new or updated items.
Retrieval: The palyra.memory.search tool or the Recall system queries items using a hybrid of lexical and vector search.

System Components Diagram

This diagram illustrates the relationship between the natural language inputs and the internal code entities that manage them. “Memory System Architecture” Sources: crates/palyra-daemon/src/journal/mod.rs#11-27, crates/palyra-daemon/src/gateway/runtime.rs#71-82, crates/palyra-daemon/src/application/run_stream/orchestration.rs#20-27

MemorySource Categories

Memory items are categorized by their origin, which influences their priority and retrieval behavior:

Source	Code Identifier	Description
User Message	`tape:user_message`	Directly extracted from conversation history.
Summary	`summary`	Generated by the session compaction/reflection pipeline.
Manual	`manual`	Explicitly added by the user via CLI or Console.
Durable Fact	`durable_fact`	Validated facts promoted by the learning system.

Sources: crates/palyra-daemon/src/journal/mod.rs#24-27, crates/palyra-daemon/src/application/learning.rs#21-28

Vector Embeddings & Backfill

Palyra uses a background worker to ensure all memory items are indexed for semantic search. This process is managed by run_memory_embeddings_backfill within the GatewayRuntimeState.

Backfill Logic

The backfill process operates in batches to minimize impact on the model provider’s rate limits. It identifies items in the MemoryItemRecord store that lack a corresponding vector embedding or have a stale content_hash.

Batching: Configurable via batch_size (default 64, max 256) crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#136-137.
Maintenance: A periodic run_memory_maintenance_now task handles TTL expiration and vacuuming crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#141-142.

Sources: crates/palyra-daemon/src/gateway/runtime.rs#11-27, crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#143-156

Auto-Injection & Retrieval

The system can automatically inject relevant memory items into the model prompt during the prepare_model_provider_input phase.

Configuration

The MemoryRuntimeConfig struct defines the behavior of this subsystem:

auto_inject_enabled: Toggles automatic RAG crates/palyra-daemon/src/gateway/runtime.rs#75.
auto_inject_max_items: Limits the number of items injected (default 3) crates/palyra-daemon/src/gateway/runtime.rs#76.
max_item_tokens: Caps the size of individual items to prevent context overflow crates/palyra-daemon/src/gateway/runtime.rs#74.

Search Tooling

The palyra.memory.search tool (implemented via SearchMemoryRequest) allows the agent to perform explicit lookups. It supports:

Hybrid Scoring: Combines lexical, vector, and recency scores crates/palyra-cli/src/commands/memory.rs#132-138.
Scoping: Can be scoped to a session_id, channel, or principal crates/palyra-cli/src/commands/memory.rs#55-61.

Sources: crates/palyra-daemon/src/gateway/runtime.rs#71-82, crates/palyra-daemon/src/application/run_stream/orchestration.rs#16-18, crates/palyra-cli/src/commands/memory.rs#37-67

CLI Memory Commands

The palyra CLI provides administrative and debug access to the memory system through MemoryCommand. “CLI to Runtime Mapping”

Command Reference

palyra memory status: Displays usage, embedding counts, and maintenance schedules crates/palyra-cli/src/args/memory.rs#5-8.
palyra memory search <query>: Performs a manual search with optional --show-metadata and --include-score-breakdown crates/palyra-cli/src/args/memory.rs#20-42.
palyra memory index --until-complete: Manually triggers the embedding backfill process crates/palyra-cli/src/args/memory.rs#10-19.
palyra memory purge: Deletes items based on session or principal scope crates/palyra-cli/src/args/memory.rs#43-52.

Sources: crates/palyra-cli/src/args/memory.rs#4-119, crates/palyra-cli/src/commands/memory.rs#4-23, crates/palyra-daemon/src/transport/http/handlers/console/memory.rs#11-95

​Memory Architecture & Data Flow

​Memory Item Lifecycle

​System Components Diagram

​MemorySource Categories

​Vector Embeddings & Backfill

​Backfill Logic

​Auto-Injection & Retrieval

​Configuration

​Search Tooling

​CLI Memory Commands

​Command Reference