Skip to main content
The Memory and Retrieval-Augmented Generation (RAG) subsystem provides the daemon with long-term persistence and context retrieval capabilities. It manages “Memory Items” (vector-embedded snippets), “Workspace Documents” (curated knowledge), and “Learning Reflections” (automated background processing of session history into structured facts).

System Architecture

The memory system is integrated into the JournalStore and orchestrated by the palyrad gateway. It bridges the gap between raw session events and high-level agent knowledge.

Memory Data Flow

The following diagram illustrates how natural language input from a session is transformed into stored memory and subsequently retrieved for RAG. Memory Ingestion and Retrieval Pipeline Sources: crates/palyra-daemon/src/journal.rs#64-100, crates/palyra-daemon/src/gateway.rs#116-121, crates/palyra-common/src/daemon_config_schema.rs#193-196

Memory Items & Embedding Providers

Memory items are the atomic units of the RAG system. Each item consists of text content, a vector embedding, and metadata (source, tags, and TTL).

Embedding Implementation

The system supports pluggable embedding providers via the MemoryEmbeddingProvider trait crates/palyra-daemon/src/journal.rs#64-68.

Constraints & Constants

ConstantValueDescription
MAX_MEMORY_ITEM_BYTES16 KBMaximum size of a single memory text snippet crates/palyra-daemon/src/gateway.rs#118-118.
MAX_MEMORY_ITEM_TOKENS2,048Token limit for embedding generation crates/palyra-daemon/src/gateway.rs#119-119.
MAX_MEMORY_SEARCH_TOP_K64Maximum number of hits returned per search crates/palyra-daemon/src/gateway.rs#117-117.
Sources: crates/palyra-daemon/src/gateway.rs#116-121, crates/palyra-daemon/src/journal.rs#50-55

Auto-Inject & Recall Preview

The “Auto-Inject” mechanism automatically searches memory during the orchestrator’s run loop to provide relevant context to the agent without explicit user intervention.
  1. Recall Preview: Before a message is sent to the LLM, the useRecallPreview hook in the web console fetches a preview of what the memory system would inject based on the current composer state apps/web/src/chat/useRecallPreview.ts#1-10.
  2. Injection Logic: Controlled by FileMemoryAutoInjectConfig, the daemon performs a vector search and prepends the top results to the system prompt crates/palyra-common/src/daemon_config_schema.rs#193-196.
Sources: apps/web/src/chat/useRecallPreview.ts#1-10, crates/palyra-common/src/daemon_config_schema.rs#193-196

Retention & Maintenance

The memory system enforces strict TTL (Time-To-Live) and storage quotas to prevent unbounded database growth. Sources: crates/palyra-daemon/src/journal.rs#50-57, crates/palyra-daemon/src/gateway.rs#64-66, crates/palyra-common/src/daemon_config_schema.rs#200-205

Memory CLI & Web Interface

Users can manage the memory state through both the palyra CLI and the Web Console.

CLI Commands

The memory command family allows for manual searching and maintenance:
  • palyra memory search <query>: Performs a vector search.
  • palyra memory purge: Clears items based on filters (session ID, channel, or “all”).

Web Console (Memory Section)

The MemorySection in the React app provides a visual management surface for: Memory Management Components Sources: apps/web/src/console/sections/MemorySection.tsx#27-94, crates/palyra-daemon/src/journal.rs#63-71, apps/web/src/console/useConsoleAppState.tsx#165-165