Media Artifact Store
TheMediaArtifactStore is the central authority for managing attachments and generated media. It handles ingestion from external connectors (e.g., Discord), manual uploads via the Console, and derived artifacts produced by the processing pipeline.
Dual Storage Architecture
Media is stored using a hybrid approach to balance query performance with filesystem efficiency:- Metadata (SQLite): Stores artifact records, content types, SHA256 hashes, and relationship mapping between source and derived artifacts crates/palyra-daemon/src/media.rs#120-124.
- Content (Filesystem): Raw bytes are stored in a content-addressed structure under the
media_content_rootcrates/palyra-daemon/src/media.rs#122.
Media Ingestion Flow
TheChannelPlatform orchestrates the ingestion of attachments received from external channels crates/palyra-daemon/src/channels.rs#106-110.
| Stage | Code Entity | Description |
|---|---|---|
| Ingest Request | InboundAttachmentIngestRequest | Encapsulates source URL, filename, and expected content type crates/palyra-daemon/src/media.rs#25. |
| Content Sniffing | MediaArtifactStore::ingest_inbound | Downloads content and validates against MediaRuntimeConfig allowed types crates/palyra-daemon/src/media.rs#49-69. |
| Validation | netguard | Validates source URLs against allowed hosts (e.g., cdn.discordapp.com) crates/palyra-daemon/src/media.rs#25-26. |
| Persistence | MediaArtifactPayload | Computes SHA256 and writes to filesystem and SQLite crates/palyra-daemon/src/media.rs#152-161. |
Derived Artifact Pipeline
Palyra automatically processes ingested media to extract useful information for the LLM context. This is managed by theMediaDerivedArtifact logic.
Supported Extractors
The system supports three primary kinds of derived artifacts defined inDerivedArtifactKind crates/palyra-derived.rs#27-31:
- MetadataSummary: Basic file info (dimensions, size, hash) crates/palyra-derived.rs#105-137.
- ExtractedText: OCR or text extraction from PDF, DOCX, XLSX, and HTML crates/palyra-derived.rs#165-201.
- Transcript: Audio-to-text conversion for supported audio formats crates/palyra-derived.rs#156-163.
Processing Logic
- Sniffing:
supports_document_extractionchecks the MIME type to route to the correct parser crates/palyra-derived.rs#140-153. - Anchoring: Extracted text is broken into chunks with
DerivedArtifactAnchorto allow the LLM to cite specific sections crates/palyra-derived.rs#51-59. - Selection: During prompt preparation,
select_prompt_chunksfilters derived content based on relevance and token budgets crates/palyra-derived.rs#20-22.
Workspace Documents and Risk Scanning
Workspace documents are managed within thejournal but governed by strict domain rules in crates/palyra-daemon/src/domain/workspace.rs.
Risk States and Prompt Injection
Every document in the workspace undergoes a risk scan to prevent “Indirect Prompt Injection” where a file contains instructions intended to hijack the LLM’s behavior.| Risk State | Constant/Pattern | Action |
|---|---|---|
| Clean | No matches | Normal processing. |
| Warning | PROMPT_INJECTION_WARNING_PATTERNS | Flagged in UI; user must confirm use crates/palyra-daemon/src/domain/workspace.rs#20-29. |
| Quarantined | PROMPT_INJECTION_HIGH_RISK_PATTERNS | Blocked from automatic injection into LLM context crates/palyra-daemon/src/domain/workspace.rs#10-19. |
Path Security
The system enforces strict path validation to prevent directory traversal and access to sensitive system areas:- Blocked Components:
.git,.ssh,.aws,secrets,node_modulescrates/palyra-daemon/src/domain/workspace.rs#8-9. - Validation:
WorkspacePathErrorhandles traversal attempts and invalid extensions crates/palyra-daemon/src/domain/workspace.rs#168-178.
Prompt Augmentation and Recall
When a user sends a message, thePreparedModelProviderInput logic determines which media and workspace artifacts to include in the LLM prompt.
Attachment Recall
If a user refers to an attachment,AttachmentRecallSelection is used to retrieve the specific chunks or derived text crates/palyra-daemon/src/application/provider_input.rs#100-106.
Vision Pipeline
Images are converted intoProviderImageInput objects. The system enforces:
- Count Limits:
vision_max_image_countcrates/palyra-daemon/src/application/provider_input.rs#118. - Dimension Limits:
vision_max_dimension_pxcrates/palyra-daemon/src/application/provider_input.rs#139-143. - Encoding: Base64 encoding of raw bytes for transmission to the Model Provider crates/palyra-daemon/src/application/provider_input.rs#147-148.
Context References
Thepreview_context_references function parses the user input for specific markers (e.g., @file, @url) and resolves them into ResolvedContextReference objects, which include “provenance” (where the data came from) and “preview_text” crates/palyra-daemon/src/application/context_references.rs#48-77.
Sources: crates/palyra-daemon/src/application/provider_input.rs#100-156, crates/palyra-daemon/src/application/context_references.rs#40-77