Provider Abstractions
Palyra uses two primary traits to define how the system interacts with AI models. These abstractions allow thepalyrad daemon to remain agnostic of the specific backend implementation.
ModelProvider: Handles chat completions, tool proposals, and audio transcriptions crates/palyra-daemon/src/model_provider.rs#252-269.EmbeddingsProvider: Handles vectorization of text inputs for memory and retrieval crates/palyra-daemon/src/model_provider.rs#271-278.
Model Provider Kinds
The system supports two implementation modes defined by theModelProviderKind enum crates/palyra-daemon/src/model_provider.rs#32-35:
| Kind | Description |
|---|---|
OpenAiCompatible | Targets any backend following the OpenAI API specification (e.g., OpenAI, Azure, LocalLLM) crates/palyra-daemon/src/model_provider.rs#49. |
Deterministic | A mock provider used for testing that returns pre-defined or hashed responses without network calls crates/palyra-daemon/src/model_provider.rs#48. |
OpenAI-Compatible Backend
TheOpenAiModelProvider is the primary production implementation. it translates internal ProviderRequest structures into OpenAI-compatible JSON payloads.
Data Flow: Request Transformation
When a request is initiated, the provider performs the following:- Vision Handling: If
vision_inputsare present, it constructs a multi-modalimage_urlpayload crates/palyra-daemon/src/model_provider.rs#59-79. - Tool Mapping: Internal tool definitions are converted to the OpenAI
toolsschema. - URL Validation: Before dispatching, the
openai_base_urlis validated. By default, private/loopback IP addresses are blocked unlessallow_private_base_urlis enabled inFileModelProviderConfigcrates/palyra-common/src/daemon_config_schema.rs#198, crates/palyra-daemon/src/model_provider.rs#604-620.
Authentication Flow
Authentication supports multiple sources via theModelProviderCredentialSource crates/palyra-daemon/src/model_provider.rs#103-108:
InlineConfig: API key stored directly inpalyra.tomlcrates/palyra-daemon/src/model_provider.rs#114.VaultRef: A reference to a secret stored in the platform’s secure vault (e.g., Keychain) crates/palyra-daemon/src/model_provider.rs#115.AuthProfile: Dynamic profiles managed via the Web Console, supporting both static API keys and OAuth2 flows crates/palyra-daemon/src/openai_surface.rs#11-65.
Reliability: Circuit Breaker and Retries
To ensure system stability, theModelProvider implementation wraps requests in logic that handles transient failures and prevents cascading outages.
Retry Logic
The provider monitors HTTP status codes and automatically retries on429 (Rate Limit), 500, 502, 503, and 504 crates/palyra-daemon/src/model_provider.rs#22.
- Max Retries: Configurable via
max_retries(default: 2) crates/palyra-daemon/src/model_provider.rs#157. - Backoff: Uses a linear backoff defined by
retry_backoff_mscrates/palyra-daemon/src/model_provider.rs#158.
Circuit Breaker
A circuit breaker prevents the daemon from repeatedly attempting calls to a failing provider.- Threshold: The circuit “opens” after
circuit_breaker_failure_thresholdconsecutive failures (default: 3) crates/palyra-daemon/src/model_provider.rs#159. - Cooldown: The provider remains in a failed state for
circuit_breaker_cooldown_ms(default: 30s) before allowing a “half-open” trial request crates/palyra-daemon/src/model_provider.rs#160.
Netguard: URL Validation
Palyra implements a “Netguard” pattern to prevent Server-Side Request Forgery (SSRF) when the daemon communicates with model providers. Thevalidate_provider_base_url function checks the configured openai_base_url crates/palyra-daemon/src/model_provider.rs#604-620:
- Resolves the hostname to IP addresses crates/palyra-daemon/src/model_provider.rs#624-626.
- Checks if any resolved IP is a loopback, link-local, or private address crates/palyra-daemon/src/model_provider.rs#628-632.
- Bails with an error unless
allow_private_base_urlis explicitly true in the configuration crates/palyra-daemon/src/model_provider.rs#633-638.
DeterministicProvider for Testing
For CI/CD and local development without API keys, theDeterministicProvider offers stable, non-networked behavior.
- Completion: Returns a hashed version of the input text to ensure unique but repeatable outputs crates/palyra-daemon/src/model_provider.rs#528-535.
- Embeddings: Generates a deterministic vector by seeding a PRNG with the hash of the input string crates/palyra-daemon/src/model_provider.rs#563-575. This allows vector search logic to be tested for functional correctness without a real embedding model.
Provider Status and Metrics
The system tracks the health and usage of providers through theProviderStatusSnapshot crates/palyra-daemon/src/model_provider.rs#228-237.
Snapshot Fields
| Field | Description |
|---|---|
is_healthy | Boolean indicating if the circuit breaker is closed crates/palyra-daemon/src/model_provider.rs#229. |
failure_count | Number of consecutive failures recorded crates/palyra-daemon/src/model_provider.rs#231. |
last_failure_message | The error message from the last failed attempt crates/palyra-daemon/src/model_provider.rs#232. |
total_tokens_consumed | Aggregate of prompt and completion tokens used in the current session crates/palyra-daemon/src/model_provider.rs#234. |
Implementation Diagrams
Model Request Pipeline
This diagram bridges the Natural Language request from theOrchestrator to the concrete OpenAiModelProvider implementation.
Sources: crates/palyra-daemon/src/model_provider.rs#252-269, crates/palyra-daemon/src/model_provider.rs#604-620
Provider Configuration & Auth Mapping
This diagram maps theRootFileConfig to the runtime ModelProviderConfig and its associated AuthProfile.
Sources: crates/palyra-common/src/daemon_config_schema.rs#64-81, crates/palyra-daemon/src/model_provider.rs#123-140, crates/palyra-daemon/src/agents.rs#27-37