Subsystem Overview
The subsystem is built on a “Registry-Backed” architecture. Instead of hardcoding a single provider, Palyra maintains a registry of configured model entries that can be ranked and selected based on the request requirements (e.g., vision support, reasoning effort, or tool-calling capabilities).Core Components and Entities
The following diagram illustrates the relationship between the core traits and the concrete implementations within thepalyra-daemon and palyra-model-providers crates.
Model Provider Interface Mapping
Sources: crates/palyra-daemon/src/model_provider.rs#5-12, crates/palyra-daemon/src/model_provider/adapters.rs#16-57, crates/palyra-model-providers/src/lib.rs#71-77
5.1 Provider Registry and Routing
TheRegistryBackedModelProvider serves as the primary entry point for the orchestrator. It manages the lifecycle of provider requests, including:
- Candidate Selection: Filtering the
ModelProviderRegistryConfigto find models that support specific features likeProviderReasoningEffortorProviderImageInputcrates/palyra-daemon/src/model_provider.rs#80-87. - Circuit Breaking & Failover: Monitoring provider health and automatically failing over to secondary candidates if the primary provider returns retryable errors (e.g., HTTP 429, 503) crates/palyra-daemon/src/model_provider.rs#118-120.
- Response Caching: TTL-bounded caching of responses to reduce latency and cost for repetitive queries crates/palyra-daemon/src/model_provider.rs#1-3.
- Normalization: Converting provider-specific events into a uniform
ProviderEventstream, allowing theAgentRunLoopto process tokens and tool calls regardless of the underlying backend crates/palyra-daemon/src/model_provider.rs#14-18.
5.2 Provider Adapters and Configuration
The adapter layer handles the translation between Palyra’s internalProviderRequest and the specific JSON dialects required by external APIs.
- Dialect Support: Concrete adapters like
OpenAiCompatibleChatAdapterandAnthropicCompatibleChatAdapterproject theToolVisibleToolCatalogSnapshotinto the correct schema (e.g., OpenAI Function Calling vs. Anthropic Tool Use) crates/palyra-daemon/src/model_provider/adapters.rs#25-57. - Configuration Pipeline: The system uses a multi-layer loading strategy where
palyra.tomlsettings are merged withPALYRA_MODEL_PROVIDER_*environment variables crates/palyra-daemon/src/config/load.rs#109-110. - Secret Management: Sensitive credentials (API keys) are handled via
SecretRef, allowing them to be sourced from environment variables, files, or thepalyra-vaultcrates/palyra-common/src/daemon_config_schema.rs#22-38. - Tool Repair: An experimental feature (
TOOL_REPAIR_ROLLOUT_ENV) that attempts to normalize and fix malformed tool-call markup generated by models crates/palyra-daemon/src/model_provider.rs#71-77.