Provider Architecture & Input Pipeline
The daemon integrates with LLMs through a unifiedModelProvider interface that supports chat completions, embeddings, and audio transcription.
Key Data Structures
ModelProviderKind: Categorizes providers intoDeterministic(local/mock),OpenAiCompatible, andAnthropiccrates/palyra-daemon/src/model_provider.rs#39-43.ProviderModelRole: Defines the capability of a specific model:Chat,Embeddings, orAudioTranscriptioncrates/palyra-daemon/src/model_provider.rs#67-71.ProviderRegistryEntryConfig: Contains connection metadata includingbase_url,api_key_vault_ref, and circuit breaker settings crates/palyra-daemon/src/model_provider.rs#144-161.
Transformation Pipeline
When a request is sent to a provider, the system performs several transformations:- Vision Handling: If the request contains images,
build_openai_chat_contentorbuild_anthropic_messages_payloadconstructs the appropriate multi-modal payload crates/palyra-daemon/src/model_provider.rs#211-220. - Token Estimation: Before dispatch,
estimate_token_countensures the payload fits withinMAX_MODEL_TOKENS_PER_EVENTcrates/palyra-daemon/src/model_provider.rs#17-18. - Tool Mapping: Internal tool definitions are converted to provider-specific function/tool schemas.
Auth Profiles & OpenAI OAuth Flow
Palyra uses Auth Profiles to decouple model configuration from sensitive credentials. Credentials are never stored in the mainconfig.toml in plain text; instead, they are stored in the Vault and referenced by a VaultRef crates/palyra-daemon/src/openai_surface.rs#42-48.
API Key Connection
Theconnect_openai_api_key and connect_anthropic_api_key functions validate the key against the provider’s /models endpoint before persisting it to the registry crates/palyra-daemon/src/openai_surface.rs#34-40, crates/palyra-daemon/src/openai_surface.rs#96-102.
OAuth Lifecycle
For OpenAI, a PKCE-based OAuth flow is supported to manage short-lived access tokens and long-lived refresh tokens:- Bootstrap:
start_openai_oauth_attemptgenerates a PKCE verifier and challenge crates/palyra-daemon/src/openai_auth.rs#99-107. - Authorize: The user is redirected to the OpenAI authorization URL with a
stateparameter tracking theattempt_idcrates/palyra-daemon/src/openai_auth.rs#109-130. - Callback: The
OPENAI_OAUTH_CALLBACK_PATHreceives the code and exchanges it for tokens usingexchange_authorization_codecrates/palyra-daemon/src/openai_auth.rs#132-155. - Persistence: Tokens are stored in the Vault, and an
AuthProfileRecordis created in theAuthProfileRegistrycrates/palyra-auth/src/lib.rs#10-21.
| Function | Role | File Reference |
|---|---|---|
generate_pkce_verifier | Creates entropy for OAuth security | crates/palyra-daemon/src/openai_auth.rs#99 |
validate_openai_bearer_token | Probes /v1/models for key validity | crates/palyra-daemon/src/openai_auth.rs#189 |
persist_openai_auth_profile | Saves metadata to agents.toml or registry | crates/palyra-daemon/src/openai_surface.rs#61 |
refresh_openai_profile | Uses refresh token to get new access token | apps/web/src/console/hooks/useAuthDomain.ts#220 |
Smart Routing & Failover
TheModelProviderRegistryConfig manages how the daemon selects models for different tasks crates/palyra-daemon/src/model_provider.rs#175-187.
- Default Selection: Users can define
default_chat_model_idanddefault_embeddings_model_id. - Failover: If
failover_enabledis true, the system can attempt to route requests to alternative models if the primary provider returns retryable status codes (429, 500, 502, 503, 504) crates/palyra-daemon/src/model_provider.rs#24, crates/palyra-daemon/src/model_provider.rs#181. - Circuit Breaker: Monitored via
circuit_breaker_failure_threshold. Once triggered, the provider is bypassed forcircuit_breaker_cooldown_mscrates/palyra-daemon/src/model_provider.rs#159-160.
Cron-Triggered Agent Runs
Model integration extends to background tasks via theCron system. Agents can be scheduled to run autonomously, utilizing the configured LLM providers.
Execution Flow
- Scheduler: The
CronMatcheridentifies jobs due for execution crates/palyra-daemon/src/cron.rs#139-158. - Dispatch: The system triggers a
gateway_v1::RunStreamrequest withSYSTEM_DAEMON_PRINCIPALcrates/palyra-daemon/src/cron.rs#55. - Context Injection: The cron job provides the necessary
RequestContext, including theagent_idand associatedauth_profile_idcrates/palyra-daemon/src/cron.rs#29-33.
Configuration & Discovery
Providers and models are configured in themodel_provider section of the daemon config crates/palyra-daemon/src/config/load.rs#23-28.
- Static Config: Manual entries in
config.toml. - Discovery: Models can be dynamically discovered from the provider’s API, controlled by
discovery_ttl_mscrates/palyra-daemon/src/model_provider.rs#185. - Health Checks: The system periodically probes provider health, governed by
health_ttl_mscrates/palyra-daemon/src/model_provider.rs#186.