Model Providers, Cron & Usage Governance

This section details the subsystems responsible for LLM integration, scheduled task execution, and financial/usage oversight within the Palyra daemon.

Model Providers & Registry

The model_provider.rs module manages the integration with external LLM providers (OpenAI, Anthropic, and OpenAI-compatible endpoints). It handles request normalization, credential resolution, and reliability patterns like circuit breaking and failover.

Provider Architecture

Palyra uses a registry-based approach where multiple providers and models can be configured simultaneously. The system supports three primary ModelProviderKind: Deterministic, OpenAiCompatible, and Anthropic crates/palyra-daemon/src/model_provider.rs#37-43. Models are categorized by ProviderModelRole:

Chat: Standard completion and tool-calling interfaces crates/palyra-daemon/src/model_provider.rs#77.
Embeddings: Vector generation for RAG and memory crates/palyra-daemon/src/model_provider.rs#78.
AudioTranscription: Converting speech to text crates/palyra-daemon/src/model_provider.rs#79.

Reliability & Performance

The registry implements several patterns to ensure high availability:

Circuit Breaker: Tracks failures per provider and enters a cooldown state if a threshold is met crates/palyra-daemon/src/model_provider.rs#159-160.
Failover: If enabled, the system can automatically route requests to alternative models if the primary fails crates/palyra-daemon/src/model_provider.rs#181.
Response Caching: Optional TTL-based caching for model responses to reduce costs and latency crates/palyra-daemon/src/model_provider.rs#182-184.

Model Selection Logic

The following diagram illustrates how the system selects a model and provider for a specific request. Model Selection & Execution Flow Sources: crates/palyra-daemon/src/model_provider.rs#144-187, crates/palyra-daemon/src/model_provider.rs#11-14

Cron & Scheduled Runs

The cron.rs module implements a high-precision scheduler for automated tasks, including recurring agent runs, memory maintenance, and system health checks.

Scheduler Implementation

The scheduler runs as a background loop, waking up periodically to check for due jobs crates/palyra-daemon/src/cron.rs#42. It supports three schedule types:

Cron: Standard 5-field cron expressions crates/palyra-daemon/src/cron.rs#123.
Every: Fixed interval-based execution (e.g., every 5 minutes) crates/palyra-daemon/src/cron.rs#124.
At: One-time execution at a specific timestamp crates/palyra-daemon/src/cron.rs#125.

System Maintenance Tasks

The daemon registers several internal cron jobs for self-management:

Memory Maintenance: Runs every 5 minutes to prune or optimize the journal crates/palyra-daemon/src/cron.rs#56.
Embeddings Backfill: Runs every 10 minutes to ensure all journal entries have vector representations crates/palyra-daemon/src/cron.rs#57.
Skill Re-audit: Periodically verifies the integrity of installed WASM skills crates/palyra-daemon/src/cron.rs#54.

Cron Dispatch Pipeline Sources: crates/palyra-daemon/src/cron.rs#158-174, crates/palyra-daemon/src/cron.rs#29-39

Usage Governance & Budgeting

The usage_governance.rs module provides a policy engine for tracking token consumption and enforcing financial limits across different scopes (e.g., per user, per session, or global).

Budget Policies

Administrators can define UsageBudgetPolicyRecord entries that specify:

Metric Kind: tokens or usd_cost crates/palyra-daemon/src/usage_governance.rs#96.
Interval: daily, weekly, or monthly crates/palyra-daemon/src/usage_governance.rs#97.
Limits: Both soft_limit (alerts only) and hard_limit (blocks execution) crates/palyra-daemon/src/usage_governance.rs#105-107.

Smart Routing

The system can dynamically select models based on the complexity of the prompt and the current budget status via RoutingMode:

Suggest: Recommends a model but doesn’t enforce it crates/palyra-daemon/src/usage_governance.rs#29.
DryRun: Logs what would have happened under enforcement crates/palyra-daemon/src/usage_governance.rs#30.
Enforced: Actively overrides model selection to stay within budget or optimize for cost/latency crates/palyra-daemon/src/usage_governance.rs#31.

Data Structures

Struct	Purpose
`PricingEstimate`	Calculates projected cost for a run based on input/output tokens crates/palyra-daemon/src/usage_governance.rs#69-77.
`UsageBudgetEvaluation`	The result of checking a run against active policies crates/palyra-daemon/src/usage_governance.rs#92-109.
`RoutingDecision`	Final determination of which model to use and why crates/palyra-daemon/src/usage_governance.rs#112-130.

Sources: crates/palyra-daemon/src/usage_governance.rs#8-13, crates/palyra-daemon/src/usage_governance.rs#112-130

Configuration Loading

Configuration is loaded via config/load.rs, which merges defaults with the palyra.toml file and environment variables.

Loading Sequence

Search: Looks for palyra.toml in standard paths crates/palyra-daemon/src/config/load.rs#50.
Parse & Migrate: Parses TOML and applies version migrations if the file is from an older schema version crates/palyra-daemon/src/config/load.rs#53-58.
Credential Resolution: Resolves api_key_vault_ref into actual keys using the Vault system crates/palyra-daemon/src/model_provider.rs#154.
Runtime State: The LoadedConfig is used to build the AppState crates/palyra-daemon/src/app/runtime.rs#44-48.

Model CLI Integration

The palyra CLI provides commands to inspect and modify these configurations:

models status: Shows current provider health and configured defaults crates/palyra-cli/src/commands/models.rs#196-199.
models list: Catalogs all available models from all registered providers crates/palyra-cli/src/commands/models.rs#200-203.
models explain: Provides a trace of the model selection logic for a specific prompt crates/palyra-cli/src/commands/models.rs#152-162.

Sources: crates/palyra-daemon/src/config/load.rs#31-49, crates/palyra-cli/src/commands/models.rs#1-25

​Model Providers & Registry

​Provider Architecture

​Reliability & Performance

​Model Selection Logic

​Cron & Scheduled Runs

​Scheduler Implementation

​System Maintenance Tasks

​Usage Governance & Budgeting

​Budget Policies

​Smart Routing

​Data Structures

​Configuration Loading

​Loading Sequence

​Model CLI Integration