Skip to main content
This section details the subsystems responsible for LLM integration, scheduled task execution, and financial/usage oversight within the Palyra daemon.

Model Providers & Registry

The model_provider.rs module manages the integration with external LLM providers (OpenAI, Anthropic, and OpenAI-compatible endpoints). It handles request normalization, credential resolution, and reliability patterns like circuit breaking and failover.

Provider Architecture

Palyra uses a registry-based approach where multiple providers and models can be configured simultaneously. The system supports three primary ModelProviderKind: Deterministic, OpenAiCompatible, and Anthropic crates/palyra-daemon/src/model_provider.rs#37-43. Models are categorized by ProviderModelRole:

Reliability & Performance

The registry implements several patterns to ensure high availability:

Model Selection Logic

The following diagram illustrates how the system selects a model and provider for a specific request. Model Selection & Execution Flow Sources: crates/palyra-daemon/src/model_provider.rs#144-187, crates/palyra-daemon/src/model_provider.rs#11-14

Cron & Scheduled Runs

The cron.rs module implements a high-precision scheduler for automated tasks, including recurring agent runs, memory maintenance, and system health checks.

Scheduler Implementation

The scheduler runs as a background loop, waking up periodically to check for due jobs crates/palyra-daemon/src/cron.rs#42. It supports three schedule types:
  1. Cron: Standard 5-field cron expressions crates/palyra-daemon/src/cron.rs#123.
  2. Every: Fixed interval-based execution (e.g., every 5 minutes) crates/palyra-daemon/src/cron.rs#124.
  3. At: One-time execution at a specific timestamp crates/palyra-daemon/src/cron.rs#125.

System Maintenance Tasks

The daemon registers several internal cron jobs for self-management: Cron Dispatch Pipeline Sources: crates/palyra-daemon/src/cron.rs#158-174, crates/palyra-daemon/src/cron.rs#29-39

Usage Governance & Budgeting

The usage_governance.rs module provides a policy engine for tracking token consumption and enforcing financial limits across different scopes (e.g., per user, per session, or global).

Budget Policies

Administrators can define UsageBudgetPolicyRecord entries that specify:

Smart Routing

The system can dynamically select models based on the complexity of the prompt and the current budget status via RoutingMode:

Data Structures

StructPurpose
PricingEstimateCalculates projected cost for a run based on input/output tokens crates/palyra-daemon/src/usage_governance.rs#69-77.
UsageBudgetEvaluationThe result of checking a run against active policies crates/palyra-daemon/src/usage_governance.rs#92-109.
RoutingDecisionFinal determination of which model to use and why crates/palyra-daemon/src/usage_governance.rs#112-130.
Sources: crates/palyra-daemon/src/usage_governance.rs#8-13, crates/palyra-daemon/src/usage_governance.rs#112-130

Configuration Loading

Configuration is loaded via config/load.rs, which merges defaults with the palyra.toml file and environment variables.

Loading Sequence

  1. Search: Looks for palyra.toml in standard paths crates/palyra-daemon/src/config/load.rs#50.
  2. Parse & Migrate: Parses TOML and applies version migrations if the file is from an older schema version crates/palyra-daemon/src/config/load.rs#53-58.
  3. Credential Resolution: Resolves api_key_vault_ref into actual keys using the Vault system crates/palyra-daemon/src/model_provider.rs#154.
  4. Runtime State: The LoadedConfig is used to build the AppState crates/palyra-daemon/src/app/runtime.rs#44-48.

Model CLI Integration

The palyra CLI provides commands to inspect and modify these configurations: Sources: crates/palyra-daemon/src/config/load.rs#31-49, crates/palyra-cli/src/commands/models.rs#1-25