Skip to main content
This page covers the configuration and management of Large Language Model (LLM) providers, the secure storage and rotation of their credentials via Auth Profiles, and the governance framework that handles smart routing, cost tracking, and budget enforcement.

Model Provider Configuration

Palyra supports multiple model providers through a unified registry. The system abstracts provider-specific APIs (OpenAI, Anthropic) into a common internal representation for chat completions, embeddings, and audio transcriptions.

Supported Provider Kinds

The daemon recognizes three primary provider types:

Registry and Model Metadata

The ModelProviderRegistryConfig manages a collection of ProviderRegistryEntryConfig (the “where” and “how” of a connection) and ProviderModelEntryConfig (the specific models available) crates/palyra-daemon/src/model_provider.rs#144-187.
EntityPurposeKey Fields
Provider EntryConnection settingsbase_url, auth_profile_id, max_retries, circuit_breaker
Model EntryModel capabilitiesrole (Chat/Embed), capabilities (Vision/Tools), metadata_source
Sources: crates/palyra-daemon/src/model_provider.rs#37-187, crates/palyra-daemon/src/config/load.rs#23-28

Auth Profile Registry

The Auth Profile system separates identity and credentials from model configuration. This allows multiple agents or components to share a single set of credentials or rotate them without modifying the provider registry.

Credential Types and Storage

Credentials are never stored in plain text in the configuration files. Instead, they are persisted in the palyra-vault and referenced by a VaultRef crates/palyra-daemon/src/openai_surface.rs#42-48.
  1. API Key: A static bearer token stored in the vault crates/palyra-auth/src/lib.rs#10-16.
  2. OAuth2: Managed profiles that support the openid, profile, and offline_access scopes, allowing for automatic token refresh apps/web/src/console/hooks/useAuthDomain.ts#15-16.

Profile Scoping

Profiles can be scoped to limit where credentials can be used:

Auth Data Flow (API Key Connection)

The following diagram illustrates the flow when a user connects a new OpenAI API key via the Web Console. Title: OpenAI API Key Connection Flow Sources: apps/web/src/console/sections/AuthSection.tsx#130-166, crates/palyra-daemon/src/openai_surface.rs#18-78, crates/palyra-daemon/tests/openai_auth_surface.rs#29-73

Usage Governance and Smart Routing

The governance subsystem tracks token consumption and applies “Smart Routing” logic to select the most cost-effective or performant model based on the complexity of the prompt.

Smart Routing Logic

The RoutingDecision is calculated by evaluating the complexity_score of a prompt against available model capabilities and health_state crates/palyra-daemon/src/usage_governance.rs#112-130.

Usage Budgets

The system enforces UsageBudgetPolicyRecord rules to prevent runaway costs crates/palyra-daemon/src/usage_governance.rs#11-13.
MetricEnforcement Actions
Consumed Value (USD)Soft Limit (Alert), Hard Limit (Block)
Token CountPer Session, Per Agent, or Global

Cost Tracking Implementation

The UsageEnrichedRun structure joins OrchestratorUsageInsightsRunRecord with UsagePricingRecord to provide real-time USD estimates crates/palyra-daemon/src/usage_governance.rs#208-214. Title: Governance and Routing Architecture Sources: crates/palyra-daemon/src/usage_governance.rs#26-146, crates/palyra-daemon/src/journal.rs#8-13

OpenAI-Compatible Surface

The daemon exposes an OpenAI-compatible HTTP surface at /v1/*, allowing standard OpenAI SDKs and tools to interact with Palyra as if it were the upstream provider.

Key Handlers

Request Transformation

When a request hits the OpenAI-compatible surface, the ModelProvider translates the generic ProviderRequest into the specific format required by the target (e.g., converting OpenAI JSON to Anthropic XML/JSON structures) crates/palyra-daemon/src/model_provider.rs#211-216. Sources: crates/palyra-daemon/src/model_provider.rs#19-35, crates/palyra-daemon/src/openai_surface.rs#15-15

Maintenance and Background Tasks

The cron subsystem handles recurring tasks related to model health and credential maintenance. Sources: crates/palyra-daemon/src/cron.rs#42-60