Skip to main content
The Model Provider Registry is the central abstraction within the Palyra daemon responsible for managing Large Language Model (LLM) backends. It provides a unified interface for chat completions, embeddings, and audio transcriptions while handling the complexities of authentication, failover, circuit breaking, and SSRF protection.

Registry Architecture

The registry is built around the RegistryBackedModelProvider struct, which implements the core logic for routing requests to specific backend implementations based on the ModelProviderKind crates/palyra-daemon/src/model_provider.rs#39-43.

Supported Backend Kinds

KindDescriptionProtocol
DeterministicUsed for testing or local mocks; returns fixed responses crates/palyra-daemon/src/model_provider.rs#40.Internal
OpenAiCompatibleSupports OpenAI’s API and compatible providers (e.g., Together, Groq, LocalAI) crates/palyra-daemon/src/model_provider.rs#41.REST / JSON
AnthropicDedicated support for Anthropic Messages API crates/palyra-daemon/src/model_provider.rs#42.REST / JSON

Data Flow: Model Request Lifecycle

The following diagram illustrates how a request from the Orchestrator is resolved and executed through the registry. Model Request Resolution Flow Sources: crates/palyra-daemon/src/model_provider.rs#175-187, crates/palyra-daemon/src/usage_governance.rs#112-130, crates/palyra-daemon/src/model_provider.rs#156-161

Key Components & Implementation

Configuration Schema

The registry is configured via the model_provider table in palyra.toml. It supports an array of providers and models to allow for complex failover and specialized routing crates/palyra-daemon/src/model_provider.rs#175-187.

Smart Routing & Failover

The UsageGovernance system performs “Smart Routing” by evaluating RoutingDecisionContext crates/palyra-daemon/src/usage_governance.rs#133-146.
  1. Complexity Scoring: It calculates a score based on prompt tokens and vision inputs crates/palyra-daemon/src/usage_governance.rs#121.
  2. Health Checks: Providers in a “failed” state are deprioritized crates/palyra-daemon/src/usage_governance.rs#122.
  3. Failover: If failover_enabled is true, the registry can switch to a secondary provider if the primary returns a retryable status code (429, 500, 502, 503, 504) crates/palyra-daemon/src/model_provider.rs#24, crates/palyra-daemon/src/model_provider.rs#197.

SSRF Protection (NetGuard)

To prevent Server-Side Request Forgery, the registry uses a validation layer for base_url. By default, it blocks private network IP ranges unless allow_private_base_url is explicitly enabled for a specific provider crates/palyra-daemon/src/model_provider.rs#149.

Token Estimation

The registry includes utilities for estimating token counts without requiring a round-trip to the provider. Sources: crates/palyra-daemon/src/model_provider.rs#144-205, crates/palyra-daemon/src/usage_governance.rs#112-146, crates/palyra-daemon/src/config/load.rs#23-28

Authentication & Vault Integration

The registry integrates with palyra-vault to ensure API keys are never stored in plaintext in the configuration file.

Secret Redaction

The redact_provider_registry_secrets function crates/palyra-common/src/daemon_config_schema.rs#65-98 ensures that when configuration is exported or logged, sensitive fields like api_key or api_key_vault_ref are replaced with <redacted> crates/palyra-common/src/daemon_config_schema.rs#4.

Auth Profiles

Providers can link to an AuthProfile via auth_profile_id crates/palyra-daemon/src/model_provider.rs#151. Sources: crates/palyra-common/src/daemon_config_schema.rs#6-16, crates/palyra-daemon/src/openai_surface.rs#16-76, crates/palyra-daemon/src/model_provider.rs#151-154

CLI Management

The palyra models command group provides administrative access to the registry.
CommandCode EntityDescription
statusload_models_statusShows current default models and provider health crates/palyra-cli/src/commands/models.rs#196-199.
listbuild_models_listLists all models available in the catalog and registry crates/palyra-cli/src/commands/models.rs#201-206.
setrun_modelsUpdates the default chat model in palyra.toml crates/palyra-cli/src/commands/models.rs#194-206.
connectrun_modelsPerforms a live connectivity check and model discovery crates/palyra-cli/src/commands/models.rs#126-133.
Sources: crates/palyra-cli/src/commands/models.rs#27-45, crates/palyra-cli/src/commands/models.rs#194-206