> ## Documentation Index
> Fetch the complete documentation index at: https://docs-code.palyra.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Provider Registry and Routing

<details>
  <summary>Relevant source files</summary>

  The following files were used as context for generating this wiki page:

  * crates/palyra-cli/src/client/control\_plane.rs
  * crates/palyra-cli/src/commands/runtime\_reload.rs
  * crates/palyra-common/src/daemon\_config\_schema.rs
  * crates/palyra-common/src/feature\_rollouts.rs
  * crates/palyra-daemon/src/application/route\_message/response.rs
  * crates/palyra-daemon/src/config/load.rs
  * crates/palyra-daemon/src/config/schema.rs
  * crates/palyra-daemon/src/model\_provider.rs
  * crates/palyra-daemon/src/model\_provider/adapters.rs
  * crates/palyra-daemon/tests/current\_state\_inventory.rs
  * crates/palyra-daemon/tests/support/mod.rs
  * crates/palyra-model-providers/Cargo.toml
  * crates/palyra-model-providers/src/config.rs
  * crates/palyra-model-providers/src/contract.rs
  * crates/palyra-model-providers/src/error\_envelope.rs
  * crates/palyra-model-providers/src/errors.rs
  * crates/palyra-model-providers/src/lib.rs
  * crates/palyra-model-providers/src/providers.rs
  * crates/palyra-model-providers/src/providers/antropic.rs
  * crates/palyra-model-providers/src/providers/google.rs
  * crates/palyra-model-providers/src/providers/minimax.rs
  * crates/palyra-model-providers/src/providers/openai.rs
  * crates/palyra-model-providers/src/redaction.rs
  * crates/palyra-model-providers/src/streaming.rs
  * crates/palyra-model-providers/src/tool\_repair.rs
</details>

The Model Provider subsystem is the gateway through which Palyra interacts with Large Language Models (LLMs). It abstracts various upstream APIs (OpenAI, Anthropic, Google, MiniMax) into a unified interface, providing resilient routing, circuit breaking, and automated failover.

## Core Abstractions: ModelProvider and EmbeddingsProvider

The system is built around two primary traits defined in `crates/palyra-daemon/src/model_provider.rs`. These traits decouple the agent's request for intelligence from the specific transport or provider implementation.

* **`ModelProvider`**: Handles chat completions, tool proposals, and audio transcriptions [crates/palyra-daemon/src/model\_provider.rs#5-9](http://crates/palyra-daemon/src/model_provider.rs#5-9).
* **`EmbeddingsProvider`**: Handles vector generation for RAG and semantic search [crates/palyra-daemon/src/model\_provider.rs#6-6](http://crates/palyra-daemon/src/model_provider.rs#6-6).

The `RegistryBackedModelProvider` is the concrete implementation that manages a collection of these providers, performing candidate selection based on the current configuration [crates/palyra-daemon/src/model\_provider.rs#10-12](http://crates/palyra-daemon/src/model_provider.rs#10-12).

### Natural Language Space to Code Entity Space: Provider Contracts

The following diagram maps high-level provider concepts to the specific Rust entities that implement them.

Title: Provider Contract Mapping

```mermaid theme={null}
graph TD
    subgraph "Natural Language Space"
        A["Chat Completion"]
        B["Vector Embedding"]
        C["Tool Call Repair"]
        D["Error Handling"]
    end

    subgraph "Code Entity Space"
        direction LR
        E["ModelProvider (Trait)"]
        F["EmbeddingsProvider (Trait)"]
        G["ToolRepairStreamNormalizer"]
        H["ProviderFailureClassification"]
    end

    A --- E
    B --- F
    C --- G
    D --- H

    style E stroke-width:2px
    style F stroke-width:2px
    style G stroke-width:2px
    style H stroke-width:2px
```

Sources: [crates/palyra-daemon/src/model\_provider.rs#5-18](http://crates/palyra-daemon/src/model_provider.rs#5-18), [crates/palyra-model-providers/src/lib.rs#89-97](http://crates/palyra-model-providers/src/lib.rs#89-97)

## Registry and Routing Logic

The `RegistryBackedModelProvider` acts as an orchestrator. When a request is made, it evaluates the `ModelProviderRegistryConfig` to identify viable candidates.

### Candidate Selection and Ranking

Candidates are filtered based on:

1. **Capability**: Does the model support the requested feature (e.g., vision, tool use, or reasoning effort)? [crates/palyra-daemon/src/model\_provider.rs#54-55](http://crates/palyra-daemon/src/model_provider.rs#54-55).
2. **Health**: Is the provider currently circuit-broken? [crates/palyra-daemon/src/model\_provider.rs#91-96](http://crates/palyra-daemon/src/model_provider.rs#91-96).
3. **Priority**: Defined by the `ProviderRegistryEntryConfig` in the configuration [crates/palyra-daemon/src/model\_provider.rs#60-61](http://crates/palyra-daemon/src/model_provider.rs#60-61).

### Circuit Breaking and Failover

The system implements a fail-closed classification strategy. If a primary candidate fails, the `ProviderFailureClassification` determines the next step [crates/palyra-daemon/src/model\_provider.rs#99-101](http://crates/palyra-daemon/src/model_provider.rs#99-101):

* **`Retry`**: For transient errors like HTTP 429 or 503 [crates/palyra-daemon/src/model\_provider.rs#120-120](http://crates/palyra-daemon/src/model_provider.rs#120-120).
* **`Failover`**: Switch to the next ranked provider in the registry [crates/palyra-daemon/src/model\_provider.rs#64-64](http://crates/palyra-daemon/src/model_provider.rs#64-64).
* **`FailClosed`**: Terminate the request if the error is unrecoverable (e.g., authentication failure) [crates/palyra-daemon/src/model\_provider.rs#64-64](http://crates/palyra-daemon/src/model_provider.rs#64-64).

Title: Request Routing and Failover Flow

```mermaid theme={null}
sequenceDiagram
    participant O as Orchestrator
    participant R as RegistryBackedModelProvider
    participant C as CircuitBreaker
    participant P as RemoteProvider (e.g. OpenAI)

    O->>R: request_chat_completion(ProviderRequest)
    R->>R: select_candidates()
    R->>C: check_health(candidate)
    C-->>R: healthy
    R->>P: POST /chat/completions
    P-->>R: HTTP 429 (Rate Limit)
    R->>R: classify_failure(429)
    Note over R: Result: FailoverRequested
    R->>R: select_next_candidate()
    R->>P: POST /v1/messages (Anthropic)
    P-->>R: 200 OK (Stream)
    R-->>O: ProviderEventStream
```

Sources: [crates/palyra-daemon/src/model\_provider.rs#10-18](http://crates/palyra-daemon/src/model_provider.rs#10-18), [crates/palyra-daemon/src/model\_provider.rs#62-67](http://crates/palyra-daemon/src/model_provider.rs#62-67), [crates/palyra-daemon/src/model\_provider.rs#120-120](http://crates/palyra-daemon/src/model_provider.rs#120-120)

## Response Caching and Normalization

To optimize latency and cost, Palyra employs TTL-bounded response caching.

* **Cache Strategy**: Controlled by `PromptCachePolicy`. It can be set to `Streaming` or `Response-body` based [crates/palyra-daemon/src/model\_provider.rs#81-83](http://crates/palyra-daemon/src/model_provider.rs#81-83).
* **Normalization**: Regardless of the upstream provider's format, the output is normalized into `ProviderEvent::ModelToken` events. This allows the `Orchestrator` to remain provider-agnostic [crates/palyra-daemon/src/model\_provider.rs#14-18](http://crates/palyra-daemon/src/model_provider.rs#14-18).
* **Tool Repair**: If a model produces malformed tool arguments, the `ToolRepairStreamNormalizer` attempts to fix the JSON before it reaches the execution layer [crates/palyra-daemon/src/model\_provider.rs#71-77](http://crates/palyra-daemon/src/model_provider.rs#71-77).

## Configuration and Secrets

The provider configuration is loaded through a multi-layered pipeline (Defaults -> TOML -> Env Vars) [crates/palyra-daemon/src/config/load.rs#4-9](http://crates/palyra-daemon/src/config/load.rs#4-9).

| Feature            | Description                                                  | Code Reference                                                                                                            |
| :----------------- | :----------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------ |
| **API Keys**       | Loaded via `SecretRef` to prevent accidental logging.        | [crates/palyra-common/src/daemon\_config\_schema.rs#27-32](http://crates/palyra-common/src/daemon_config_schema.rs#27-32) |
| **Network Policy** | Validates base URLs to prevent SSRF or unauthorized egress.  | [crates/palyra-daemon/src/model\_provider.rs#56-56](http://crates/palyra-daemon/src/model_provider.rs#56-56)              |
| **Service Tiers**  | Configures `ProviderServiceTier` (e.g., scale vs. standard). | [crates/palyra-daemon/src/model\_provider.rs#87-87](http://crates/palyra-daemon/src/model_provider.rs#87-87)              |

### Code Entity Space: Configuration Structure

Title: Model Provider Configuration Schema

```mermaid theme={null}
classDiagram
    class ModelProviderConfig {
        +ModelProviderKind kind
        +Option~String~ openai_api_key
        +Option~String~ anthropic_api_key
    }
    class ModelProviderRegistryConfig {
        +Vec~ProviderRegistryEntryConfig~ entries
    }
    class ProviderRegistryEntryConfig {
        +String provider_id
        +ProviderPriority priority
        +ProviderCapabilities capabilities
    }
    ModelProviderConfig --> ModelProviderRegistryConfig
    ModelProviderRegistryConfig --> ProviderRegistryEntryConfig
```

Sources: [crates/palyra-daemon/src/config/load.rs#109-110](http://crates/palyra-daemon/src/config/load.rs#109-110), [crates/palyra-daemon/src/model\_provider.rs#57-61](http://crates/palyra-daemon/src/model_provider.rs#57-61)

## Implementation Details

### Key Functions and Modules

* **`load_config()`**: Resolves provider credentials and network policies from `palyra.toml` or environment variables [crates/palyra-daemon/src/config/load.rs#91-109](http://crates/palyra-daemon/src/config/load.rs#91-109).
* **`process_route_provider_response()`**: Post-processes raw provider events, handles tool execution inline, and persists results to the `OrchestratorTape` [crates/palyra-daemon/src/application/route\_message/response.rs#163-174](http://crates/palyra-daemon/src/application/route_message/response.rs#163-174).
* **`adapters` module**: Contains provider-specific logic for translating Palyra's internal `ProviderRequest` into wire-format JSON for OpenAI or Anthropic [crates/palyra-daemon/src/model\_provider/adapters.rs#16-19](http://crates/palyra-daemon/src/model_provider/adapters.rs#16-19).

### Constraints

* **Embeddings Batch Size**: Limited to 64 to ensure provider stability [crates/palyra-daemon/src/model\_provider.rs#121-121](http://crates/palyra-daemon/src/model_provider.rs#121-121).
* **Text Limits**: Outbound messages are chunked to `DEFAULT_ROUTE_MESSAGE_OUTPUT_MAX_BYTES` (2,000 bytes) to accommodate connector limits [crates/palyra-daemon/src/application/route\_message/response.rs#35-35](http://crates/palyra-daemon/src/application/route_message/response.rs#35-35).

Sources: [crates/palyra-daemon/src/model\_provider.rs](), [crates/palyra-daemon/src/config/load.rs](), [crates/palyra-daemon/src/application/route\_message/response.rs](), [crates/palyra-model-providers/src/lib.rs]()
