> ## Documentation Index
> Fetch the complete documentation index at: https://docs-code.palyra.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Worker Fleet Management

<details>
  <summary>Relevant source files</summary>

  The following files were used as context for generating this wiki page:

  * apps/web/src/console/sections/LogsSection.tsx
  * crates/palyra-cli/examples/run\_release\_eval\_gate.rs
  * crates/palyra-common/src/release\_evals/catalog.rs
  * crates/palyra-common/src/release\_evals/evaluator.rs
  * crates/palyra-common/src/release\_evals/mod.rs
  * crates/palyra-common/src/release\_evals/projections.rs
  * crates/palyra-common/src/release\_evals/schema.rs
  * crates/palyra-common/tests/release\_eval\_contract.rs
  * crates/palyra-daemon/src/gateway/approvals.rs
  * crates/palyra-daemon/src/gateway/canvas.rs
  * crates/palyra-daemon/src/gateway/cron\_support.rs
  * crates/palyra-daemon/src/gateway/messages.rs
  * crates/palyra-daemon/src/gateway/util.rs
  * crates/palyra-daemon/src/gateway/vault.rs
  * crates/palyra-daemon/src/node\_rpc.rs
  * crates/palyra-daemon/src/node\_runtime.rs
  * crates/palyra-daemon/src/transport/http/handlers/console/approvals.rs
  * crates/palyra-daemon/src/transport/http/handlers/console/devices.rs
  * crates/palyra-daemon/src/transport/http/handlers/console/logs.rs
  * crates/palyra-daemon/src/transport/http/handlers/console/nodes.rs
  * crates/palyra-daemon/src/transport/http/handlers/console/pairing.rs
  * crates/palyra-daemon/tests/node\_rpc\_mtls.rs
  * crates/palyra-egress-proxy/src/lib.rs
  * crates/palyra-egress-proxy/tests/critical\_attack\_scenarios.rs
  * crates/palyra-identity/src/pairing/manager.rs
  * crates/palyra-safety/tests/critical\_attack\_scenarios.rs
  * crates/palyra-workerd/src/lib.rs
  * crates/palyra-workerd/tests/critical\_attack\_scenarios.rs
  * fixtures/golden/release\_eval\_inventory.json
</details>

Worker Fleet Management provides a fail-closed system for coordinating distributed execution across networked worker nodes. It ensures that any worker participating in the fleet is verified through cryptographic attestation, bound by strict egress policies, and managed through a state-controlled lease lifecycle.

## WorkerFleetManager Ledger

The `WorkerFleetManager` serves as the in-memory ledger within the daemon. It tracks the availability, health, and current assignments of all registered workers.

* **Registration**: Workers must present a `WorkerAttestation` to join the fleet. The manager validates this against the `WorkerFleetPolicy` [crates/palyra-workerd/src/lib.rs#3-6](http://crates/palyra-workerd/src/lib.rs#3-6).
* **State Tracking**: It manages the `WorkerLifecycleState`, transitioning workers between states such as `Available`, `Busy`, `Quarantined`, or `Orphaned` [crates/palyra-workerd/src/lib.rs#10-11](http://crates/palyra-workerd/src/lib.rs#10-11).
* **Capacity Management**: It enforces limits defined in the policy, such as the maximum number of concurrent workers or leases per worker.

### Worker Lifecycle State Machine

The following diagram illustrates the transitions managed by the `WorkerFleetManager`.

"Worker Lifecycle State Machine"

```mermaid theme={null}
state_diagram
    [*] --> "Unregistered"
    "Unregistered" --> "Available" : "register_worker()\n(Attestation Valid)"
    "Available" --> "Busy" : "acquire_lease()"
    "Busy" --> "Available" : "release_lease()\n(Success/Clean)"
    "Busy" --> "Quarantined" : "release_lease()\n(Security Violation)"
    "Available" --> "Orphaned" : "Heartbeat Timeout"
    "Quarantined" --> "Unregistered" : "Evict/Re-attest"
    "Orphaned" --> "Available" : "re_register_worker()"
```

Sources: [crates/palyra-workerd/src/lib.rs#1-11](http://crates/palyra-workerd/src/lib.rs#1-11), [crates/palyra-workerd/src/lib.rs#73-91](http://crates/palyra-workerd/src/lib.rs#73-91)

## Attestation and Verification

Attestation is the process by which a worker proves its identity and integrity to the daemon. The `WorkerAttestation` struct contains claims about the worker's environment and build [crates/palyra-workerd/src/lib.rs#35-62](http://crates/palyra-workerd/src/lib.rs#35-62).

### Key Attestation Fields

| Field                   | Description                                                                                                                                             |
| :---------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `worker_id`             | Unique identifier (ULID) for the worker [crates/palyra-workerd/src/lib.rs#37](http://crates/palyra-workerd/src/lib.rs#37).                              |
| `image_digest_sha256`   | SHA-256 hash of the container/VM image [crates/palyra-workerd/src/lib.rs#38](http://crates/palyra-workerd/src/lib.rs#38).                               |
| `egress_proxy_attested` | Boolean indicating if the worker is bound to a verified egress proxy [crates/palyra-workerd/src/lib.rs#42](http://crates/palyra-workerd/src/lib.rs#42). |
| `wit_abi_version`       | The Wasm Interface Type ABI version supported by the worker [crates/palyra-workerd/src/lib.rs#52-54](http://crates/palyra-workerd/src/lib.rs#52-54).    |

Verification is performed by `WorkerAttestation::validate` against a `WorkerAttestationExpectation` [crates/palyra-workerd/src/lib.rs#113-117](http://crates/palyra-workerd/src/lib.rs#113-117). The system fails closed if any digest (image, build, or artifact) mismatches or if the egress proxy binding is missing when required [crates/palyra-workerd/src/lib.rs#127-151](http://crates/palyra-workerd/src/lib.rs#127-151).

Sources: [crates/palyra-workerd/src/lib.rs#35-153](http://crates/palyra-workerd/src/lib.rs#35-153), [crates/palyra-workerd/tests/critical\_attack\_scenarios.rs#73-91](http://crates/palyra-workerd/tests/critical_attack_scenarios.rs#73-91)

## Node RPC and mTLS Communication

Networked workers (and other remote nodes) communicate with the daemon via the `NodeService` gRPC interface [crates/palyra-daemon/src/node\_rpc.rs#1-4](http://crates/palyra-daemon/src/node_rpc.rs#1-4).

### mTLS Enforcement

The `NodeRpcServiceImpl` enforces Mutual TLS (mTLS) to ensure only paired devices can communicate.

1. **Fingerprint Extraction**: The daemon extracts the SHA-256 fingerprint of the client certificate from the `TlsConnectInfo` [crates/palyra-daemon/src/node\_rpc.rs#80-109](http://crates/palyra-daemon/src/node_rpc.rs#80-109).
2. **Revocation Check**: The fingerprint is checked against the `IdentityManager` to ensure it hasn't been revoked [crates/palyra-daemon/src/node\_rpc.rs#115-119](http://crates/palyra-daemon/src/node_rpc.rs#115-119).
3. **Device Binding**: The daemon ensures the `device_id` in the request matches the ID bound to that specific certificate fingerprint [crates/palyra-daemon/src/node\_rpc.rs#125-155](http://crates/palyra-daemon/src/node_rpc.rs#125-155).

"Node RPC Authentication Flow"

```mermaid theme={null}
sequenceDiagram
    participant W as Worker Node
    participant G as NodeRpcServiceImpl
    participant I as IdentityManager
    participant R as NodeRuntimeState

    W->>G: gRPC Request (with Client Cert)
    G->>G: extract peer_certificate_fingerprint()
    G->>I: is_revoked_certificate_fingerprint(fp)
    I-->>G: Not Revoked
    G->>I: device_id_for_certificate_fingerprint(fp)
    I-->>G: returns device_id
    G->>G: enforce_cert_bound_device(req, device_id)
    G->>R: Update RegisteredNodeRecord
```

Sources: [crates/palyra-daemon/src/node\_rpc.rs#80-155](http://crates/palyra-daemon/src/node_rpc.rs#80-155), [crates/palyra-daemon/src/node\_runtime.rs#161-170](http://crates/palyra-daemon/src/node_runtime.rs#161-170), [crates/palyra-daemon/tests/node\_rpc\_mtls.rs#76-96](http://crates/palyra-daemon/tests/node_rpc_mtls.rs#76-96)

## Lease Lifecycle and Quarantine

When an agent requires a networked tool execution, it requests a lease from the `WorkerFleetManager`.

1. **Grant Authorization**: An approval grant (`WorkerApprovalGrant`) must exist for the specific `run_id` [crates/palyra-workerd/src/lib.rs#173-177](http://crates/palyra-workerd/src/lib.rs#173-177).
2. **Assignment**: The manager selects an `Available` worker and transitions it to `Busy`.
3. **Workspace Scoping**: The lease includes a `WorkerWorkspaceScope` defining the root directory and allowed paths for the worker [crates/palyra-workerd/src/lib.rs#157-162](http://crates/palyra-workerd/src/lib.rs#157-162).
4. **Quarantine/Orphan Handling**:
   * **Quarantine**: If a worker crashes or violates security policy (e.g., unauthorized network access detected by the egress proxy), it is placed in `Quarantined` state and cannot be reassigned until an operator intervenes [crates/palyra-workerd/src/lib.rs#5-6](http://crates/palyra-workerd/src/lib.rs#5-6).
   * **Orphan**: If a worker stops sending heartbeats, it is marked `Orphaned`.

Sources: [crates/palyra-workerd/src/lib.rs#1-15](http://crates/palyra-workerd/src/lib.rs#1-15), [crates/palyra-workerd/src/lib.rs#157-177](http://crates/palyra-workerd/src/lib.rs#157-177)

## Egress Proxy Integration

Networked workers must boot bound to an attested egress proxy. The `EgressProxyPolicyService` provides the logic used to validate outbound requests [crates/palyra-egress-proxy/src/lib.rs#3-7](http://crates/palyra-egress-proxy/src/lib.rs#3-7).

* **Address Blocking**: It blocks resolution to private, loopback, or link-local addresses unless specifically opted-in via `allow_private_targets` [crates/palyra-egress-proxy/src/lib.rs#38-39](http://crates/palyra-egress-proxy/src/lib.rs#38-39).
* **DNS Rebinding Protection**: The service pins resolved socket addresses to prevent DNS rebinding attacks between policy evaluation and request execution [crates/palyra-egress-proxy/src/lib.rs#65-67](http://crates/palyra-egress-proxy/src/lib.rs#65-67).
* **Credential Injection**: Only vault-backed secrets can be injected into proxied headers via `CredentialBindingPlan` [crates/palyra-egress-proxy/src/lib.rs#20-29](http://crates/palyra-egress-proxy/src/lib.rs#20-29).

Sources: [crates/palyra-egress-proxy/src/lib.rs#1-48](http://crates/palyra-egress-proxy/src/lib.rs#1-48), [crates/palyra-egress-proxy/src/lib.rs#127-150](http://crates/palyra-egress-proxy/src/lib.rs#127-150)

## Node Runtime and Pairing

The `NodeRuntimeState` manages the persistence of node metadata and pairing requests in `node-runtime.v1.json` [crates/palyra-daemon/src/node\_runtime.rs#4-9](http://crates/palyra-daemon/src/node_runtime.rs#4-9).

* **Pairing Codes**: Operators mint `DevicePairingCodeRecord` (PIN or QR) with a specific TTL [crates/palyra-daemon/src/node\_runtime.rs#65-71](http://crates/palyra-daemon/src/node_runtime.rs#65-71).
* **Approval Flow**: Pairing requests enter `PendingApproval` state. An operator must approve the request via the Console (which calls `console_approval_decision_handler`) [crates/palyra-daemon/src/transport/http/handlers/console/approvals.rs#79-84](http://crates/palyra-daemon/src/transport/http/handlers/console/approvals.rs#79-84).
* **Capability Inventory**: Once registered, nodes report their `DeviceCapabilityView`, which the daemon uses to route tool execution requests [crates/palyra-daemon/src/node\_runtime.rs#152-156](http://crates/palyra-daemon/src/node_runtime.rs#152-156).

"Code Entities: Node Management"

| System Concept      | Code Entity                  | File                                                                                                                                                      |
| :------------------ | :--------------------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Node Ledger         | `PersistedNodeRuntimeState`  | [crates/palyra-daemon/src/node\_runtime.rs#171-181](http://crates/palyra-daemon/src/node_runtime.rs#171-181)                                              |
| Pairing Request     | `DevicePairingRequestRecord` | [crates/palyra-daemon/src/node\_runtime.rs#134-149](http://crates/palyra-daemon/src/node_runtime.rs#134-149)                                              |
| mTLS Implementation | `NodeRpcServiceImpl`         | [crates/palyra-daemon/src/node\_rpc.rs#56-61](http://crates/palyra-daemon/src/node_rpc.rs#56-61)                                                          |
| Capability Dispatch | `CapabilityRequestRecord`    | [crates/palyra-daemon/src/node\_runtime.rs#180](http://crates/palyra-daemon/src/node_runtime.rs#180)                                                      |
| Console Handler     | `console_nodes_list_handler` | [crates/palyra-daemon/src/transport/http/handlers/console/nodes.rs#55-58](http://crates/palyra-daemon/src/transport/http/handlers/console/nodes.rs#55-58) |

Sources: [crates/palyra-daemon/src/node\_runtime.rs#1-181](http://crates/palyra-daemon/src/node_runtime.rs#1-181), [crates/palyra-daemon/src/transport/http/handlers/console/nodes.rs#55-63](http://crates/palyra-daemon/src/transport/http/handlers/console/nodes.rs#55-63)
