BrowserService is the primary gRPC interface provided by palyra-browserd to enable programmatic control over headless Chromium instances. It facilitates session-based browser automation, state observation, and profile management, serving as the bridge between the Palyra daemon (palyrad) and the underlying browser engine.
Service Definition and Transport
The service is defined inpalyra/v1/browser.proto and utilizes tonic for gRPC implementation in Rust. The build process uses protoc-bin-vendored to ensure consistent stub generation across platforms.
| Component | Entity | Role |
|---|---|---|
| Protocol | palyra.browser.v1.BrowserService | Service definition schemas/proto/palyra/v1/browser.proto#7-37 |
| Server | palyra-browserd | Binary implementing the gRPC server crates/palyra-browserd/Cargo.toml#9-10 |
| Engine | headless_chrome | Underlying CDP (Chrome DevTools Protocol) integration crates/palyra-browserd/Cargo.toml#17-17 |
Session Management
Browser interactions are encapsulated within aSession. Each session maintains its own lifecycle, budget constraints, and optional persistence.
Lifecycle Methods
CreateSession: Initializes a new browser context. It accepts aSessionBudgetto enforce resource limits and aprofile_idfor state persistence.CloseSession: Gracefully terminates the browser instance and cleans up temporary artifacts.GetSession/ListSessions: Provides metadata about active sessions, including tab counts and idle timers.InspectSession: A deep-dive diagnostic method that can retrieve cookies, local storage, and action logs for debugging.
Session Resource Governance
TheSessionBudget message is critical for security and stability, defining hard limits on the automation environment.
| Field | Description |
|---|---|
max_navigation_timeout_ms | Maximum time allowed for a page load schemas/proto/palyra/v1/browser.proto#53-53 |
max_screenshot_bytes | Quota for image capture size schemas/proto/palyra/v1/browser.proto#55-55 |
max_actions_per_session | Total automation steps allowed before auto-close schemas/proto/palyra/v1/browser.proto#59-59 |
max_network_log_entries | Buffer limit for network activity tracking schemas/proto/palyra/v1/browser.proto#65-65 |
Automation and Observation
The API provides standard automation primitives that map toheadless_chrome actions.
Core Actions
Navigate: Directs the browser to a URL.Click: Simulates mouse events on selectors.Type: Injects keyboard events into input fields.Scroll: Adjusts window or element scroll position.WaitFor: Pauses execution until a DOM element matches a selector or a timeout occurs.
Observation API
To allow AI agents to “see” the page, the service provides:Screenshot: Returns a PNG/JPEG buffer of the current viewport.Observe: Provides a structural snapshot of the DOM, often filtered for accessibility or visibility.NetworkLog: Retrieves a list of HTTP requests/responses captured during the session.
Code Entity Mapping: Action Flow
The following diagram illustrates how a gRPC request traverses the system into the browser engine. Browser Action Execution Flow Sources: schemas/proto/palyra/v1/browser.proto#19-27, schemas/generated/rust/protocol_stubs.rs#135-149, crates/palyra-browserd/Cargo.toml#17-17Profile and Tab Management
The service manages browser profiles (user data directories) and individual tabs within a session.- Profiles:
CreateProfile,RenameProfile, andDeleteProfileallow for persistent identities (logged-in states) across multiple sessions. - Tabs:
OpenTab,SwitchTab, andCloseTaballow the orchestrator to manage multi-page workflows.ListTabsreturns a list ofBrowserTabobjects containing titles and URLs.
Extension Interoperability (RelayAction)
TheRelayAction RPC is a specialized endpoint for communication with the Palyra Browser Extension. It enables the daemon to trigger actions that require a privileged extension context or to receive data initiated by a human user in the browser.
Relay Payloads
RelayOpenTabPayload: Requests the extension to open a specific URL in the user’s active browser.RelayCaptureSelectionPayload: Retrieves text currently selected by the user.RelayPageSnapshotPayload: Requests a full DOM/MHTML snapshot from the extension’s perspective.
API Summary Table
| RPC Method | Request Type | Response Type | Primary Function |
|---|---|---|---|
Health | BrowserHealthRequest | BrowserHealthResponse | Monitor daemon uptime and load 8-8 |
CreateSession | CreateSessionRequest | CreateSessionResponse | Initialize sandbox 9-9 |
Navigate | NavigateRequest | NavigateResponse | URL transition 19-19 |
Screenshot | ScreenshotRequest | ScreenshotResponse | Visual capture 25-25 |
Observe | ObserveRequest | ObserveResponse | Semantic page analysis 26-26 |
RelayAction | RelayActionRequest | RelayActionResponse | Extension bridge 35-35 |