Clients
The fetch clients that retrieve page content.
A client is what actually retrieves a URL's content. A job sets one client under config.fetcher.client. This page catalogs the available kinds and their parameters.
How the client is configured
config.fetcher.client is a single object with a kind and, for the browser clients, a params object. The standard client retrieves raw HTML over plain HTTP and runs no JavaScript. The browser clients drive a real browser and render JavaScript, at a much higher cost per fetch.
The playwright and cdp clients run a browser inside the fetcher, so they require the browser image variant inndx-image:<tag>-browsers. The remote_cdp client connects to a browser running elsewhere over a WebSocket and does not need the browser image. See Images and the guide Fetch with a browser.
config:
fetcher:
client:
kind: standardstandard
Retrieves content over plain HTTP. It is the fastest and cheapest client and runs no JavaScript. It takes no parameters.
fetcher:
client:
kind: standardplaywright
Drives a full browser through Playwright, rendering JavaScript.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
javascript_enabled | boolean | no | true | Whether to execute JavaScript. |
ignore_https_errors | boolean | no | false | Whether to ignore HTTPS errors such as self-signed certificates. |
screenshot | Screenshot | no | none | Screenshot capture options. |
fetcher:
client:
kind: playwright
params:
javascript_enabled: true
screenshot:
enabled: true
full_page: truecdp
Drives a Chrome instance directly over the Chrome DevTools Protocol, rendering JavaScript with finer control than Playwright.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
javascript_enabled | boolean | no | true | Whether to execute JavaScript. |
patch_fingerprint | boolean | no | true | Whether to patch the browser fingerprint. |
intercept_resources | list of resource types | no | empty | Resource types to block from loading, which speeds up rendering when you do not need them. |
screenshot | Screenshot | no | none | Screenshot capture options. |
The intercept_resources values are document, stylesheet, image, media, font, script, text_track, xhr, fetch, event_source, web_socket, and other.
fetcher:
client:
kind: cdp
params:
javascript_enabled: true
intercept_resources:
- image
- media
- font
patch_fingerprint: trueremote_cdp
Connects to a browser running outside the fetcher over the Chrome DevTools Protocol via a WebSocket. It accepts the same rendering options as cdp, plus a selector for which external browser pool to use.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
groups | list of strings | no | none | The external browser groups to connect to. |
javascript_enabled | boolean | no | true | Whether to execute JavaScript. |
patch_fingerprint | boolean | no | true | Whether to patch the browser fingerprint. |
intercept_resources | list of resource types | no | empty | Resource types to block from loading. Same values as cdp. |
screenshot | Screenshot | no | none | Screenshot capture options. |
fetcher:
client:
kind: remote_cdp
params:
groups:
- default
javascript_enabled: trueScreenshot
The screenshot options used by the playwright, cdp, and remote_cdp clients.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
enabled | boolean | no | false | Whether to capture a screenshot during the fetch. |
full_page | boolean | no | true | Whether to capture the full page rather than just the visible viewport. |