inndx/
GitHub

Clients

The fetch clients that retrieve page content.

A client is what actually retrieves a URL's content. A job sets one client under config.fetcher.client. This page catalogs the available kinds and their parameters.

How the client is configured

config.fetcher.client is a single object with a kind and, for the browser clients, a params object. The standard client retrieves raw HTML over plain HTTP and runs no JavaScript. The browser clients drive a real browser and render JavaScript, at a much higher cost per fetch.

The playwright and cdp clients run a browser inside the fetcher, so they require the browser image variant inndx-image:<tag>-browsers. The remote_cdp client connects to a browser running elsewhere over a WebSocket and does not need the browser image. See Images and the guide Fetch with a browser.

config:
  fetcher:
    client:
      kind: standard

standard

Retrieves content over plain HTTP. It is the fastest and cheapest client and runs no JavaScript. It takes no parameters.

fetcher:
  client:
    kind: standard

playwright

Drives a full browser through Playwright, rendering JavaScript.

FieldTypeRequiredDefaultDescription
javascript_enabledbooleannotrueWhether to execute JavaScript.
ignore_https_errorsbooleannofalseWhether to ignore HTTPS errors such as self-signed certificates.
screenshotScreenshotnononeScreenshot capture options.
fetcher:
  client:
    kind: playwright
    params:
      javascript_enabled: true
      screenshot:
        enabled: true
        full_page: true

cdp

Drives a Chrome instance directly over the Chrome DevTools Protocol, rendering JavaScript with finer control than Playwright.

FieldTypeRequiredDefaultDescription
javascript_enabledbooleannotrueWhether to execute JavaScript.
patch_fingerprintbooleannotrueWhether to patch the browser fingerprint.
intercept_resourceslist of resource typesnoemptyResource types to block from loading, which speeds up rendering when you do not need them.
screenshotScreenshotnononeScreenshot capture options.

The intercept_resources values are document, stylesheet, image, media, font, script, text_track, xhr, fetch, event_source, web_socket, and other.

fetcher:
  client:
    kind: cdp
    params:
      javascript_enabled: true
      intercept_resources:
        - image
        - media
        - font
      patch_fingerprint: true

remote_cdp

Connects to a browser running outside the fetcher over the Chrome DevTools Protocol via a WebSocket. It accepts the same rendering options as cdp, plus a selector for which external browser pool to use.

FieldTypeRequiredDefaultDescription
groupslist of stringsnononeThe external browser groups to connect to.
javascript_enabledbooleannotrueWhether to execute JavaScript.
patch_fingerprintbooleannotrueWhether to patch the browser fingerprint.
intercept_resourceslist of resource typesnoemptyResource types to block from loading. Same values as cdp.
screenshotScreenshotnononeScreenshot capture options.
fetcher:
  client:
    kind: remote_cdp
    params:
      groups:
        - default
      javascript_enabled: true

Screenshot

The screenshot options used by the playwright, cdp, and remote_cdp clients.

FieldTypeRequiredDefaultDescription
enabledbooleannofalseWhether to capture a screenshot during the fetch.
full_pagebooleannotrueWhether to capture the full page rather than just the visible viewport.

Search docs

Search the Self-host documentation