inndx/
GitHub

Navigators

Pipeline components that discover child links to widen the crawl.

A navigator finds new links in a fetched page, which the orchestrator inserts into the frontier. A pipeline sets one navigator. This page catalogs the available kinds.

How the navigator is configured

A pipeline's navigator is a single object with a kind. The links a navigator discovers are fed back to the orchestrator, where the job's filters and mutators decide which of them are admitted. A pipeline without a navigator discovers no new links and so does not widen the crawl.

pipelines:
  - navigator:
      kind: anchor
    steps:
      - kind: extractor
        params:
          kind: markdown

anchor

Discovers links from the page's anchor tags (the href of each link element). It takes no parameters.

navigator:
  kind: anchor

Search docs

Search the Self-host documentation