Navigators
Pipeline components that discover child links to widen the crawl.
A navigator finds new links in a fetched page, which the orchestrator inserts into the frontier. A pipeline sets one navigator. This page catalogs the available kinds.
How the navigator is configured
A pipeline's navigator is a single object with a kind. The links a navigator discovers are fed back to the orchestrator, where the job's filters and mutators decide which of them are admitted. A pipeline without a navigator discovers no new links and so does not widen the crawl.
pipelines:
- navigator:
kind: anchor
steps:
- kind: extractor
params:
kind: markdownanchor
Discovers links from the page's anchor tags (the href of each link element). It takes no parameters.
navigator:
kind: anchor