Jobs

Create, read, update, delete, and list crawl jobs.

A crawl job is the stored definition of a crawl: a name, a set of labels, and a configuration. The configuration uses the same shape as a crawl manifest's config object. Running a job produces runs.

These endpoints share the pagination and error conventions.

List jobs

GET/v1/jobs

Returns a page of crawl jobs, most recent first.

Query parametersin: query

page: Pagination cursor. See pagination.
per_page: Maximum number of jobs to return.
min: 1max: 100
include_ids: Return only jobs with these IDs.
format: uuid
exclude_ids: Omit jobs with these IDs.
format: uuid
names: Return only jobs with these exact names.
created_before: Return only jobs created before this timestamp.
format: date-time
created_after: Return only jobs created after this timestamp.
format: date-time
labels: Return only jobs whose labels match every supplied key and value.

Responses

curl 'http://localhost:8022/v1/jobs?per_page=20' \
  -H 'X-Tenant-Id: acme'

Create a job

POST/v1/jobs

Creates a crawl job. Optionally creates and starts its first run in the same request.

Query parametersin: query

create_run: When true, create an initial run for the job immediately.

Request bodyin: body

name: A human-readable name for the job.
minLength: 1maxLength: 255
config: The crawl configuration. This is the config object of a crawl manifest; see that page for every field.
labels: Arbitrary string key and value pairs attached to the job.
id: The ID to assign. Generated when omitted.
format: uuid

Responses

curl -X POST 'http://localhost:8022/v1/jobs?create_run=true' \
  -H 'X-Tenant-Id: acme' \
  -H 'Content-Type: application/json' \
  -d '{
    "name": "docs-crawl",
    "config": { "seeds": [{ "kind": "sitemap", "params": { "url": "https://example.com/sitemap.xml" } }] }
  }'

Get a job

GET/v1/jobs/{job_id}

Returns a single crawl job by ID. The {job_id} path segment is the job's UUID.

Responses

curl 'http://localhost:8022/v1/jobs/3f1a…' \
  -H 'X-Tenant-Id: acme'

Update a job

PUT/v1/jobs/{job_id}

Updates a crawl job. Only the supplied fields are changed; omitted fields are left as they are.

Request bodyin: body

name: A new name for the job.
minLength: 1maxLength: 255
config: A replacement crawl configuration. See crawl manifest.
labels: Replacement labels for the job.

Responses

curl -X PUT 'http://localhost:8022/v1/jobs/3f1a…' \
  -H 'X-Tenant-Id: acme' \
  -H 'Content-Type: application/json' \
  -d '{ "name": "docs-crawl-v2" }'

Delete a job

DELETE/v1/jobs/{job_id}

Deletes a crawl job.

Responses

curl -X DELETE 'http://localhost:8022/v1/jobs/3f1a…' \
  -H 'X-Tenant-Id: acme'

Jobs

List jobs

200A page of jobs.

Create a job

200The created job.

422The request body failed validation.

Get a job

200The job.

404No job with that ID exists.

Update a job

200The updated job.

404No job with that ID exists.

422The request body could not be processed.

Delete a job

204The job was deleted.

404No job with that ID exists.

Search docs