Jobs
Create, read, update, delete, and list crawl jobs.
A crawl job is the stored definition of a crawl: a name, a set of labels, and a configuration. The configuration uses the same shape as a crawl manifest's config object. Running a job produces runs.
These endpoints share the pagination and error conventions.
List jobs
/v1/jobsReturns a page of crawl jobs, most recent first.
- page string
Pagination cursor. See pagination.
- per_page numberdefault: 10
Maximum number of jobs to return.
min: 1max: 100- include_ids string[]
Return only jobs with these IDs.
format: uuid- exclude_ids string[]
Omit jobs with these IDs.
format: uuid- names string[]
Return only jobs with these exact names.
- created_before string
Return only jobs created before this timestamp.
format: date-time- created_after string
Return only jobs created after this timestamp.
format: date-time- labels object
Return only jobs whose labels match every supplied key and value.
curl 'http://localhost:8022/v1/jobs?per_page=20' \
-H 'X-Tenant-Id: acme'Create a job
/v1/jobsCreates a crawl job. Optionally creates and starts its first run in the same request.
- create_run booleandefault: false
When
true, create an initial run for the job immediately.
- name stringrequired
A human-readable name for the job.
minLength: 1maxLength: 255- config objectrequired
The crawl configuration. This is the
configobject of a crawl manifest; see that page for every field.- labels objectdefault: {}
Arbitrary string key and value pairs attached to the job.
- id string
The ID to assign. Generated when omitted.
format: uuid
curl -X POST 'http://localhost:8022/v1/jobs?create_run=true' \
-H 'X-Tenant-Id: acme' \
-H 'Content-Type: application/json' \
-d '{
"name": "docs-crawl",
"config": { "seeds": [{ "kind": "sitemap", "params": { "url": "https://example.com/sitemap.xml" } }] }
}'Get a job
/v1/jobs/{job_id}Returns a single crawl job by ID. The {job_id} path segment is the job's
UUID.
curl 'http://localhost:8022/v1/jobs/3f1a…' \
-H 'X-Tenant-Id: acme'Update a job
/v1/jobs/{job_id}Updates a crawl job. Only the supplied fields are changed; omitted fields are left as they are.
- name string
A new name for the job.
minLength: 1maxLength: 255- config object
A replacement crawl configuration. See crawl manifest.
- labels object
Replacement labels for the job.
curl -X PUT 'http://localhost:8022/v1/jobs/3f1a…' \
-H 'X-Tenant-Id: acme' \
-H 'Content-Type: application/json' \
-d '{ "name": "docs-crawl-v2" }'Delete a job
/v1/jobs/{job_id}Deletes a crawl job.
curl -X DELETE 'http://localhost:8022/v1/jobs/3f1a…' \
-H 'X-Tenant-Id: acme'