Introduction
Run the inndx crawl system inside your own infrastructure.
Self-hosted inndx is the full crawl orchestration system, licensed to run entirely within your own environment. Your crawls, configuration, and extracted data never leave your boundary.
The system is composed of four stages: the orchestrator schedules and ranks crawl work, fetchers retrieve pages with renderless and browser clients, parsers turn raw content into markdown and structured records, and sinks deliver results to S3, MongoDB, or webhooks.
If you want clean data without operating infrastructure, the inndx Cloud docs cover the managed APIs.
Getting access
Self-hosting is available under an enterprise license. Deployment guides, configuration reference, and operations runbooks land here as part of enterprise onboarding. To start that conversation, contact [email protected].