inndx/
GitHub

Output formats

The available output formats and guidance on choosing among them.

The Scrape API returns a page in one or more formats per call. Specify what you want in the formats array of your request. If you omit formats, the API defaults to markdown.

Markdown

The page converted to clean markdown. Navigation, ads, cookie banners, and other page chrome are stripped. Headings, lists, code blocks, and links are preserved.

{ "kind": "markdown" }

Use markdown when feeding content into an LLM, building a retrieval index, or any case where the text structure matters more than the original markup.

Skipping tags

Pass a skip_tags array to strip specific HTML elements before the markdown conversion:

{ "kind": "markdown", "skip_tags": ["nav", "footer", "aside"] }

This is useful when a page has persistent elements like sidebars or related-article sections that add noise to the output.

HTML

The page's article content as cleaned HTML, without the surrounding page chrome.

{ "kind": "html" }

Use HTML when you need the structural markup, want to do further DOM processing, or are passing the content to a system that can handle HTML natively.

JSON

The page content as a structured JSON representation.

{ "kind": "json" }

Use JSON when you need machine-readable structure rather than a document.

Binary

The raw page bytes, base64-encoded in the HTTP response. The SDKs decode the value for you.

{ "kind": "binary" }

Use binary when you need the original content before any parsing or conversion.

Requesting multiple formats

You can request more than one format in a single call. The response includes one result per format:

{
  "url": "https://example.com",
  "results": [
    { "kind": "markdown", "content": "..." },
    { "kind": "html", "content": "..." }
  ]
}

A single call that requests two formats is still billed as one unit at the base price, since the page is only fetched once.

Search docs

Search the Cloud documentation