API Reference

Everything you need to fetch, screenshot, and crawl the web. Base URL: https://api.denkbot.dog

Base URL

api.denkbot.dog

Format

JSON / Binary

Auth

Bearer Token

Authentication

All endpoints (except public GET /screenshot) require a Bearer token. Get your API key from the dashboard.

http

Authorization: Bearer dk_live_xxxxxxxxxxxxxxxxxxxx

Keep your API key secret. If it leaks, regenerate it from the dashboard immediately. No judgment, we've all committed secrets to git.

POST/scrape

Fetches the content of any URL. Returns HTML, extracted text, title, metadata, links, and more. Optionally renders JavaScript with Playwright before returning. Responses are cached for 15 minutes per URL + options combination.

Request Body

field	type	default	description
`url`*	`string`	`—`	The URL to fetch. Must be a valid HTTP/HTTPS URL. Private IPs and localhost are blocked.
`js`	`boolean`	`true`	Render the page with Playwright (Chromium). Default is true (JS on). Set to false for faster static-only fetching.
`format`	`string`	`"parsed"`	Response format. "parsed" returns structured data, "raw" returns HTML only, "both" returns both.
`waitUntil`	`string`	`"load"`	Page lifecycle event to wait for: "load", "domcontentloaded", or "networkidle".
`no_cache`	`boolean`	`false`	Skip the 15-minute response cache and fetch fresh.

Example Request

bash

curl -X POST https://api.denkbot.dog/scrape \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "js": false,
    "format": "parsed"
  }'

Response

response

{
  "url": "https://example.com",
  "format": "parsed",
  "data": {
    "title": "Example Domain",
    "description": "This domain is for use in illustrative examples.",
    "text": "Example Domain This domain is for use in...",
    "headings": ["Example Domain"],
    "links": [
      { "href": "https://www.iana.org/domains/reserved", "text": "More information..." }
    ],
    "images": [],
    "meta": {
      "description": "This domain is for use in illustrative examples."
    }
  },
  "duration_ms": 312,
  "cached": false
}

POST/sitemap

Discovers and returns all URLs found in the sitemap(s) of a given domain. Follows sitemap index files recursively. Returns a flat list of URLs with their lastmod, changefreq, and priority if available.

Request Body

field	type	default	description
`url`*	`string`	`—`	The domain URL. Can be a domain (https://example.com) or a direct sitemap URL (https://example.com/sitemap.xml).
`limit`	`number`	`500`	Maximum number of URLs to return. Max 5000.

Example Request

bash

curl -X POST https://api.denkbot.dog/sitemap \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "url": "https://example.com" }'

Response

response

{
  "sitemap_url": "https://example.com/sitemap.xml",
  "total": 42,
  "urls": [
    {
      "loc": "https://example.com/",
      "lastmod": "2025-01-15",
      "changefreq": "weekly",
      "priority": "1.0"
    },
    {
      "loc": "https://example.com/about",
      "lastmod": "2024-11-01",
      "changefreq": "monthly",
      "priority": "0.8"
    }
  ],
  "cached": false,
  "duration_ms": 890
}

POST/crawl

Crawls a website starting from the given URL and returns a nested tree of all discovered internal links. Respects robots.txt. Stays within the same domain by default. Up to 500 pages deep.

Request Body

field	type	default	description
`url`*	`string`	`—`	The starting URL for the crawl.
`limit`	`number`	`100`	Maximum number of pages to crawl. Upper limit: 500.
`depth`	`number`	`3`	Maximum link depth from the starting URL. Upper limit: 5.
`js`	`boolean`	`false`	Use Playwright to render each page before extracting links.
`no_cache`	`boolean`	`false`	Skip the cache and crawl fresh.

Example Request

bash

curl -X POST https://api.denkbot.dog/crawl \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "limit": 100,
    "depth": 4
  }'

Response

response

{
  "root": "https://example.com",
  "total_found": 3,
  "tree": {
    "url": "https://example.com",
    "children": [
      {
        "url": "https://example.com/about",
        "children": []
      },
      {
        "url": "https://example.com/blog",
        "children": [
          {
            "url": "https://example.com/blog/hello-world",
            "children": []
          }
        ]
      }
    ]
  }
}

POST/screenshotrequires auth

Takes a screenshot of the given URL and returns a PNG binary directly. No base64 encoding. Content-Type: image/png. Save it directly to a file. Uses Playwright Chromium with a 1280×800 viewport by default.

Request Body

field	type	default	description
`url`*	`string`	`—`	The URL to screenshot.
`wait_until`	`string`	`"load"`	Page lifecycle event to wait for before capturing: "load", "domcontentloaded", or "networkidle".
`no_cache`	`boolean`	`false`	Skip the cache and take a fresh screenshot.

bash

# Returns raw PNG bytes
curl -X POST https://api.denkbot.dog/screenshot \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "url": "https://example.com" }' \
  --output screenshot.png

GET/screenshotpublic

Public screenshot endpoint. Pass the URL as a query parameter. No auth required. Useful for embedding live previews in HTML <img> tags. Rate limited to 20 requests per minute per IP.

field	type	default	description
`url`*	`query param`	`—`	URL-encoded target URL.
`width`	`query param`	`1280`	Viewport width.
`height`	`query param`	`800`	Viewport height.
`fullPage`	`query param`	`false`	Capture full page.

html

# Use directly in an <img> tag
<img src="https://api.denkbot.dog/screenshot?url=https%3A%2F%2Fexample.com" />

# Or fetch it
curl "https://api.denkbot.dog/screenshot?url=https://example.com" --output shot.png

Error Codes

All errors return JSON with a message field. HTTP status codes follow convention.

Status	Description
`400`	Bad Request: Missing required fields or invalid URL. Check your request body.
`401`	Unauthorized: Missing or invalid API key. Check your Authorization header.
`403`	Forbidden: SSRF protection triggered. You tried to access a private IP or localhost. Sneaky.
`404`	Not Found: The endpoint you're hitting doesn't exist. Check the URL.
`429`	Rate Limited: Slow down. You've hit the rate limit. Wait a minute and try again.
`502`	Bad Gateway: The target site returned an error or timed out. Not our fault.
`500`	Internal Server Error: Something broke on our end. Very rare. We're on it.

Rate Limits

Rate limits are per API key. Cached responses don't count against your limit. So the smart move is: avoid hammering the same URL within 15 minutes.

Full Access, €19/year

Unlimited requests

Cached responses are always free

Pricing

Full Access

€19/year

Unlimited requests. No per-request billing. No metering anxiety.

See pricing →