Crawling & Sitemaps
Crawl entire websites, extract sitemaps, map site structure, and discover all internal links.

8 guides in this category
Crawl Entire Websites via API
Crawl any website and get a full nested URL tree. Up to 500 pages, respects robots.txt, stays on-domain by default.
/website-crawler-apiExtract Sitemaps from Any Website
Automatically find and parse XML sitemaps. Returns all URLs with lastmod, changefreq, and priority. No XML parsing required.
/sitemap-extractor-apiExtract All Internal Links from a Website
Extract all internal links from any webpage or crawl an entire site's link structure. Returns hrefs, anchor text, and more.
/internal-links-extractorGet the Full URL Tree of Any Website
Crawl any website and get a hierarchical tree of all pages. Perfect for site maps, content inventories, and architecture analysis.
/website-tree-structure-apiParse XML Sitemaps via API
Parse XML sitemaps and sitemap index files automatically. Returns structured JSON with all URLs, lastmod, changefreq, and priority.
/sitemap-parser-apiFetch and Parse robots.txt via API
Fetch the robots.txt of any domain and understand crawling rules. denkbot.dog respects robots.txt by default.
/robots-txt-readerCheck All Links on a Website
Crawl a website and extract all links, then verify which ones are broken. Combine /crawl and /scrape to build a link health checker.
/website-link-checkerScrape Multiple URLs in Batch
Process multiple URLs in parallel with denkbot.dog. No batch endpoint needed β just concurrent HTTP requests with async/await.
/batch-url-scraper