robots.txt is the polite note websites leave for bots. denkbot.dog reads it. When you use the /crawl endpoint, robots.txt rules are respected by default. And if you just need to read another site's robots.txt, /scrape handles that too. The dog is polite.
Understanding crawling policies before scraping, checking which URLs are blocked, SEO analysis of robots.txt rules, and respectful automated crawling.
# Fetch and read a robots.txt
curl -X POST https://api.denkbot.dog/scrape \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "url": "https://example.com/robots.txt", "format": "json" }' \
| jq '.text'Yes, by default in the /crawl endpoint. You can disable this with respectRobotsTxt: false.
Fetch it with /scrape β the text field will contain the plain text of the file.
denkbot.dog identifies itself with its own user agent when respecting robots.txt.

β¬19/year. Unlimited requests. API key ready in 30 seconds.