Your model needs data. The web has data. Getting from A to B used to require serious infrastructure. denkbot.dog makes the bridge. Scrape at scale, collect clean text, build your dataset. The dog fetches training data. You train the model.
Collecting fine-tuning datasets, building domain-specific corpora, creating benchmark datasets from live web content, and automated knowledge base construction.
import asyncio
import aiohttp
async def scrape_batch(urls, api_key):
headers = {
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json',
}
async with aiohttp.ClientSession(headers=headers) as session:
tasks = [
session.post(
'https://api.denkbot.dog/scrape',
json={'url': url, 'format': 'json'},
)
for url in urls
]
responses = await asyncio.gather(*tasks, return_exceptions=True)
return [await r.json() for r in responses if not isinstance(r, Exception)]Don't scrape anything you wouldn't be legally allowed to scrape. We don't verify use cases, but we do block SSRF and rate-limit abuse.
Free tier: 100 req/day. Pro tier with higher limits coming soon.
Check the ToS of the sites you're scraping. That's between you and them.

€19/year. Unlimited requests. API key ready in 30 seconds.