Collect Web Data for AI Training

Your model needs data. The web has data. Getting from A to B used to require serious infrastructure. denkbot.dog makes the bridge. Scrape at scale, collect clean text, build your dataset. The dog fetches training data. You train the model.

Get API Key →Read the Docs

What you'd use this for

Collecting fine-tuning datasets, building domain-specific corpora, creating benchmark datasets from live web content, and automated knowledge base construction.

How it works

example

import asyncio
import aiohttp

async def scrape_batch(urls, api_key):
    headers = {
        'Authorization': f'Bearer {api_key}',
        'Content-Type': 'application/json',
    }
    async with aiohttp.ClientSession(headers=headers) as session:
        tasks = [
            session.post(
                'https://api.denkbot.dog/scrape',
                json={'url': url, 'format': 'json'},
            )
            for url in urls
        ]
        responses = await asyncio.gather(*tasks, return_exceptions=True)
    return [await r.json() for r in responses if not isinstance(r, Exception)]