New Per-job VM isolation is here

All your spiders.
One platform.

Deploy, schedule, and monitor Scrapy spiders at scale. Self-hosted orchestration with per-job VMs, cron scheduling, and real-time monitoring.

Python / Scrapy / Docker / Hetzner / Gluetun

Built with battle-tested technologies

Scrapy Python Docker Hetzner Redis PostgreSQL BullMQ SvelteKit TypeScript Tailwind Gluetun Node.js Supabase Playwright Scrapy Python Docker Hetzner Redis PostgreSQL BullMQ SvelteKit TypeScript Tailwind Gluetun Node.js Supabase Playwright

Deploy your first spider. Right now.

terminal

$ git push origin main

Spider deployed. Job queued. VM provisioning...

INFO Scraping started (8,431 items in 13m)

Setup instructions

From development to production.

Replace your hodgepodge of cron jobs, custom scripts, and manual deployments with a single platform built for Scrapy.

Deploy with Git Push

Connect your repository and deploy spiders with a simple git push. Automatic builds and versioning.

Cron Scheduling

Schedule spiders with cron expressions. Timezone-aware with automatic retries on failure.

Per-Job VMs

Every spider job runs on its own ephemeral Hetzner VM. Complete isolation, zero cross-contamination.

Real-time Monitoring

Watch items flow in real-time. Track requests, response codes, and error rates as they happen.

Proxy Rotation

All traffic routed through Gluetun proxy containers. Automatic rotation and failover built in.

You may also need to...

Webhook callbacks
Custom pipelines
Item export
Request throttling
Auto-retry
Spider versioning
Multi-format output
Log aggregation

How does it work?

Three layers working together: Deployment handles your code. Orchestration manages the lifecycle. Isolation protects every run.

Deployment

Deploy, orchestrate, and monitor your spiders.

Push your spider code to Git. The platform builds, stores, and deploys it to ephemeral VMs automatically. Every phase of the job lifecycle is orchestrated and observable.

spider.yml
# Spider configuration
spider:
  name: product_spider
  repo: github.com/acme/scrapers
  settings:
    concurrent_requests: 16
    download_delay: 0.5

schedule:
  cron: "0 */6 * * *"
  timezone: "America/New_York"

worker:
  type: cx21
  region: eu-central
  auto_destroy: true

Take action at any phase in the job lifecycle

1
Queue Job enters the queue
2
Provision VM spins up on Hetzner
3
Run Spider executes in Docker
4
Collect Items streamed back
5
Cleanup VM self-destructs
Git deploy Cron triggers Webhook callbacks Item pipelines

Stop managing infrastructure manually

Auto-provisioning

VMs spin up on demand when jobs are queued. No manual server management.

Smart scheduling

Cron-based job triggers with timezone support and automatic retries.

Auto-cleanup

VMs self-destruct after job completion. Zero lingering infrastructure costs.

Multi-format export

Export scraped data to JSON, CSV, or stream directly to your database.

Isolation

Run every spider in complete isolation.

Three layers of Docker protection. Gluetun proxy containers route all traffic. Network mode isolation prevents escape. Every job runs in its own VM.

Per-job VMs

Every spider gets its own ephemeral Hetzner VM. No shared state, no cross-contamination between runs.

Gluetun VPN

All spider traffic routed through Gluetun proxy containers. IP rotation and DNS leak protection built in.

Network isolation

Docker network_mode prevents container escape. Spiders can only communicate through defined channels.

Auto-cleanup

VMs self-destruct after job completion using the Hetzner metadata API. Zero infrastructure drift.

Developer Experience

Developer experience matters.

Every tool you need to build, debug, and operate spiders at scale. All accessible from the dashboard or the API.

Live logs

Stream spider output in real-time. Filter by level with full text search.

Item browser

Inspect scraped data with instant search and filtering. JSON syntax highlighting.

Request inspector

Debug HTTP requests and responses. Filter by status code, domain, and timing.

Job history

Track every run with stats. Compare performance across runs with visual timelines.

Spider config

YAML-based declarative configuration. Define schedules, settings, and pipelines.

API-first

Full REST API for automation. Create jobs, query items, manage spiders from code.

Join developers deploying spiders at scale.

"We moved from Scrapy Cloud to this platform and cut our infrastructure costs by 70%. The Hetzner auto-scaling is incredible."
S

Sarah Chen

Head of Data Engineering, DataFlow Inc.

See it in action.

spider.yml
# Complete spider configuration
spider:
  name: product_spider
  repo: github.com/acme/scrapers
  settings:
    concurrent_requests: 16
    download_delay: 0.5
    timeout_seconds: 300

schedule:
  cron: "0 */6 * * *"
  timezone: "America/New_York"

proxy:
  provider: gluetun
  vpn_type: wireguard
  rotate: true

pipeline:
  webhook: https://api.example.com/items
  export: json

worker:
  type: cx21
  auto_destroy: true

One config file. Everything handled.

Define your spider, schedule, proxy, pipeline, and infrastructure in a single YAML file. The platform handles the rest.

  • Proxy setup via Gluetun containers
  • Item pipeline with webhook callbacks
  • Cron scheduling with timezone support
  • Auto-cleanup after job completion

You read the whole page.
What are you waiting for?