All your spiders.
One platform.
Deploy, schedule, and monitor Scrapy spiders at scale. Self-hosted orchestration with per-job VMs, cron scheduling, and real-time monitoring.
Built with battle-tested technologies
Deploy your first spider. Right now.
$ git push origin main
Spider deployed. Job queued. VM provisioning...
INFO Scraping started (8,431 items in 13m)
From development to production.
Replace your hodgepodge of cron jobs, custom scripts, and manual deployments with a single platform built for Scrapy.
Deploy with Git Push
Connect your repository and deploy spiders with a simple git push. Automatic builds and versioning.
Cron Scheduling
Schedule spiders with cron expressions. Timezone-aware with automatic retries on failure.
Per-Job VMs
Every spider job runs on its own ephemeral Hetzner VM. Complete isolation, zero cross-contamination.
Real-time Monitoring
Watch items flow in real-time. Track requests, response codes, and error rates as they happen.
Proxy Rotation
All traffic routed through Gluetun proxy containers. Automatic rotation and failover built in.
You may also need to...
How does it work?
Three layers working together: Deployment handles your code. Orchestration manages the lifecycle. Isolation protects every run.
Deployment
Deploy, orchestrate, and monitor your spiders.
Push your spider code to Git. The platform builds, stores, and deploys it to ephemeral VMs automatically. Every phase of the job lifecycle is orchestrated and observable.
# Spider configuration spider: name: product_spider repo: github.com/acme/scrapers settings: concurrent_requests: 16 download_delay: 0.5 schedule: cron: "0 */6 * * *" timezone: "America/New_York" worker: type: cx21 region: eu-central auto_destroy: true
Take action at any phase in the job lifecycle
Stop managing infrastructure manually
Auto-provisioning
VMs spin up on demand when jobs are queued. No manual server management.
Smart scheduling
Cron-based job triggers with timezone support and automatic retries.
Auto-cleanup
VMs self-destruct after job completion. Zero lingering infrastructure costs.
Multi-format export
Export scraped data to JSON, CSV, or stream directly to your database.
Isolation
Run every spider in complete isolation.
Three layers of Docker protection. Gluetun proxy containers route all traffic. Network mode isolation prevents escape. Every job runs in its own VM.
Per-job VMs
Every spider gets its own ephemeral Hetzner VM. No shared state, no cross-contamination between runs.
Gluetun VPN
All spider traffic routed through Gluetun proxy containers. IP rotation and DNS leak protection built in.
Network isolation
Docker network_mode prevents container escape. Spiders can only communicate through defined channels.
Auto-cleanup
VMs self-destruct after job completion using the Hetzner metadata API. Zero infrastructure drift.
Developer Experience
Developer experience matters.
Every tool you need to build, debug, and operate spiders at scale. All accessible from the dashboard or the API.
Live logs
Stream spider output in real-time. Filter by level with full text search.
Item browser
Inspect scraped data with instant search and filtering. JSON syntax highlighting.
Request inspector
Debug HTTP requests and responses. Filter by status code, domain, and timing.
Job history
Track every run with stats. Compare performance across runs with visual timelines.
Spider config
YAML-based declarative configuration. Define schedules, settings, and pipelines.
API-first
Full REST API for automation. Create jobs, query items, manage spiders from code.
Join developers deploying spiders at scale.
"We moved from Scrapy Cloud to this platform and cut our infrastructure costs by 70%. The Hetzner auto-scaling is incredible."
Sarah Chen
Head of Data Engineering, DataFlow Inc.
See it in action.
# Complete spider configuration spider: name: product_spider repo: github.com/acme/scrapers settings: concurrent_requests: 16 download_delay: 0.5 timeout_seconds: 300 schedule: cron: "0 */6 * * *" timezone: "America/New_York" proxy: provider: gluetun vpn_type: wireguard rotate: true pipeline: webhook: https://api.example.com/items export: json worker: type: cx21 auto_destroy: true
One config file. Everything handled.
Define your spider, schedule, proxy, pipeline, and infrastructure in a single YAML file. The platform handles the rest.
- Proxy setup via Gluetun containers
- Item pipeline with webhook callbacks
- Cron scheduling with timezone support
- Auto-cleanup after job completion