Skip to content
Cascading Labs QScrape VoidCrawl Yosoi

Crawl Policy

crawl is the policy for bounded site traversal.

crawl:
mode: contract_focus
budget:
max_pages: 25
max_depth: 2
scheduler:
max_workers: 3
per_host_concurrency: 1
politeness_delay: 1.0
safety:
respect_robots: true
allow_redirects: false
allow_cross_domain: false

Top-level fields

FieldTypeDefaultDescription
modeenumcontract_focusCrawl mode: seed_hunt, contract_focus, structure_guarded, or explorer.
budgetobjectsee belowPage, depth, attempt, and session limits.
schedulerobjectsee belowWorker, per-host, politeness, timeout, and retry settings.
safetyobjectsee belowRobots, redirects, host allow/deny, path blocking.
escalationobjectdefault-denyPaid/model escalation gates.
target_contractslist[]Contracts the crawl is trying to find.
scrape_contractsboolean or listfalseWhether to scrape representative URLs after crawl.
scrape_url_limit_per_contractinteger1Representative scrape cap.
fetcher_typeenumautoFetch tier for crawl page acquisition.

Budget

FieldDefaultDescription
max_pages1Maximum successful pages.
max_depth0Link depth from seeds.
max_attemptsnullOptional cap including failures and policy-blocked URLs.
max_pages_per_hostnullOptional host-level page cap.
crawl_session_idnullCheckpoint/session id when persistence is enabled.

Scheduler

FieldDefaultDescription
max_workers1Total concurrent crawl workers.
per_host_concurrency1Concurrent workers per host.
politeness_delay1.0Minimum delay between same-host fetches.
fetch_timeout_seconds15.0Fetch timeout for crawl pages.
max_fetch_retries2Fetch retry cap.

Safety

FieldDefaultDescription
respect_robotstrueHonor robots.txt.
allow_redirectsfalseAllow redirects during crawl fetches.
allow_cross_domainfalsePermit crawling across domains.
allowed_hosts[]Explicit host allowlist. If empty, seeds derive allowed hosts.
denied_hosts[]Explicit host denylist.
blocked_path_prefixes[]Path prefixes to skip before fetch.

Escalation

Escalation is default-deny.

crawl:
escalation:
allow_model_discovery: false
allow_paid_scrapers: false
max_llm_calls: 0
max_paid_scraper_calls: 0