Skip to content
Cascading Labs QScrape VoidCrawl Yosoi

A3Node

Experimental

When Yosoi visits a page with a browser fetcher, DOMLoader runs a behavior tree to clear obstacles and exhaust content triggers before capturing the final HTML. This probe phase is thorough but takes time — on a page with load-more pagination it may run dozens of cycles.

A3Node is the caching layer that eliminates this overhead on repeat visits. After a successful probe, the sequence of actions that worked is stored as a stability recipe. On the next visit to the same domain, the recipe is replayed directly, skipping the probe entirely.

The Stability Recipe

A recipe is an ordered list of ActRecord entries, each recording what kind of action was taken and how many cycles it ran:

{
"domain": "shop.example.com",
"acts": [
{ "kind": "cookie", "cycles": 1 },
{ "kind": "load_more", "cycles": 7 }
],
"discovered_at": "2026-05-23T14:00:00Z",
"replay_count": 4,
"last_replayed_at": "2026-05-23T18:00:00Z"
}

An empty acts list is a valid recipe — it means the domain was probed and needed no action. Yosoi stores it anyway so the next visit skips probing entirely.

Storage Location

Recipes are stored in .yosoi/a3nodes/, one JSON file per domain:

.yosoi/
a3nodes/
a3node_shop_example_com.json
a3node_finance_yahoo_com.json

The .yosoi/ directory is gitignored by default, so recipes stay local. If you want to share them across a team, the selectors/ subdirectory is safe to commit; a3nodes/ follows the same pattern — it’s stable machine-readable data.

Enabling A3Node

A3Node is experimental and opt-in. Pass experimental_a3node=True when constructing a browser fetcher:

from yosoi.core.fetcher.waterfall import JSFetcher
async with JSFetcher(experimental_a3node=True) as fetcher:
result = await fetcher.fetch('https://shop.example.com/catalog')

Or via the waterfall fetcher in the pipeline:

from yosoi.core.fetcher import create_fetcher
fetcher = create_fetcher('waterfall', experimental_a3node=True)
async with fetcher:
await pipeline.process_url(url, fetcher=fetcher)

Without experimental_a3node=True, every browser fetch runs a fresh DOMLoader probe regardless of what’s stored.

What Gets Stored

After each successful DOMLoader run, the acts are written to .yosoi/a3nodes/. The replay_count increments on each successful replay; last_replayed_at updates with the timestamp.

A3NodeStorage exposes the full recipe lifecycle:

from yosoi.storage.a3node import A3NodeStorage, ActRecord
storage = A3NodeStorage()
# Load all cached recipes at startup
recipes = storage.load_all() # dict[str, A3Node]
# Inspect a specific domain
node = storage.load('shop.example.com')
if node is not None:
print(node.acts) # [ActRecord(kind='cookie', cycles=1), ...]
print(node.replay_count) # 4
print(node.battle_tested) # True if replay_count >= 3
# Delete a stale recipe
storage.delete('shop.example.com')

A3Node Properties

PropertyTypeDescription
domainstrBare domain string (e.g. 'shop.example.com')
actslist[ActRecord]Ordered list of recorded actions
discovered_atstrISO timestamp of when this recipe was first recorded
replay_countintNumber of successful replays
last_replayed_atstr | NoneISO timestamp of the most recent replay
is_emptyboolTrue when acts is empty (no action needed)
battle_testedboolTrue when replay_count >= 3

FAQs

What happens when a site’s page structure changes?

A3Node replay runs the stored actions against the current DOM. If the DOM changed — the cookie banner selector is gone, the load-more button text changed — the action may silently succeed with zero effect. The content count comparison inside ClickTrigger will detect that content stopped growing and mark the trigger exhausted.

If the site changed significantly, delete the recipe and let DOMLoader re-probe:

from yosoi.storage.a3node import A3NodeStorage
A3NodeStorage().delete('shop.example.com')

Or delete all recipes and let them rebuild on next use:

rm .yosoi/a3nodes/*.json
Why is replay_count not incrementing?

Replay only records via storage.record_replay() on the empty-recipe path today — the non-empty replay path runs a fresh probe instead of replaying acts. This is a known limitation of the current experimental implementation. battle_tested is meaningful only for empty-recipe domains until full act-replay lands.

Is A3Node safe to use in production?

It is experimental. The replay path for non-empty recipes falls back to a fresh DOMLoader probe rather than actually replaying the stored acts. The caching infrastructure is stable; the replay optimization is what’s incomplete. Use it for the empty-recipe fast path and treat act replay as a preview.

Can I pre-seed recipes manually?

Yes. The file format is plain JSON. Create a file at .yosoi/a3nodes/a3node_<domain>.json with the shape shown above. Yosoi loads all files in that directory at fetcher startup via A3NodeStorage.load_all().

References

VoidCrawl. Cascading Labs. Rust-native CDP browser automation for Python via PyO3. https://github.com/CascadingLabs/VoidCrawl

DOMLoader. Cascading Labs. Behavior-tree page loader for Yosoi browser fetchers. /guides/dom-loader/