Skip to content

interactive: extract scope-tree renderer into a substrate-generic backend#757

Merged
frankmcsherry merged 1 commit into
master-nextfrom
interactive_render_extract
Jun 11, 2026
Merged

interactive: extract scope-tree renderer into a substrate-generic backend#757
frankmcsherry merged 1 commit into
master-nextfrom
interactive_render_extract

Conversation

@frankmcsherry

Copy link
Copy Markdown
Member

Both ddir_vec and ddir_col carried their own copy of the scope-tree walk (render_tree/resolve/RItem), differing only in the leaf operators and the Col/Variable type spellings. This lifts the walk into the library as interactive::backend::render_tree<B>, generic over a Backend trait that fixes the differential container and supplies the leaf operators (linear/join/reduce/arrange/inspect/leave_dynamic).

Design

Mirrors DD's container-generic style rather than reaching for GATs:

  • Generic over the container C = B::Container, with Time fixed concrete (ir::Time).
  • Col = Collection<'s, Time, C> is the plain container-generic collection; only the arrangement type is a GAT (type Arr<'scope>, bounded just Clone).
  • Region enter/leave stay container-generic via Enter/Leave<Time, Time, ..Container = C> identity bounds (a same-timestamp region preserves the container), so nested scopes recurse with the same backend.
  • leave_dynamic is a trait method because vec uses a Collection method and col a free columnar:: fn.

Result

  • backend::{vec,col} each become a Backend impl plus their substrate leaf ops.
  • ddir_{vec,col} thin to drivers calling the shared renderer (-219 / -353 lines).
  • The walk is single-sourced; vec/col produce identical outputs. Verified on reach, tc, and scc (nested scopes / enter_at / min).

This is the shared prerequisite for reviving the DDIR server and standing up a wasm front-end — both can now call interactive::backend::vec::render_tree instead of re-deriving a renderer.

🤖 Generated with Claude Code

…kend

Both example binaries carried their own copy of the scope-tree walk
(render_tree/resolve/RItem), differing only in the leaf operators and the
Col/Variable type spellings. Lift the walk into the library as
interactive::backend::render_tree<B>, generic over a Backend trait that
fixes the differential container and supplies the leaf operators
(linear/join/reduce/arrange/inspect/leave_dynamic).

Mirrors DD's container-generic style: generic over C = B::Container with
Time fixed concrete, Col = Collection<'s, Time, C> plain; only the
arrangement type is a (Clone-only) GAT. Region enter/leave stay
container-generic via Enter/Leave<Time, Time, ..Container = C> identity
bounds, so nested scopes recurse with the same backend.

backend::{vec,col} each become a Backend impl plus their substrate leaf
ops; ddir_{vec,col} thin to drivers calling the shared renderer. vec/col
produce identical outputs; reach/tc/scc verified.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@frankmcsherry frankmcsherry merged commit 0bdd76d into master-next Jun 11, 2026
6 checks passed
@frankmcsherry frankmcsherry deleted the interactive_render_extract branch June 11, 2026 22:55
frankmcsherry added a commit that referenced this pull request Jun 12, 2026
… as data

Adds backend::vec::evaluate(program, inputs): run a scope-tree Program on
explicitly provided rows (single worker, in-process) and return every
export's consolidated final contents as data.

This is the missing driver half of the #757 split: render_tree moved into
the library, but inputs could still only come from gen_row and outputs
could only leave via Inspect eprintln, which is why semantic questions kept
growing one-off env vars (QUERY, CLONE_RT, EDGES_FILE) on the example
binaries. With evaluate(), end-to-end properties — clone identity,
explanation sufficiency, demand-set excess — are ordinary #[test]s over
data, and the explain rewrite's demand:input* exports are read directly
instead of scraped from stderr.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
frankmcsherry added a commit that referenced this pull request Jun 12, 2026
…emand (#758)

* interactive/backend: in-process evaluate() — explicit inputs, exports as data

Adds backend::vec::evaluate(program, inputs): run a scope-tree Program on
explicitly provided rows (single worker, in-process) and return every
export's consolidated final contents as data.

This is the missing driver half of the #757 split: render_tree moved into
the library, but inputs could still only come from gen_row and outputs
could only leave via Inspect eprintln, which is why semantic questions kept
growing one-off env vars (QUERY, CLONE_RT, EDGES_FILE) on the example
binaries. With evaluate(), end-to-end properties — clone identity,
explanation sufficiency, demand-set excess — are ordinary #[test]s over
data, and the explain rewrite's demand:input* exports are read directly
instead of scraped from stderr.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* interactive/explain: filter both sides' times before splitting join demand

The join backward rule maps demand for (k, (v1, v2)) @ t to (k, v1) @ <= t
and (k, v2) @ <= t. The implementation applied each side's <= t filter only
to that side's contribution, projecting the partner's user-time coordinates
away unfiltered. Over the signed host tables (witness + forward carry stable
-1 rows at retraction iterations) that projection is not a faithful
existential: pair rows differing only in the partner's time -- a +1 while
the partner held the row and a -1 from its later retraction -- merge into
the same contribution row and cancel, annihilating demand the time-valid
configuration justifies.

Concretely (scc, 1000 nodes / 1100 edges, QUERY=773:466): demand for the
fwd-label proposal at node 25 paired labels(25)=0 with scc edge (25,236),
which scc retracts at outer iter 1 as trim shrinks. The left contribution's
+1/-1 pair cancelled, so the edge was demanded but label(25) never was; the
0-flood justification chase died there, the forward clone's labels diverged
(label(466) = 11 != 0), trim_fwd lost 466->28, and the demand-set (66 edges)
failed to regenerate the queried output.

Fix: one dep-pair join carrying BOTH user chains, with both outer-aligned
u_in <= u_out filters applied before either side is projected away. A
(left, right) pair only explains output demanded at u_out if both inputs
were present at times <= u_out. This is also a sound narrowing: pairs whose
partner arrived later no longer contribute demand.

Verified: all 22 scc-edge queries at 100 nodes / 110 edges and 7/7 sampled
at 1000/1100 now produce demand-sets that regenerate their queried row
(EDGES_FILE re-run); 773:466 grows 66 -> 278 edges and is sufficient;
several 100/110 queries shrink (e.g. 25 -> 20, 16 -> 11); the global query
is unchanged at 110; interactive lib tests pass.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* interactive: end-to-end sufficiency tests for the explanation rewrite

The central property: for a query against a program's output, the reported
demand-set — fed back through the original program as its only input — must
regenerate the queried output row.

- clone_identity_preserves_scc_output: CLONE_RT as a test.
- scc_row_explanations_sufficient_small: all 12 scc-edge queries at
  50 nodes / 55 edges, each demand-set replayed and checked.
- demand_is_a_subset_of_the_input.
- [ignored] scc_row_explanations_sufficient_100: the 22-query sweep.
- [ignored] scc_join_partner_time_regression: the 1000/1100 (773,466)
  query. Fails against the parent of the join-time fix with the exact
  signature from its investigation (66 demanded rows, queried row not
  regenerated); passes with it.

The whole suite, ignored tests included, runs in ~1.4s in release.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* interactive/explain: narrow join demand by the produced output value

The join backward rule matched demand to candidate (left, right) pairs on
K_out alone; the demanded output VALUE never constrained which pairs were
charged, so every same-key sibling was demanded — every in-edge of a node
and its ancestor label cone, not just the pair that carried the demanded
label. The pair table now also records the chained V_out each pair
produces (the projection's value fields, Pos-expanded against the
original arities), and the dep-pair join filters produced == demanded
element-wise, alongside the two time filters.

Measured on scc row queries (demand-set size vs greedy 1-minimal):
50 nodes / 55 edges: 119 -> 71 demanded over 12 queries (3.72x -> 2.22x);
100 / 110: 736 -> 395 over 22 queries (4.72x -> 2.53x). Sufficiency suite
(including the 1000/1100 partner-time regression) passes.

An earlier attempt at this narrowing (pre #757-era experiments) appeared
to break the demand fixed point's self-correction; that failure was the
partner-time cancellation fixed in the previous commit, not a property of
value narrowing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* interactive: demand-excess report (greedy 1-minimal shrink)

An ignored, non-asserting test that prints per-query demand-set size
against a greedy 1-minimal shrink (drop rows while the replay still
regenerates the queried output) — the working metric for the
over-approximation work.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* interactive: tc and reach explanation tests; multi-input demand harness

Generalizes the sufficiency/excess helpers to per-import shapes and
multiple demand-set exports (reach demands both edges and roots), and adds
transitive closure and reachability cases to the default suite and the
excess report.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* interactive: fuzz sweep, multi-query, and count tests; CI gates the suite

- fuzz_explanations_sufficient: 10 seeded graphs x {scc, tc, reach}, every
  output row queried and its demand-set replayed (~300 sufficiency checks,
  ~11s release). Both soundness bugs so far surfaced only at particular
  scales/instances; this is the standing net for the next one.
- two_simultaneous_queries_sufficient: two queries seeded together with
  distinct q ids; the union demand must regenerate both rows. Exercises the
  q-id plumbing through every reverse rule.
- count_explanations_sufficient_small: count over duplicate-free inputs is
  sufficient today (the keyed all-rows demand is exactly the count's
  multiset).
- count_duplicate_inputs_known_gap (#[ignore]d, fails by design): demand-
  sets are sets but count outputs depend on input multiplicities; with a
  duplicated row, indeg(5)=3 demands {(1,5),(2,5)} and replays as 2. The
  motivating case for a multiplicity ("diff") dimension on demand.
- CI: run the suite with --include-ignored in release (skipping the metric
  report and known-gap tests), so the soundness regression gates merges.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* interactive: replay oracle restricts original inputs (reference multiplicities)

Demand means "this row, at its reference count": the rewrite's forward
clone consumes demand-sets by semijoining the actual input, preserving
source multiplicities. The test oracle instead replayed the demand-set
itself at multiplicity 1, which understates counts when inputs contain
duplicate rows — the supposed count 'known gap' was this oracle bug, not
a rewrite gap. Replays now restrict the original inputs to the demanded
rows; the count-with-duplicates case flips to a passing test and the CI
known-gap skip is removed.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* interactive: retire the env-var switches

- CLONE_RT: deleted; the identity check is the clone_identity_preserves_*
  test.
- EXPLAIN_DEBUG_DEP: the library no longer reads the environment; the
  rewrite takes explain::Options { debug_inspects } via explain_with(),
  and ddir_vec maps a --debug-demand flag onto it.
- QUERY / DIAG: become ddir_vec flags (--query=K:V,Q / --diag); flag args
  are stripped before positional parsing, and the timely tail is forwarded
  from the cleaned argument list instead of raw std::env::args().

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant