interactive: extract scope-tree renderer into a substrate-generic backend#757
Merged
Merged
Conversation
…kend
Both example binaries carried their own copy of the scope-tree walk
(render_tree/resolve/RItem), differing only in the leaf operators and the
Col/Variable type spellings. Lift the walk into the library as
interactive::backend::render_tree<B>, generic over a Backend trait that
fixes the differential container and supplies the leaf operators
(linear/join/reduce/arrange/inspect/leave_dynamic).
Mirrors DD's container-generic style: generic over C = B::Container with
Time fixed concrete, Col = Collection<'s, Time, C> plain; only the
arrangement type is a (Clone-only) GAT. Region enter/leave stay
container-generic via Enter/Leave<Time, Time, ..Container = C> identity
bounds, so nested scopes recurse with the same backend.
backend::{vec,col} each become a Backend impl plus their substrate leaf
ops; ddir_{vec,col} thin to drivers calling the shared renderer. vec/col
produce identical outputs; reach/tc/scc verified.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
frankmcsherry
added a commit
that referenced
this pull request
Jun 12, 2026
… as data Adds backend::vec::evaluate(program, inputs): run a scope-tree Program on explicitly provided rows (single worker, in-process) and return every export's consolidated final contents as data. This is the missing driver half of the #757 split: render_tree moved into the library, but inputs could still only come from gen_row and outputs could only leave via Inspect eprintln, which is why semantic questions kept growing one-off env vars (QUERY, CLONE_RT, EDGES_FILE) on the example binaries. With evaluate(), end-to-end properties — clone identity, explanation sufficiency, demand-set excess — are ordinary #[test]s over data, and the explain rewrite's demand:input* exports are read directly instead of scraped from stderr. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
frankmcsherry
added a commit
that referenced
this pull request
Jun 12, 2026
…emand (#758) * interactive/backend: in-process evaluate() — explicit inputs, exports as data Adds backend::vec::evaluate(program, inputs): run a scope-tree Program on explicitly provided rows (single worker, in-process) and return every export's consolidated final contents as data. This is the missing driver half of the #757 split: render_tree moved into the library, but inputs could still only come from gen_row and outputs could only leave via Inspect eprintln, which is why semantic questions kept growing one-off env vars (QUERY, CLONE_RT, EDGES_FILE) on the example binaries. With evaluate(), end-to-end properties — clone identity, explanation sufficiency, demand-set excess — are ordinary #[test]s over data, and the explain rewrite's demand:input* exports are read directly instead of scraped from stderr. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * interactive/explain: filter both sides' times before splitting join demand The join backward rule maps demand for (k, (v1, v2)) @ t to (k, v1) @ <= t and (k, v2) @ <= t. The implementation applied each side's <= t filter only to that side's contribution, projecting the partner's user-time coordinates away unfiltered. Over the signed host tables (witness + forward carry stable -1 rows at retraction iterations) that projection is not a faithful existential: pair rows differing only in the partner's time -- a +1 while the partner held the row and a -1 from its later retraction -- merge into the same contribution row and cancel, annihilating demand the time-valid configuration justifies. Concretely (scc, 1000 nodes / 1100 edges, QUERY=773:466): demand for the fwd-label proposal at node 25 paired labels(25)=0 with scc edge (25,236), which scc retracts at outer iter 1 as trim shrinks. The left contribution's +1/-1 pair cancelled, so the edge was demanded but label(25) never was; the 0-flood justification chase died there, the forward clone's labels diverged (label(466) = 11 != 0), trim_fwd lost 466->28, and the demand-set (66 edges) failed to regenerate the queried output. Fix: one dep-pair join carrying BOTH user chains, with both outer-aligned u_in <= u_out filters applied before either side is projected away. A (left, right) pair only explains output demanded at u_out if both inputs were present at times <= u_out. This is also a sound narrowing: pairs whose partner arrived later no longer contribute demand. Verified: all 22 scc-edge queries at 100 nodes / 110 edges and 7/7 sampled at 1000/1100 now produce demand-sets that regenerate their queried row (EDGES_FILE re-run); 773:466 grows 66 -> 278 edges and is sufficient; several 100/110 queries shrink (e.g. 25 -> 20, 16 -> 11); the global query is unchanged at 110; interactive lib tests pass. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * interactive: end-to-end sufficiency tests for the explanation rewrite The central property: for a query against a program's output, the reported demand-set — fed back through the original program as its only input — must regenerate the queried output row. - clone_identity_preserves_scc_output: CLONE_RT as a test. - scc_row_explanations_sufficient_small: all 12 scc-edge queries at 50 nodes / 55 edges, each demand-set replayed and checked. - demand_is_a_subset_of_the_input. - [ignored] scc_row_explanations_sufficient_100: the 22-query sweep. - [ignored] scc_join_partner_time_regression: the 1000/1100 (773,466) query. Fails against the parent of the join-time fix with the exact signature from its investigation (66 demanded rows, queried row not regenerated); passes with it. The whole suite, ignored tests included, runs in ~1.4s in release. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * interactive/explain: narrow join demand by the produced output value The join backward rule matched demand to candidate (left, right) pairs on K_out alone; the demanded output VALUE never constrained which pairs were charged, so every same-key sibling was demanded — every in-edge of a node and its ancestor label cone, not just the pair that carried the demanded label. The pair table now also records the chained V_out each pair produces (the projection's value fields, Pos-expanded against the original arities), and the dep-pair join filters produced == demanded element-wise, alongside the two time filters. Measured on scc row queries (demand-set size vs greedy 1-minimal): 50 nodes / 55 edges: 119 -> 71 demanded over 12 queries (3.72x -> 2.22x); 100 / 110: 736 -> 395 over 22 queries (4.72x -> 2.53x). Sufficiency suite (including the 1000/1100 partner-time regression) passes. An earlier attempt at this narrowing (pre #757-era experiments) appeared to break the demand fixed point's self-correction; that failure was the partner-time cancellation fixed in the previous commit, not a property of value narrowing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * interactive: demand-excess report (greedy 1-minimal shrink) An ignored, non-asserting test that prints per-query demand-set size against a greedy 1-minimal shrink (drop rows while the replay still regenerates the queried output) — the working metric for the over-approximation work. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * interactive: tc and reach explanation tests; multi-input demand harness Generalizes the sufficiency/excess helpers to per-import shapes and multiple demand-set exports (reach demands both edges and roots), and adds transitive closure and reachability cases to the default suite and the excess report. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * interactive: fuzz sweep, multi-query, and count tests; CI gates the suite - fuzz_explanations_sufficient: 10 seeded graphs x {scc, tc, reach}, every output row queried and its demand-set replayed (~300 sufficiency checks, ~11s release). Both soundness bugs so far surfaced only at particular scales/instances; this is the standing net for the next one. - two_simultaneous_queries_sufficient: two queries seeded together with distinct q ids; the union demand must regenerate both rows. Exercises the q-id plumbing through every reverse rule. - count_explanations_sufficient_small: count over duplicate-free inputs is sufficient today (the keyed all-rows demand is exactly the count's multiset). - count_duplicate_inputs_known_gap (#[ignore]d, fails by design): demand- sets are sets but count outputs depend on input multiplicities; with a duplicated row, indeg(5)=3 demands {(1,5),(2,5)} and replays as 2. The motivating case for a multiplicity ("diff") dimension on demand. - CI: run the suite with --include-ignored in release (skipping the metric report and known-gap tests), so the soundness regression gates merges. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * interactive: replay oracle restricts original inputs (reference multiplicities) Demand means "this row, at its reference count": the rewrite's forward clone consumes demand-sets by semijoining the actual input, preserving source multiplicities. The test oracle instead replayed the demand-set itself at multiplicity 1, which understates counts when inputs contain duplicate rows — the supposed count 'known gap' was this oracle bug, not a rewrite gap. Replays now restrict the original inputs to the demanded rows; the count-with-duplicates case flips to a passing test and the CI known-gap skip is removed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * interactive: retire the env-var switches - CLONE_RT: deleted; the identity check is the clone_identity_preserves_* test. - EXPLAIN_DEBUG_DEP: the library no longer reads the environment; the rewrite takes explain::Options { debug_inspects } via explain_with(), and ddir_vec maps a --debug-demand flag onto it. - QUERY / DIAG: become ddir_vec flags (--query=K:V,Q / --diag); flag args are stripped before positional parsing, and the timely tail is forwarded from the cleaned argument list instead of raw std::env::args(). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Both
ddir_vecandddir_colcarried their own copy of the scope-tree walk (render_tree/resolve/RItem), differing only in the leaf operators and theCol/Variabletype spellings. This lifts the walk into the library asinteractive::backend::render_tree<B>, generic over aBackendtrait that fixes the differential container and supplies the leaf operators (linear/join/reduce/arrange/inspect/leave_dynamic).Design
Mirrors DD's container-generic style rather than reaching for GATs:
C = B::Container, withTimefixed concrete (ir::Time).Col = Collection<'s, Time, C>is the plain container-generic collection; only the arrangement type is a GAT (type Arr<'scope>, bounded justClone).Enter/Leave<Time, Time, ..Container = C>identity bounds (a same-timestamp region preserves the container), so nested scopes recurse with the same backend.leave_dynamicis a trait method because vec uses aCollectionmethod and col a freecolumnar::fn.Result
backend::{vec,col}each become aBackendimpl plus their substrate leaf ops.ddir_{vec,col}thin to drivers calling the shared renderer (-219/-353lines).vec/colproduce identical outputs. Verified onreach,tc, andscc(nested scopes /enter_at/min).This is the shared prerequisite for reviving the DDIR server and standing up a wasm front-end — both can now call
interactive::backend::vec::render_treeinstead of re-deriving a renderer.🤖 Generated with Claude Code