Skip to content

interactive: explanation rewrite on the scope tree; retire the flat IR#753

Merged
frankmcsherry merged 6 commits into
scope-tree-irfrom
explain-tree
Jun 10, 2026
Merged

interactive: explanation rewrite on the scope tree; retire the flat IR#753
frankmcsherry merged 6 commits into
scope-tree-irfrom
explain-tree

Conversation

@frankmcsherry

Copy link
Copy Markdown
Member

Stacked on #752 (scope-tree IR) — review after it merges.

Ports the explanation rewrite to the scope-tree IR and then deletes the flat IR, which it was the last reader of.

The port

  • Clone-with-lifts: the flat rewrite's host map (a host-visible, iteration-coords-folded form of every value) becomes structure — a scope exports its lifted internals, cascading one lift_iter per enclosing exit. The flat version's positional scope tracking, pending pile, leave depth-offset arithmetic, and Leave-aliasing fix-up pass all dissolve into the recursion; the embedding depth is no longer a parameter.
  • The transform: output root { sources, query input, witness clone } + an iterative explain scope { demand-set vars, forward clone on demanded rows, reverse rules, demand exports }. The reverse dataflow is flat inside the explain scope by design (iteration time folds into values), so the per-op lookup rules port nearly verbatim. Boundary bookkeeping is what changes: references resolve through explicit import/export edges to the site they name, and one ordinary shape-preserving lookup against that site's host adapts chain depth in either direction — flat's special Leave rule and two-length Side both dissolve, and flat's latent cross-scope pass-through shape mismatch cannot arise.
  • Loud failure for junk queries: export_shape + a seeding-time assert. A shape-mismatched query previously yielded plausible-looking junk demand (and a false "regression" during verification); now it reports the expected shape.
  • Structural dump: scope_ir::Program::dump prints the tree; dump_explain shows before/after.

The retirement (−2194 lines)

explain.rs, the flat Lowering/lower(), ir::Node/ir::Program + impl, render_program ×2, and the FLAT=1 branches are deleted. ir.rs is now shared vocabulary (LinearOp, RowLike, evaluation, arity transfer). One IR remains.

Verification

  • Demand sets match the flat rewrite byte-for-byte on shape-valid queries before the deletion: scc pairs on two graphs (3 queries each), stable, reach (including its pre-existing empty-root-demand wart, faithfully shared).
  • Clone verified as an identity (CLONE_RT=1) across the corpus before being consumed.
  • After the sweep: 15 lib tests, corpus output counts unchanged on both backends, explain demand sets unchanged.

Noted, not done

  • The flat validate_lift_iter discipline check has no tree equivalent yet (LiftIter is rewrite-emitted; only the applicative parser can surface it).
  • ddir_col --explain (new capability — it never had explain); the future server's query must call export_shape at its seeding.
  • Renaming explain_treeexplain once this lands.

🤖 Generated with Claude Code

frankmcsherry and others added 6 commits June 9, 2026 21:04
First piece of the explanation rewrite's tree port: `clone_into` clones a
program's scopes into an output scope, and every nested scope additionally
lift_iters and exports each internal collection — so every op and feedback
var in the subtree has a host-visible form (user-iter coords folded into the
value, innermost first) at the embedding level, cascading one lift per
enclosing scope exit.

This is the flat rewrite's `host` map become structure: "a scope exports its
lifted internals". The flat version's positional scope tracking, pending
pile, leave depth-offset arithmetic, and the Leave-aliasing fix-up pass all
dissolve; the embedding depth is no longer a parameter (the renderer derives
depth structurally, so a clone needn't know where it will sit).

Verified: structural tests (host coverage of every site on scc, one lift per
level on the re-export chains, identity-clone shape), and behaviorally via
CLONE_RT=1 in ddir_vec — the cloned program's outputs match the plain run
byte-for-byte across the corpus (reach, scc, stable, kcore). The lift chains
execute but are not yet consumed; their content gets exercised when the
reverse rules build pair tables from them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The full rewrite: output root { sources, query input, witness clone } plus an
iterative `explain` scope { demand-set vars, forward clone on demanded rows,
reverse-tracing ops, demand exports }. The reverse dataflow is flat inside
the explain scope by design (iteration time folds into values, the `folded`
layout), so the per-op lookup rules port nearly verbatim. What the tree
changes is boundary bookkeeping: a reference is *resolved* through explicit
import/export edges to the value site it names, and the ordinary shape-
preserving lookup against that site's host form injects or strips chain
coordinates as depths dictate — flat's special Leave rule and its two-length
Side both dissolve, and flat's latent cross-scope pass-through mismatch
can't arise (routing adapts depth uniformly).

ddir_vec --explain now runs the tree path by default (FLAT=1 --explain for
the flat rewrite); flat gains a matching demand_set debug inspect for A/B.

Verified: demand-sets MATCH flat byte-for-byte on shape-valid queries across
reach, scc on two graphs (3 queries each), and stable. An earlier apparent
scc over-inclusion traced to a SHAPE-INVALID query (key/val not matching the
aggregate export); on such junk queries the two paths produce different junk
- a loud shape check on the query input is the right guard (future: the
shapes pass). On valid queries the paths agree everywhere tested.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`export_shape` exposes the first export's (k, v); ddir_vec asserts the QUERY
against it at seeding. A mismatched query addresses nothing and yields
plausible-looking junk demand (the source of a false "regression" during the
flat/tree comparison); now it fails with the expected shape spelled out.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`Program::dump` prints the tree as indented structural text — per scope,
imports and vars first, items in order with Subs nested, then binds and
exports. dump_explain now shows lower_tree output and the explain_tree
rewrite (the flat printer goes with the flat IR).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Deletes, now that explain_tree removed the last reader:
- explain.rs (the flat rewrite, builder, clone, reverse rules) — its tree
  port is explain_tree; the arity transfer functions move to ir.rs.
- The flat Lowering and lower() in lower.rs (lower_tree remains).
- ir::Node / ir::Program and its impl (dump, depths, optimize, rewrite,
  validate_lift_iter). ir.rs is now the shared row/op vocabulary: LinearOp,
  RowLike, field/condition evaluation, arity transfer.
- render_program and the FLAT=1 branches in both backends; the tree path
  (regions, optimizer, --explain via explain_tree) is the only path.

Noted: the flat validate_lift_iter discipline check (LiftIter result must
not be referenced in its own scope) has no tree equivalent yet; LiftIter is
rewrite-emitted today, with only the applicative parser able to surface it.

Verified after the sweep: 15 lib tests, corpus output counts unchanged on
both backends, --explain demand sets unchanged (scc pairs query).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@frankmcsherry frankmcsherry merged commit 817018c into scope-tree-ir Jun 10, 2026
6 checks passed
@frankmcsherry frankmcsherry deleted the explain-tree branch June 10, 2026 10:29
frankmcsherry added a commit that referenced this pull request Jun 10, 2026
#753)

* explain_tree: clone-with-lifts on the scope tree

First piece of the explanation rewrite's tree port: `clone_into` clones a
program's scopes into an output scope, and every nested scope additionally
lift_iters and exports each internal collection — so every op and feedback
var in the subtree has a host-visible form (user-iter coords folded into the
value, innermost first) at the embedding level, cascading one lift per
enclosing scope exit.

This is the flat rewrite's `host` map become structure: "a scope exports its
lifted internals". The flat version's positional scope tracking, pending
pile, leave depth-offset arithmetic, and the Leave-aliasing fix-up pass all
dissolve; the embedding depth is no longer a parameter (the renderer derives
depth structurally, so a clone needn't know where it will sit).

Verified: structural tests (host coverage of every site on scc, one lift per
level on the re-export chains, identity-clone shape), and behaviorally via
CLONE_RT=1 in ddir_vec — the cloned program's outputs match the plain run
byte-for-byte across the corpus (reach, scc, stable, kcore). The lift chains
execute but are not yet consumed; their content gets exercised when the
reverse rules build pair tables from them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* explain_tree: the explanation transform on the scope tree

The full rewrite: output root { sources, query input, witness clone } plus an
iterative `explain` scope { demand-set vars, forward clone on demanded rows,
reverse-tracing ops, demand exports }. The reverse dataflow is flat inside
the explain scope by design (iteration time folds into values, the `folded`
layout), so the per-op lookup rules port nearly verbatim. What the tree
changes is boundary bookkeeping: a reference is *resolved* through explicit
import/export edges to the value site it names, and the ordinary shape-
preserving lookup against that site's host form injects or strips chain
coordinates as depths dictate — flat's special Leave rule and its two-length
Side both dissolve, and flat's latent cross-scope pass-through mismatch
can't arise (routing adapts depth uniformly).

ddir_vec --explain now runs the tree path by default (FLAT=1 --explain for
the flat rewrite); flat gains a matching demand_set debug inspect for A/B.

Verified: demand-sets MATCH flat byte-for-byte on shape-valid queries across
reach, scc on two graphs (3 queries each), and stable. An earlier apparent
scc over-inclusion traced to a SHAPE-INVALID query (key/val not matching the
aggregate export); on such junk queries the two paths produce different junk
- a loud shape check on the query input is the right guard (future: the
shapes pass). On valid queries the paths agree everywhere tested.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* explain_tree: reject shape-invalid queries loudly

`export_shape` exposes the first export's (k, v); ddir_vec asserts the QUERY
against it at seeding. A mismatched query addresses nothing and yields
plausible-looking junk demand (the source of a false "regression" during the
flat/tree comparison); now it fails with the expected shape spelled out.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* scope_ir: structural tree dump; dump_explain on the tree pipeline

`Program::dump` prints the tree as indented structural text — per scope,
imports and vars first, items in order with Subs nested, then binds and
exports. dump_explain now shows lower_tree output and the explain_tree
rewrite (the flat printer goes with the flat IR).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Retire the flat IR: the scope tree is the only program representation

Deletes, now that explain_tree removed the last reader:
- explain.rs (the flat rewrite, builder, clone, reverse rules) — its tree
  port is explain_tree; the arity transfer functions move to ir.rs.
- The flat Lowering and lower() in lower.rs (lower_tree remains).
- ir::Node / ir::Program and its impl (dump, depths, optimize, rewrite,
  validate_lift_iter). ir.rs is now the shared row/op vocabulary: LinearOp,
  RowLike, field/condition evaluation, arity transfer.
- render_program and the FLAT=1 branches in both backends; the tree path
  (regions, optimizer, --explain via explain_tree) is the only path.

Noted: the flat validate_lift_iter discipline check (LiftIter result must
not be referenced in its own scope) has no tree equivalent yet; LiftIter is
rewrite-emitted today, with only the applicative parser able to surface it.

Verified after the sweep: 15 lib tests, corpus output counts unchanged on
both backends, --explain demand sets unchanged (scc pairs query).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* explain_tree: module doc describes the full transform

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
frankmcsherry added a commit that referenced this pull request Jun 10, 2026
#753) (#754)

* explain_tree: clone-with-lifts on the scope tree

First piece of the explanation rewrite's tree port: `clone_into` clones a
program's scopes into an output scope, and every nested scope additionally
lift_iters and exports each internal collection — so every op and feedback
var in the subtree has a host-visible form (user-iter coords folded into the
value, innermost first) at the embedding level, cascading one lift per
enclosing scope exit.

This is the flat rewrite's `host` map become structure: "a scope exports its
lifted internals". The flat version's positional scope tracking, pending
pile, leave depth-offset arithmetic, and the Leave-aliasing fix-up pass all
dissolve; the embedding depth is no longer a parameter (the renderer derives
depth structurally, so a clone needn't know where it will sit).

Verified: structural tests (host coverage of every site on scc, one lift per
level on the re-export chains, identity-clone shape), and behaviorally via
CLONE_RT=1 in ddir_vec — the cloned program's outputs match the plain run
byte-for-byte across the corpus (reach, scc, stable, kcore). The lift chains
execute but are not yet consumed; their content gets exercised when the
reverse rules build pair tables from them.



* explain_tree: the explanation transform on the scope tree

The full rewrite: output root { sources, query input, witness clone } plus an
iterative `explain` scope { demand-set vars, forward clone on demanded rows,
reverse-tracing ops, demand exports }. The reverse dataflow is flat inside
the explain scope by design (iteration time folds into values, the `folded`
layout), so the per-op lookup rules port nearly verbatim. What the tree
changes is boundary bookkeeping: a reference is *resolved* through explicit
import/export edges to the value site it names, and the ordinary shape-
preserving lookup against that site's host form injects or strips chain
coordinates as depths dictate — flat's special Leave rule and its two-length
Side both dissolve, and flat's latent cross-scope pass-through mismatch
can't arise (routing adapts depth uniformly).

ddir_vec --explain now runs the tree path by default (FLAT=1 --explain for
the flat rewrite); flat gains a matching demand_set debug inspect for A/B.

Verified: demand-sets MATCH flat byte-for-byte on shape-valid queries across
reach, scc on two graphs (3 queries each), and stable. An earlier apparent
scc over-inclusion traced to a SHAPE-INVALID query (key/val not matching the
aggregate export); on such junk queries the two paths produce different junk
- a loud shape check on the query input is the right guard (future: the
shapes pass). On valid queries the paths agree everywhere tested.



* explain_tree: reject shape-invalid queries loudly

`export_shape` exposes the first export's (k, v); ddir_vec asserts the QUERY
against it at seeding. A mismatched query addresses nothing and yields
plausible-looking junk demand (the source of a false "regression" during the
flat/tree comparison); now it fails with the expected shape spelled out.



* scope_ir: structural tree dump; dump_explain on the tree pipeline

`Program::dump` prints the tree as indented structural text — per scope,
imports and vars first, items in order with Subs nested, then binds and
exports. dump_explain now shows lower_tree output and the explain_tree
rewrite (the flat printer goes with the flat IR).



* Retire the flat IR: the scope tree is the only program representation

Deletes, now that explain_tree removed the last reader:
- explain.rs (the flat rewrite, builder, clone, reverse rules) — its tree
  port is explain_tree; the arity transfer functions move to ir.rs.
- The flat Lowering and lower() in lower.rs (lower_tree remains).
- ir::Node / ir::Program and its impl (dump, depths, optimize, rewrite,
  validate_lift_iter). ir.rs is now the shared row/op vocabulary: LinearOp,
  RowLike, field/condition evaluation, arity transfer.
- render_program and the FLAT=1 branches in both backends; the tree path
  (regions, optimizer, --explain via explain_tree) is the only path.

Noted: the flat validate_lift_iter discipline check (LiftIter result must
not be referenced in its own scope) has no tree equivalent yet; LiftIter is
rewrite-emitted today, with only the applicative parser able to surface it.

Verified after the sweep: 15 lib tests, corpus output counts unchanged on
both backends, --explain demand sets unchanged (scc pairs query).



* explain_tree: module doc describes the full transform



---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant