feat(ir): native ADO pipeline IR (WIP)#960
Conversation
First commit of the "Native ADO Pipeline IR" refactor (see plan).
Adds the typed IR root module `src/compile/ir/` with eight
submodules:
- `ids` - StageId / JobId / StepId newtypes, validated
against the ADO identifier grammar.
- `output` - OutputDecl + OutputRef.
- `env` - EnvValue { Literal | AdoMacro | PipelineVar |
Secret | StepOutput | Coalesce }; AdoMacro is
constructor-checked against ALLOWED_ADO_MACROS.
- `condition` - Condition AST + Expr (codegen lands in
ir-condition-codegen).
- `step` - Step enum + BashStep / TaskStep / CheckoutStep /
DownloadStep / PublishStep with builder helpers.
- `job` - Job + Pool (vmImage / 1ES named pool).
- `stage` - Stage.
- `mod.rs` - Pipeline / PipelineBody / PipelineShape
(Standalone | OneEs | JobTemplate | StageTemplate)
+ placeholder Parameter / Resources / Triggers /
PipelineVar shapes.
No production callers yet - everything is reachable only from the
in-module unit tests. The module carries a deliberate, scoped
`#![allow(dead_code)]` until the `extension-trait-port` commit
wires real callers; the unit tests exercise constructors so silent
breakage would still surface.
Unit-test coverage (59 tests) covers id validation, env-macro
allowlist, condition / step / job / stage / pipeline constructors,
and the OutputRef / OutputDecl / Coalesce shapes.
`cargo build` / `cargo test` / `cargo clippy --all-targets
--all-features` all green.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the IR-to-YAML lowering and emit passes: - `src/compile/ir/lower.rs` walks the typed Pipeline / Stage / Job / Step tree and produces a `serde_yaml::Value` with canonical key order (identity keys -> static config -> payload). Handles every IR variant that does not need cross-step resolution. - `src/compile/ir/emit.rs` is the thin entry point: it composes `lower` with `serde_yaml::to_string`. Result is byte-compatible with the canonical baseline that the prep PR established (#957). Variants that need cross-step resolution (`EnvValue::StepOutput`, `EnvValue::Coalesce`, `Expr::StepOutput`) return a structured error that names the commit which fills them in (`ir-output-lowering`). Unit tests cover the success path for the simple variants and the explicit error path for the deferred ones so the boundary stays load-bearing as later commits land. Round-trip acceptance test in `emit::tests` builds a handcrafted Pipeline with Setup + Agent jobs (containing Checkout / Bash / Publish / Download steps), emits via `emit`, re-parses the YAML through `serde_yaml`, and asserts structural equality against a hand-built reference `Value` - locking the wire shape. Still no production callers; the `#![allow(dead_code)]` on `src/compile/ir/mod.rs` stays for now and is removed in `extension-trait-port`. `cargo build` / `cargo test` (17 groups, 0 failed) / `cargo clippy --all-targets --all-features` all green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the dependency-graph pass (`src/compile/ir/graph.rs`):
- Walks every step's `env` and `condition` (including
`Condition` AST + nested `Coalesce` children) to collect every
`OutputRef`.
- For each ref, looks up the producer step's location
(`StepLocation { stage, job, declared outputs }`) and adds either
a cross-job edge (same stage, different jobs) or a cross-stage edge
(different stages). Same-job refs contribute nothing - ADO orders
steps within a job by YAML position.
- Validates as a side-effect: `UnknownProducer`, `AnonymousProducer`,
`UnknownOutput` (the producer must declare the named output),
`DuplicateJobId`, `DuplicateStageId`, `DuplicateStepId`, plus
`MixedStagedAndUnstaged` (cross-stage refs between staged and
flat-jobs sections are not supported).
- Runs Kahn's algorithm on both job and stage edge sets to detect
cycles. The error message lists every node still with positive
in-degree so an operator can locate the offending sub-graph.
- Two entry points:
* `resolve(p)` - all-in-one: build graph, detect cycles, merge
derived edges into `job.depends_on` / `stage.depends_on`
(existing values preserved).
* `build_graph(p)` - returns the typed graph without mutating the
pipeline. Useful for diagnostics and tests.
Nine unit tests cover the major paths: cross-job edge derivation,
cross-stage edge derivation, same-job no-op, unknown producer,
unknown output, duplicate ids, cycle detection (with the listed-nodes
error message), coalesce-child edge collection, and a 5-stage chain
that exercises both per-step env and job-level condition walks.
Still no production callers - the graph pass is reachable only from
its unit tests until `extension-trait-port` wires real callers.
`cargo build` / `cargo test` (17 groups, 0 failed) /
`cargo clippy --all-targets --all-features` all green.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the per-consumer-location lowering for `OutputRef` so that
extensions can write `OutputRef { step, name }` once and have the
IR pick exactly one of the three ADO syntaxes:
same job `\$(stepName.X)\`
cross-job same stage `dependencies.<job>.outputs[\stepName.X\]`
cross-stage `stageDependencies.<stage>.<job>.outputs[\stepName.X\]`
Implementation:
- `output::lower_outputref` is the single source of truth for the
three-syntax decision; it takes `ConsumerLocation` and
`ProducerLocation` newtypes so call sites cannot mix them up.
- `lower::LoweringContext` carries the `Graph` + the current
consumer's stage/job through every recursive `lower_*` helper.
`lower` (no-arg public entry) now builds the graph internally
via `graph::build_graph` + `detect_cycles`;
`lower_with_graph` is exposed for callers that already hold a
built graph.
- `EnvValue::Coalesce` lowers to a single `\$[ coalesce(<a>, <b>, ..., '') ]\`
with the trailing `''` safety value appended automatically.
Nested `Coalesce` is flattened. Children inside `\$[ ... ]\` use
ADO expression-atom form: `variables['Name']` for predefined
vars, the un-wrapped step-output reference for `StepOutput`.
- `Condition::Eq` / `Condition::Ne` over `Expr::StepOutput`
thread the same context into the existing condition codegen
(the static subset is unchanged; the dynamic subset now resolves
correctly).
Auto-`isOutput=true` map:
- `graph::Graph` gains `outputs_needing_is_output: BTreeMap<StepId, BTreeSet<String>>`
populated as a side effect of walking every consumer's
`OutputRef`. Same-job references count here (ADO needs
`isOutput=true` for `\$(step.name)\` too).
- `graph::resolve` propagates the map back onto
`OutputDecl::auto_is_output` so producer extensions can read the
flag at emit time without re-deriving it. New unit test
`auto_is_output_flag_only_promotes_referenced_outputs` locks
the contract: only outputs with at least one reader are promoted.
Test coverage: `output::tests` (4 new lowering-syntax tests),
`lower::tests` (Coalesce round-trip + context threading),
`graph::tests` (auto_is_output + cross-step reader counting).
`cargo build` / `cargo test` (17 groups, 0 failed) /
`cargo clippy --all-targets --all-features` all green.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Moves the `Condition` / `Expr` lowering out of `lower.rs` and
into a dedicated `condition::codegen` module so the AST and its
codegen stay colocated.
Functional additions over what `ir-yaml-emit` shipped:
- `Condition::And` / `Condition::Or` flatten nested operators
of the same kind before lowering. `and(a, and(b, c))` emits as
`and(a, b, c)` - matches the existing layout in
`compile_gate_step_external` and stays readable in the YAML.
- `Condition::Custom(s)` now runs through a two-vector injection
check at lower time:
* `crate::validate::contains_pipeline_command(s)` rejects
`##vso[` and `##[` - these would be acted on at runtime if
echoed by an executor.
* `crate::validate::contains_newline(s)` rejects embedded
newlines that would flip a YAML scalar from inline to block
form and change parse semantics.
Crucially, the check does NOT reject `\$(...)\`, `\$[...]\`,
`\${{...}}\` - those are exactly the ADO expressions the escape
hatch exists for. Tests verify both the pass-through (real ADO
expressions accepted) and rejection (pipeline-command markers and
newlines rejected) paths.
- New `CondCodegenCtx { graph, stage, job }` is the per-consumer
context that `lower::LoweringContext::cond_ctx()` builds on
demand. The codegen module no longer borrows `lower::LoweringContext`
directly, so we avoid an internal-module cycle.
Test coverage: 8 new tests in `condition::codegen::tests` cover
every Condition variant, every Expr variant, nested And/Or
flattening, apostrophe-in-Literal escaping, and the two Custom
injection paths. Existing `lower::tests` keep their integration
coverage for env+condition round-trip.
`cargo build` / `cargo test` (17 groups, 0 failed) /
`cargo clippy --all-targets --all-features` all green.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduces the surface that the upcoming per-extension `port-*` commits will populate, without breaking any existing call sites. Two new IR concepts: - `Step::RawYaml(String)` is the migration bridge - carries legacy `Vec<String>` step bodies (which are pre-formatted YAML strings) through the IR unchanged. `ir::lower::lower_raw_yaml` parses the body into a `serde_yaml::Value` (stripping a leading `- ` + de-indenting continuation lines so emitters that produced sequence-item form still work) and re-emits it via the canonical normalisation. Invalid bodies surface a clear error rather than producing malformed YAML. Removed by `delete-deprecated-trait-aliases` once no `RawYaml` instances remain. - `extensions::Declarations` is the typed aggregate every extension will eventually return: agent_prepare_steps, setup_steps, agent_finalize_steps, detection_prepare_steps, safe_outputs_steps (all `Vec<Step>`), plus network_hosts, bash_commands, prompt_supplement, mcpg_servers, copilot_allow_tools, pipeline_env, awf_mounts, awf_path_prepends, agent_env_vars, warnings. `CompilerExtension` gains a `declarations(ctx) -> Result<Declarations>` method with a default impl that wraps every legacy per-method output - `prepare_steps` / `setup_steps` results land in `agent_prepare_steps` / `setup_steps` as `Step::RawYaml` entries; every other field is copied through verbatim. The `extension_enum!` macro delegates the new method alongside the existing ones. `#[allow(dead_code)]` covers production paths during the migration window; the smoke test in `extensions::tests::declarations_default_bridges_lean_extension_legacy_methods` locks the bridge contract end-to-end against `LeanExtension`. Subsequent `port-*` commits override `declarations` per extension with real typed Steps and drop the corresponding legacy overrides; the final `delete-deprecated-trait-aliases` commit strips `Step::RawYaml`, the legacy trait methods, and the `#[allow(dead_code)]` annotations together. Pragmatic deviation from the plan's "old method names are gone" acceptance: that would have required updating ~150 call sites (production + tests) in a single commit and was too risky. The default-impl bridge keeps every existing call site working while still establishing the new surface; the migration story for each extension is unchanged. `cargo build` / `cargo test` (1883 tests, 0 failed) / `cargo clippy --all-targets --all-features` / `cargo test --test bash_lint_tests` all green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a declarations() override on AdoAwMarkerExtension returning the two prepare-phase steps as typed Step::Bash(BashStep) values (no Step::RawYaml). Coexists with the legacy prepare_steps method until compile-target-standalone switches production consumption to declarations(). New helpers marker_bash_step() and aw_info_bash_step() build the typed BashStep with the same bash bodies as the legacy YAML strings, so lowering through ir::emit produces equivalent output. The aw_info step carries Condition::Always (today the YAML string embeds condition: always() verbatim). New unit test declarations_returns_typed_bash_steps_not_raw_yaml locks the shape: must return exactly two Step::Bash values with the canonical display names. Detailed bash-body assertions stay on the legacy-form tests. cargo build / cargo test (1884 tests, 0 failed) / cargo clippy --all-targets --all-features all green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a declarations() override on GitHubExtension that routes the single 'github' allow-tool through the Declarations bundle. The extension contributes nothing else (no steps, hosts, env vars). Coexists with the legacy allowed_copilot_tools method so production call sites in src/engine.rs keep working until compile-target-* switches to declarations() consumption. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds declarations() override routing mcpg_servers, allowed_copilot_tools, and prompt_supplement through the Declarations bundle. Coexists with legacy methods until target compilers switch to declarations() consumption. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds declarations() override returning the two prepare-phase steps
as typed Step::Bash(BashStep) values. The conditional prompt-append
step carries Condition::Ne(Expr::Variable('AW_AZ_MOUNTS'), Expr::Literal(''))
which lowers to the same condition string the YAML emits today:
ne(variables['AW_AZ_MOUNTS'], '').
AW_AZ_MOUNTS is a pipeline variable (set via task.setvariable), not
a step output, so it's referenced via Expr::Variable - no OutputRef
is involved and no isOutput=true is needed.
Coexists with the legacy prepare_steps method until target compilers
switch to declarations() consumption.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Solid IR foundation — well-structured, well-tested, and good error handling throughout. Two issues worth addressing before real callers are wired up. Findings🐛 Bugs / Logic Issues
|
Draft PR - work in progress. Companion to merged prep PR #957. 10 commits: 6 IR foundation (typed pipeline IR, serde_yaml emit, dependency graph + Kahn cycle detection, per-location OutputRef lowering, condition codegen with Custom-injection check, Declarations bundle + Step::RawYaml bridge) + 4 always-on extension ports (AdoAwMarker, GitHub, SafeOutputs, AzureCli). Pragmatic deviation: declarations() is a default impl wrapping legacy methods (~150 call sites unchanged); per-extension port-* commits override one at a time. Remaining work in plan.md: port-runtimes/tools (easy, 4hr); port-ado-script + port-exec-context (hard, 1-2 days each, unlock declarative synth-PR); compile-target-* (big bang, 3-5 days, delete src/data/*-base.yml); cleanup. Every commit green on build/test (1884 tests)/clippy/bash-lint.