Skip to content

feat(cli): rerun captured task bundles#1335

Merged
christso merged 1 commit into
mainfrom
feat/av-wy0.4-rerun-bundles
Jun 9, 2026
Merged

feat(cli): rerun captured task bundles#1335
christso merged 1 commit into
mainfrom
feat/av-wy0.4-rerun-bundles

Conversation

@christso

@christso christso commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add agentv runs rerun <run-dir> to execute captured task bundles from a run index.jsonl using native bundle files (task/EVAL.yaml, task/targets.yaml, task/files/, task/graders/).
  • Support selection by --test-id and --source-target, target replacement via --targets / --target, dotenv loading with --env-file, explicit --output, and safe default output placement outside the captured source bundle.
  • Preserve rerun traceability by carrying source run/test metadata into normal AgentV result metadata.

Bead

  • Bead: av-wy0.4

Base / Dependencies

Verification

Post-final-rebase local checks:

  • bun run --filter @agentv/core build
  • bun test apps/cli/test/commands/runs/rerun.test.ts apps/cli/test/eval.integration.test.ts (17 pass, 0 fail)
  • bun --filter agentv typecheck
  • bunx biome check apps/cli/src/index.ts apps/cli/src/commands/runs/index.ts apps/cli/src/commands/runs/rerun.ts apps/cli/src/commands/eval/run-eval.ts apps/cli/src/commands/results/manifest.ts apps/cli/test/commands/runs/rerun.test.ts apps/cli/test/fixtures/mock-run-evaluation.ts
  • GitHub CI on fa83b2cc16f2af686fb8fd18be2d3e9338ca7d43: Build, Typecheck, Lint, Test, Check Links, Validate Marketplace, Validate Evals, and Cloudflare Pages all succeeded.

Earlier full-branch verification before the unblock rebase:

  • bun run verify
  • bun run validate:examples (58 valid, 0 invalid)

Manual Red/Green UAT

  • Red on pre-feature main: bun --no-env-file /tmp/agentv-av-wy0-4-red/apps/cli/src/cli.ts runs rerun /tmp/nonexistent exited with Not a valid subcommand name for runs.
  • Green on this branch: agentv runs rerun against a temp captured bundle wrote a separate output run, preserved test_id: case-alpha, emitted metadata.rerun_source, and regenerated Alpha answer instead of replaying the captured Captured old answer.

Coordination Notes

  • Code/test files were reserved in Agent Mail before editing and during the unblock rebase.
  • This PR should be reviewed but not merged by the worker.

@christso christso force-pushed the feat/av-wy0.4-rerun-bundles branch from 9acfa46 to 6dfe495 Compare June 9, 2026 03:18
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 9, 2026

Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: fa83b2c
Status: ✅  Deploy successful!
Preview URL: https://28e27d30.agentv.pages.dev
Branch Preview URL: https://feat-av-wy0-4-rerun-bundles.agentv.pages.dev

View logs

@christso christso force-pushed the feat/av-wy0.4-rerun-bundles branch from 6dfe495 to fa83b2c Compare June 9, 2026 03:22
@christso christso marked this pull request as ready for review June 9, 2026 03:24
@christso christso merged commit f116231 into main Jun 9, 2026
8 checks passed
@christso christso deleted the feat/av-wy0.4-rerun-bundles branch June 9, 2026 03:35
christso added a commit that referenced this pull request Jun 9, 2026
Simplify agentv eval output to canonical --output directories plus --export files. Remove legacy --out/--artifacts/--output-format and config output.format with migration guidance. Preserve rerun captured task bundle metadata after rebasing on #1335.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant