Skip to content

chore(bench): harden skill smoke + delete dead mock loop#295

Merged
drewstone merged 1 commit into
mainfrom
cleanup/bench-smoke-and-deadfile
Jun 14, 2026
Merged

chore(bench): harden skill smoke + delete dead mock loop#295
drewstone merged 1 commit into
mainfrom
cleanup/bench-smoke-and-deadfile

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

Two safe, capability-preserving cleanups.

Harden skill-sandbox-smoke.mts — the old verdict (/reproduce-first/ && /SKILL.md/ over the agent's prose) false-positived on opencode's bundled /nix/store skills and checked the wrong discovery path. It now keys on a unique marker the in-session agent reads back out of the materialized file, at the correct ~/.opencode/skill/<name>/SKILL.md (opencode's singular dir; claude-code uses ~/.claude/skills). Documents the two gotchas that produced false readings during bring-up: the singular skill path, and that a bare box.exec (no sessionId) sees a different filesystem than the agent session.

Delete observe-steer-workspace-loop.mts — unreferenced dead mock-executor loop; cloud-loop.mts is the real observe→steer proof. It was orphaned back onto main by an earlier squash-merge.

No capability change.

… mock loop

- skill-sandbox-smoke: old verdict (/reproduce-first/ && /SKILL.md/ over agent prose)
  false-positived on opencode's bundled /nix/store skills and checked the wrong path. Now keys
  on a UNIQUE marker the in-session agent reads back from the materialized file at the correct
  ~/.opencode/skill/<name>/SKILL.md (opencode singular dir; claude-code uses ~/.claude/skills),
  and documents the session-view gotcha (bare box.exec sees a different FS than the agent session).
- delete observe-steer-workspace-loop.mts: unreferenced dead mock-executor loop (cloud-loop.mts
  is the real observe->steer proof); orphaned back onto main by an earlier squash-merge.

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — e45262ea

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T14:42:29Z

@drewstone drewstone merged commit 79ce774 into main Jun 14, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants