Skip to content

feat: pre-execution critic gate for side-effecting tools#863

Draft
anandgupta42 wants to merge 4 commits into
mainfrom
feat/critic-gate
Draft

feat: pre-execution critic gate for side-effecting tools#863
anandgupta42 wants to merge 4 commits into
mainfrom
feat/critic-gate

Conversation

@anandgupta42

@anandgupta42 anandgupta42 commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Adds a flag-gated (ALTIMATE_CRITIC_GATE, default OFF) pre-execution "critic gate". Before a side-effecting tool (bash/write/edit/sql_execute/dbt_run/patch) runs, a pluggable Verifier checks the proposed args; on a hard verdict the call is skipped and the reason is fed back so the model can fix-and-retry, instead of executing a bad action. No behavior change when the flag is unset.

  • tool/critic.ts — pure, testable gate + pluggable Verifier interface.
    • ALLOW_ALL — ungated (opt-out / tests).
    • basicSafetyVerifier — the wired default: a conservative, dependency-free heuristic that blocks catastrophic, unambiguous host-destructive bash (rm -rf / incl. system-path / glob / compound / fully-qualified / ${HOME} variants, fork bombs, mkfs/dd on a raw device, recursive chmod of /). Best-effort safety net, not a security boundary — documented as such; defense-in-depth stays with the OS sandbox, the permission system, and a richer verifier a product may inject.
    • gate() is fail-open on verifier error AND on a verifier timeout, so a broken or hung verifier never breaks/hangs the agent.
  • session/prompt.ts — wired into the native-tool execute wrapper just before item.execute, marker-wrapped, emits tool.execute.after on a block so observability sees it. MCP tools use a separate wrapper and are intentionally not gated (documented on DEFAULT_GATED).

Split out of #857 / PR #858, where the critic was previously an unwired no-op flag. This PR makes the flag actually do something.

Type of change

  • New feature (non-breaking change which adds functionality)

Issue for this PR

Closes #862

How did you verify your code works?

  • 100 tests across three files: unit (critic.test.ts), adversarial bypass probes that document caught-vs-known-limits (critic-adversarial.test.ts), and e2e driving the REAL BashTool through the gate with real filesystem effects (critic-e2e.test.ts) — no mocked tool calls. All green; 448 tool-suite tests pass; tsgo typecheck clean; altimate_change markers in prompt.ts balanced (37/37); default-off path unchanged.
  • Adversarial testing found + fixed real bugs: compound commands where the fatal rm wasn't last (rm -rf / && rm -rf ./safe), and a separator glued to the target (rm -rf /;).
  • Reviewed by a multi-model panel (Gemini 3.1 Pro, GLM-5, MiniMax M2.7) before opening. Their findings were applied: removed false positives (rm -rf *, rm -rf ., and rm buried in an argument like git commit -m "...rm -rf /..."), closed false negatives (glob wipes rm -rf /var/*, /.//.., fully-qualified /bin/rm, ${HOME}, long-form/glob chmod), and added an input length cap + verifier timeout.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have added tests that prove my feature works
  • New and existing unit tests pass locally with my changes

Summary by cubic

Adds a flag-gated pre-execution critic for side-effecting tools that blocks clearly dangerous bash before it runs. Default is off; when enabled, blocked calls return feedback so the model can fix-and-retry.

  • New Features

    • Controlled by ALTIMATE_CRITIC_GATE; pluggable Verifier with ALLOW_ALL and default basicSafetyVerifier.
    • Blocks catastrophic bash (e.g., rm -rf / variants, fork bombs, mkfs/dd to devices, recursive chmod/chown of /); fail-open on verifier error or 5s timeout.
    • Wired into session/prompt.ts just before execution; on block emits tool.execute.after and skips execution. MCP tools are not gated.
    • Gated tools: bash, write, edit, sql_execute, apply_patch. Tests: 104 (unit, adversarial, e2e with real BashTool).
  • Bug Fixes

    • Corrected gated tool id to apply_patch; removed non-existent dbt_run.
    • Closed bypass: inline env assignments (FOO=1 rm -rf /, IFS=x ... rm) now resolve command position correctly.
    • Fixed timer leak by clearing the verifier timeout after resolution; tightened types to Record<string, unknown> in Verifier.verify and gate, and updated the prompt wrapper cast (no behavior change).

Written for commit 517a134. Summary will update on new commits.

Review in cubic

Summary by CodeRabbit

  • New Features
    • Optional pre-execution safety gate for side-effecting tools (e.g., bash) that can block dangerous commands, return informative feedback when blocked, and otherwise allow normal execution.
  • Tests
    • Added comprehensive unit, adversarial, and end-to-end tests covering detection heuristics, gating behavior, timeouts, robustness, and fail-open guarantees.

Flag-gated (`ALTIMATE_CRITIC_GATE`, default off) gate that runs before a
side-effecting tool (bash/write/edit/sql_execute/dbt_run/patch) executes.
On a hard verdict the call is skipped and the reason is fed back so the
model can fix-and-retry, instead of executing a bad action.

- `tool/critic.ts` — pure, testable gate + pluggable `Verifier` interface.
  - `ALLOW_ALL` — ungated (opt-out / tests).
  - `basicSafetyVerifier` — the wired default: a conservative, dependency-free
    heuristic that blocks catastrophic, unambiguous host-destructive bash
    (`rm -rf /` incl. system-path/glob/compound/fully-qualified/brace variants,
    fork bombs, `mkfs`/`dd` on a raw device, recursive chmod of `/`). Best-effort
    safety NET, not a security boundary — documented as such; defense-in-depth
    stays with the OS sandbox, the permission system, and a richer verifier a
    product may inject.
  - `gate()` is fail-open on verifier error AND on a verifier timeout, so a
    broken or hung verifier never breaks/hangs the agent.
- `session/prompt.ts` — wired into the native-tool execute wrapper just before
  `item.execute`, marker-wrapped, emits `tool.execute.after` on a block so
  observability sees it. No-op when off. (MCP tools use a separate wrapper and
  are intentionally not gated; documented on `DEFAULT_GATED`.)

Tests (100): unit, adversarial bypass probes (caught-vs-documented-limits), and
e2e driving the REAL BashTool through the gate with real filesystem effects —
no mocked tool calls.

Hardening from multi-model review (Gemini / GLM-5 / MiniMax):
- FIX false positive: `rm -rf *`, `rm -rf .`, `rm -rf ./` no longer blocked
  (routine workspace cleanup; no workdir context to judge them).
- FIX false positive: `rm` mentioned inside another command's argument
  (`git commit -m "...rm -rf /..."`) no longer blocked — `rm` is only treated
  as the command when in command position (with a transparent-prefix allowlist
  so `sudo`/`bash -c` still match).
- FIX false negatives: glob wipes of system dirs (`rm -rf /var/*`), `/.`/`/..`,
  fully-qualified `/bin/rm`, `${HOME}` brace expansion, and long-form/glob
  recursive chmod (`chmod --recursive 777 /`, `chmod -R 777 /*`).
- Add input length cap + verifier timeout; `attachments: []` on blocked result.
- Adversarial testing also found+fixed two earlier holes: compound commands
  where the fatal `rm` wasn't last, and a separator glued to the target.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds a flag-gated (ALTIMATE_CRITIC_GATE) pre-execution critic gate for side-effecting tools, wiring a pluggable verifier into the session tool execute path and shipping a conservative default that detects catastrophic bash patterns; blocked verdicts return feedback and skip execution.

Changes

Critic gate implementation and integration

Layer / File(s) Summary
Critic gate module implementation
packages/opencode/src/tool/critic.ts
Implements the Critic namespace with configurable gating, a bash-danger heuristic, basicSafetyVerifier, and async gate() with timeout and fail-open semantics.
Tool execution wrapper integration
packages/opencode/src/session/prompt.ts
Integrates Critic.gate into resolveTools' tool execute wrapper; on deny returns a blocked result (feedback in title/metadata/output) and fires tool.execute.after without running the tool.
Critic module test suite
packages/opencode/test/tool/critic.test.ts, packages/opencode/test/tool/critic-adversarial.test.ts, packages/opencode/test/tool/critic-e2e.test.ts
Adds unit, adversarial, and e2e tests exercising detection heuristics, robustness, known bypasses, verifier timeout/ failure handling, and real BashTool integration with filesystem assertions.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant SessionPrompt
  participant Critic
  participant BashTool

  Client->>SessionPrompt: request tool execution (toolName, args)
  SessionPrompt->>Critic: Critic.gate(toolName, args)
  Critic-->>SessionPrompt: { allow: true } / { allow: false, feedback }
  alt allowed
    SessionPrompt->>BashTool: execute(args)
    BashTool-->>SessionPrompt: tool result
    SessionPrompt-->>Client: result
  else blocked
    SessionPrompt-->>Client: blocked result (feedback)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

needs-review:blocked

Poem

A critic gate stands watch and tall, 🚪
Catching rm -rf / before the fall, 🛑
With bash-safe heuristics, sharp and keen,
No fork bombs here, no /dev/null scene, 🐰
Safe side-effects, at last, ensured. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main feature: a pre-execution critic gate for side-effecting tools, matching the core changeset.
Linked Issues check ✅ Passed The PR implements all core requirements from issue #862: flag-gated critic gate (ALTIMATE_CRITIC_GATE, default OFF) for side-effecting tools, pluggable Verifier interface with basicSafetyVerifier default, fail-open on errors/timeout, and extensive testing (unit, adversarial, e2e).
Out of Scope Changes check ✅ Passed All code changes are directly scoped to the critic gate feature: critic.ts implements the gate logic, prompt.ts wires it into native-tool execution, and test files provide unit/adversarial/e2e coverage. No unrelated changes detected.
Description check ✅ Passed PR description includes all required template sections: summary of changes/why, test plan details, and completed checklist.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/critic-gate

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@anandgupta42 anandgupta42 marked this pull request as ready for review June 1, 2026 06:37

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/opencode/src/tool/critic.ts`:
- Around line 121-129: The isCommandPosition function treats shell assignment
words as normal arguments; update its backward token-scan to also skip
assignment words by recognizing tokens that match a shell variable assignment
pattern (e.g. /^[A-Za-z_][A-Za-z0-9_]*=.*$/) and treat them like flags/prefixes
(similar to TRANSPARENT_PREFIX), so tokens like FOO=1 or HOME=/ don't stop the
scan; apply the same change to the other identical token-scan logic elsewhere in
the file (the analogous loop that decides command position further down) so both
checks skip assignment words.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 82f5e058-6170-4af3-85b9-0262bd8ed2ac

📥 Commits

Reviewing files that changed from the base of the PR and between 6ad8b47 and 769af84.

📒 Files selected for processing (5)
  • packages/opencode/src/session/prompt.ts
  • packages/opencode/src/tool/critic.ts
  • packages/opencode/test/tool/critic-adversarial.test.ts
  • packages/opencode/test/tool/critic-e2e.test.ts
  • packages/opencode/test/tool/critic.test.ts

Comment on lines +121 to +129
function isCommandPosition(tokens: string[], i: number, sep: Set<string>): boolean {
for (let j = i - 1; j >= 0; j--) {
const t = tokens[j]
if (sep.has(t)) return true
if (t.startsWith("-")) continue // a flag (e.g. `sudo -E`, `bash -c`)
if (TRANSPARENT_PREFIX.has(t)) continue
return false // a real preceding word -> rm is an argument, not the command
}
return true // reached the start through only flags/prefixes

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Skip shell assignment words when deciding command position.

isCommandPosition() currently treats assignment words as normal arguments, so catastrophic commands like FOO=1 rm -rf / or env HOME=/ rm -rf / bypass the detector because rm is no longer seen in command position. That is a simple false negative for the default safety gate.

Suggested fix
+  function isAssignmentWord(token: string): boolean {
+    return /^[a-z_][a-z0-9_]*=.*/.test(token)
+  }
+
   function isCommandPosition(tokens: string[], i: number, sep: Set<string>): boolean {
     for (let j = i - 1; j >= 0; j--) {
       const t = tokens[j]
       if (sep.has(t)) return true
       if (t.startsWith("-")) continue // a flag (e.g. `sudo -E`, `bash -c`)
+      if (isAssignmentWord(t)) continue
       if (TRANSPARENT_PREFIX.has(t)) continue
       return false // a real preceding word -> rm is an argument, not the command
     }
     return true // reached the start through only flags/prefixes
   }

Also applies to: 180-184

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opencode/src/tool/critic.ts` around lines 121 - 129, The
isCommandPosition function treats shell assignment words as normal arguments;
update its backward token-scan to also skip assignment words by recognizing
tokens that match a shell variable assignment pattern (e.g.
/^[A-Za-z_][A-Za-z0-9_]*=.*$/) and treat them like flags/prefixes (similar to
TRANSPARENT_PREFIX), so tokens like FOO=1 or HOME=/ don't stop the scan; apply
the same change to the other identical token-scan logic elsewhere in the file
(the analogous loop that decides command position further down) so both checks
skip assignment words.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 5 files

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread packages/opencode/src/tool/critic.ts Outdated
Comment thread packages/opencode/src/tool/critic.ts

@dev-punia-altimate dev-punia-altimate left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-Persona Review — Verdict: comment

Multi-persona review completed.

6/6 agents completed · 32s · 0 findings (0 critical, 0 high, 0 medium)


Multi-Persona Review · vllm:qwen3-next-80b (waves) + vllm-fallback (synth) ·

anandgupta42 added a commit that referenced this pull request Jun 4, 2026
)

* fix: two tests flaky under parallel CI load (S27 + trace snapshot)

Both pass locally but fail consistently in CI's heavy parallel run (9474
tests / 378 files) — the repo's "no flaky tests under resource contention"
case. Neither is caused by any feature change; they fail identically on
unrelated PRs (#854/#858/#863), blocking all of them.

- `real-tool-simulation` S27: the progressive-suggestion dedup state is a
  module-global Set. The test's `beforeEach` reset used a dynamic
  `await import`, which under parallel CI can resolve to a different module
  instance than the tool's static import — so the real Set is never reset and
  accumulates `sql_analyze` from S25/S26 → S27 sees no suggestion. Fix: import
  `PostConnectSuggestions` statically (same instance the tools use); reset in
  S27 too.
- `tracing-adversarial-snapshot` "shows 'running' status": waited a fixed 50ms
  for a debounced async snapshot write, too short under CI load → read a stale
  snapshot. Fix: poll the on-disk status until expected (timeout 4s) instead
  of a fixed sleep.

Closes #879

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix: raise CI test timeout 30s→90s to kill resource-contention flakiness

The "TypeScript" job runs all 9500+ tests in one parallel bun process. Under
CPU contention a few slower tests (real fs/spawn/git-bootstrap) get starved and
exceed the 30s per-test timeout NON-deterministically — different tests each run
(observed: 32s and 51s timeouts). This blocks every PR with failures unrelated
to the diff. 90s gives ~3x headroom over the worst observed, removing the
flakiness without masking genuinely-hung tests.

Part of #879.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dev-punia-altimate

Copy link
Copy Markdown
Contributor

❌ Tests — Failures Detected

TypeScript — 15 failure(s)

  • connection_refused [1.00ms]
  • timeout
  • permission_denied
  • parse_error
  • network_error
  • auth_failure [1.00ms]
  • rate_limit
  • internal_error
  • empty_error
  • connection_refused
  • timeout
  • permission_denied
  • parse_error
  • oom [1.00ms]
  • network_error

Next Step

Please address the failing cases above and re-run verification.

cc @anandgupta42

@dev-punia-altimate dev-punia-altimate left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Code Review — OpenCodeReview (Gemini) — 4 finding(s)

  • 4 anchored to a line (posted inline when the comment stream is on)
  • 0 without a line anchor
All findings (full text)

1. packages/opencode/src/session/prompt.ts (L1529-L1533)

[🔵 LOW] According to the review checklist, avoid using the any type. If it is necessary (e.g., to match external interface signatures or satisfy generic constraints), please provide a comment explaining the reason. Consider replacing it with unknown or a more specific type if possible.

Suggested change:

            // Explain why `any` is used here, or use `unknown` if possible
            const verdict = await Critic.gate(item.id, args as Record<string, unknown>, Critic.basicSafetyVerifier)
            if (!verdict.allow) {
              const blocked = {
                title: `Blocked: ${item.id}`,
                metadata: { critic: { blocked: true, reason: verdict.feedback } } as unknown,

2. packages/opencode/src/tool/critic.ts (L124-L127)

[🔴 HIGH] The isCommandPosition function fails to account for inline environment variable assignments (e.g., FOO=bar rm -rf /). Because of this, it misclassifies the subsequent rm as an argument rather than a command, completely bypassing the safety check. Consider adding a regular expression to skip over inline variable assignments.

Suggested change:

      if (sep.has(t)) return true
      if (t.startsWith("-")) continue // a flag (e.g. `sudo -E`, `bash -c`)
      if (TRANSPARENT_PREFIX.has(t)) continue
      if (/^[a-zA-Z_][a-zA-Z0-9_]*=/.test(t)) continue // inline variable assignment
      return false // a real preceding word -> rm is an argument, not the command

3. packages/opencode/src/tool/critic.ts (L246-L252)

[🟠 MEDIUM] The timeout Promise uses setTimeout to enforce VERIFIER_TIMEOUT_MS, but it never calls clearTimeout when verifyPromise resolves or rejects first. Although unref() is called, pending timers will accumulate in memory and needlessly execute their callbacks after 5 seconds, causing a potential timer leak under high throughput. We should keep a reference to the timer and clear it after Promise.race.

Suggested change:

      let timerId: any
      const timeout = new Promise<Verdict>((resolve) => {
        timerId = setTimeout(() => resolve({ ok: true, reason: "__timeout__" }), timeoutMs)
        // don't keep the event loop alive for this guard timer
        ;(timerId as any)?.unref?.()
      })
      const v = await Promise.race([verifyPromise, timeout])
      clearTimeout(timerId)
      if (v.ok) return { allow: true }

4. packages/opencode/src/tool/critic.ts (L39-L42)

[🔵 LOW] According to the TypeScript Types guideline, the use of any type should be avoided, or a comment should be provided explaining the reason. Please add a comment explaining why any is necessary here (e.g., tool arguments having dynamic structures), or consider using unknown.

Suggested change:

  /** The judgment interface. Default impl allows all (open). Product plugs a richer verifier. */
  export interface Verifier {
    // Note: `any` is used here because tool arguments have dynamic structures.
    verify(toolName: string, args: Record<string, any>): Verdict | Promise<Verdict>
  }

Comment thread packages/opencode/src/session/prompt.ts Outdated
Comment on lines +1529 to +1533
const verdict = await Critic.gate(item.id, args as Record<string, any>, Critic.basicSafetyVerifier)
if (!verdict.allow) {
const blocked = {
title: `Blocked: ${item.id}`,
metadata: { critic: { blocked: true, reason: verdict.feedback } } as any,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🔵 LOW] According to the review checklist, avoid using the any type. If it is necessary (e.g., to match external interface signatures or satisfy generic constraints), please provide a comment explaining the reason. Consider replacing it with unknown or a more specific type if possible.

Suggested change:

Suggested change
const verdict = await Critic.gate(item.id, args as Record<string, any>, Critic.basicSafetyVerifier)
if (!verdict.allow) {
const blocked = {
title: `Blocked: ${item.id}`,
metadata: { critic: { blocked: true, reason: verdict.feedback } } as any,
// Explain why `any` is used here, or use `unknown` if possible
const verdict = await Critic.gate(item.id, args as Record<string, unknown>, Critic.basicSafetyVerifier)
if (!verdict.allow) {
const blocked = {
title: `Blocked: ${item.id}`,
metadata: { critic: { blocked: true, reason: verdict.feedback } } as unknown,

Comment on lines +124 to +127
if (sep.has(t)) return true
if (t.startsWith("-")) continue // a flag (e.g. `sudo -E`, `bash -c`)
if (TRANSPARENT_PREFIX.has(t)) continue
return false // a real preceding word -> rm is an argument, not the command

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🔴 HIGH] The isCommandPosition function fails to account for inline environment variable assignments (e.g., FOO=bar rm -rf /). Because of this, it misclassifies the subsequent rm as an argument rather than a command, completely bypassing the safety check. Consider adding a regular expression to skip over inline variable assignments.

Suggested change:

Suggested change
if (sep.has(t)) return true
if (t.startsWith("-")) continue // a flag (e.g. `sudo -E`, `bash -c`)
if (TRANSPARENT_PREFIX.has(t)) continue
return false // a real preceding word -> rm is an argument, not the command
if (sep.has(t)) return true
if (t.startsWith("-")) continue // a flag (e.g. `sudo -E`, `bash -c`)
if (TRANSPARENT_PREFIX.has(t)) continue
if (/^[a-zA-Z_][a-zA-Z0-9_]*=/.test(t)) continue // inline variable assignment
return false // a real preceding word -> rm is an argument, not the command

Comment on lines +246 to +252
const timeout = new Promise<Verdict>((resolve) => {
const t = setTimeout(() => resolve({ ok: true, reason: "__timeout__" }), timeoutMs)
// don't keep the event loop alive for this guard timer
;(t as any)?.unref?.()
})
const v = await Promise.race([verifyPromise, timeout])
if (v.ok) return { allow: true }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟠 MEDIUM] The timeout Promise uses setTimeout to enforce VERIFIER_TIMEOUT_MS, but it never calls clearTimeout when verifyPromise resolves or rejects first. Although unref() is called, pending timers will accumulate in memory and needlessly execute their callbacks after 5 seconds, causing a potential timer leak under high throughput. We should keep a reference to the timer and clear it after Promise.race.

Suggested change:

Suggested change
const timeout = new Promise<Verdict>((resolve) => {
const t = setTimeout(() => resolve({ ok: true, reason: "__timeout__" }), timeoutMs)
// don't keep the event loop alive for this guard timer
;(t as any)?.unref?.()
})
const v = await Promise.race([verifyPromise, timeout])
if (v.ok) return { allow: true }
let timerId: any
const timeout = new Promise<Verdict>((resolve) => {
timerId = setTimeout(() => resolve({ ok: true, reason: "__timeout__" }), timeoutMs)
// don't keep the event loop alive for this guard timer
;(timerId as any)?.unref?.()
})
const v = await Promise.race([verifyPromise, timeout])
clearTimeout(timerId)
if (v.ok) return { allow: true }

Comment on lines +39 to +42
/** The judgment interface. Default impl allows all (open). Product plugs a richer verifier. */
export interface Verifier {
verify(toolName: string, args: Record<string, any>): Verdict | Promise<Verdict>
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🔵 LOW] According to the TypeScript Types guideline, the use of any type should be avoided, or a comment should be provided explaining the reason. Please add a comment explaining why any is necessary here (e.g., tool arguments having dynamic structures), or consider using unknown.

Suggested change:

Suggested change
/** The judgment interface. Default impl allows all (open). Product plugs a richer verifier. */
export interface Verifier {
verify(toolName: string, args: Record<string, any>): Verdict | Promise<Verdict>
}
/** The judgment interface. Default impl allows all (open). Product plugs a richer verifier. */
export interface Verifier {
// Note: `any` is used here because tool arguments have dynamic structures.
verify(toolName: string, args: Record<string, any>): Verdict | Promise<Verdict>
}

@dev-punia-altimate

Copy link
Copy Markdown
Contributor

🤖 Code Review — OpenCodeReview (Gemini) — 4 finding(s)

  • 4 anchored to a line (posted inline when the comment stream is on)
  • 0 without a line anchor
All findings (full text)

1. packages/opencode/src/session/prompt.ts (L1529-L1533)

[🔵 LOW] According to the review checklist, avoid using the any type. If it is necessary (e.g., to match external interface signatures or satisfy generic constraints), please provide a comment explaining the reason. Consider replacing it with unknown or a more specific type if possible.

Suggested change:

            // Explain why `any` is used here, or use `unknown` if possible
            const verdict = await Critic.gate(item.id, args as Record<string, unknown>, Critic.basicSafetyVerifier)
            if (!verdict.allow) {
              const blocked = {
                title: `Blocked: ${item.id}`,
                metadata: { critic: { blocked: true, reason: verdict.feedback } } as unknown,

2. packages/opencode/src/tool/critic.ts (L124-L127)

[🔴 HIGH] The isCommandPosition function fails to account for inline environment variable assignments (e.g., FOO=bar rm -rf /). Because of this, it misclassifies the subsequent rm as an argument rather than a command, completely bypassing the safety check. Consider adding a regular expression to skip over inline variable assignments.

Suggested change:

      if (sep.has(t)) return true
      if (t.startsWith("-")) continue // a flag (e.g. `sudo -E`, `bash -c`)
      if (TRANSPARENT_PREFIX.has(t)) continue
      if (/^[a-zA-Z_][a-zA-Z0-9_]*=/.test(t)) continue // inline variable assignment
      return false // a real preceding word -> rm is an argument, not the command

3. packages/opencode/src/tool/critic.ts (L246-L252)

[🟠 MEDIUM] The timeout Promise uses setTimeout to enforce VERIFIER_TIMEOUT_MS, but it never calls clearTimeout when verifyPromise resolves or rejects first. Although unref() is called, pending timers will accumulate in memory and needlessly execute their callbacks after 5 seconds, causing a potential timer leak under high throughput. We should keep a reference to the timer and clear it after Promise.race.

Suggested change:

      let timerId: any
      const timeout = new Promise<Verdict>((resolve) => {
        timerId = setTimeout(() => resolve({ ok: true, reason: "__timeout__" }), timeoutMs)
        // don't keep the event loop alive for this guard timer
        ;(timerId as any)?.unref?.()
      })
      const v = await Promise.race([verifyPromise, timeout])
      clearTimeout(timerId)
      if (v.ok) return { allow: true }

4. packages/opencode/src/tool/critic.ts (L39-L42)

[🔵 LOW] According to the TypeScript Types guideline, the use of any type should be avoided, or a comment should be provided explaining the reason. Please add a comment explaining why any is necessary here (e.g., tool arguments having dynamic structures), or consider using unknown.

Suggested change:

  /** The judgment interface. Default impl allows all (open). Product plugs a richer verifier. */
  export interface Verifier {
    // Note: `any` is used here because tool arguments have dynamic structures.
    verify(toolName: string, args: Record<string, any>): Verdict | Promise<Verdict>
  }

@dev-punia-altimate dev-punia-altimate marked this pull request as draft June 8, 2026 09:13
@dev-punia-altimate

Copy link
Copy Markdown
Contributor

🔴 1 critical finding(s) — converted back to draft

Address the critical items below, then mark this PR Ready for review to re-run the review. Medium/low findings are posted inline above and do not block.

  • packages/opencode/src/tool/critic.ts:127 — The isCommandPosition function fails to account for inline environment variable assignments (e.g., FOO=bar rm -rf /). Because of this, it misclassifies the subsequent rm as an argument rather th

anandgupta42 and others added 2 commits June 9, 2026 19:49
Multi-reviewer findings on the safety detector (all verified against real code):

- P1 (cubic): DEFAULT_GATED listed "patch", but the real tool id is "apply_patch"
  (Tool.define("apply_patch")), so apply_patch edits silently bypassed the gate.
  Also "dbt_run" is not a tool (dbt runs via bash). Corrected the list to real ids.
- HIGH/Major (coderabbit + cubic + multi-persona, 3 reviewers): isCommandPosition
  missed inline env-var assignments, so `FOO=1 rm -rf /` (and `IFS=x ... rm`)
  evaded detection. Now skips assignment tokens (`NAME=...`) when resolving command
  position. Verified: the 3 bypass variants now block; benign `FOO=1 echo` / `X=1 ls`
  do not false-positive.
- MEDIUM (multi-persona): the verifier-timeout setTimeout was never cleared; added a
  finally{ clearTimeout } so it doesn't linger after verify resolves.

Tests: +3 env-assignment-bypass regressions, +DEFAULT_GATED real-tool-id assertion.
104 critic tests pass; typecheck clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- critic.ts: `Verifier.verify(args: Record<string, any>)` and `gate(args)` now
  use `Record<string, unknown>` — arbitrary tool args, but stricter than `any`
  (callers must narrow; `String(args["command"])` still compiles).
- prompt.ts: the gate-block `args` cast updated to `Record<string, unknown>` to
  match; the metadata `as any` kept (open tool-metadata record, consistent with
  the surrounding tool-wrapper casts) with an explanatory comment per the
  no-bare-`any` guideline.

104 critic tests pass; typecheck clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@anandgupta42

Copy link
Copy Markdown
Contributor Author

All review findings now addressed:

Finding Reviewer(s) Status
DEFAULT_GATED uses patch (real id apply_patch); dbt_run not a tool cubic P1 ✅ fixed → real ids (50a0f6e)
isCommandPosition misses inline env-assignments (FOO=1 rm -rf /) coderabbit Major · cubic P2 · multi-persona HIGH ✅ fixed + 3 regression tests; benign FOO=1 echo doesn't false-positive
verifier-timeout setTimeout never cleared multi-persona MEDIUM finally { clearTimeout }
Record<string, any> in Verifier/gate multi-persona LOW ✅ → Record<string, unknown> (517a1344)
as any on blocked metadata in prompt.ts multi-persona LOW ✅ kept (open tool-metadata record, matches neighboring tool-wrapper casts) + explanatory comment

104 critic tests pass (incl. new env-bypass + real-tool-id regressions); typecheck + marker guard clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pre-execution critic gate for side-effecting tools

2 participants