Implement solver-based branch pruning by tautschnig · Pull Request #8979 · diffblue/cbmc

tautschnig · 2026-04-27T10:49:37Z

Query the SAT solver at each branch point to eliminate infeasible branches. Works in both --paths mode (push/pop scoping) and monolithic mode (incremental conversion with converted_for_pruning flag). Uses a non-simplifying MiniSat solver to avoid backwardSubsumptionCheck overhead; falls back to solver_factoryt when MiniSat is not available (e.g., CaDiCaL builds).

Adaptive pruning: cumulative 2-second time budget auto-disables pruning when overhead exceeds benefit.

Assumes are explicitly asserted as guard => cond. Includes fix for symex::args name collisions across paths via a static type registry.

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds solver-assisted feasibility checks at branch points to prune unreachable branches during symbolic execution, reducing path explosion.

Changes:

Introduces a shared, push/pop-capable SAT solver for branch pruning (with MiniSat-no-simplifier fast path and solver-factory fallback).
Implements pruning in both --paths and monolithic modes, with an adaptive time-budget auto-disable.
Adds CLI toggle --no-branch-pruning, regression tests, and an architecture note.

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/goto-symex/symex_target_equation.cpp	Avoids argument symbol name/type collisions across shared solver conversions
src/goto-symex/symex_goto.cpp	Adds solver queries to prune infeasible branches in paths + monolithic execution
src/goto-symex/ssa_step.h	Adds `converted_for_pruning` flag for incremental pruning conversions
src/goto-symex/goto_symex.h	Stores pruning solver pointer and pruning budget state; exposes setter
src/goto-checker/single_path_symex_only_checker.h	Adds persistent pruning solver member and `setup_symex` hook
src/goto-checker/single_path_symex_only_checker.cpp	Builds/configures pruning solver and wires it into symex
src/goto-checker/multi_path_symex_only_checker.h	Adds persistent pruning solver member
src/goto-checker/multi_path_symex_only_checker.cpp	Builds/configures pruning solver and wires it into symex
src/cbmc/cbmc_parse_options.h	Adds `no-branch-pruning` command-line option
src/cbmc/cbmc_parse_options.cpp	Parses `--no-branch-pruning` and documents it in `--help`
regression/cbmc/unwind_counters4/test.desc	Updates expected output regex
regression/cbmc/paths-branch-pruning/*.c/.desc	Adds new regression tests covering pruning behavior and refine-soundness
doc/architectural/branch-pruning.md	Documents pruning architecture and limitations

Comments suppressed due to low confidence (4)

src/goto-symex/symex_goto.cpp:1

In monolithic pruning, assignments/constraints are asserted unconditionally via set_to_true(step.cond_expr). SSA steps are typically guarded, and the normal conversion asserts them as guard => cond_expr; dropping the guard can over-constrain the pruning solver and incorrectly mark branches infeasible. Fix by asserting guarded steps as implies_exprt{step.guard, step.cond_expr} (or the equivalent used by the normal equation conversion) for assignments and constraints as well.
src/goto-symex/symex_goto.cpp:1
SSA_stept::converted is documented/used for incremental conversion. Resetting it here for pruning risks interfering with later conversions into other solvers (and can force repeated reconversion), which can change performance and potentially duplicate constraints depending on how conversion is orchestrated downstream. Prefer tracking pruning conversion exclusively with converted_for_pruning (already added) and avoid mutating converted in the pruning path.
src/goto-symex/symex_target_equation.cpp:1
This function-local static registry is shared across all runs/instances and is not thread-safe. It also grows monotonically with every unique symex::args::<n> ever produced, which can create unbounded memory growth in long-running or multi-check workflows. A more robust approach is to make the registry solver- or symex-instance scoped (or incorporate a type-derived discriminator into the identifier) so state doesn’t leak globally and can be managed/reset deterministically.
src/goto-symex/symex_goto.cpp:1
The pruning query is checking satisfiability under the converted equation, but it doesn’t appear to assert the current path condition explicitly (e.g., state.guard). That can significantly reduce pruning power (SAT models can satisfy implications by setting earlier guards false). Consider adding the current path guard as an assumption (or conjoining it with new_guard / !new_guard) when calling push(...) so the solver checks feasibility under the active path condition.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

 ^SIGNAL=0$
 ^VERIFICATION FAILED$
-^\** 4 of 4 failed
+^\*\* [34] of [34] failed


+#ifdef HAVE_MINISAT2
+#include <solvers/sat/satcheck_minisat2.h>
+#endif
+#endif


@@ -0,0 +1,86 @@
+/// \file
+/// Branch Pruning in --paths Mode


+/// ## Push/Pop Overhead
+///
+/// Each branch check converts the entire equation inside a push/pop scope.


…pport; add CI tests The `solver-time-limit` option has been read by `solver_factoryt` since 38a370c (Jan 2019), but was never registered as a command-line flag in `OPT_SOLVER` -- the original use case was purely programmatic, via dependent projects calling `options.set_option("solver-time-limit", N)` directly. That left the option's plumbing untested in CI and the only backend that honoured it (MiniSat 2) had no end-to-end coverage. This change: * Registers `--solver-time-limit N` in `OPT_SOLVER` / `HELP_SOLVER` and wires the cmdline-to-options forwarding in `parse_solver_options`. `cbmc`, `jbmc`, `goto-synthesizer` and `libcprover-cpp` (which all use `parse_solver_options`) now accept the flag from the command line. * Documents the flag in `doc/man/cbmc.1`, `doc/man/goto-synthesizer.1` and `doc/man/jbmc.1`, so `scripts/check_help.sh` continues to pass. * Implements `set_time_limit_seconds` for the IPASIR back-end (`satcheck_ipasirt`). The implementation registers an `ipasir_set_terminate` callback that returns 1 once the per-solve deadline has passed; IPASIR solvers poll the callback during solving and abort cleanly when it does. On timeout the back-end returns `P_ERROR` (matching the MiniSat 2 back-end's behaviour) instead of throwing `analysis_exceptiont`. Tests: * `regression/cbmc/solver-time-limit/`: a `CORE` regression test that pins down the happy path -- the option is parsed and propagated through to the solver layer without breaking a trivial solve. Runs on every CI build. * `unit/solvers/sat/satcheck_minisat2.cpp`: extended with a pigeonhole-PHP(20) `SCENARIO` that builds the formula directly via `new_variable` / `lcnf`, sets a 1-second time limit, and asserts that `prop_solve` returns `P_ERROR` within five seconds. PHP is provably exponential for resolution-based SAT (Haken, 1985), so the timeout is reliably exceeded on any reasonable hardware. * `unit/solvers/sat/satcheck_ipasir.cpp`: new file, mirroring the satcheck_minisat2 test. Conditional on `HAVE_IPASIR`, so it only compiles in CI builds that link an IPASIR back-end (currently the `check-ubuntu-24_04-make-gcc` job, which uses Riss as the IPASIR provider). This is the first piece of foundation work for a wider redesign of CBMC's solver-timeout infrastructure (sub-second granularity, cross-OS, watchdog-based interrupt). Subsequent commits will build on a now-tested baseline. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

The solver-time-limit infrastructure used to be expressed in whole seconds, was implemented for MiniSat 2 via POSIX `alarm()` + `SIGALRM` (so it was POSIX-only and process-wide), and was explicitly skipped on Windows with a "Time limit ignored (not supported on Win32 yet)" warning. In practice this meant: * Sub-second budgets -- needed for fine-grained per-check timeouts inside symbolic execution -- could not be expressed. * Windows builds had no time-limit support at all. * The signal handler relied on a process-wide `solver_to_interrupt` pointer; nesting solve calls across threads or having two solvers active in the same process was fragile. This change reworks the API and the MiniSat 2 implementation: * `solver_resource_limitst::set_time_limit_milliseconds` and `propt::set_time_limit_milliseconds` are now the primary virtual API. The default `propt` implementation logs a warning, mirroring the previous behaviour for back-ends with no native timeout support. * `set_time_limit_seconds` survives as a non-virtual helper that forwards to `set_time_limit_milliseconds(lim * 1000)`, so external callers (e.g. the CLI option-parser, dependent projects) keep working without source changes. * The MiniSat 2 back-end now implements the time limit using a `std::thread` watchdog: when the limit is non-zero, a watchdog thread waits on a `std::condition_variable` for either the solve to complete (signalled via a flag) or the deadline to elapse, and on timeout calls `solver->interrupt()` (which is just a flag set on the Minisat::Solver object and is safe to invoke from another thread). The watchdog is joined before `do_prop_solve` returns. Result: cross-OS, sub-millisecond granularity, no global signal state, no `SIGALRM` collision risk. * The IPASIR back-end's existing `set_time_limit_seconds` override is renamed/repurposed as `set_time_limit_milliseconds`; the deadline is computed in `std::chrono::milliseconds` instead of seconds. The callback already polled a `chrono::steady_clock` deadline, so this is essentially a stored-precision change. * The `solver_factoryt` call site in `solver_factory.cpp` keeps using the seconds helper (the CLI flag `--solver-time-limit` remains expressed in seconds for user familiarity); only the internal API is now in milliseconds. Tests: * Existing pigeonhole(20) SCENARIOs in `unit/solvers/sat/satcheck_minisat2.cpp` and `unit/solvers/sat/satcheck_ipasir.cpp` now request a 200-millisecond limit and verify (a) that `prop_solve` returns `P_ERROR`, and (b) that wall-clock elapsed time is well under five seconds (a generous slack for slow CI hardware and watchdog-join latency). Locally the satcheck_minisat2 test completes in ~207 ms, confirming sub-millisecond accuracy of the watchdog. This is the second piece of the wider solver-timeout rework. Subsequent work will add CaDiCaL terminator support and let the branch-pruning code use sub-second per-call budgets directly. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

The CaDiCaL back-end was the last SAT back-end in CBMC for which `set_time_limit_milliseconds` (and its `set_time_limit_seconds` helper) silently fell through to `propt`'s default no-op + warning. This change adds native time-limit support via CaDiCaL's `Terminator` callback API: * `satcheck_cadical_baset` declares a nested `terminatort` class in the header (forward-declared to keep `cadical.hpp` out of the header). * `terminatort` is defined in the .cpp and inherits `CaDiCaL::Terminator`. It stores a deadline as a `steady_clock::time_point` and returns true once the deadline has passed. * `set_time_limit_milliseconds(uint32_t)` stores the limit on the solver instance. * `do_prop_solve` constructs / refreshes the deadline before calling `solver->solve()`, calls `solver->connect_terminator(...)` while solving, and calls `solver->disconnect_terminator()` afterwards. CaDiCaL polls the terminator at decision points during solving and aborts cleanly on a true return. * On interrupt, `do_prop_solve` now returns `propt::resultt ::P_ERROR` -- matching the MiniSat 2 and IPASIR back-ends -- instead of throwing `analysis_exceptiont`. The `<util/exception_utils.h>` include is dropped as it is no longer needed. A `set_time_limit_milliseconds(200)` SCENARIO is added to `unit/solvers/sat/satcheck_cadical.cpp` mirroring the existing MiniSat 2 / IPASIR timeout tests: a pigeonhole(20) CNF (provably exponential for resolution-based SAT) is built directly via `new_variable` / `lcnf`, and the solver is asserted to return `P_ERROR` within five seconds. Locally the test completes in ~208 ms (200 ms deadline + ~8 ms terminator-polling latency), confirming sub-millisecond accuracy of the CaDiCaL implementation. After this commit, all three SAT back-ends with native interrupt infrastructure (MiniSat 2 via std::thread watchdog, IPASIR via `ipasir_set_terminate`, CaDiCaL via `connect_terminator`) honour `--solver-time-limit` and `set_time_limit_milliseconds` with sub-second granularity, are portable across operating systems, and return `P_ERROR` on interrupt for callers to detect. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

Query the SAT solver at each branch point to eliminate infeasible branches. Works in both --paths mode (push/pop scoping) and monolithic mode (incremental conversion with converted_for_pruning flag). Uses a non-simplifying MiniSat solver to avoid backwardSubsumptionCheck overhead; falls back to solver_factoryt when MiniSat is not available (e.g., CaDiCaL builds). Adaptive pruning: cumulative 2-second time budget auto-disables pruning when overhead exceeds benefit. Assumes are explicitly asserted as guard => cond. Includes fix for symex::args name collisions across paths via a static type registry. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

The branch pruning solver performs a single SAT check without the iterative refinement loop. With --refine, --refine-arrays, or --refine-strings, this can give unsound results on the initial over-approximation (e.g., incorrectly determining a branch is infeasible). Disable branch pruning for all refinement modes. The --refine flag (used by --incremental-loop) also creates a bv_refinement solver. The branch pruning solver doesn't perform the refinement loop, so it must be disabled for all refinement modes, not just --refine-arrays and --refine-strings. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com> The branch pruning solver performs a single SAT check without the iterative refinement loop. With --refine, --refine-arrays, or --refine-strings, this can give unsound results on the initial over-approximation (e.g., incorrectly determining a branch is infeasible). Disable branch pruning for all refinement modes.

Add a command-line option to disable the solver-based branch pruning that is now active by default in both --paths and monolithic modes. This is useful for: - Comparative benchmarking (measuring pruning impact) - Working around potential edge cases in branch pruning - Debugging verification results Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

Tests covering: prune_assume, no_prune_nondet, prune_chained, prune_loop, prune_malloc, refine_arrays_sound, refine_sound, guarded_assume, monolithic_prune, malloc_dynamic_bounds, stdlib_no_paths_explosion. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

tautschnig self-assigned this Apr 27, 2026

Copilot AI review requested due to automatic review settings April 27, 2026 10:49

tautschnig requested review from TGWDB, jimgrundy, kroening and peterschrammel as code owners April 27, 2026 10:49

Copilot AI reviewed Apr 27, 2026

View reviewed changes

Copilot started reviewing on behalf of tautschnig April 27, 2026 12:40 View session

tautschnig force-pushed the branch-pruning branch from 6498011 to 3834a1a Compare May 27, 2026 22:26

tautschnig and others added 8 commits May 28, 2026 10:05

Document branch pruning design for --paths mode

9c48042

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

tautschnig force-pushed the branch-pruning branch from 3834a1a to 9c48042 Compare May 28, 2026 11:04

tautschnig requested a review from martin-cs as a code owner May 28, 2026 11:04

tautschnig added the dependent - do not merge label May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement solver-based branch pruning#8979

Implement solver-based branch pruning#8979
tautschnig wants to merge 8 commits into
diffblue:developfrom
tautschnig:branch-pruning

tautschnig commented Apr 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tautschnig commented Apr 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants