Skip to content

Address v12 bugs in consensus#607

Open
illuzen wants to merge 8 commits into
mainfrom
illuzen/msig-v12
Open

Address v12 bugs in consensus#607
illuzen wants to merge 8 commits into
mainfrom
illuzen/msig-v12

Conversation

@illuzen

@illuzen illuzen commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Security Fixes from V12 Audit Review

This PR addresses multiple security findings from the V12 audit review.

Fixes Included

1. Reduce MAX_MESSAGE_SIZE from 16MB to 1KB (miner-api)

Issue: read_message allocates a buffer sized by an untrusted 4-byte length prefix before reading payload. With 16MB limit, an attacker could cause ~4,000,000x memory amplification per connection.

Fix: Reduced to 1KB. Real MinerMessage payloads (Ready, NewJob, JobResult) are only ~200-350 bytes.

2. Use get_difficulty() in adjust_difficulty() (pallet-qpow)

Issue: adjust_difficulty() read raw storage directly, bypassing the zero-check in get_difficulty(). When storage was zero/missing, verification used InitialDifficulty but adjustment computed from zero → clamped to min_difficulty, causing inconsistent behavior.

Fix: Changed to use Self::get_difficulty() for consistent zero-storage handling.

3. Honor GenesisConfig.initial_difficulty (pallet-qpow)

Issue: GenesisConfig exposed an initial_difficulty field, but build() ignored it and always used the runtime constant T::InitialDifficulty::get(), making chain-spec overrides a silent no-op.

Fix: Changed build() to use self.initial_difficulty.

4. Only clean up achieved work when finalization advances (sc-consensus-qpow)

Issue: Cleanup deleted the work entry at last_finalized_before unconditionally, even when finalize_block was a no-op (re-finalizing same block). This could delete the current finalized tip's entry, which is needed as parent work for child blocks.

Fix: Added check last_finalized_after > last_finalized_before before cleanup.

5. Atomic seal verification in submit() (sc-consensus-qpow)

Issue: submit() consumed the build (via build.take()) before validating the seal. A TOCTOU race could occur: verify_seal() passes, rebuild lands, submit() consumes the new build with a stale seal, import fails, valid build is destroyed.

Fix: Refactored submit() to verify and take atomically under one lock. Removed separate verify_seal() method and redundant call site.

6. Use overflowing_add in mine_range() (qpow-math)

Issue: mine_range used saturating_add to increment nonces, which clamps at U512::MAX instead of wrapping. This would cause repeated hashing of the same nonce at the boundary (though practically unreachable).

Fix: Changed to overflowing_add().0 for correct wrapping behavior, matching the node's local mining loop.

Not Fixed (Accepted Limitations)

Aux storage leak for non-canonical fork blocks

Non-canonical fork blocks' achieved work entries are not cleaned up because we cannot enumerate block hashes by height. This is acceptable because:

  • Fork blocks require valid PoW (expensive to create)
  • Entries are small (~96 bytes each)
  • Normal operation has very few forks

Eclipse attack → permanent split

Auto-finalization at fixed depth means an eclipsed node can finalize on an attacker's fork and become permanently split from the honest network. This is a fundamental PoW + finality tradeoff, documented but not changed.


Note

Medium Risk
Touches consensus finalization aux state, block submission races, and on-chain difficulty genesis/adjustment—important for fork choice and mining correctness, but changes are targeted security fixes rather than broad refactors.

Overview
Addresses multiple V12 audit findings across consensus, runtime, miner IPC, and mining math.

Miner API: MAX_MESSAGE_SIZE drops from 16MB to 1KB so untrusted length prefixes cannot force huge allocations on QUIC reads.

Mining worker: submit() now verifies the seal and takes the build under one lock, closing a TOCTOU where a rebuild could invalidate a stale seal after a separate verify. verify_seal() is removed; external mining in the node calls submit() only.

Finalization aux cleanup: Achieved-work deletion runs only when finalization actually advances (last_finalized_after > last_finalized_before), so a no-op re-finalize does not delete the current finalized tip’s parent-work entry. Docs note fork aux entries are not enumerated.

pallet-qpow: Genesis build() persists self.initial_difficulty (chain-spec overrides). adjust_difficulty() and on_finalize use get_difficulty() so zero/missing storage matches verification instead of adjusting from zero.

qpow-math: mine_range uses overflowing_add for nonce stepping instead of saturating_add at U512::MAX.

Reviewed by Cursor Bugbot for commit 384405e. Configure here.

@n13 n13 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Address v12 bugs in consensus

Verdict: Approve (LGTM). All six fixes are correct, narrowly scoped, and backward-compatible. Default/existing behavior is preserved (the genesis initial_difficulty field defaults to InitialDifficulty, and the non-zero difficulty path is unchanged), so there is no consensus break. A few non-blocking notes below.

Per-fix assessment

  1. MAX_MESSAGE_SIZE 16MB → 1KB (miner-api) — Correct. The largest real frame is JobResult (~450 B: 128-char nonce + 128-char work hex) and NewJob (~280 B: 64-char hash + ≤155-digit U512 difficulty + numeric job_id, which is just job_counter.to_string()). 1KB leaves comfortable headroom and cuts the pre-read allocation amplification from ~4M× to ~256×.

  2. adjust_difficulty() / on_finalize via get_difficulty() (pallet-qpow) — Correct and important. With raw storage at zero, the old path produced increment = 0new = 0 → clamped to min_difficulty (131072), while verification used InitialDifficulty — an inconsistency that could collapse difficulty to the floor. Both now share the same zero-fallback. The non-zero (normal) path is unchanged.

  3. Genesis honors self.initial_difficulty (pallet-qpow) — Safe. GenesisConfig::default() sets the field to T::InitialDifficulty::get(), so default chain-specs and all existing tests (mock.rs sets it to TestInitialDifficulty::get()) are unaffected; only explicit chain-spec overrides change behavior, and a 0 override self-heals through get_difficulty().

  4. Cleanup guard last_finalized_after > last_finalized_before (sc-consensus-qpow) — Correct. Prevents deleting the finalized tip's achieved-work entry on a no-op re-finalize (that entry is the parent-work source for children). Finalization is monotonic, so after > before is exactly "advanced"; in steady state (advance by 1) it deletes the entry one below the tip, whose children are already finalized — safe, and retention stays bounded.

  5. Atomic verify+take in submit() (sc-consensus-qpow) — Correct. Collapsing verification and build.take() under one lock closes the TOCTOU where a rebuild between a separate verify_seal() and submit() could consume a fresh build with a stale seal. On verify failure the build is now preserved so the miner keeps trying. The lock is dropped before the await on import_block (no lock-across-await). verify_seal() is removed and its only caller (service.rs) updated — the verify_seal in frame/support is an unrelated trait method.

  6. overflowing_add in mine_range() (qpow-math) — Correct. Identical to saturating_add everywhere except at U512::MAX, where wrapping avoids re-hashing the same nonce. Practically unreachable, and now consistent with the node's local mining loop.

Non-blocking suggestions

  • worker.rs lost diagnostics: the old verify_seal logged pre_hash / best_hash / difficulty / nonce / extrinsic_count at error level on a failed seal; the new path logs a single warn. Consider keeping the richer fields — failed-seal logs are valuable for debugging mining issues.
  • chain_management.rs multi-step jumps: only the entry at last_finalized_before is deleted, so if finalization ever advances by >1 height (e.g. bursty import during sync) the intermediate canonical entries leak. Pre-existing and minor (same accepted tradeoff as fork entries), but a before+1..=after-1 cleanup loop would close it.
  • Tests: no regression tests added. The zero-storage adjust_difficulty case (#2) is cheap to cover at the pallet level and guards against a real difficulty-collapse path — worth a unit test.
  • Nit: value.unwrap() in submit() could be .expect("build presence checked under lock") to document the invariant.

CI

Fast Checks (Format), Analysis (Clippy & Doc), and Bugbot pass; the Build & Test matrix (Linux/macOS) was still pending at review time — worth confirming green before merge.

@n13 n13 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GTG

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants