Address v12 bugs in consensus#607
Conversation
n13
left a comment
There was a problem hiding this comment.
Review: Address v12 bugs in consensus
Verdict: Approve (LGTM). All six fixes are correct, narrowly scoped, and backward-compatible. Default/existing behavior is preserved (the genesis initial_difficulty field defaults to InitialDifficulty, and the non-zero difficulty path is unchanged), so there is no consensus break. A few non-blocking notes below.
Per-fix assessment
-
MAX_MESSAGE_SIZE16MB → 1KB (miner-api) — Correct. The largest real frame isJobResult(~450 B: 128-charnonce+ 128-charworkhex) andNewJob(~280 B: 64-char hash + ≤155-digit U512 difficulty + numericjob_id, which is justjob_counter.to_string()). 1KB leaves comfortable headroom and cuts the pre-read allocation amplification from ~4M× to ~256×. -
adjust_difficulty()/on_finalizeviaget_difficulty()(pallet-qpow) — Correct and important. With raw storage at zero, the old path producedincrement = 0→new = 0→ clamped tomin_difficulty(131072), while verification usedInitialDifficulty— an inconsistency that could collapse difficulty to the floor. Both now share the same zero-fallback. The non-zero (normal) path is unchanged. -
Genesis honors
self.initial_difficulty(pallet-qpow) — Safe.GenesisConfig::default()sets the field toT::InitialDifficulty::get(), so default chain-specs and all existing tests (mock.rssets it toTestInitialDifficulty::get()) are unaffected; only explicit chain-spec overrides change behavior, and a0override self-heals throughget_difficulty(). -
Cleanup guard
last_finalized_after > last_finalized_before(sc-consensus-qpow) — Correct. Prevents deleting the finalized tip's achieved-work entry on a no-op re-finalize (that entry is the parent-work source for children). Finalization is monotonic, soafter > beforeis exactly "advanced"; in steady state (advance by 1) it deletes the entry one below the tip, whose children are already finalized — safe, and retention stays bounded. -
Atomic verify+take in
submit()(sc-consensus-qpow) — Correct. Collapsing verification andbuild.take()under one lock closes the TOCTOU where a rebuild between a separateverify_seal()andsubmit()could consume a fresh build with a stale seal. On verify failure the build is now preserved so the miner keeps trying. The lock is dropped before theawaitonimport_block(no lock-across-await).verify_seal()is removed and its only caller (service.rs) updated — theverify_sealinframe/supportis an unrelated trait method. -
overflowing_addinmine_range()(qpow-math) — Correct. Identical tosaturating_addeverywhere except atU512::MAX, where wrapping avoids re-hashing the same nonce. Practically unreachable, and now consistent with the node's local mining loop.
Non-blocking suggestions
worker.rslost diagnostics: the oldverify_sealloggedpre_hash/best_hash/difficulty/nonce/extrinsic_countaterrorlevel on a failed seal; the new path logs a singlewarn. Consider keeping the richer fields — failed-seal logs are valuable for debugging mining issues.chain_management.rsmulti-step jumps: only the entry atlast_finalized_beforeis deleted, so if finalization ever advances by >1 height (e.g. bursty import during sync) the intermediate canonical entries leak. Pre-existing and minor (same accepted tradeoff as fork entries), but abefore+1..=after-1cleanup loop would close it.- Tests: no regression tests added. The zero-storage
adjust_difficultycase (#2) is cheap to cover at the pallet level and guards against a real difficulty-collapse path — worth a unit test. - Nit:
value.unwrap()insubmit()could be.expect("build presence checked under lock")to document the invariant.
CI
Fast Checks (Format), Analysis (Clippy & Doc), and Bugbot pass; the Build & Test matrix (Linux/macOS) was still pending at review time — worth confirming green before merge.
Security Fixes from V12 Audit Review
This PR addresses multiple security findings from the V12 audit review.
Fixes Included
1. Reduce
MAX_MESSAGE_SIZEfrom 16MB to 1KB (miner-api)Issue:
read_messageallocates a buffer sized by an untrusted 4-byte length prefix before reading payload. With 16MB limit, an attacker could cause ~4,000,000x memory amplification per connection.Fix: Reduced to 1KB. Real
MinerMessagepayloads (Ready, NewJob, JobResult) are only ~200-350 bytes.2. Use
get_difficulty()inadjust_difficulty()(pallet-qpow)Issue:
adjust_difficulty()read raw storage directly, bypassing the zero-check inget_difficulty(). When storage was zero/missing, verification usedInitialDifficultybut adjustment computed from zero → clamped tomin_difficulty, causing inconsistent behavior.Fix: Changed to use
Self::get_difficulty()for consistent zero-storage handling.3. Honor
GenesisConfig.initial_difficulty(pallet-qpow)Issue:
GenesisConfigexposed aninitial_difficultyfield, butbuild()ignored it and always used the runtime constantT::InitialDifficulty::get(), making chain-spec overrides a silent no-op.Fix: Changed
build()to useself.initial_difficulty.4. Only clean up achieved work when finalization advances (
sc-consensus-qpow)Issue: Cleanup deleted the work entry at
last_finalized_beforeunconditionally, even whenfinalize_blockwas a no-op (re-finalizing same block). This could delete the current finalized tip's entry, which is needed as parent work for child blocks.Fix: Added check
last_finalized_after > last_finalized_beforebefore cleanup.5. Atomic seal verification in
submit()(sc-consensus-qpow)Issue:
submit()consumed the build (viabuild.take()) before validating the seal. A TOCTOU race could occur:verify_seal()passes, rebuild lands,submit()consumes the new build with a stale seal, import fails, valid build is destroyed.Fix: Refactored
submit()to verify and take atomically under one lock. Removed separateverify_seal()method and redundant call site.6. Use
overflowing_addinmine_range()(qpow-math)Issue:
mine_rangeusedsaturating_addto increment nonces, which clamps atU512::MAXinstead of wrapping. This would cause repeated hashing of the same nonce at the boundary (though practically unreachable).Fix: Changed to
overflowing_add().0for correct wrapping behavior, matching the node's local mining loop.Not Fixed (Accepted Limitations)
Aux storage leak for non-canonical fork blocks
Non-canonical fork blocks' achieved work entries are not cleaned up because we cannot enumerate block hashes by height. This is acceptable because:
Eclipse attack → permanent split
Auto-finalization at fixed depth means an eclipsed node can finalize on an attacker's fork and become permanently split from the honest network. This is a fundamental PoW + finality tradeoff, documented but not changed.
Note
Medium Risk
Touches consensus finalization aux state, block submission races, and on-chain difficulty genesis/adjustment—important for fork choice and mining correctness, but changes are targeted security fixes rather than broad refactors.
Overview
Addresses multiple V12 audit findings across consensus, runtime, miner IPC, and mining math.
Miner API:
MAX_MESSAGE_SIZEdrops from 16MB to 1KB so untrusted length prefixes cannot force huge allocations on QUIC reads.Mining worker:
submit()now verifies the seal and takes the build under one lock, closing a TOCTOU where a rebuild could invalidate a stale seal after a separate verify.verify_seal()is removed; external mining in the node callssubmit()only.Finalization aux cleanup: Achieved-work deletion runs only when finalization actually advances (
last_finalized_after > last_finalized_before), so a no-op re-finalize does not delete the current finalized tip’s parent-work entry. Docs note fork aux entries are not enumerated.pallet-qpow: Genesisbuild()persistsself.initial_difficulty(chain-spec overrides).adjust_difficulty()andon_finalizeuseget_difficulty()so zero/missing storage matches verification instead of adjusting from zero.qpow-math:mine_rangeusesoverflowing_addfor nonce stepping instead ofsaturating_addatU512::MAX.Reviewed by Cursor Bugbot for commit 384405e. Configure here.