tool: add full-scan EVM logical digest#3611
Conversation
Add an `evm-logical-digest` seidb operation for comparing EVM state across FlatKV and memIAVL at the same height. The command normalizes both backends into FlatKV physical keys, strips height-dependent value metadata, reports per-bucket `bucket_digest` values, and emits one `FINAL_DIGEST` line for backend comparison.
PR SummaryLow Risk Overview Both backends are normalized to FlatKV-style physical keys and digested via an order-independent XOR-of-SHA256 over memIAVL supports Unit tests assert semantic digest/inspect parity with translator for core EVM keys and that invalid normalization fails before opening snapshots. Reviewed by Cursor Bugbot for commit b7bb0fc. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3611 +/- ##
==========================================
- Coverage 59.02% 58.00% -1.03%
==========================================
Files 2215 2142 -73
Lines 182521 174692 -7829
==========================================
- Hits 107734 101329 -6405
+ Misses 65091 64343 -748
+ Partials 9696 9020 -676
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit b7bb0fc. Configure here.
| return fmt.Errorf("open kvs %s: %w", kvsPath, err) | ||
| } | ||
| defer func() { _ = f.Close() }() | ||
| r := bufio.NewReaderSize(f, 16*1024*1024) |
There was a problem hiding this comment.
Use unit constants for buffers
Low Severity
New bufio.NewReaderSize calls use raw 16*1024*1024 and 1024*1024 literals for buffer sizes. In sei-db, byte sizes should use sei-db/common/unit constants (for example 16 * unit.MB and unit.MB) instead of bit-shift-style numeric expressions.
Additional Locations (4)
Triggered by learned rule: sei-db: use unit.MB/GB constants for byte sizes, not bit-shift literals
Reviewed by Cursor Bugbot for commit b7bb0fc. Configure here.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b7bb0fc7a5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| func EvmLogicalDigestCmd() *cobra.Command { | ||
| cmd := &cobra.Command{ | ||
| Use: "evm-logical-digest", | ||
| Short: "Backend-independent digest of EVM logical state (account/code/storage) for memiavl vs flatkv comparison", | ||
| RunE: runEvmLogicalDigest, | ||
| } |
There was a problem hiding this comment.
Register the digest command with the root CLI
The new EvmLogicalDigestCmd() factory is never added to rootCmd.AddCommand in sei-db/tools/cmd/seidb/main.go (checked the existing command list at lines 18-33), so users cannot run the advertised seidb evm-logical-digest ... command at all. Please add this command to the root command registration; the unit tests call runEvmLogicalDigest directly, so they do not catch the CLI being unreachable.
Useful? React with 👍 / 👎.
| for ; iter.Valid(); iter.Next() { | ||
| k := iter.Key() | ||
| seen++ | ||
| if err := d.consume(k, iter.Value()); err != nil { |
There was a problem hiding this comment.
Filter FlatKV rows to the EVM module before digesting
When the FlatKV backend contains non-EVM module rows (for example in later migration modes such as MigrateAllButBank/FlatKVOnly), this loop digests every RawGlobalIterator row into the legacy bucket, while the memIAVL path resolves only <snapshot>/evm via resolveMemIAVLEvmSnapshotDir. That makes the advertised EVM-only comparison report mismatches caused solely by bank/staking/etc. rows in FlatKV; skip non-evm/ physical keys except the migration marker adjustment.
Useful? React with 👍 / 👎.
…sk panic isolation (#211) Adds the **full-keyspace digest gate** ([sei-chain#3611](sei-protocol/sei-chain#3611 `seidb evm-logical-digest`) as a discrete sidecar task — the per-segment boundary seal that closes the touched-key comparator's **cold-state blind spot** (a key migrated wrong and never touched again is invisible to per-block Layer 2). Plus three seams the systems-engineering review called for. No "ShadowResultProducer" abstraction — that's deferred to the 3rd producer (YAGNI). ### What's here - **`sidecar/s3/emit.go`** — one S3 emission helper (`StreamGzipNDJSON`/`StreamGzipJSON`/`StreamGzipFunc`), collapsing 3 duplicated gzip-pipe paths. Twofold integrity seal: an aws-chunked SHA-256 **wire** checksum over the compressed body (io.Pipe streaming/backpressure preserved) + an **uncompressed-payload** SHA-256 surfaced via `EmitResult` for out-of-band verification. `result_compare`/`result_export` refactored onto it (no behavior change). - **`sidecar/engine`** — `recover()` in `runTask` turns a handler panic into a failed `TaskResult` (+ `seictl_task_panics_total`) instead of crashing the sidecar. - **`sidecar/tasks/evm_logical_digest.go`** — the discrete task: shells out to `seidb` for flatkv + memiavl (`semantic` + `translator`), asserts **both** backends' opened version `== height` (fail-closed — no wrong-height false match), parses the `FINAL_DIGEST`/per-bucket contract, publishes an `EndpointDigestRecord`. `axes_proved` deliberately omits **balance** (the semantic account digest zeroes it — that axis stays the per-block comparator's job). ### Cross-review (systems-engineer + idiomatic-reviewer) — applied - **Symmetric memiavl version assertion** (the flatkv-only check left a wrong-height false-match hole if seidb clamps to the nearest snapshot). - **`recover()` inside the s3 writer goroutine** — a panic there (e.g. `MarshalJSON` over chain data) runs on a task-spawned goroutine *outside* the engine's handler recover; converted to a returned error so the upload aborts (no truncated-but-valid object) and the process survives. - **Dropped the empty-by-construction `uncompressed_sha256`** from the published record (a record can't carry the hash of its own bytes; the seal is out-of-band in the log/TaskResult). - comment-precision fixes (memory bound is the uploader part-pool, not "gzip window"; S3 checksum is per-part-composite for multipart) + a `version:`-line length guard. ### Notes - `seidb` is **shelled out to** (configurable `seidbPath`), not vendored. #3611 also needs a one-line registration fix (`EvmLogicalDigestCmd()` isn't in `seidb`'s root `AddCommand`) — flagged to the author. - Trigger is the **out-of-band task API**; no controller/CRD change (consistent with the avoided `ResultExportConfig` one-way door). - **One-way-door surfaces for confirmation before a consumer reads them:** the task-type string `"evm-logical-digest"`, the param field names, and the `EndpointDigestRecord` schema. - `GOWORK=off go build ./...` clean; `go test ./sidecar/...` green (incl. new memiavl-version-mismatch + writer-panic regression tests); `gofmt -s` clean. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>


Summary
Add an
evm-logical-digestseidb operation for comparing EVM state across FlatKV and memIAVL at the same height. The command normalizes both backends into FlatKV physical keys, strips height-dependent value metadata, reports per-bucketbucket_digestvalues, and emits oneFINAL_DIGESTline for backend comparison.sei-db/tools/cmd/seidb/operations/evm_logical_digest.go: Adds theevm-logical-digestcommand with FlatKV native scanning and memIAVL snapshot scanning. FlatKV reads useRawGlobalIterator; memIAVL reads stream snapshotkvsrecords sequentially so scan order does not affect correctness.sei-db/tools/cmd/seidb/operations/evm_logical_digest.go: Enforces an order-independent bucket accumulator oversha256(len(key)||key||len(value)||value). The final digest combines account, code, storage, and marker-adjusted legacy bucket digests so a FlatKV-only migration-version row does not create a false mismatch.sei-db/tools/cmd/seidb/operations/evm_logical_digest.go: Supports semantic memIAVL normalization by default and an opt-in translator mode through--memiavl-normalization translator. Semantic mode decodes raw EVM leaves directly; translator mode routes leaves throughflatkv.ImportTranslatorto validate the migration mapping.sei-db/tools/cmd/seidb/operations/evm_logical_digest.go: Adds--inspect-bucket, prefix sharding, row listing, backend metadata details, and--find-hashsupport for isolating mismatched entries. memIAVL inspect honors the same normalization flag as the global digest, so diagnostics match the selected digest path.Test plan
sei-db/tools/cmd/seidb/operations/evm_logical_digest_test.go:TestSemanticMemiavlDigestMatchesTranslatorForCoreEVMKeysverifies semantic normalization matches translator normalization for account, code, storage, and legacy buckets, including delete-equivalent zero storage and empty code rows.sei-db/tools/cmd/seidb/operations/evm_logical_digest_test.go:TestSemanticMemiavlInspectMatchesTranslatorForCoreEVMKeysverifies inspect bucket results match translator output for all normalized buckets.sei-db/tools/cmd/seidb/operations/evm_logical_digest_test.go:TestInspectMemiavlRejectsUnknownNormalizationBeforeOpeningSnapshotguarantees invalid--memiavl-normalizationvalues are rejected before filesystem access.go test ./sei-db/tools/cmd/seidb/operations.