Skip to content

Migrate snapshots to task system, rather than ringbuffer#7947

Open
eddyashton wants to merge 16 commits into
microsoft:mainfrom
eddyashton:snapshot_tasks
Open

Migrate snapshots to task system, rather than ringbuffer#7947
eddyashton wants to merge 16 commits into
microsoft:mainfrom
eddyashton:snapshot_tasks

Conversation

@eddyashton

Copy link
Copy Markdown
Member

Perhaps not an idiomatic example of this migration, but a relatively isolated one to start with.

  • The old flow with a pre-allocation (in host memory) of the snapshot has been redundant for a long time. This removes it - its all one memory space, the "host" (really, the "thing that does IO") doesn't need to do any book-keeping/prep before ahead of seeing a ready-for-commit snapshot. See the first commit for this in isolation.
  • There's still 2 "tasks" (chunks of work) that we want to dispatch - the serialisation of the snapshot, and the IO writing of the snapshot. Those are now ccf::Tasks (technically Actions, for fiddly implementation details), and the old snapshot-related ringbuffer messages are gone.
  • The host-side SnapshotManager is near-stateless, depending only on config-provided directories. So rather than continuing to construct it early and threading a reference to the Snapshotter, we just push it inside and have it owned by the Snapshotter. Its renamed too, and significantly simplified.
  • We no longer use uv for the actual file writes! We previously did these on a uv worker pool thread, to make sure they didn't block the "main" thread. They're now on a task system thread. Worth exploring if we can find even more async-y APIs for these, to minimise any potentially-slow blocking calls, but that's a separate (and broader) concern.

Robot assisted, here's some before-and-after diagrams.
The separate task system actors in the latter diagram aren't particularly helpful, but the big "this is a simplification" evidence should be "all the arrows flow to the right".

sequenceDiagram
    participant C as Consensus thread<br/>(enclave)
    participant T as Snapshot task thread<br/>(enclave)
    participant RB as Ringbuffer
    participant H as Host main thread<br/>(dispatcher)
    participant W as Host uv workpool

    C->>C: Snapshotter.commit(idx)
    C->>T: add_task(SnapshotTask)
    T->>T: serialise snapshot + commit evidence tx
    Note over T: save bytes in pending_snapshots[generation]

    T->>RB: snapshot_allocate(version, evidence_idx, size, generation)
    RB->>H: SnapshotManager.allocate pending buffer
    H->>RB: snapshot_allocated(span, generation)
    RB->>C: write_snapshot(span, generation)
    C->>C: memcpy bytes to host buffer, mark stored=true

    Note over C: later: evidence committed + cose sig + tree available
    C->>C: update_indices gate passes
    C->>RB: snapshot_commit(version, receipt)
    RB->>H: SnapshotManager.commit_snapshot
    H->>H: open + write(snapshot) + write(receipt)
    H->>W: uv_queue_work(fsync + close + rename .committed)
Loading
sequenceDiagram
    participant C as Consensus thread
    participant OT as OrderedTasks[generation]
    participant TS as Task worker<br/>(serialise action)
    participant TP as Task worker<br/>(persist action)
    participant F as Filesystem

    C->>C: Snapshotter.commit(idx)
    C->>OT: schedule_snapshot(): create OrderedTasks + add Serialise action

    OT->>TS: run SerialiseSnapshotAction
    TS->>TS: serialise snapshot + commit evidence tx
    TS->>TS: store bytes/digests in shared SnapshotSerialisation

    Note over C: later: evidence_idx recorded, idx>evidence_idx, COSE sig + tree present
    C->>C: update_indices gate passes
    C->>OT: add PersistSnapshotAction (queued after serialise)

    OT->>TP: run PersistSnapshotAction
    TP->>TP: build receipt from COSE/tree/evidence
    TP->>TP: SnapshotWriter.persist_snapshot(...)

    TP->>F: open tmp snapshot file
    TP->>F: write(snapshot bytes)
    TP->>F: write(receipt bytes)
    TP->>F: fsync + close
    TP->>F: rename to .committed
Loading

@eddyashton eddyashton requested a review from a team as a code owner June 17, 2026 10:35
Copilot AI review requested due to automatic review settings June 17, 2026 10:35
Comment thread src/snapshots/snapshot_writer.h
Comment thread src/node/test/snapshotter.cpp
@eddyashton eddyashton added the run-long-test Run Long Test job label Jun 17, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates snapshot persistence away from host/enclave ringbuffer messages to the task system, with snapshot writing performed in-process via a new SnapshotWriter. This simplifies the snapshot flow by removing host-side pre-allocation and ringbuffer snapshot message types, and by scheduling serialisation and persistence as ordered task actions.

Changes:

  • Introduces snapshots::SnapshotWriter and wires Snapshotter to persist committed snapshots via ordered task actions.
  • Removes snapshot-specific ringbuffer message types and deletes the host-side SnapshotManager/dispatcher handlers.
  • Updates unit tests to validate snapshot persistence by inspecting committed snapshot files on disk.

Custom instructions used:

  • .github/copilot-instructions.md
  • .github/instructions/reviewing.instructions.md

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/snapshots/snapshot_writer.h Adds new snapshot persistence helper that writes snapshot+receipt to disk and renames to .committed.
src/snapshots/snapshot_manager.h Deletes the old host-side snapshot ringbuffer manager and uv-based commit flow.
src/node/snapshotter.h Reworks snapshot flow to use OrderedTasks with serialise/persist actions and calls SnapshotWriter.
src/node/node_state.h Constructs Snapshotter with snapshot directory instead of a ringbuffer writer path.
src/enclave/enclave.h Removes handling of the host->enclave snapshot_allocated message.
src/consensus/ledger_enclave_types.h Removes snapshot-related ringbuffer message definitions and payload declarations.
src/host/run.cpp Removes host-side SnapshotManager construction and message handler registration.
src/host/test/ledger.cpp Updates snapshot tests to use SnapshotWriter and directory-based snapshot discovery helpers.
src/node/test/snapshotter.cpp Updates snapshotter tests to validate on-disk committed snapshot outputs rather than ringbuffer traffic.

Comment thread src/snapshots/snapshot_writer.h
Comment thread src/node/snapshotter.h Outdated
Comment thread src/snapshots/snapshot_writer.h Outdated
Comment thread src/snapshots/snapshot_writer.h Outdated
@achamayou achamayou added this to the 7.0.6 milestone Jun 17, 2026
@eddyashton

Copy link
Copy Markdown
Member Author

@copilot resolve the merge conflicts in this pull request

@achamayou

Copy link
Copy Markdown
Member

We no longer use uv for the actual file writes! We previously did these on a uv worker pool thread, to make sure they didn't block the "main" thread. They're now on a task system thread. Worth exploring if we can find even more async-y APIs for these, to minimise any potentially-slow blocking calls, but that's a separate (and broader) concern.

This is interesting, and I have mixed feelings about it. So long as we at least tend towards a consistent situation where uv is only used for networking, it's not too objectionable.
Using libuv gave a degree of platform independence that we are losing by going to OS APIs directly, but maybe that is fine in return for simplified task handling, at least on the file side of things.

I am worried that the ledger I/O is inextricably tied to the networking though, and won't be movable out of the UV loop without a wider change.

Comment thread src/node/snapshotter.h
@achamayou achamayou modified the milestones: 7.0.6, 7.0.7 Jun 22, 2026
try
{
snapshot_fd =
files::open_fd(full_snapshot_path, O_CREAT | O_EXCL | O_WRONLY);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume here we are deferring cleanup of broken files to another process? rather than cleaning up?
(the O_EXCL)

}
}

close_fd(snapshot_fd, file_name);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any error handling for close_fd?

@cjen1-msft

Copy link
Copy Markdown
Contributor

Some comments from an AI review, pruned for relevance / realism:

  • Snapshot tasks IO will now block enclave threads
    • I think this is fine but good to note.
  • max_pending_snapshots_count only uses pending_snapshots
    • We might want to amend this to include in-progress writes now that that is feasible.
  • Stopping the job board may leak Snapshotter (snapshotter -> pending_snapshots -> tasks -> SerialiseSnapshotAction -> snapshotter)
    • This seems legit, but afaicr we close the enclave threads after kicking off the destruction of the node_state so this should be fine?
  • No fsync after rename to .committed
  • Failure cases for SnapshotWriter tests
  • doc/operations/data_persistence.rst not updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-long-test Run Long Test job

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants