Skip to content

Cherry-pick Fee Claim commits to client branch#27

Open
amackillop wants to merge 10 commits into
lsp-0.2.0_accept-underpaying-htlcs_with_timing_logsfrom
austin_fee-claim-cherry-picks
Open

Cherry-pick Fee Claim commits to client branch#27
amackillop wants to merge 10 commits into
lsp-0.2.0_accept-underpaying-htlcs_with_timing_logsfrom
austin_fee-claim-cherry-picks

Conversation

@amackillop

@amackillop amackillop commented Jun 11, 2026

Copy link
Copy Markdown

These are the commits from #23, #24, #25, and #26. some of this stuff is needed on the client but moving it all to help keep things in sync and hopefully make it easier to reconcile these one day.

amackillop and others added 10 commits June 11, 2026 05:54
Introduce the fee_policy module: a FeePolicy/FeeTier ADT and a single
pure resolve_skim mapping a policy plus an HTLC amount to the msat to
skim. Shared foundation for upcoming work that waives the JIT-channel
skim for grant recipients; nothing wires it up yet.

When the skim would eat the whole HTLC, resolve_skim waives it rather
than clamping to amount - 1: a residual that small may fall below
htlc_minimum_msat and be rejected anyway, so clamping would bank a fee
on a payment that never settles. This only triggers for a rate >= 100%
or an outsized Custom base; the standard 2% never reaches it.

This is also where resolve_skim deliberately diverges from the code it
replaces. The service computed the proportional fee in u64, so
amount * ppm overflowed for very large HTLCs and skimmed nothing,
forwarding the whole HTLC for free. resolve_skim multiplies in u128 and
skims correctly. At the standard 2% the result is identical for any
realistic HTLC; only the previously-overflowing range changes, from
"free" to "charged". Tests pin the standard tier against the legacy
computation for non-overflow sizes and document the large-HTLC case.

The other arms are encoding-only. ZeroFee resolves to zero; Custom
{ ppm, base_msat } adds a flat base to the proportional component.
Custom is kept though v1 never constructs it: it's free in the TLV and
documents the shape of an explicit rate.

Both enums serialize via impl_writeable_tlv_based_enum! with reserved
type bytes (FeeTier: Standard=0, ZeroFee=2, Custom=4; FeePolicy: Flat=0)
so future variants are additive. Flat is a length-prefixed tuple variant
wrapping FeeTier, letting a later commit store a policy as a TLV field
that defaults cleanly on old records.
Persist a FeePolicy per intercept SCID record so a later milestone can
look up a peer's policy at the skim site. ScidWithPeer::new still hands
every record Flat(Standard), so nothing changes yet.

The field uses a length-delimited TLV with a default_value of
Flat(Standard) at type 4. Records written before this field existed
carry only types 0 and 2, so they read back as Standard with no
migration. A test encodes a copy of the old two-field layout and decodes
it through the new struct to pin that default.
Replace the inline compute_forward_fee block in
calculate_htlc_actions_for_peer with a single resolve_skim call against a
literal Flat(Standard) policy. This is the one place the LSP decides what
to skim; routing it through the pure function is what later milestones
need to swap the literal for a per-peer policy lookup.

Not a strict no-op: it inherits the u128 overflow fix from resolve_skim,
so a >9223-BTC HTLC is now skimmed 2% instead of overflowing and
forwarding free. Every realistic HTLC is unchanged. The peer's stored
policy is still ignored; every forward resolves Flat(Standard) until the
lookup lands.

The two old log lines (overflow, skim-ate-the-HTLC) collapse into one:
resolve_skim can't overflow, so a zero skim on a non-zero HTLC can only
mean the fee would have eaten the whole amount.
Introduce a Nix dev shell and a justfile so work on this fork has a
reproducible toolchain and a one-command local gate, mirroring the setup
in the ldk-node repo.

The toolchain is pinned to stable 1.90.0 via fenix rather than tracking
latest. The version is bounded on both ends: clippy must be at least
1.87 so ci/check-lint.sh can resolve the lint names it allows (older
clippy hard-errors on the unknown clippy::manual_is_multiple_of), but
1.92+ adds clippy::assertions_on_constants which fires on existing code
in lightning/src/chain/channelmonitor.rs and would break -D warnings on
the inherited tree. 1.90.0 sits in that window and matches what CI's
stable resolves to today. fenix is used over rust-overlay because it
pins an exact channel by hash, so the shell can't drift.

just check only compiles and runs the workspace test suite. It
deliberately skips clippy and rustfmt: this branch is not clean under
either of CI's gates (pre-existing dead code in lsps2/lsps4 trips
check-lint.sh's -D warnings, and the tree predates the pinned rustfmt),
and reformatting or de-linting code we don't own is out of scope.
MDK-980 lets a registering node carry a signed grant for a
non-standard FeePolicy that the LSP verifies locally before honouring.
This is just the pure core: the claim ADT and the verifier. Nothing
calls it yet, so behaviour is unchanged.

A claim is a versioned TLV pair. ClaimPayload binds {scheme, node_id,
policy}; SignedFeeClaim wraps the encoded payload as an opaque byte
string plus a detached BIP340 signature over SHA256 of those bytes.
Keeping the payload opaque means the verifier hashes exactly what was
signed and never reproduces the payload's TLV layout to check the
signature. TLV rather than a fixed concatenation keeps later schemes
(more policy arms, an issuer key-id) additive: a new tag, not a re-issue.

verify_claim takes a slice of issuer keys, not a single Option. An empty
slice rejects every claim, which is how the feature stays inert until a
key is configured. A non-empty slice accepts a signature from any one of
the keys. That covers a no-key-id claim today and key rotation later,
and per-key scoping can slot in without a breaking config change.

The verifier borrows a caller-supplied secp context rather than building
its own, so the verifier stays allocation free and context lifetime is
the caller's call. The handler wiring that follows will own a long-lived
verify context on the service and hand it in. The scheme byte is read
and matched before the signature check because it selects which
verification rules apply, so an unknown scheme is rejected before any
crypto runs.

The wire format is the contract MDK-981 (ldk-node) and MDK-982 (the TS
minter) must reproduce byte-for-byte, so an in-tree test vector pins a
known issuer key and claim hex. Drift in the TLV layout or the signing
input fails that test instead of silently diverging across repos. The
signing helper is test-only; the verifier does no I/O.
Adds an optional fee_claim string to RegisterNodeRequest so a node can
present a signed grant when it registers. The field is hex of the
SignedFeeClaim bytes from the previous commit. Nothing reads it yet; the
service-side verify and persist land later.

serde(default, skip_serializing_if = "Option::is_none") keeps the wire
backward compatible both ways. An old client that sends {} still decodes
to fee_claim: None through the existing params.unwrap_or(json!({})) path,
and a node with no claim emits {} rather than an explicit null, so the
request bytes match today exactly when no claim is set.

The client constructs the request with fee_claim: None; only a node that
has been granted a claim populates it, which is wired up in a later
change.
The persisted ScidWithPeer already carries a policy field, but the store
only built peer<->scid lookups, so the granted policy was written to disk
and then lost in memory. This adds a policy_by_peer map kept in lockstep
with the existing two: populated on load from the decoded records, upserted
on insert, dropped on remove. A get_policy accessor reads it back.

ScidWithPeer::new now takes the policy explicitly rather than defaulting it,
so there is one way to build the record and the policy is always named at
the construction site. add_intercepted_scid passes Flat(Standard), the path
a peer with no claim takes; the register handler will pass a verified grant.
The on-disk default is unchanged: it comes from the TLV default_value, not
the constructor. get_policy warns as dead code until the handler reads it.

Keeping policy in its own map rather than handing out whole ScidWithPeer
records keeps the skim site's read to a single lookup by node id, which is
all it needs.
Wires the claim verifier into register_node. The service now holds a set of
trusted issuer keys and a long-lived verification context; when a peer
registers, any fee_claim it presented is verified against those keys and the
granted policy is persisted onto the peer's SCID record before the response
is enqueued, so the policy is in place by the time the SCID is handed out.

The feature is inert by default. An empty issuer_pubkeys set short-circuits
before any crypto and resolves every peer to the standard policy, byte for
byte identical to today. Population of the key set is the operator's job
downstream; nothing in-tree sets it.

A grant is only ever upserted, never downgraded. An absent or unverifiable
claim returns None and leaves an existing record untouched, so a transient
miss or a malformed claim can't wipe a live grant; a brand-new record falls
back to standard. This deviates from a literal "else standard" on purpose:
once a node has been granted zero-fee, a dropped claim on a later
re-registration must not silently restore the 2% skim.

The verifier borrows the service's context rather than allocating per call,
and resolve_claim_policy logs and swallows verification failures rather than
failing the registration: a bad claim should cost the node its discount, not
its channel.

add_intercepted_scid is removed. It only built a standard-policy record,
which the new persist path now does directly with the resolved policy, so
the old helper had no remaining caller.
The forwarding skim now reads the peer's granted policy from the SCID store
instead of hard-coding Flat(Standard). A peer with no grant resolves to
Standard, so the skim is byte-for-byte the historical 2% for everyone the
issuer set has not waived. This is the read side that makes the persisted
grant actually take effect; before it, a verified ZeroFee was stored and
then ignored.

The lookup is hoisted out of the per-HTLC loop because the policy is keyed
by peer, not by HTLC, so it is one store read per batch rather than one per
forward.

A zero skim is now two distinct cases, so the log is split. ZeroFee is the
expected path and logs at info. Any other tier skimming nothing means the
fee would have consumed the whole HTLC, which keeps the error-level log:
a forward of the full amount that we expected to take a cut on. Previously
ZeroFee was unreachable, so the single error log was correct; now that a
grant can legitimately waive the fee, the error would be noise.
MDK-980 added the fee_claim field to RegisterNodeRequest and the verifier
that reads it, but the client still hardcoded None, so no node could ever
present a claim. This lets the caller pass one.

The claim rides as a per-call argument rather than a field on
LSPS4ClientConfig: that config derives Copy, which an Option<String> would
break, and the value belongs to the layer above (ldk-node), which already
holds it and relays it on every registration. There is a single call site,
so threading it as a parameter is the smaller, honest change.

The value is opaque here: a lowercase-hex signed grant the LSP alone decodes
and verifies. Passing None reproduces today's behavior exactly, so nothing
changes until a node is configured with a claim.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant