This is the claude output of the following prompt:
write a design doc for a new approach to resolving HTLCs and payments.
Currently, the resolution of HTLCs (and decisions on when HTLCs can be forwarded) is the responsibility of Channel objects (a part of ChannelManager) until the channel is closed, and then the ChannelMonitor thereafter. This leads to some complexity around race conditions for HTLCs right around channel closure. Additionally, there is lots of complexity reconstructing the state of all HTLCs in the ChannelManager deserialization/loading logic.
Instead, we want to do all resolution in ChannelMonitors (in response to ChannelMonitorUpdates) and pass them back to ChannelManager in the form of MonitorEvents (similar to how HTLCs are resolved after channels are closed). In order to have reliable resolution, we'll need to keep MonitorEvents around in the ChannelMonitor until the ChannelManager has finished processing them - adding a new MonitorEvent resolution path through a new method (rather than via ChannelMonitorUpdates). This should remove the need for the RAA-blocking updates entirely - on restart instead of examining the set of HTLCs in monitors we can simply replay all the pending MonitorEvents.
Further, for claiming inbound (MPP) payments, the ChannelMonitor should push a pending MonitorEvent to replay that claim, but not provide it to the ChannelManager immediately. That way, if we crash and restart the MonitorEvents to claim the MPP parts will be replayed automatically and we can claim any MPP Parts that we didn't manage to claim before crashing. We'll still have to track once all channels part of the MPP payment have the preimage and delete the MonitorEvent, but should be able to remove some other on-load logic.
Finally, decisions on when an HTLC is able to be forwarded to the outbound channel should similarly be done using a MonitorEvent, avoiding more work in ChannelManager HTLC reconstruction.
Please write a thorough design doc detailing how much of the ChannelManager on-load complexity can go away, listing specific hunks of code and fields.
Design: Unified HTLC Resolution via ChannelMonitor
Status: Draft
Date: 2026-03-12 (rev 1)
Problem Statement
Currently, HTLC resolution (and decisions about when HTLCs can be forwarded) is
split between two systems:
Channel objects (part of ChannelManager) handle HTLC resolution while
the channel is open — tracking commitment state transitions, forwarding
decisions, and preimage propagation.
ChannelMonitors handle HTLC resolution after the channel is closed —
watching for on-chain transactions, claiming outputs, and failing HTLCs back.
This split creates two major problems:
Race conditions around channel closure. An HTLC might be in the process of
being resolved in the Channel when the channel closes. The Channel buffers
HTLC actions in monitor_pending_* fields (channel.rs:3128–3131) while a
monitor update is in progress. If the channel is dropped during this window, we
don't know whether the ChannelMonitor has responsibility for those HTLCs. This
is explicitly called out as an open bug in channel.rs:3124–3127:
If a channel is drop'd, we don't know whether the ChannelMonitor is
ultimately responsible for some of the HTLCs here or not — we don't know
whether the update in question completed or not. We currently ignore these
fields entirely when force-closing a channel, but need to handle this somehow
or we run the risk of losing HTLCs!
Enormous complexity in ChannelManager deserialization. On restart, the
ChannelManager must reconstruct the state of all in-flight HTLCs by
cross-referencing Channel state, ChannelMonitor state, in-flight monitor
updates, blocked monitor updates, RAA-blocking actions, and various legacy maps.
This reconstruction logic spans ~1600 lines of from_channel_manager_data()
(channelmanager.rs:18635–20263) and is one of the most complex and error-prone
parts of LDK.
Proposed Solution
Move all HTLC resolution to ChannelMonitors, driven by
ChannelMonitorUpdates, with results communicated back to ChannelManager via
MonitorEvents. The ChannelManager becomes a pure routing/forwarding
engine that tells monitors what to do, and monitors tell the manager what
happened.
Core Principles
-
ChannelMonitor is the sole authority on HTLC resolution state. Whether a
channel is open or closed, the monitor decides when an HTLC is resolved and
communicates this to the ChannelManager.
-
MonitorEvents are persistent until acknowledged. The ChannelMonitor
keeps MonitorEvents in its persistent state until the ChannelManager
explicitly acknowledges them via a new method (not via ChannelMonitorUpdate).
-
Restart == replay. On restart, the ChannelManager simply replays all
pending (unacknowledged) MonitorEvents from all monitors. No reconstruction
logic needed.
-
Inbound MPP claims use deferred MonitorEvents. When claiming an MPP
payment, the ChannelMonitor stores a MonitorEvent for the claim but does
not provide it to the ChannelManager immediately. On restart, these events
are replayed, allowing crash-safe MPP claiming without special on-load logic.
Current Architecture (What We Have Today)
HTLC Lifecycle in an Open Channel
When a channel is open, an inbound HTLC goes through these states
(channel.rs InboundHTLCState, lines 174–233):
RemoteAnnounced(InboundHTLCResolution)
│ commitment_signed received
▼
AwaitingRemoteRevokeToAnnounce(InboundHTLCResolution)
│ counterparty revoke_and_ack
▼
AwaitingAnnouncedRemoteRevoke(InboundHTLCResolution)
│ our revoke_and_ack + their commitment_signed
▼
Committed { update_add_htlc: InboundUpdateAdd }
│ HTLC now irrevocably committed; forwarding decision made
│ fail_htlc/fulfill_htlc
▼
LocalRemoved(InboundHTLCRemovalReason)
│ counterparty revoke_and_ack
▼
[removed from tracking]
When the HTLC reaches Committed, the InboundUpdateAdd payload
(channel.rs:337–376) indicates its readiness:
WithOnion { update_add_htlc } — onion not yet decoded, added to
decode_update_add_htlcs for processing
Forwarded { ... } — already forwarded to the outbound edge, onion pruned
Legacy — pre-0.3 HTLC without onion persistence
The transition from AwaitingAnnouncedRemoteRevoke to Committed happens in
revoke_and_ack() (channel.rs:~8587), where WithOnion HTLCs are pushed to
monitor_pending_update_adds (channel.rs:8696) and eventually decoded via
process_pending_update_add_htlcs() (channelmanager.rs:7195–7535).
Forwarding Decisions
Decoded HTLCs flow through process_pending_htlc_forwards()
(channelmanager.rs:7558–7645):
process_pending_update_add_htlcs() decodes onions from the
decode_update_add_htlcs map (channelmanager.rs:2807)
- Decoded HTLCs are added to the
forward_htlcs map
(channelmanager.rs:2789–2791)
forward_htlcs is drained; for each HTLC:
- If
short_chan_id != 0: process_forward_htlcs() sends it to an outbound
channel via queue_add_htlc() (channelmanager.rs:7836–8105)
- If
short_chan_id == 0: process_receive_htlcs() handles it as a final
payment
Monitor Update Blocking and the monitor_pending_* Fields
When a ChannelMonitorUpdate is being persisted, the Channel cannot proceed
with certain protocol messages. Pending work is buffered:
monitor_pending_forwards: Vec<(PendingHTLCInfo, u64)> (channel.rs:3128)
— inbound HTLCs ready to forward
monitor_pending_failures: Vec<(HTLCSource, PaymentHash, HTLCFailReason)>
(channel.rs:3129) — inbound HTLCs to fail backwards
monitor_pending_finalized_fulfills: Vec<(HTLCSource, Option<AttributionData>)>
(channel.rs:3130) — fulfilled HTLCs awaiting acknowledgment (persisted, TLV 11)
monitor_pending_update_adds: Vec<msgs::UpdateAddHTLC> (channel.rs:3131)
— inbound update_add messages awaiting onion decode
These are released via monitor_updating_restored() (channel.rs:9100–9234)
which returns a MonitorRestoreUpdates struct (channel.rs:1176–1197) containing:
pub struct MonitorRestoreUpdates {
pub raa: Option<msgs::RevokeAndACK>,
pub commitment_update: Option<msgs::CommitmentUpdate>,
pub commitment_order: RAACommitmentOrder,
pub accepted_htlcs: Vec<(PendingHTLCInfo, u64)>, // from monitor_pending_forwards
pub failed_htlcs: Vec<(HTLCSource, PaymentHash, HTLCFailReason)>,
pub finalized_claimed_htlcs: Vec<(HTLCSource, Option<AttributionData>)>,
pub pending_update_adds: Vec<msgs::UpdateAddHTLC>, // from monitor_pending_update_adds
pub funding_broadcastable: Option<Transaction>,
pub channel_ready: Option<msgs::ChannelReady>,
// ... other fields
}
Preimage Claiming (Forwarded Payments)
When a downstream channel receives a preimage:
ChannelManager::claim_funds_internal() is called
- For the upstream (inbound) channel,
Channel::get_update_fulfill_htlc_and_commit() (channel.rs:7106–7166)
generates a ChannelMonitorUpdate with a PaymentPreimage step
(channel.rs:7018–7025)
- A
RAAMonitorUpdateBlockingAction::ForwardedPaymentInboundClaim
(channelmanager.rs:1672–1677) is set on the downstream channel, blocking
its next RAA monitor update
- A
MonitorUpdateCompletionAction::EmitEventOptionAndFreeOtherChannel
(channelmanager.rs:1474–1477) pairs Event::PaymentForwarded with
unblocking the downstream channel via EventUnblockedChannel
(channelmanager.rs:1414–1420)
- When the upstream monitor update completes,
handle_monitor_update_completion_actions() (channelmanager.rs:10103–10255)
emits the event and frees the RAA blocker
RAA-Blocking Infrastructure
The RAA-blocking system involves multiple types and fields:
Types:
RAAMonitorUpdateBlockingAction (channelmanager.rs:1668–1700): Enum with
ForwardedPaymentInboundClaim and ClaimedMPPPayment variants
MonitorUpdateCompletionAction (channelmanager.rs:1454–1495): Enum with
PaymentClaimed, EmitEventOptionAndFreeOtherChannel, and
FreeDuplicateClaimImmediately variants
EventCompletionAction::ReleaseRAAChannelMonitorUpdate
(channelmanager.rs:1557–1562): Deferred RAA release on event processing
EventUnblockedChannel (channelmanager.rs:1414–1420): Pointer to channel to
unblock
PendingChannelMonitorUpdate (channel.rs:1472–1478): Blocked update wrapper
Fields (PeerState, channelmanager.rs:1709–1782):
in_flight_monitor_updates: BTreeMap<ChannelId, (OutPoint, Vec<ChannelMonitorUpdate>)>
(line 1740)
monitor_update_blocked_actions: BTreeMap<ChannelId, Vec<MonitorUpdateCompletionAction>>
(line 1760)
actions_blocking_raa_monitor_updates: BTreeMap<ChannelId, Vec<RAAMonitorUpdateBlockingAction>>
(line 1765)
closed_channel_monitor_update_ids: BTreeMap<ChannelId, u64> (line 1775)
Fields (ChannelContext, channel.rs):
blocked_monitor_updates: Vec<PendingChannelMonitorUpdate> (line 3339)
Functions:
raa_monitor_updates_held() (channelmanager.rs:12671–12688): Checks the
actions_blocking_raa_monitor_updates map AND the pending events queue for
ReleaseRAAChannelMonitorUpdate actions
handle_monitor_update_release() (channelmanager.rs:~14962–15036): Removes
blockers and unblocks the channel's blocked_monitor_updates queue
revoke_and_ack(..., hold_mon_update: bool) (channel.rs:~8359): The
hold_mon_update parameter conditionally blocks the resulting monitor update
Inbound MPP Claiming
The MPP claim flow is particularly complex:
- User calls
claim_funds(preimage) (channelmanager.rs:9206)
begin_claiming_payment() moves payment from claimable_payments to
pending_claiming_payments (channelmanager.rs:~1319–1380)
- For each MPP part,
claim_mpp_part() (channelmanager.rs:9563+):
a. Calls Channel::get_update_fulfill_htlc_and_commit() for open channels
b. Creates ChannelMonitorUpdate with PaymentPreimage step + PaymentClaimDetails
c. Sets up shared PendingMPPClaim (channelmanager.rs:1609–1612):
pub(crate) struct PendingMPPClaim {
channels_without_preimage: Vec<(PublicKey, ChannelId)>,
channels_with_preimage: Vec<(PublicKey, ChannelId)>,
}
d. Creates RAAMonitorUpdateBlockingAction::ClaimedMPPPayment per channel
e. Creates MonitorUpdateCompletionAction::PaymentClaimed per channel
- As each monitor update completes,
handle_monitor_update_completion_actions() (channelmanager.rs:10147–10155)
moves entries from channels_without_preimage to channels_with_preimage
- When
channels_without_preimage is empty: free all RAA blockers, emit
Event::PaymentClaimed
Supporting types:
PendingMPPClaimPointer(Arc<Mutex<PendingMPPClaim>>) (line 1650): Shared
pointer for cross-channel coordination
MPPClaimHTLCSource (line 1618–1623): Identifies each MPP part channel
PaymentClaimDetails (line 1637–1642): Stored in ChannelMonitor for
restart claim replay
HTLCClaimSource (line 1590–1595): Deserialization-time equivalent of
MPPClaimHTLCSource
HTLC Resolution After Channel Closure
After closure, the ChannelMonitor takes over:
- Watches for on-chain HTLC timeouts/claims (channelmonitor.rs:5257–5756)
- Creates
MonitorEvent::HTLCEvent with preimage (line 6134) or without
(line 5607) as HTLCs resolve on-chain
- Creates
MonitorEvent::CommitmentTxConfirmed (line 5432) when commitment tx
is detected
ChannelManager::process_monitor_events_for_failover()
(channelmanager.rs:13247–13373) consumes these events to fail/claim upstream
The MonitorEvent enum (channelmonitor.rs:188–227) currently has:
HTLCEvent(HTLCUpdate) — HTLC resolved on-chain (claim or timeout)
HolderForceClosedWithInfo { reason, outpoint, channel_id } — we force-closed
HolderForceClosed(OutPoint) — legacy force-close
CommitmentTxConfirmed(()) — commitment tx confirmed on-chain
Completed { funding_txo, channel_id, monitor_update_id } — monitor update
persisted
Events are currently fire-and-forget: get_and_clear_pending_monitor_events()
(channelmonitor.rs:4373–4377) does a mem::swap to drain them.
The Painful On-Load Reconstruction
On deserialization, from_channel_manager_data() (channelmanager.rs:18635–20263)
must perform a vast reconstruction. Here is every section with exact line ranges:
Step 1: Channel vs. Monitor State Validation (lines 18688–18876)
For each deserialized FundedChannel, compare its commitment transaction
numbers against the corresponding ChannelMonitor:
channel.get_cur_holder_commitment_transaction_number()
> monitor.get_cur_holder_commitment_number()
|| channel.get_revoked_counterparty_commitment_transaction_number()
> monitor.get_min_seen_secret()
|| channel.get_cur_counterparty_commitment_transaction_number()
> monitor.get_cur_counterparty_commitment_number()
|| channel.context.get_latest_monitor_update_id()
< monitor.get_latest_update_id()
If the channel is behind the monitor: force-close with
ClosureReason::OutdatedChannelManager and fail any orphaned HTLCs not in the
monitor. This queues BackgroundEvent::MonitorUpdateRegeneratedOnStartup with
a ChannelForceClosed step.
Step 2: Closed Channel Monitor Processing (lines 18878–18935)
For monitors without a corresponding Channel (already closed), track their
latest update IDs in closed_channel_monitor_update_ids and queue force-close
monitor updates for monitors with state needing update.
Step 3: In-Flight Monitor Update Replay (lines 18970–19205)
The handle_in_flight_updates! macro (lines 18982–19048) processes each
in_flight_monitor_updates entry:
- Compare each update's
update_id against monitor.get_latest_update_id()
- If all completed: queue
BackgroundEvent::MonitorUpdatesComplete with
highest_update_id_completed
- If some pending: retain only incomplete updates, queue as
BackgroundEvent::MonitorUpdateRegeneratedOnStartup for replay
- Validate that channel's unblocked update ID doesn't exceed monitor's ID
This macro is invoked twice: once for open channels (lines ~19050–19096) and
once for remaining closed-channel updates (lines ~19097–19139).
Step 4: Reconstruct/Deserialize Decision (lines 19207–19239)
The key branch: should we reconstruct HTLC state from monitors or use
persisted ChannelManager state?
// Non-test: always reconstruct for version >= RECONSTRUCT_HTLCS_FROM_CHANS_VERSION (2)
let reconstruct_manager_from_monitors = _version >= RECONSTRUCT_HTLCS_FROM_CHANS_VERSION;
// Test: random or controlled via env var
Step 5: HTLC Forwarding State Reconstruction (lines 19267–19362)
Two passes over all channel monitors:
First pass (lines 19267–19333): For each monitor with an open channel
(when reconstruct_manager_from_monitors):
- Call
inbound_htlcs_pending_decode() (channel.rs:7439–7448) to get WithOnion
HTLCs → populate decode_update_add_htlcs
- Call
inbound_forwarded_htlcs() (channel.rs:7452–7507) to get already-
forwarded HTLCs → populate already_forwarded_htlcs
- For closed channels: call
insert_from_monitor_on_startup() for outbound
payments, process preimage claims via pending_outbounds.claim_htlc()
Second pass (lines 19334–19512): For each monitor:
- For open channels with
reconstruct_manager_from_monitors: call
outbound_htlc_forwards() (channel.rs:7512–7533) and prune via
dedup_decode_update_add_htlcs() and prune_forwarded_htlc()
- For closed channels: call
get_all_current_outbound_htlcs() and
reconcile_pending_htlcs_with_monitor() for each; also handle
get_onchain_failed_outbound_htlcs() → failed_htlcs
Step 6: Preimage Claim Replay from Monitors (lines 19514–19591)
For each monitor (open or closed), find outbound HTLCs with preimages:
- Filter via
get_all_current_outbound_htlcs() for HTLCs where
preimage_opt.is_some()
- Check that the inbound edge's monitor still exists (not archived)
- Check
claimable_balances().is_empty() to skip fully-resolved monitors
- Verify
counterparty_node_id.is_some() (required since 0.0.124)
- Push to
pending_claims_to_replay for later execution
Step 7: RAA-Blocking Restoration (lines 19695–19770)
Reconstruct actions_blocking_raa_monitor_updates from the persisted
monitor_update_blocked_actions_per_peer:
For each MonitorUpdateCompletionAction::EmitEventOptionAndFreeOtherChannel:
- Find the blocked channel's peer state
- Push
blocking_action (an RAAMonitorUpdateBlockingAction) into
actions_blocking_raa_monitor_updates[blocked_channel_id]
- Handle edge case: pre-0.1 MPP claims where a channel blocked itself
Step 8: HTLC Deduplication (lines 19772–19798)
When reconstruct_manager_from_monitors:
- Dedup
failed_htlcs against decode_update_add_htlcs
- Dedup
claimable_payments against decode_update_add_htlcs
- Choose between reconstructed maps vs legacy maps (lines 19800–19809)
Step 9: ChannelManager Construction (lines 19864–19926)
The ChannelManager struct is built with the reconstructed state, including
forward_htlcs, decode_update_add_htlcs, claimable_payments, and
pending_background_events.
Step 10: MPP Claim Replay from Monitor Preimages (lines 19928–20088)
For each monitor, call get_stored_preimages() to retrieve
(PaymentHash, (PaymentPreimage, Vec<PaymentClaimDetails>)):
- Cross-reference with
already_forwarded_htlcs — if an inbound HTLC was
forwarded to a downstream channel and the downstream has the preimage,
push it to pending_claims_to_replay (lines 19935–19968)
- For each
PaymentClaimDetails:
- Dedup via
processed_claims: HashSet<Vec<MPPClaimHTLCSource>>
- Skip if already in
pending_claiming_payments
- Create fresh
PendingMPPClaim with all channels in
channels_without_preimage
- Call
begin_claiming_payment() + claim_mpp_part() for each part
(lines 20001–20088)
Step 11: Legacy Preimage-Without-ClaimDetails Path (lines 20090–20196)
For preimages in monitors that have no PaymentClaimDetails (pre-0.3):
- Remove payment from
claimable_payments
- For each HTLC part:
- Call
claim_htlc_while_disconnected_dropping_mon_update_legacy()
on the channel (line 20141–20146)
- Call
provide_payment_preimage_unsafe_legacy() directly on the monitor
(line 20164) — explicitly unsafe, noted as only for upgrade path
- Push
Event::PaymentClaimed manually
Step 12: Failed HTLC and Claim Execution (lines 20200–20257)
- Call
fail_htlc_backwards_internal() for all failed_htlcs
- Fail any remaining
already_forwarded_htlcs that weren't pruned
(lines 20213–20227) — these are HTLCs the inbound channel thought were
forwarded but the outbound channel doesn't have, implying they were failed
- Call
claim_funds_internal() for all pending_claims_to_replay
(lines 20229–20257)
Step 13: Helper Functions (lines 20266–20352)
prune_forwarded_htlc() (lines 20266–20281): Remove specific HTLC from
already_forwarded_htlcs
reconcile_pending_htlcs_with_monitor() (lines 20285–20352): Master dedup
function that removes HTLCs from decode_update_add_htlcs,
forward_htlcs_legacy, and pending_intercepted_htlcs_legacy when the
monitor has taken responsibility
Proposed Architecture
New MonitorEvent Variants
Extend MonitorEvent (channelmonitor.rs:188) to cover all HTLC resolution
outcomes, not just post-close on-chain events:
pub enum MonitorEvent {
// Existing variants (retained)
HTLCEvent(HTLCUpdate),
HolderForceClosedWithInfo { .. },
HolderForceClosed(OutPoint),
CommitmentTxConfirmed(()),
Completed { .. },
// New variants
/// An HTLC was irrevocably committed to both commitment transactions and
/// can now be forwarded/received. Generated when the ChannelMonitor
/// processes a LatestHolderCommitment update containing the HTLC and the
/// counterparty's revocation for the prior state has been received.
///
/// Replaces the current flow where Channel pushes to
/// monitor_pending_update_adds → decode_update_add_htlcs → forward_htlcs.
HTLCAccepted {
channel_id: ChannelId,
htlc: msgs::UpdateAddHTLC,
},
/// A forwarded HTLC was claimed with a preimage. The ChannelManager should
/// propagate the preimage to the inbound edge.
///
/// Replaces the current flow where claim_funds_internal() directly drives
/// the inbound channel + sets up RAA blocking on the outbound channel.
ForwardedHTLCClaimed {
source: HTLCSource,
preimage: PaymentPreimage,
downstream_value_msat: u64,
},
/// An inbound MPP payment part has been durably claimed with a preimage.
/// This event is generated but NOT immediately surfaced — it is stored in
/// deferred_restart_events and only replayed on restart to enable
/// crash-safe MPP claiming without ChannelManager-side tracking.
///
/// Replaces PendingMPPClaim, PendingMPPClaimPointer, and the complex
/// on-load reconstruction in lines 19928-20088.
InboundMPPClaimPersisted {
payment_hash: PaymentHash,
preimage: PaymentPreimage,
htlc_source: HTLCPreviousHopData,
claim_details: PaymentClaimDetails,
},
}
New MonitorEvent Acknowledgment Path
Add a method on ChannelMonitor (and the chain::Watch trait) to acknowledge
processed events:
/// A unique identifier for a MonitorEvent, used for acknowledgment.
/// Monotonically increasing per-monitor counter.
pub struct MonitorEventId(u64);
impl ChannelMonitor {
/// Acknowledge that the given MonitorEvents have been processed by the
/// ChannelManager. The monitor will remove them from its persistent state.
///
/// This should be called after the ChannelManager has durably processed
/// the events (i.e., after the ChannelManager has been re-persisted with
/// the resulting state changes).
pub fn acknowledge_monitor_events(&self, up_to_id: MonitorEventId);
}
Each MonitorEvent gets a unique MonitorEventId (monotonic counter per
monitor). Events remain in the monitor's persistent state until acknowledged.
On restart, unacknowledged events are replayed.
This is deliberately not a ChannelMonitorUpdate — acknowledgments flow in
the opposite direction and don't need the same ordering guarantees. However,
acknowledging events does trigger a monitor re-persist (since the monitor's
serialized state changed).
New ChannelMonitorUpdateStep Variants
enum ChannelMonitorUpdateStep {
// Existing variants retained...
/// An HTLC has been irrevocably committed. The monitor should generate
/// an HTLCAccepted MonitorEvent. This step is sent when the Channel
/// determines the HTLC is in both commitment txns and the prior
/// counterparty state is revoked.
///
/// Replaces the monitor_pending_update_adds → decode_update_add_htlcs flow.
HTLCIrrevocablyCommitted {
update_add_htlc: msgs::UpdateAddHTLC,
},
/// The ChannelManager has decided to fulfill an HTLC with a preimage.
/// For forwarded HTLCs, the monitor should generate a ForwardedHTLCClaimed
/// event. The source identifies the inbound edge for preimage propagation.
///
/// This extends the existing PaymentPreimage step to carry source info.
FulfillHTLC {
htlc_id: u64,
preimage: PaymentPreimage,
source: HTLCSource,
},
}
Note: We may not need a new FailHTLC step. HTLC failures on open channels
still flow through normal commitment transaction negotiation. The monitor only
needs to handle failures post-close (which it already does via HTLCEvent
with payment_preimage: None).
Deferred MonitorEvents for MPP Claims
When the user calls claim_funds(preimage) for an MPP payment:
-
The ChannelManager sends ChannelMonitorUpdates with
PaymentPreimage + PaymentClaimDetails steps to each channel's monitor
(same as today).
-
Each ChannelMonitor, upon processing the preimage update, stores an
InboundMPPClaimPersisted event in a new deferred_restart_events list
(NOT in pending_monitor_events). This event is persisted with the monitor.
-
On restart, the ChannelManager calls a new get_restart_events() method
(or the existing get_and_clear_pending_monitor_events() is enhanced).
Monitors return InboundMPPClaimPersisted events. The ChannelManager
uses these to identify which MPP parts have been claimed and which haven't,
then claims any missing parts.
-
Once all MPP parts across all channels have the preimage durably stored
(confirmed by all monitors having the InboundMPPClaimPersisted event),
the ChannelManager acknowledges all the InboundMPPClaimPersisted
events, removing them from the monitors.
This replaces the current on-load logic that iterates all monitors via
get_stored_preimages() and cross-references with claimable_payments /
pending_claiming_payments state (lines 19928–20088).
Resolution Flow (Open Channel — HTLC Acceptance)
ChannelManager Channel ChannelMonitor
| | |
|-- receive update_add_htlc --> | |
| |-- CMU: LatestHolder ->|
| | |
|<-- commitment_signed ---------| |
| |-- CMU: CommitSecret ->|
| | |
|<-- revoke_and_ack ------------| |
| | |
| [Channel confirms HTLC irrevocably committed] |
| |-- CMU: HTLCIrrev. -->|
| | Committed |
| |
|<------------- MonitorEvent::HTLCAccepted ------------|
| |
|-- [decode onion, forward/receive decision] |
| |
Resolution Flow (Claiming a Forwarded HTLC)
ChannelManager ChannelMonitor (downstream)
| |
|<-- MonitorEvent::HTLCEvent ------| (preimage from counterparty claim)
| (or during normal operation: |
| preimage arrives via |
| update_fulfill_htlc) |
| |
|-- CMU: FulfillHTLC + source ---->|
| |
| [Monitor stores preimage, generates ForwardedHTLCClaimed]
| |
|<-- MonitorEvent::ForwardedHTLC --|
| Claimed |
| |
|-- [send preimage to inbound |
| channel's monitor via CMU] |
| |
|-- [once inbound confirmed: |
| acknowledge event] |
| |
Resolution Flow (Restart)
ChannelManager (new) ChannelMonitor (from disk)
| |
|-- get_pending_monitor_events() ->|
| + get_restart_events() |
| |
|<-- [all unacknowledged events] --|
| (HTLCAccepted, ForwardedHTLCClaimed,
| InboundMPPClaimPersisted, etc.)
| |
|-- [process each event as if |
| receiving it for first time] |
| |
|-- acknowledge_monitor_events() ->|
| |
No reconstruction logic needed. The monitor state IS the source of truth.
What Can Be Removed
Fields That Can Be Eliminated
In PeerState (channelmanager.rs:1709–1782)
| Field |
Line |
Why Removable |
monitor_update_blocked_actions |
1760 |
Completion actions move into monitor; ChannelManager no longer queues post-completion work |
actions_blocking_raa_monitor_updates |
1765 |
RAA blocking is eliminated entirely — safety comes from event acknowledgment |
closed_channel_monitor_update_ids |
1775 |
Monitors self-track their update IDs; ChannelManager no longer mirrors this for on-load dedup |
In ChannelContext (channel.rs:3120–3340)
| Field |
Lines |
Why Removable |
monitor_pending_forwards |
3128 |
Forwarding driven by MonitorEvent::HTLCAccepted; no buffering needed |
monitor_pending_failures |
3129 |
Failure propagation driven by MonitorEvent::HTLCEvent; no buffering needed |
monitor_pending_finalized_fulfills |
3130 |
Fulfill tracking moves to monitor's persistent events |
monitor_pending_update_adds |
3131 |
Replaced by MonitorEvent::HTLCAccepted |
blocked_monitor_updates |
3339 |
RAA blocking eliminated; all updates flow through immediately |
This also eliminates the race condition described in channel.rs:3124–3127.
In ChannelManagerData (channelmanager.rs:18013–18041)
| Field |
Line |
Why Removable |
monitor_update_blocked_actions_per_peer |
18025–18026 |
No more blocked actions to persist |
in_flight_monitor_updates |
18030 |
Monitor knows its own state; no need for CM to track |
forward_htlcs_legacy |
18036 |
Legacy map replaced by monitor events |
pending_intercepted_htlcs_legacy |
18037 |
Legacy map replaced by monitor events |
decode_update_add_htlcs_legacy |
18038 |
Legacy map replaced by monitor events |
In ChannelManager (runtime state, channelmanager.rs:2780–2820)
| Field |
Line |
Why Removable |
decode_update_add_htlcs |
2807 |
HTLCs-to-decode communicated via MonitorEvent::HTLCAccepted; onion decode happens inline |
Note: forward_htlcs (line 2789) and pending_intercepted_htlcs (line 2800)
are still needed for the forwarding pipeline. They are populated from monitor
events rather than from channel state.
Enums/Types That Can Be Simplified or Removed
| Type |
Location |
Why Removable |
RAAMonitorUpdateBlockingAction |
channelmanager.rs:1668–1700 |
Entire enum: both variants serve RAA blocking which is eliminated |
MonitorUpdateCompletionAction |
channelmanager.rs:1454–1495 |
EmitEventOptionAndFreeOtherChannel and FreeDuplicateClaimImmediately removed; PaymentClaimed simplified or moved to monitor |
EventCompletionAction::ReleaseRAAChannelMonitorUpdate |
channelmanager.rs:1557–1562 |
No RAA blocking to release |
EventUnblockedChannel |
channelmanager.rs:1414–1420 |
Only existed to carry RAA blocker info |
PostMonitorUpdateChanResume |
channelmanager.rs:1521–1538 |
Drastically simplified — no more htlc_forwards, decode_update_add_htlcs, failed_htlcs fields needed |
BackgroundEvent::MonitorUpdateRegeneratedOnStartup |
channelmanager.rs:1397–1402 |
No in-flight updates to regenerate on load |
BackgroundEvent::MonitorUpdatesComplete |
channelmanager.rs:1406–1410 |
Simplified — completion tracking moves to monitor |
PendingMPPClaim |
channelmanager.rs:1609–1612 |
Replaced by deferred InboundMPPClaimPersisted events |
PendingMPPClaimPointer |
channelmanager.rs:1650 |
Goes with PendingMPPClaim |
MPPClaimHTLCSource |
channelmanager.rs:1618–1623 |
Goes with MPP claim tracking |
HTLCClaimSource |
channelmanager.rs:1590–1595 |
Only used during on-load reconstruction |
MonitorRestoreUpdates |
channel.rs:1176–1197 |
Most fields removable — only raa, commitment_update, and protocol messages retained |
InboundUpdateAdd::Forwarded |
channel.rs:349–363 |
Channel no longer needs to track forwarding state for on-load reconstruction |
Functions/Methods That Can Be Removed or Drastically Simplified
On-Load Reconstruction (the big win)
| Function/Section |
Lines |
Current Purpose |
After Change |
handle_in_flight_updates! macro |
18982–19048 |
Replay in-flight monitor updates |
Remove entirely — monitors know their own state |
| In-flight update handling for open channels |
19050–19096 |
Match in-flight updates to open channels |
Remove entirely |
| In-flight update handling for closed channels |
19097–19139 |
Match in-flight updates to closed channels |
Remove entirely |
| HTLC reconstruction from channels |
19274–19301 |
Rebuild decode_update_add_htlcs and already_forwarded_htlcs from Channel state |
Remove entirely — replaced by MonitorEvent::HTLCAccepted replay |
| Outbound forward dedup |
19334–19362 |
Call outbound_htlc_forwards() and dedup_decode_update_add_htlcs() |
Remove entirely |
| Outbound HTLC processing for closed channels |
19365–19512 |
Cross-reference monitors with pending_outbound_payments, handle preimage claims and failures |
Drastically simplify — monitor events carry all needed info; outbound payment tracking may still be needed for PaymentSent events |
| Preimage claim replay |
19514–19591 |
Find preimages in monitors, check upstream monitors, queue for replay |
Remove entirely — ForwardedHTLCClaimed events are replayed automatically |
| RAA-blocking restoration |
19695–19770 |
Reconstruct actions_blocking_raa_monitor_updates from persisted monitor_update_blocked_actions_per_peer |
Remove entirely |
| HTLC deduplication |
19772–19798 |
Remove already-processed HTLCs from decode queues |
Remove entirely — no decode queues to dedup |
| Legacy map selection |
19800–19809 |
Choose between reconstructed and legacy maps |
Remove entirely |
dedup_decode_update_add_htlcs() |
~18525–18555 |
Prevent double-forwarding by matching on prev_outbound_scid_alias and htlc_id |
Remove entirely |
prune_forwarded_htlc() |
20266–20281 |
Remove forwarded HTLCs from tracking |
Remove entirely |
reconcile_pending_htlcs_with_monitor() |
20285–20352 |
Master dedup function across decode_update_add_htlcs, forward_htlcs_legacy, pending_intercepted_htlcs_legacy |
Remove entirely |
MPP claim replay from get_stored_preimages() |
19928–20088 |
Reconstruct PendingMPPClaim, call begin_claiming_payment() + claim_mpp_part() |
Remove entirely — replaced by InboundMPPClaimPersisted replay |
Legacy preimage path (no PaymentClaimDetails) |
20090–20196 |
claim_htlc_while_disconnected_dropping_mon_update_legacy() + provide_payment_preimage_unsafe_legacy() |
Remove entirely — no longer needed post-migration |
| Failed HTLC backwards propagation |
20200–20212 |
fail_htlc_backwards_internal() for failed_htlcs |
Simplified — some of this may still be needed for outbound payments, but forwarded-HTLC failures are handled by monitor events |
| Already-forwarded HTLC failure |
20213–20227 |
Fail HTLCs that appear forwarded but are missing from outbound edge |
Remove entirely — monitor events make this reconciliation unnecessary |
| Claim replay execution |
20229–20257 |
claim_funds_internal() for pending_claims_to_replay |
Remove entirely — replayed via MonitorEvents |
Estimated lines removed from from_channel_manager_data(): ~1200–1400 lines.
RAA-Blocking Infrastructure
| Function |
Lines |
Purpose |
After Change |
raa_monitor_updates_held() |
12671–12688 |
Check actions_blocking_raa_monitor_updates + pending events for ReleaseRAAChannelMonitorUpdate |
Remove entirely |
test_raa_monitor_updates_held() |
12691–12707 |
Test helper |
Remove entirely |
get_and_clear_pending_raa_blockers() |
~14935–14955 |
Extract blockers for startup |
Remove entirely |
handle_monitor_update_release() |
~14962–15036 |
Remove RAAMonitorUpdateBlockingAction, unblock channel's blocked_monitor_updates via unblock_next_blocked_monitor_update() |
Remove entirely |
handle_monitor_update_completion_actions() |
10103–10255 |
Process MonitorUpdateCompletionAction variants: track PendingMPPClaim progress, emit events, free RAA blockers |
Drastically simplify — only simple event emission remains |
handle_post_event_actions() (ReleaseRAA path) |
~15040–15113 |
When user handles PaymentForwarded, release the downstream channel's RAA via EventCompletionAction::ReleaseRAAChannelMonitorUpdate |
Remove ReleaseRAA path |
Channel Methods (channel.rs)
| Method |
Lines |
Purpose |
After Change |
monitor_updating_paused() |
9079–9094 |
Push pending forwards/failures/fulfills to monitor_pending_* fields |
Remove entirely — no pending queues |
monitor_updating_restored() |
9100–9234 |
Drain monitor_pending_* fields into MonitorRestoreUpdates |
Drastically simplify — only protocol message resends (raa, commitment_update) remain |
unblock_next_blocked_monitor_update() |
~10755–10763 |
Dequeue from blocked_monitor_updates |
Remove entirely |
push_ret_blockable_mon_update() |
~10768–10779 |
Conditionally block or return monitor update |
Remove entirely — updates always flow through |
on_startup_drop_completed_blocked_mon_updates_through() |
~10784–10799 |
Drop stale blocked updates on startup |
Remove entirely |
get_latest_unblocked_monitor_update_id() |
~4149–4154 |
Track boundary of unblocked updates |
Remove entirely — no blocking concept |
inbound_htlcs_pending_decode() |
7439–7448 |
Extract WithOnion HTLCs for on-load decode queue rebuild |
Remove entirely |
inbound_forwarded_htlcs() |
7452–7507 |
Extract forwarded HTLCs for on-load already_forwarded_htlcs rebuild |
Remove entirely |
has_legacy_inbound_htlcs() |
7428–7435 |
Detect pre-0.3 HTLC state (InboundUpdateAdd::Legacy) |
Remove entirely (version migration) |
outbound_htlc_forwards() |
7512–7533 |
Extract outbound forwards for on-load dedup |
Remove entirely |
claim_htlc_while_disconnected_dropping_mon_update_legacy() |
(channel.rs) |
Legacy on-load claim that bypasses normal monitor update flow |
Remove entirely |
Why RAA-Blocking Is Eliminated
The current RAA-blocking mechanism exists because:
-
When a forwarded payment is claimed, the downstream channel's RAA monitor
update (which, as a side effect of revoking the prior state, removes the
preimage from one commitment transaction) must not complete before the
upstream channel's monitor update (which adds the preimage) is durable.
Otherwise, on restart, the preimage might be lost from the downstream
monitor while the upstream monitor never received it.
-
For MPP payments, all channel monitors must have the preimage before any of
them can have it removed from a commitment transaction via revocation. The
PendingMPPClaim shared pointer coordinates this across channels.
In the new architecture, this is handled naturally:
-
The ChannelMonitor stores preimages in payment_preimages
(channelmonitor.rs:1272) durably. The preimage is never "lost" from the
monitor's state due to a revocation — it lives in a separate map.
-
ForwardedHTLCClaimed events persist until the ChannelManager acknowledges
them. The ChannelManager only acknowledges after confirming the preimage is
durable on the inbound edge.
-
For MPP, InboundMPPClaimPersisted events persist in each monitor until all
parts are confirmed claimed. On restart, any missing parts are re-claimed.
-
ChannelMonitorUpdates flow through immediately — no hold_mon_update
parameter on revoke_and_ack(), no blocked_monitor_updates queue. The
safety guarantee comes from the acknowledgment path, not from blocking the
update path.
Detailed Design: Inbound MPP Claiming
Current Approach (Complex)
- User calls
claim_funds(preimage) (channelmanager.rs:9206)
begin_claiming_payment() moves payment from claimable_payments to
pending_claiming_payments (channelmanager.rs:~1319–1380)
- For each MPP part,
claim_mpp_part() (channelmanager.rs:9563+):
a. Calls Channel::get_update_fulfill_htlc_and_commit() for open channels
b. Creates ChannelMonitorUpdate with PaymentPreimage step + PaymentClaimDetails
c. Sets up shared PendingMPPClaim (channels_without_preimage/channels_with_preimage)
d. Creates RAAMonitorUpdateBlockingAction::ClaimedMPPPayment per channel
e. Creates MonitorUpdateCompletionAction::PaymentClaimed per channel
- As each monitor update completes,
handle_monitor_update_completion_actions() (lines 10147–10155) moves
entries between PendingMPPClaim lists
- When all channels have preimage: free all RAA blockers, emit
Event::PaymentClaimed
- On restart: iterate all monitors'
get_stored_preimages(), reconstruct
PendingMPPClaim, dedup via processed_claims, call
begin_claiming_payment() + claim_mpp_part() for each
New Approach (Simple)
- User calls
claim_funds(preimage)
ChannelManager sends ChannelMonitorUpdate with PaymentPreimage +
PaymentClaimDetails to each channel's monitor
- Each
ChannelMonitor, upon processing the update:
a. Stores the preimage in payment_preimages
b. Stores an InboundMPPClaimPersisted event in deferred_restart_events
c. For open channels: fulfills the HTLC in the commitment transaction
normally
- The
ChannelManager tracks confirmed parts via MonitorEvent::Completed
- Once all parts confirmed: emit
Event::PaymentClaimed, acknowledge all
InboundMPPClaimPersisted events
- On restart: monitors replay
InboundMPPClaimPersisted events →
ChannelManager identifies which parts were claimed → claims missing
parts → done
What Goes Away
PendingMPPClaim / PendingMPPClaimPointer (channelmanager.rs:1609–1663)
RAAMonitorUpdateBlockingAction::ClaimedMPPPayment variant (line 1685)
- The completion-tracking in
handle_monitor_update_completion_actions()
(channelmanager.rs:10116–10223)
- On-load MPP reconstruction (channelmanager.rs:19928–20088, ~160 lines)
- Legacy preimage path (channelmanager.rs:20090–20196, ~106 lines)
MPPClaimHTLCSource, HTLCClaimSource, processed_claims HashSet
Detailed Design: Forwarded HTLC Claiming
Current Approach
- Downstream channel receives preimage from counterparty (via
update_fulfill_htlc or on-chain claim)
ChannelManager::claim_funds_internal() is called
- For the upstream (inbound) channel:
a. Channel::get_update_fulfill_htlc_and_commit() generates a
ChannelMonitorUpdate on the upstream channel with a PaymentPreimage
step
b. A RAAMonitorUpdateBlockingAction::ForwardedPaymentInboundClaim
(channelmanager.rs:1672–1677) blocks the downstream channel's next RAA
c. A MonitorUpdateCompletionAction::EmitEventOptionAndFreeOtherChannel
(channelmanager.rs:1474–1477) is stored on the upstream channel's
monitor_update_blocked_actions, pairing Event::PaymentForwarded with
an EventUnblockedChannel that will free the downstream
- When the upstream monitor update completes:
handle_monitor_update_completion_actions() emits PaymentForwarded and
calls handle_monitor_update_release() to remove the RAA blocker
- On restart: reconstruct blockers from
monitor_update_blocked_actions_per_peer (lines 19695–19770)
New Approach
- Downstream channel receives preimage from counterparty
ChannelManager sends a FulfillHTLC ChannelMonitorUpdate to the
downstream ChannelMonitor, including the HTLCSource identifying the
upstream edge
- Downstream
ChannelMonitor generates ForwardedHTLCClaimed event
(persistent until acknowledged)
ChannelManager receives ForwardedHTLCClaimed, sends preimage to
upstream channel via ChannelMonitorUpdate with PaymentPreimage step
- When upstream monitor confirms preimage storage (via
MonitorEvent::Completed): ChannelManager acknowledges the
ForwardedHTLCClaimed event on the downstream monitor
ChannelManager emits Event::PaymentForwarded
- On restart: downstream monitor replays unacknowledged
ForwardedHTLCClaimed → ChannelManager re-sends preimage to upstream →
safe
What Goes Away
RAAMonitorUpdateBlockingAction::ForwardedPaymentInboundClaim variant
(channelmanager.rs:1672–1677)
MonitorUpdateCompletionAction::EmitEventOptionAndFreeOtherChannel
(channelmanager.rs:1474–1477)
EventCompletionAction::ReleaseRAAChannelMonitorUpdate
(channelmanager.rs:1557–1562)
EventUnblockedChannel struct (channelmanager.rs:1414–1447)
- The entire
blocked_monitor_updates mechanism in Channel (channel.rs:3339)
- All
hold_mon_update logic in Channel::revoke_and_ack()
(channel.rs:~8675–8694)
- The RAA-blocking restoration on load (channelmanager.rs:19695–19770)
Detailed Design: HTLC Forwarding via MonitorEvent
Current Approach
When an HTLC becomes irrevocably committed in the Channel:
revoke_and_ack() (channel.rs:~8587) transitions it to Committed with
InboundUpdateAdd::WithOnion
- The
update_add_htlc message is pushed to monitor_pending_update_adds
- When monitor update completes,
monitor_updating_restored() drains
monitor_pending_update_adds into MonitorRestoreUpdates::pending_update_adds
ChannelManager puts them into decode_update_add_htlcs map (keyed by
outbound SCID alias)
process_pending_update_add_htlcs() (channelmanager.rs:7195–7535) decodes
each onion and routes to forward_htlcs
process_pending_htlc_forwards() (channelmanager.rs:7558–7645) forwards or
receives
On restart, inbound_htlcs_pending_decode() extracts WithOnion HTLCs to
rebuild the decode_update_add_htlcs map, and complex deduplication prevents
double-forwarding.
New Approach
- When
revoke_and_ack() confirms an HTLC as irrevocably committed, the
Channel sends a ChannelMonitorUpdate with HTLCIrrevocablyCommitted
step containing the update_add_htlc message
- The
ChannelMonitor processes this step and generates an
MonitorEvent::HTLCAccepted event (persistent until acknowledged)
ChannelManager receives HTLCAccepted, decodes the onion, and routes
to forward_htlcs or handles as a final payment
- After forwarding/receiving,
ChannelManager acknowledges the event
- On restart: unacknowledged
HTLCAccepted events are replayed → onion is
decoded again → forwarding happens again → idempotent (the downstream
channel will reject the duplicate update_add_htlc)
What Goes Away
decode_update_add_htlcs map (channelmanager.rs:2807)
monitor_pending_update_adds field (channel.rs:3131)
monitor_pending_forwards field (channel.rs:3128) — forwarding is driven
by events, not buffered in the channel
InboundUpdateAdd::WithOnion variant — the monitor holds the raw
update_add_htlc until acknowledged, so the channel doesn't need to
InboundUpdateAdd::Forwarded variant — no longer needed for on-load
reconstruction
inbound_htlcs_pending_decode() (channel.rs:7439–7448)
inbound_forwarded_htlcs() (channel.rs:7452–7507)
outbound_htlc_forwards() (channel.rs:7512–7533)
- All on-load dedup logic (
dedup_decode_update_add_htlcs,
reconcile_pending_htlcs_with_monitor, prune_forwarded_htlc)
- The
already_forwarded_htlcs temporary map in from_channel_manager_data()
(lines 19251–19254)
New ChannelMonitor State
The ChannelMonitorImpl (channelmonitor.rs:1199–1400) needs new fields:
pub(crate) struct ChannelMonitorImpl<Signer: EcdsaChannelSigner> {
// ... existing fields ...
/// MonitorEvents that have been generated but not yet acknowledged by the
/// ChannelManager. These survive serialization and are replayed on restart.
/// Replaces the fire-and-forget `pending_monitor_events` for new event types.
pending_unacknowledged_events: Vec<(MonitorEventId, MonitorEvent)>,
/// The next MonitorEventId to assign.
next_event_id: u64,
/// Deferred events (e.g., InboundMPPClaimPersisted) that should not be
/// surfaced to the ChannelManager during normal operation but should be
/// replayed on restart. These are stored separately so that
/// get_and_clear_pending_monitor_events() doesn't return them.
deferred_restart_events: Vec<(MonitorEventId, MonitorEvent)>,
}
The existing pending_monitor_events: Vec<MonitorEvent> field
(channelmonitor.rs:1283) is kept for backwards compatibility with existing
MonitorEvent variants (on-chain events) during the migration period, then
eventually deprecated.
Serialization Changes
pending_unacknowledged_events and deferred_restart_events must be
serialized as new TLV fields in ChannelMonitorImpl. The existing
MonitorEvent serialization (channelmonitor.rs:228–246) supports existing
variants; new variants need new TLV tags in the MonitorEvent enum.
get_and_clear_pending_monitor_events() Changes
The current implementation (channelmonitor.rs:4373–4377) does mem::swap to
drain events. In the new design:
fn get_pending_monitor_events(&self) -> Vec<(MonitorEventId, MonitorEvent)> {
// Return copies of unacknowledged events without clearing.
self.pending_unacknowledged_events.clone()
}
fn get_restart_events(&self) -> Vec<(MonitorEventId, MonitorEvent)> {
// Called only on restart. Returns deferred events.
self.deferred_restart_events.clone()
}
fn acknowledge_events(&mut self, up_to_id: MonitorEventId) {
self.pending_unacknowledged_events.retain(|(id, _)| id.0 > up_to_id.0);
self.deferred_restart_events.retain(|(id, _)| id.0 > up_to_id.0);
}
The ChannelManager tracks the highest acknowledged MonitorEventId per
monitor (either in its own state or by querying the monitor) to distinguish
"new" from "already-processed" events during normal operation. A simple
approach: after processing new events, immediately acknowledge them (the
monitor will re-persist, and if the ChannelManager crashes before
re-persisting, the events will replay on restart — which is the desired
behavior).
Interaction with Existing Constraints
"MonitorEvents MUST NOT be generated during update processing"
The existing constraint (channelmonitor.rs:~1274–1282) says:
MonitorEvents MUST NOT be generated during update processing, only generated
during chain data processing.
This constraint exists because of a race in ChainMonitor::update_channel
where the in-memory state is updated under a read-lock, but persistence hasn't
completed yet. If events were generated during update processing and consumed
before persistence, a restart would replay the update but the event would be
lost.
In the new design, this constraint is relaxed because:
- Events are persistent-until-acknowledged
- Even if an event is generated during update processing and the update isn't
persisted, on restart the update will be replayed and the event regenerated
- The acknowledgment path ensures the
ChannelManager won't "lose" events
However, we must ensure idempotent event generation — replaying a
ChannelMonitorUpdate must not duplicate events. The monitor should check
whether an event for a given HTLC already exists before generating a new one.
This is straightforward since events carry enough identifying information
(channel_id + htlc_id) for dedup.
Chain Watch Trait
The chain::Watch trait's release_pending_monitor_events() method
(chain/mod.rs:345–347) needs to change:
/// Returns pending MonitorEvents with their IDs for acknowledgment tracking.
fn release_pending_monitor_events(&self)
-> Vec<(OutPoint, ChannelId, Vec<(MonitorEventId, MonitorEvent)>, PublicKey)>;
/// Acknowledge processed events up to the given ID per monitor.
fn acknowledge_monitor_events(&self,
channel_id: &ChannelId, up_to_id: MonitorEventId);
Migration Strategy
Backwards Compatibility
- Old
ChannelMonitor state can be read by new code. New fields are additive
TLVs with defaults (empty vecs, zero counter).
- Old
ChannelManager state can still be loaded. On first load with new code,
the on-load reconstruction runs one final time (the existing logic is
retained behind the version check). After the ChannelManager is
re-persisted, all state is in the new format.
- Bump
SERIALIZATION_VERSION (channelmanager.rs:17246) and
RECONSTRUCT_HTLCS_FROM_CHANS_VERSION (channelmanager.rs:17258) to gate
new behavior.
Phased Approach
Phase 1: Persistent MonitorEvents with acknowledgment
- Add
MonitorEventId, pending_unacknowledged_events,
deferred_restart_events to ChannelMonitorImpl
- Add
acknowledge_monitor_events() to ChannelMonitor and chain::Watch
- Change
get_and_clear_pending_monitor_events() to not clear for new events
- Existing
MonitorEvent variants (HTLCEvent, CommitmentTxConfirmed, etc.)
continue to use the old fire-and-forget path during this phase
Phase 2: Move HTLC forwarding to monitor-driven events
- Add
HTLCIrrevocablyCommitted step and HTLCAccepted event
Channel generates the new step in revoke_and_ack() instead of pushing
to monitor_pending_update_adds
ChannelManager processes HTLCAccepted in the event loop
- Remove
decode_update_add_htlcs map, monitor_pending_update_adds,
monitor_pending_forwards
- Remove
inbound_htlcs_pending_decode(), inbound_forwarded_htlcs(),
outbound_htlc_forwards(), all dedup helpers
Phase 3: Move forwarded HTLC claiming to monitor events
- Add
FulfillHTLC step and ForwardedHTLCClaimed event
- Remove RAA-blocking infrastructure entirely:
RAAMonitorUpdateBlockingAction, EventCompletionAction::ReleaseRAA,
blocked_monitor_updates, hold_mon_update parameter, etc.
- Remove
MonitorUpdateCompletionAction::EmitEventOptionAndFreeOtherChannel
Phase 4: Move inbound MPP claiming to monitor-driven events
- Add
InboundMPPClaimPersisted event
- Modify
claim_funds_internal() to rely on deferred monitor events
- Remove
PendingMPPClaim, PendingMPPClaimPointer,
ClaimedMPPPayment RAA blocker
- Remove
MPPClaimHTLCSource, HTLCClaimSource
Phase 5: Simplify on-load logic
- Replace all reconstruction logic with simple
MonitorEvent replay loop
- Remove ~1200–1400 lines from
from_channel_manager_data()
- Remove legacy map handling, legacy preimage paths,
_legacy suffixed fields
- The on-load code reduces to:
for (channel_id, monitor) in channel_monitors {
for (event_id, event) in monitor.get_pending_monitor_events() {
process_monitor_event(event);
}
for (event_id, event) in monitor.get_restart_events() {
process_monitor_event(event);
}
}
Risks and Open Questions
-
Monitor persistence size: Storing events until acknowledged increases the
persistent monitor size. Events are relatively small (one per HTLC), but
high-volume nodes could accumulate events if the ChannelManager is slow
to persist. Mitigation: batch acknowledgments; keep events compact; bound
the maximum number of unacknowledged events.
-
Idempotent event generation: When a ChannelMonitorUpdate is replayed
on restart, the monitor must not duplicate events. The implementation must
check for existing events with the same HTLC identifier before generating
new ones.
-
Backwards compatibility on upgrade: First load with new code must bootstrap
the new event state from the existing reconstruction. The existing
reconstruction logic runs one final time, then the ChannelManager is
re-persisted in the new format. This means the existing reconstruction
code must be maintained (but can be feature-gated) until we're confident
all users have upgraded.
-
InboundHTLCState / InboundUpdateAdd simplification: The Channel
still needs to track HTLCs for commitment transaction negotiation, but
InboundUpdateAdd::Forwarded (which exists solely for on-load
reconstruction) can be removed. The Channel can transition directly from
WithOnion to "onion consumed" without tracking forwarding state.
-
Timing of HTLCAccepted events: The monitor needs to know when an
HTLC is irrevocably committed. Today, this is derived from the Channel's
InboundHTLCState machine. In the new design, the Channel sends an
explicit HTLCIrrevocablyCommitted step at the right moment. The monitor
doesn't need to replicate the state machine — it just needs to generate the
event when it receives the step. This is simpler than having the monitor
independently track HTLC lifecycle states.
-
Performance: The current "clear on read" approach for MonitorEvents is
zero-cost at read time. Persistent events require cloning on read and
additional serialization. However:
- Events are small (~100 bytes each)
- Acknowledgment can batch (one call covers many events)
- The massive reduction in on-load complexity saves far more developer time
than the small runtime cost
-
Trampoline forwards: HTLCSource::TrampolineForward contains multiple
previous_hop_data entries. The new event system must handle this — a
single ForwardedHTLCClaimed event for a trampoline forward should
carry all upstream sources. This is straightforward since the
HTLCSource enum already handles this.
Summary
| Metric |
Current |
Proposed |
| On-load reconstruction lines |
~1600 (18635–20263) |
~50 (event replay loop) |
| RAA-blocking types/functions |
~15 (types, fields, methods) |
0 |
ChannelManager-persisted HTLC state |
6+ maps/fields |
0 (all in monitors) |
| Restart correctness argument |
Cross-reference Channel, Monitor, in-flight updates, blocked updates, RAA blockers, legacy maps |
Replay unacknowledged events |
blocked_monitor_updates mechanism |
Complex FIFO queue with ordering constraints |
Not needed |
monitor_pending_* fields |
4 Vec fields with race condition bug (channel.rs:3124–3127) |
0 — eliminated along with the bug |
| MPP claim coordination |
PendingMPPClaim + PendingMPPClaimPointer + Arc<Mutex> + RAA blockers + completion actions |
Deferred monitor events |
| Forward HTLC claiming safety |
RAA blocking + event completion actions + EventUnblockedChannel |
Persistent events + acknowledgment |
| Channel closure HTLC handoff |
Race-prone (known bug) |
Race-free — monitor already has all state |
The key insight is that persistent, acknowledged MonitorEvents replace
both the RAA-blocking mechanism (which existed to ensure preimages aren't
lost across monitor updates) and the on-load reconstruction logic (which
existed because MonitorEvents were fire-and-forget). By making events
durable and acknowledgment-driven, we get correctness by construction — the
monitor holds onto events until the ChannelManager has processed them, and
on restart we simply replay.
The channel closure race condition (channel.rs:3124–3127) disappears because
there are no monitor_pending_* fields to lose — the ChannelMonitor
generates events directly from its own state, which is always durable.
This is the claude output of the following prompt:
write a design doc for a new approach to resolving HTLCs and payments.
Currently, the resolution of HTLCs (and decisions on when HTLCs can be forwarded) is the responsibility of
Channelobjects (a part ofChannelManager) until the channel is closed, and then theChannelMonitorthereafter. This leads to some complexity around race conditions for HTLCs right around channel closure. Additionally, there is lots of complexity reconstructing the state of all HTLCs in theChannelManagerdeserialization/loading logic.Instead, we want to do all resolution in
ChannelMonitors (in response toChannelMonitorUpdates) and pass them back toChannelManagerin the form ofMonitorEvents (similar to how HTLCs are resolved after channels are closed). In order to have reliable resolution, we'll need to keepMonitorEvents around in theChannelMonitoruntil theChannelManagerhas finished processing them - adding a newMonitorEventresolution path through a new method (rather than viaChannelMonitorUpdates). This should remove the need for the RAA-blocking updates entirely - on restart instead of examining the set of HTLCs in monitors we can simply replay all the pendingMonitorEvents.Further, for claiming inbound (MPP) payments, the
ChannelMonitorshould push a pendingMonitorEventto replay that claim, but not provide it to theChannelManagerimmediately. That way, if we crash and restart theMonitorEvents to claim the MPP parts will be replayed automatically and we can claim any MPP Parts that we didn't manage to claim before crashing. We'll still have to track once all channels part of the MPP payment have the preimage and delete theMonitorEvent, but should be able to remove some other on-load logic.Finally, decisions on when an HTLC is able to be forwarded to the outbound channel should similarly be done using a
MonitorEvent, avoiding more work inChannelManagerHTLC reconstruction.Please write a thorough design doc detailing how much of the
ChannelManageron-load complexity can go away, listing specific hunks of code and fields.Design: Unified HTLC Resolution via ChannelMonitor
Status: Draft
Date: 2026-03-12 (rev 1)
Problem Statement
Currently, HTLC resolution (and decisions about when HTLCs can be forwarded) is
split between two systems:
Channelobjects (part ofChannelManager) handle HTLC resolution whilethe channel is open — tracking commitment state transitions, forwarding
decisions, and preimage propagation.
ChannelMonitors handle HTLC resolution after the channel is closed —watching for on-chain transactions, claiming outputs, and failing HTLCs back.
This split creates two major problems:
Race conditions around channel closure. An HTLC might be in the process of
being resolved in the
Channelwhen the channel closes. TheChannelbuffersHTLC actions in
monitor_pending_*fields (channel.rs:3128–3131) while amonitor update is in progress. If the channel is dropped during this window, we
don't know whether the
ChannelMonitorhas responsibility for those HTLCs. Thisis explicitly called out as an open bug in channel.rs:3124–3127:
Enormous complexity in
ChannelManagerdeserialization. On restart, theChannelManagermust reconstruct the state of all in-flight HTLCs bycross-referencing
Channelstate,ChannelMonitorstate, in-flight monitorupdates, blocked monitor updates, RAA-blocking actions, and various legacy maps.
This reconstruction logic spans ~1600 lines of
from_channel_manager_data()(channelmanager.rs:18635–20263) and is one of the most complex and error-prone
parts of LDK.
Proposed Solution
Move all HTLC resolution to
ChannelMonitors, driven byChannelMonitorUpdates, with results communicated back toChannelManagerviaMonitorEvents. TheChannelManagerbecomes a pure routing/forwardingengine that tells monitors what to do, and monitors tell the manager what
happened.
Core Principles
ChannelMonitoris the sole authority on HTLC resolution state. Whether achannel is open or closed, the monitor decides when an HTLC is resolved and
communicates this to the
ChannelManager.MonitorEvents are persistent until acknowledged. TheChannelMonitorkeeps
MonitorEvents in its persistent state until theChannelManagerexplicitly acknowledges them via a new method (not via
ChannelMonitorUpdate).Restart == replay. On restart, the
ChannelManagersimply replays allpending (unacknowledged)
MonitorEvents from all monitors. No reconstructionlogic needed.
Inbound MPP claims use deferred
MonitorEvents. When claiming an MPPpayment, the
ChannelMonitorstores aMonitorEventfor the claim but doesnot provide it to the
ChannelManagerimmediately. On restart, these eventsare replayed, allowing crash-safe MPP claiming without special on-load logic.
Current Architecture (What We Have Today)
HTLC Lifecycle in an Open Channel
When a channel is open, an inbound HTLC goes through these states
(channel.rs
InboundHTLCState, lines 174–233):When the HTLC reaches
Committed, theInboundUpdateAddpayload(channel.rs:337–376) indicates its readiness:
WithOnion { update_add_htlc }— onion not yet decoded, added todecode_update_add_htlcsfor processingForwarded { ... }— already forwarded to the outbound edge, onion prunedLegacy— pre-0.3 HTLC without onion persistenceThe transition from
AwaitingAnnouncedRemoteRevoketoCommittedhappens inrevoke_and_ack()(channel.rs:~8587), whereWithOnionHTLCs are pushed tomonitor_pending_update_adds(channel.rs:8696) and eventually decoded viaprocess_pending_update_add_htlcs()(channelmanager.rs:7195–7535).Forwarding Decisions
Decoded HTLCs flow through
process_pending_htlc_forwards()(channelmanager.rs:7558–7645):
process_pending_update_add_htlcs()decodes onions from thedecode_update_add_htlcsmap (channelmanager.rs:2807)forward_htlcsmap(channelmanager.rs:2789–2791)
forward_htlcsis drained; for each HTLC:short_chan_id != 0:process_forward_htlcs()sends it to an outboundchannel via
queue_add_htlc()(channelmanager.rs:7836–8105)short_chan_id == 0:process_receive_htlcs()handles it as a finalpayment
Monitor Update Blocking and the
monitor_pending_*FieldsWhen a
ChannelMonitorUpdateis being persisted, theChannelcannot proceedwith certain protocol messages. Pending work is buffered:
monitor_pending_forwards: Vec<(PendingHTLCInfo, u64)>(channel.rs:3128)— inbound HTLCs ready to forward
monitor_pending_failures: Vec<(HTLCSource, PaymentHash, HTLCFailReason)>(channel.rs:3129) — inbound HTLCs to fail backwards
monitor_pending_finalized_fulfills: Vec<(HTLCSource, Option<AttributionData>)>(channel.rs:3130) — fulfilled HTLCs awaiting acknowledgment (persisted, TLV 11)
monitor_pending_update_adds: Vec<msgs::UpdateAddHTLC>(channel.rs:3131)— inbound update_add messages awaiting onion decode
These are released via
monitor_updating_restored()(channel.rs:9100–9234)which returns a
MonitorRestoreUpdatesstruct (channel.rs:1176–1197) containing:Preimage Claiming (Forwarded Payments)
When a downstream channel receives a preimage:
ChannelManager::claim_funds_internal()is calledChannel::get_update_fulfill_htlc_and_commit()(channel.rs:7106–7166)generates a
ChannelMonitorUpdatewith aPaymentPreimagestep(channel.rs:7018–7025)
RAAMonitorUpdateBlockingAction::ForwardedPaymentInboundClaim(channelmanager.rs:1672–1677) is set on the downstream channel, blocking
its next RAA monitor update
MonitorUpdateCompletionAction::EmitEventOptionAndFreeOtherChannel(channelmanager.rs:1474–1477) pairs
Event::PaymentForwardedwithunblocking the downstream channel via
EventUnblockedChannel(channelmanager.rs:1414–1420)
handle_monitor_update_completion_actions()(channelmanager.rs:10103–10255)emits the event and frees the RAA blocker
RAA-Blocking Infrastructure
The RAA-blocking system involves multiple types and fields:
Types:
RAAMonitorUpdateBlockingAction(channelmanager.rs:1668–1700): Enum withForwardedPaymentInboundClaimandClaimedMPPPaymentvariantsMonitorUpdateCompletionAction(channelmanager.rs:1454–1495): Enum withPaymentClaimed,EmitEventOptionAndFreeOtherChannel, andFreeDuplicateClaimImmediatelyvariantsEventCompletionAction::ReleaseRAAChannelMonitorUpdate(channelmanager.rs:1557–1562): Deferred RAA release on event processing
EventUnblockedChannel(channelmanager.rs:1414–1420): Pointer to channel tounblock
PendingChannelMonitorUpdate(channel.rs:1472–1478): Blocked update wrapperFields (PeerState, channelmanager.rs:1709–1782):
in_flight_monitor_updates: BTreeMap<ChannelId, (OutPoint, Vec<ChannelMonitorUpdate>)>(line 1740)
monitor_update_blocked_actions: BTreeMap<ChannelId, Vec<MonitorUpdateCompletionAction>>(line 1760)
actions_blocking_raa_monitor_updates: BTreeMap<ChannelId, Vec<RAAMonitorUpdateBlockingAction>>(line 1765)
closed_channel_monitor_update_ids: BTreeMap<ChannelId, u64>(line 1775)Fields (ChannelContext, channel.rs):
blocked_monitor_updates: Vec<PendingChannelMonitorUpdate>(line 3339)Functions:
raa_monitor_updates_held()(channelmanager.rs:12671–12688): Checks theactions_blocking_raa_monitor_updatesmap AND the pending events queue forReleaseRAAChannelMonitorUpdateactionshandle_monitor_update_release()(channelmanager.rs:~14962–15036): Removesblockers and unblocks the channel's
blocked_monitor_updatesqueuerevoke_and_ack(..., hold_mon_update: bool)(channel.rs:~8359): Thehold_mon_updateparameter conditionally blocks the resulting monitor updateInbound MPP Claiming
The MPP claim flow is particularly complex:
claim_funds(preimage)(channelmanager.rs:9206)begin_claiming_payment()moves payment fromclaimable_paymentstopending_claiming_payments(channelmanager.rs:~1319–1380)claim_mpp_part()(channelmanager.rs:9563+):a. Calls
Channel::get_update_fulfill_htlc_and_commit()for open channelsb. Creates
ChannelMonitorUpdatewithPaymentPreimagestep +PaymentClaimDetailsc. Sets up shared
PendingMPPClaim(channelmanager.rs:1609–1612):RAAMonitorUpdateBlockingAction::ClaimedMPPPaymentper channele. Creates
MonitorUpdateCompletionAction::PaymentClaimedper channelhandle_monitor_update_completion_actions()(channelmanager.rs:10147–10155)moves entries from
channels_without_preimagetochannels_with_preimagechannels_without_preimageis empty: free all RAA blockers, emitEvent::PaymentClaimedSupporting types:
PendingMPPClaimPointer(Arc<Mutex<PendingMPPClaim>>)(line 1650): Sharedpointer for cross-channel coordination
MPPClaimHTLCSource(line 1618–1623): Identifies each MPP part channelPaymentClaimDetails(line 1637–1642): Stored inChannelMonitorforrestart claim replay
HTLCClaimSource(line 1590–1595): Deserialization-time equivalent ofMPPClaimHTLCSourceHTLC Resolution After Channel Closure
After closure, the
ChannelMonitortakes over:MonitorEvent::HTLCEventwith preimage (line 6134) or without(line 5607) as HTLCs resolve on-chain
MonitorEvent::CommitmentTxConfirmed(line 5432) when commitment txis detected
ChannelManager::process_monitor_events_for_failover()(channelmanager.rs:13247–13373) consumes these events to fail/claim upstream
The
MonitorEventenum (channelmonitor.rs:188–227) currently has:HTLCEvent(HTLCUpdate)— HTLC resolved on-chain (claim or timeout)HolderForceClosedWithInfo { reason, outpoint, channel_id }— we force-closedHolderForceClosed(OutPoint)— legacy force-closeCommitmentTxConfirmed(())— commitment tx confirmed on-chainCompleted { funding_txo, channel_id, monitor_update_id }— monitor updatepersisted
Events are currently fire-and-forget:
get_and_clear_pending_monitor_events()(channelmonitor.rs:4373–4377) does a
mem::swapto drain them.The Painful On-Load Reconstruction
On deserialization,
from_channel_manager_data()(channelmanager.rs:18635–20263)must perform a vast reconstruction. Here is every section with exact line ranges:
Step 1: Channel vs. Monitor State Validation (lines 18688–18876)
For each deserialized
FundedChannel, compare its commitment transactionnumbers against the corresponding
ChannelMonitor:If the channel is behind the monitor: force-close with
ClosureReason::OutdatedChannelManagerand fail any orphaned HTLCs not in themonitor. This queues
BackgroundEvent::MonitorUpdateRegeneratedOnStartupwitha
ChannelForceClosedstep.Step 2: Closed Channel Monitor Processing (lines 18878–18935)
For monitors without a corresponding
Channel(already closed), track theirlatest update IDs in
closed_channel_monitor_update_idsand queue force-closemonitor updates for monitors with state needing update.
Step 3: In-Flight Monitor Update Replay (lines 18970–19205)
The
handle_in_flight_updates!macro (lines 18982–19048) processes eachin_flight_monitor_updatesentry:update_idagainstmonitor.get_latest_update_id()BackgroundEvent::MonitorUpdatesCompletewithhighest_update_id_completedBackgroundEvent::MonitorUpdateRegeneratedOnStartupfor replayThis macro is invoked twice: once for open channels (lines ~19050–19096) and
once for remaining closed-channel updates (lines ~19097–19139).
Step 4: Reconstruct/Deserialize Decision (lines 19207–19239)
The key branch: should we reconstruct HTLC state from monitors or use
persisted
ChannelManagerstate?Step 5: HTLC Forwarding State Reconstruction (lines 19267–19362)
Two passes over all channel monitors:
First pass (lines 19267–19333): For each monitor with an open channel
(when
reconstruct_manager_from_monitors):inbound_htlcs_pending_decode()(channel.rs:7439–7448) to getWithOnionHTLCs → populate
decode_update_add_htlcsinbound_forwarded_htlcs()(channel.rs:7452–7507) to get already-forwarded HTLCs → populate
already_forwarded_htlcsinsert_from_monitor_on_startup()for outboundpayments, process preimage claims via
pending_outbounds.claim_htlc()Second pass (lines 19334–19512): For each monitor:
reconstruct_manager_from_monitors: calloutbound_htlc_forwards()(channel.rs:7512–7533) and prune viadedup_decode_update_add_htlcs()andprune_forwarded_htlc()get_all_current_outbound_htlcs()andreconcile_pending_htlcs_with_monitor()for each; also handleget_onchain_failed_outbound_htlcs()→failed_htlcsStep 6: Preimage Claim Replay from Monitors (lines 19514–19591)
For each monitor (open or closed), find outbound HTLCs with preimages:
get_all_current_outbound_htlcs()for HTLCs wherepreimage_opt.is_some()claimable_balances().is_empty()to skip fully-resolved monitorscounterparty_node_id.is_some()(required since 0.0.124)pending_claims_to_replayfor later executionStep 7: RAA-Blocking Restoration (lines 19695–19770)
Reconstruct
actions_blocking_raa_monitor_updatesfrom the persistedmonitor_update_blocked_actions_per_peer:For each
MonitorUpdateCompletionAction::EmitEventOptionAndFreeOtherChannel:blocking_action(anRAAMonitorUpdateBlockingAction) intoactions_blocking_raa_monitor_updates[blocked_channel_id]Step 8: HTLC Deduplication (lines 19772–19798)
When
reconstruct_manager_from_monitors:failed_htlcsagainstdecode_update_add_htlcsclaimable_paymentsagainstdecode_update_add_htlcsStep 9: ChannelManager Construction (lines 19864–19926)
The
ChannelManagerstruct is built with the reconstructed state, includingforward_htlcs,decode_update_add_htlcs,claimable_payments, andpending_background_events.Step 10: MPP Claim Replay from Monitor Preimages (lines 19928–20088)
For each monitor, call
get_stored_preimages()to retrieve(PaymentHash, (PaymentPreimage, Vec<PaymentClaimDetails>)):already_forwarded_htlcs— if an inbound HTLC wasforwarded to a downstream channel and the downstream has the preimage,
push it to
pending_claims_to_replay(lines 19935–19968)PaymentClaimDetails:processed_claims: HashSet<Vec<MPPClaimHTLCSource>>pending_claiming_paymentsPendingMPPClaimwith all channels inchannels_without_preimagebegin_claiming_payment()+claim_mpp_part()for each part(lines 20001–20088)
Step 11: Legacy Preimage-Without-ClaimDetails Path (lines 20090–20196)
For preimages in monitors that have no
PaymentClaimDetails(pre-0.3):claimable_paymentsclaim_htlc_while_disconnected_dropping_mon_update_legacy()on the channel (line 20141–20146)
provide_payment_preimage_unsafe_legacy()directly on the monitor(line 20164) — explicitly unsafe, noted as only for upgrade path
Event::PaymentClaimedmanuallyStep 12: Failed HTLC and Claim Execution (lines 20200–20257)
fail_htlc_backwards_internal()for allfailed_htlcsalready_forwarded_htlcsthat weren't pruned(lines 20213–20227) — these are HTLCs the inbound channel thought were
forwarded but the outbound channel doesn't have, implying they were failed
claim_funds_internal()for allpending_claims_to_replay(lines 20229–20257)
Step 13: Helper Functions (lines 20266–20352)
prune_forwarded_htlc()(lines 20266–20281): Remove specific HTLC fromalready_forwarded_htlcsreconcile_pending_htlcs_with_monitor()(lines 20285–20352): Master dedupfunction that removes HTLCs from
decode_update_add_htlcs,forward_htlcs_legacy, andpending_intercepted_htlcs_legacywhen themonitor has taken responsibility
Proposed Architecture
New MonitorEvent Variants
Extend
MonitorEvent(channelmonitor.rs:188) to cover all HTLC resolutionoutcomes, not just post-close on-chain events:
New MonitorEvent Acknowledgment Path
Add a method on
ChannelMonitor(and thechain::Watchtrait) to acknowledgeprocessed events:
Each
MonitorEventgets a uniqueMonitorEventId(monotonic counter permonitor). Events remain in the monitor's persistent state until acknowledged.
On restart, unacknowledged events are replayed.
This is deliberately not a
ChannelMonitorUpdate— acknowledgments flow inthe opposite direction and don't need the same ordering guarantees. However,
acknowledging events does trigger a monitor re-persist (since the monitor's
serialized state changed).
New ChannelMonitorUpdateStep Variants
Note: We may not need a new
FailHTLCstep. HTLC failures on open channelsstill flow through normal commitment transaction negotiation. The monitor only
needs to handle failures post-close (which it already does via
HTLCEventwith
payment_preimage: None).Deferred MonitorEvents for MPP Claims
When the user calls
claim_funds(preimage)for an MPP payment:The
ChannelManagersendsChannelMonitorUpdates withPaymentPreimage+PaymentClaimDetailssteps to each channel's monitor(same as today).
Each
ChannelMonitor, upon processing the preimage update, stores anInboundMPPClaimPersistedevent in a newdeferred_restart_eventslist(NOT in
pending_monitor_events). This event is persisted with the monitor.On restart, the
ChannelManagercalls a newget_restart_events()method(or the existing
get_and_clear_pending_monitor_events()is enhanced).Monitors return
InboundMPPClaimPersistedevents. TheChannelManageruses these to identify which MPP parts have been claimed and which haven't,
then claims any missing parts.
Once all MPP parts across all channels have the preimage durably stored
(confirmed by all monitors having the
InboundMPPClaimPersistedevent),the
ChannelManageracknowledges all theInboundMPPClaimPersistedevents, removing them from the monitors.
This replaces the current on-load logic that iterates all monitors via
get_stored_preimages()and cross-references withclaimable_payments/pending_claiming_paymentsstate (lines 19928–20088).Resolution Flow (Open Channel — HTLC Acceptance)
Resolution Flow (Claiming a Forwarded HTLC)
Resolution Flow (Restart)
No reconstruction logic needed. The monitor state IS the source of truth.
What Can Be Removed
Fields That Can Be Eliminated
In
PeerState(channelmanager.rs:1709–1782)monitor_update_blocked_actionsactions_blocking_raa_monitor_updatesclosed_channel_monitor_update_idsIn
ChannelContext(channel.rs:3120–3340)monitor_pending_forwardsMonitorEvent::HTLCAccepted; no buffering neededmonitor_pending_failuresMonitorEvent::HTLCEvent; no buffering neededmonitor_pending_finalized_fulfillsmonitor_pending_update_addsMonitorEvent::HTLCAcceptedblocked_monitor_updatesThis also eliminates the race condition described in channel.rs:3124–3127.
In
ChannelManagerData(channelmanager.rs:18013–18041)monitor_update_blocked_actions_per_peerin_flight_monitor_updatesforward_htlcs_legacypending_intercepted_htlcs_legacydecode_update_add_htlcs_legacyIn
ChannelManager(runtime state, channelmanager.rs:2780–2820)decode_update_add_htlcsMonitorEvent::HTLCAccepted; onion decode happens inlineNote:
forward_htlcs(line 2789) andpending_intercepted_htlcs(line 2800)are still needed for the forwarding pipeline. They are populated from monitor
events rather than from channel state.
Enums/Types That Can Be Simplified or Removed
RAAMonitorUpdateBlockingActionMonitorUpdateCompletionActionEmitEventOptionAndFreeOtherChannelandFreeDuplicateClaimImmediatelyremoved;PaymentClaimedsimplified or moved to monitorEventCompletionAction::ReleaseRAAChannelMonitorUpdateEventUnblockedChannelPostMonitorUpdateChanResumehtlc_forwards,decode_update_add_htlcs,failed_htlcsfields neededBackgroundEvent::MonitorUpdateRegeneratedOnStartupBackgroundEvent::MonitorUpdatesCompletePendingMPPClaimInboundMPPClaimPersistedeventsPendingMPPClaimPointerPendingMPPClaimMPPClaimHTLCSourceHTLCClaimSourceMonitorRestoreUpdatesraa,commitment_update, and protocol messages retainedInboundUpdateAdd::ForwardedFunctions/Methods That Can Be Removed or Drastically Simplified
On-Load Reconstruction (the big win)
handle_in_flight_updates!macrodecode_update_add_htlcsandalready_forwarded_htlcsfrom Channel stateMonitorEvent::HTLCAcceptedreplayoutbound_htlc_forwards()anddedup_decode_update_add_htlcs()pending_outbound_payments, handle preimage claims and failuresPaymentSenteventsForwardedHTLCClaimedevents are replayed automaticallyactions_blocking_raa_monitor_updatesfrom persistedmonitor_update_blocked_actions_per_peerdedup_decode_update_add_htlcs()prev_outbound_scid_aliasandhtlc_idprune_forwarded_htlc()reconcile_pending_htlcs_with_monitor()decode_update_add_htlcs,forward_htlcs_legacy,pending_intercepted_htlcs_legacyget_stored_preimages()PendingMPPClaim, callbegin_claiming_payment()+claim_mpp_part()InboundMPPClaimPersistedreplayPaymentClaimDetails)claim_htlc_while_disconnected_dropping_mon_update_legacy()+provide_payment_preimage_unsafe_legacy()fail_htlc_backwards_internal()forfailed_htlcsclaim_funds_internal()forpending_claims_to_replayMonitorEventsEstimated lines removed from
from_channel_manager_data(): ~1200–1400 lines.RAA-Blocking Infrastructure
raa_monitor_updates_held()actions_blocking_raa_monitor_updates+ pending events forReleaseRAAChannelMonitorUpdatetest_raa_monitor_updates_held()get_and_clear_pending_raa_blockers()handle_monitor_update_release()RAAMonitorUpdateBlockingAction, unblock channel'sblocked_monitor_updatesviaunblock_next_blocked_monitor_update()handle_monitor_update_completion_actions()MonitorUpdateCompletionActionvariants: trackPendingMPPClaimprogress, emit events, free RAA blockershandle_post_event_actions()(ReleaseRAA path)PaymentForwarded, release the downstream channel's RAA viaEventCompletionAction::ReleaseRAAChannelMonitorUpdateChannel Methods (channel.rs)
monitor_updating_paused()monitor_pending_*fieldsmonitor_updating_restored()monitor_pending_*fields intoMonitorRestoreUpdatesraa,commitment_update) remainunblock_next_blocked_monitor_update()blocked_monitor_updatespush_ret_blockable_mon_update()on_startup_drop_completed_blocked_mon_updates_through()get_latest_unblocked_monitor_update_id()inbound_htlcs_pending_decode()WithOnionHTLCs for on-load decode queue rebuildinbound_forwarded_htlcs()already_forwarded_htlcsrebuildhas_legacy_inbound_htlcs()InboundUpdateAdd::Legacy)outbound_htlc_forwards()claim_htlc_while_disconnected_dropping_mon_update_legacy()Why RAA-Blocking Is Eliminated
The current RAA-blocking mechanism exists because:
When a forwarded payment is claimed, the downstream channel's RAA monitor
update (which, as a side effect of revoking the prior state, removes the
preimage from one commitment transaction) must not complete before the
upstream channel's monitor update (which adds the preimage) is durable.
Otherwise, on restart, the preimage might be lost from the downstream
monitor while the upstream monitor never received it.
For MPP payments, all channel monitors must have the preimage before any of
them can have it removed from a commitment transaction via revocation. The
PendingMPPClaimshared pointer coordinates this across channels.In the new architecture, this is handled naturally:
The
ChannelMonitorstores preimages inpayment_preimages(channelmonitor.rs:1272) durably. The preimage is never "lost" from the
monitor's state due to a revocation — it lives in a separate map.
ForwardedHTLCClaimedevents persist until theChannelManageracknowledgesthem. The
ChannelManageronly acknowledges after confirming the preimage isdurable on the inbound edge.
For MPP,
InboundMPPClaimPersistedevents persist in each monitor until allparts are confirmed claimed. On restart, any missing parts are re-claimed.
ChannelMonitorUpdates flow through immediately — nohold_mon_updateparameter on
revoke_and_ack(), noblocked_monitor_updatesqueue. Thesafety guarantee comes from the acknowledgment path, not from blocking the
update path.
Detailed Design: Inbound MPP Claiming
Current Approach (Complex)
claim_funds(preimage)(channelmanager.rs:9206)begin_claiming_payment()moves payment fromclaimable_paymentstopending_claiming_payments(channelmanager.rs:~1319–1380)claim_mpp_part()(channelmanager.rs:9563+):a. Calls
Channel::get_update_fulfill_htlc_and_commit()for open channelsb. Creates
ChannelMonitorUpdatewithPaymentPreimagestep +PaymentClaimDetailsc. Sets up shared
PendingMPPClaim(channels_without_preimage/channels_with_preimage)d. Creates
RAAMonitorUpdateBlockingAction::ClaimedMPPPaymentper channele. Creates
MonitorUpdateCompletionAction::PaymentClaimedper channelhandle_monitor_update_completion_actions()(lines 10147–10155) movesentries between
PendingMPPClaimlistsEvent::PaymentClaimedget_stored_preimages(), reconstructPendingMPPClaim, dedup viaprocessed_claims, callbegin_claiming_payment()+claim_mpp_part()for eachNew Approach (Simple)
claim_funds(preimage)ChannelManagersendsChannelMonitorUpdatewithPaymentPreimage+PaymentClaimDetailsto each channel's monitorChannelMonitor, upon processing the update:a. Stores the preimage in
payment_preimagesb. Stores an
InboundMPPClaimPersistedevent indeferred_restart_eventsc. For open channels: fulfills the HTLC in the commitment transaction
normally
ChannelManagertracks confirmed parts viaMonitorEvent::CompletedEvent::PaymentClaimed, acknowledge allInboundMPPClaimPersistedeventsInboundMPPClaimPersistedevents →ChannelManageridentifies which parts were claimed → claims missingparts → done
What Goes Away
PendingMPPClaim/PendingMPPClaimPointer(channelmanager.rs:1609–1663)RAAMonitorUpdateBlockingAction::ClaimedMPPPaymentvariant (line 1685)handle_monitor_update_completion_actions()(channelmanager.rs:10116–10223)
MPPClaimHTLCSource,HTLCClaimSource,processed_claimsHashSetDetailed Design: Forwarded HTLC Claiming
Current Approach
update_fulfill_htlcor on-chain claim)ChannelManager::claim_funds_internal()is calleda.
Channel::get_update_fulfill_htlc_and_commit()generates aChannelMonitorUpdateon the upstream channel with aPaymentPreimagestep
b. A
RAAMonitorUpdateBlockingAction::ForwardedPaymentInboundClaim(channelmanager.rs:1672–1677) blocks the downstream channel's next RAA
c. A
MonitorUpdateCompletionAction::EmitEventOptionAndFreeOtherChannel(channelmanager.rs:1474–1477) is stored on the upstream channel's
monitor_update_blocked_actions, pairingEvent::PaymentForwardedwithan
EventUnblockedChannelthat will free the downstreamhandle_monitor_update_completion_actions()emitsPaymentForwardedandcalls
handle_monitor_update_release()to remove the RAA blockermonitor_update_blocked_actions_per_peer(lines 19695–19770)New Approach
ChannelManagersends aFulfillHTLCChannelMonitorUpdateto thedownstream
ChannelMonitor, including theHTLCSourceidentifying theupstream edge
ChannelMonitorgeneratesForwardedHTLCClaimedevent(persistent until acknowledged)
ChannelManagerreceivesForwardedHTLCClaimed, sends preimage toupstream channel via
ChannelMonitorUpdatewithPaymentPreimagestepMonitorEvent::Completed):ChannelManageracknowledges theForwardedHTLCClaimedevent on the downstream monitorChannelManageremitsEvent::PaymentForwardedForwardedHTLCClaimed→ChannelManagerre-sends preimage to upstream →safe
What Goes Away
RAAMonitorUpdateBlockingAction::ForwardedPaymentInboundClaimvariant(channelmanager.rs:1672–1677)
MonitorUpdateCompletionAction::EmitEventOptionAndFreeOtherChannel(channelmanager.rs:1474–1477)
EventCompletionAction::ReleaseRAAChannelMonitorUpdate(channelmanager.rs:1557–1562)
EventUnblockedChannelstruct (channelmanager.rs:1414–1447)blocked_monitor_updatesmechanism inChannel(channel.rs:3339)hold_mon_updatelogic inChannel::revoke_and_ack()(channel.rs:~8675–8694)
Detailed Design: HTLC Forwarding via MonitorEvent
Current Approach
When an HTLC becomes irrevocably committed in the
Channel:revoke_and_ack()(channel.rs:~8587) transitions it toCommittedwithInboundUpdateAdd::WithOnionupdate_add_htlcmessage is pushed tomonitor_pending_update_addsmonitor_updating_restored()drainsmonitor_pending_update_addsintoMonitorRestoreUpdates::pending_update_addsChannelManagerputs them intodecode_update_add_htlcsmap (keyed byoutbound SCID alias)
process_pending_update_add_htlcs()(channelmanager.rs:7195–7535) decodeseach onion and routes to
forward_htlcsprocess_pending_htlc_forwards()(channelmanager.rs:7558–7645) forwards orreceives
On restart,
inbound_htlcs_pending_decode()extractsWithOnionHTLCs torebuild the
decode_update_add_htlcsmap, and complex deduplication preventsdouble-forwarding.
New Approach
revoke_and_ack()confirms an HTLC as irrevocably committed, theChannelsends aChannelMonitorUpdatewithHTLCIrrevocablyCommittedstep containing the
update_add_htlcmessageChannelMonitorprocesses this step and generates anMonitorEvent::HTLCAcceptedevent (persistent until acknowledged)ChannelManagerreceivesHTLCAccepted, decodes the onion, and routesto
forward_htlcsor handles as a final paymentChannelManageracknowledges the eventHTLCAcceptedevents are replayed → onion isdecoded again → forwarding happens again → idempotent (the downstream
channel will reject the duplicate
update_add_htlc)What Goes Away
decode_update_add_htlcsmap (channelmanager.rs:2807)monitor_pending_update_addsfield (channel.rs:3131)monitor_pending_forwardsfield (channel.rs:3128) — forwarding is drivenby events, not buffered in the channel
InboundUpdateAdd::WithOnionvariant — the monitor holds the rawupdate_add_htlcuntil acknowledged, so the channel doesn't need toInboundUpdateAdd::Forwardedvariant — no longer needed for on-loadreconstruction
inbound_htlcs_pending_decode()(channel.rs:7439–7448)inbound_forwarded_htlcs()(channel.rs:7452–7507)outbound_htlc_forwards()(channel.rs:7512–7533)dedup_decode_update_add_htlcs,reconcile_pending_htlcs_with_monitor,prune_forwarded_htlc)already_forwarded_htlcstemporary map infrom_channel_manager_data()(lines 19251–19254)
New ChannelMonitor State
The
ChannelMonitorImpl(channelmonitor.rs:1199–1400) needs new fields:The existing
pending_monitor_events: Vec<MonitorEvent>field(channelmonitor.rs:1283) is kept for backwards compatibility with existing
MonitorEventvariants (on-chain events) during the migration period, theneventually deprecated.
Serialization Changes
pending_unacknowledged_eventsanddeferred_restart_eventsmust beserialized as new TLV fields in
ChannelMonitorImpl. The existingMonitorEventserialization (channelmonitor.rs:228–246) supports existingvariants; new variants need new TLV tags in the
MonitorEventenum.get_and_clear_pending_monitor_events() Changes
The current implementation (channelmonitor.rs:4373–4377) does
mem::swaptodrain events. In the new design:
The
ChannelManagertracks the highest acknowledgedMonitorEventIdpermonitor (either in its own state or by querying the monitor) to distinguish
"new" from "already-processed" events during normal operation. A simple
approach: after processing new events, immediately acknowledge them (the
monitor will re-persist, and if the
ChannelManagercrashes beforere-persisting, the events will replay on restart — which is the desired
behavior).
Interaction with Existing Constraints
"MonitorEvents MUST NOT be generated during update processing"
The existing constraint (channelmonitor.rs:~1274–1282) says:
This constraint exists because of a race in
ChainMonitor::update_channelwhere the in-memory state is updated under a read-lock, but persistence hasn't
completed yet. If events were generated during update processing and consumed
before persistence, a restart would replay the update but the event would be
lost.
In the new design, this constraint is relaxed because:
persisted, on restart the update will be replayed and the event regenerated
ChannelManagerwon't "lose" eventsHowever, we must ensure idempotent event generation — replaying a
ChannelMonitorUpdatemust not duplicate events. The monitor should checkwhether an event for a given HTLC already exists before generating a new one.
This is straightforward since events carry enough identifying information
(channel_id + htlc_id) for dedup.
Chain Watch Trait
The
chain::Watchtrait'srelease_pending_monitor_events()method(chain/mod.rs:345–347) needs to change:
Migration Strategy
Backwards Compatibility
ChannelMonitorstate can be read by new code. New fields are additiveTLVs with defaults (empty vecs, zero counter).
ChannelManagerstate can still be loaded. On first load with new code,the on-load reconstruction runs one final time (the existing logic is
retained behind the version check). After the
ChannelManagerisre-persisted, all state is in the new format.
SERIALIZATION_VERSION(channelmanager.rs:17246) andRECONSTRUCT_HTLCS_FROM_CHANS_VERSION(channelmanager.rs:17258) to gatenew behavior.
Phased Approach
Phase 1: Persistent MonitorEvents with acknowledgment
MonitorEventId,pending_unacknowledged_events,deferred_restart_eventstoChannelMonitorImplacknowledge_monitor_events()toChannelMonitorandchain::Watchget_and_clear_pending_monitor_events()to not clear for new eventsMonitorEventvariants (HTLCEvent,CommitmentTxConfirmed, etc.)continue to use the old fire-and-forget path during this phase
Phase 2: Move HTLC forwarding to monitor-driven events
HTLCIrrevocablyCommittedstep andHTLCAcceptedeventChannelgenerates the new step inrevoke_and_ack()instead of pushingto
monitor_pending_update_addsChannelManagerprocessesHTLCAcceptedin the event loopdecode_update_add_htlcsmap,monitor_pending_update_adds,monitor_pending_forwardsinbound_htlcs_pending_decode(),inbound_forwarded_htlcs(),outbound_htlc_forwards(), all dedup helpersPhase 3: Move forwarded HTLC claiming to monitor events
FulfillHTLCstep andForwardedHTLCClaimedeventRAAMonitorUpdateBlockingAction,EventCompletionAction::ReleaseRAA,blocked_monitor_updates,hold_mon_updateparameter, etc.MonitorUpdateCompletionAction::EmitEventOptionAndFreeOtherChannelPhase 4: Move inbound MPP claiming to monitor-driven events
InboundMPPClaimPersistedeventclaim_funds_internal()to rely on deferred monitor eventsPendingMPPClaim,PendingMPPClaimPointer,ClaimedMPPPaymentRAA blockerMPPClaimHTLCSource,HTLCClaimSourcePhase 5: Simplify on-load logic
MonitorEventreplay loopfrom_channel_manager_data()_legacysuffixed fieldsRisks and Open Questions
Monitor persistence size: Storing events until acknowledged increases the
persistent monitor size. Events are relatively small (one per HTLC), but
high-volume nodes could accumulate events if the
ChannelManageris slowto persist. Mitigation: batch acknowledgments; keep events compact; bound
the maximum number of unacknowledged events.
Idempotent event generation: When a
ChannelMonitorUpdateis replayedon restart, the monitor must not duplicate events. The implementation must
check for existing events with the same HTLC identifier before generating
new ones.
Backwards compatibility on upgrade: First load with new code must bootstrap
the new event state from the existing reconstruction. The existing
reconstruction logic runs one final time, then the
ChannelManagerisre-persisted in the new format. This means the existing reconstruction
code must be maintained (but can be feature-gated) until we're confident
all users have upgraded.
InboundHTLCState/InboundUpdateAddsimplification: TheChannelstill needs to track HTLCs for commitment transaction negotiation, but
InboundUpdateAdd::Forwarded(which exists solely for on-loadreconstruction) can be removed. The
Channelcan transition directly fromWithOnionto "onion consumed" without tracking forwarding state.Timing of
HTLCAcceptedevents: The monitor needs to know when anHTLC is irrevocably committed. Today, this is derived from the
Channel'sInboundHTLCStatemachine. In the new design, theChannelsends anexplicit
HTLCIrrevocablyCommittedstep at the right moment. The monitordoesn't need to replicate the state machine — it just needs to generate the
event when it receives the step. This is simpler than having the monitor
independently track HTLC lifecycle states.
Performance: The current "clear on read" approach for
MonitorEvents iszero-cost at read time. Persistent events require cloning on read and
additional serialization. However:
than the small runtime cost
Trampoline forwards:
HTLCSource::TrampolineForwardcontains multipleprevious_hop_dataentries. The new event system must handle this — asingle
ForwardedHTLCClaimedevent for a trampoline forward shouldcarry all upstream sources. This is straightforward since the
HTLCSourceenum already handles this.Summary
ChannelManager-persisted HTLC stateblocked_monitor_updatesmechanismmonitor_pending_*fieldsPendingMPPClaim+PendingMPPClaimPointer+Arc<Mutex>+ RAA blockers + completion actionsEventUnblockedChannelThe key insight is that persistent, acknowledged
MonitorEvents replaceboth the RAA-blocking mechanism (which existed to ensure preimages aren't
lost across monitor updates) and the on-load reconstruction logic (which
existed because
MonitorEvents were fire-and-forget). By making eventsdurable and acknowledgment-driven, we get correctness by construction — the
monitor holds onto events until the
ChannelManagerhas processed them, andon restart we simply replay.
The channel closure race condition (channel.rs:3124–3127) disappears because
there are no
monitor_pending_*fields to lose — theChannelMonitorgenerates events directly from its own state, which is always durable.