Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

renepay: handle it if channel_capacity unavailable. #7194

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rustyrussell
Copy link
Contributor

CI hit this, and it's almost certainly because gossipd was writing the records at the same time:

2024-04-02T15:27:49.4466059Z lightningd-1 2024-04-02T15:21:58.479Z UNUSUAL 035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-chan#2: gossipd lost track of announced channel: re-announcing!
2024-04-02T15:27:49.4467604Z lightningd-1 2024-04-02T15:21:58.618Z UNUSUAL 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: gossipd lost track of announced channel: re-announcing!
...
2024-04-02T15:27:49.4484027Z lightningd-1 2024-04-02T15:21:59.115Z **BROKEN** plugin-cln-renepay: uncertainty_network_update (line 175) unable to fetch channel capacity

If we can't get the capacity, it's transient: use htlc_max, or 0 if we don't have one.

CI hit this, and it's almost certainly because gossipd was writing the records at the same time:

```
2024-04-02T15:27:49.4466059Z lightningd-1 2024-04-02T15:21:58.479Z UNUSUAL 035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-chan#2: gossipd lost track of announced channel: re-announcing!
2024-04-02T15:27:49.4467604Z lightningd-1 2024-04-02T15:21:58.618Z UNUSUAL 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: gossipd lost track of announced channel: re-announcing!
...
2024-04-02T15:27:49.4484027Z lightningd-1 2024-04-02T15:21:59.115Z **BROKEN** plugin-cln-renepay: uncertainty_network_update (line 175) unable to fetch channel capacity
```

If we can't get the capacity, it's transient: use htlc_max, or 0 if we don't have one.

Signed-off-by: Rusty Russell <[email protected]>
@Lagrang3
Copy link
Collaborator

Lagrang3 commented Apr 3, 2024

If we temporarily set the channel capacity to hltc_max we should also, at some point, update the capacity to the real value; maybe adding a new chan_extra_update function that checks if the capacity in the database is the same or different from the one provided by the gossmap and try to update it. In the same loop I wold have:

        for(struct gossmap_chan *chan = gossmap_first_chan(gossmap);
	    chan;
	    chan=gossmap_next_chan(gossmap,chan))
	{
		struct short_channel_id scid =
			gossmap_chan_scid(gossmap,chan);
		struct chan_extra *ce = chan_extra_map_get(chan_extra_map,
							   gossmap_chan_scid(gossmap,chan));
		if(!ce)
		{
			struct amount_sat cap;
			struct amount_msat cap_msat;

			if(!gossmap_chan_get_capacity(gossmap,chan,&cap))
			{
				... // use htlc_max as proxy for the capacity
			}
			else if(!amount_sat_to_msat(&cap_msat,cap))
			{
				plugin_err(...);
			}
			new_chan_extra(chan_extra_map,scid,cap_msat);
		} else  if( !chan_extra_update(ce, cap_msat) )
                        plugin_err(...);
                }
	}

This will increase the computational burden of updating the uncertainty network (calling uncertainty_network_update).

Comment on lines +173 to +189
/* This can happen transiently if gossipd is
* writing it to the store right now. Set
* it to larger htlc_max */
cap_msat = AMOUNT_MSAT(0);
for (int dir = 0; dir < 2; dir++)
{
struct amount_msat htlc_max;
if (!gossmap_chan_set(chan, dir))
continue;
htlc_max = amount_msat(fp16_to_u64(chan->half[dir].htlc_max));
if (amount_msat_greater(htlc_max, cap_msat))
cap_msat = htlc_max;
}
plugin_log(pay_plugin->plugin, LOG_UNUSUAL,
"Cannot fetch capacity for channel %s: using %s",
fmt_short_channel_id(tmpctx, scid),
fmt_amount_msat(tmpctx, cap_msat));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An easier solution would be to ignore those channels temporarily, they can still be picked up next time uncertainty_network_update is called again.

Suggested change
/* This can happen transiently if gossipd is
* writing it to the store right now. Set
* it to larger htlc_max */
cap_msat = AMOUNT_MSAT(0);
for (int dir = 0; dir < 2; dir++)
{
struct amount_msat htlc_max;
if (!gossmap_chan_set(chan, dir))
continue;
htlc_max = amount_msat(fp16_to_u64(chan->half[dir].htlc_max));
if (amount_msat_greater(htlc_max, cap_msat))
cap_msat = htlc_max;
}
plugin_log(pay_plugin->plugin, LOG_UNUSUAL,
"Cannot fetch capacity for channel %s: using %s",
fmt_short_channel_id(tmpctx, scid),
fmt_amount_msat(tmpctx, cap_msat));
/* This can happen transiently if gossipd is
* writing it to the store right now.
* Ignore this channel for the moment, we can
* pick it up next time. */
plugin_log(pay_plugin->plugin, LOG_UNUSUAL,
"Cannot fetch capacity for channel %s: skipping it.",
fmt_short_channel_id(tmpctx, scid));

The only issue I can think of right now, is that we will have some channels in gossmap that have no counterpart in chan_extra_map and we would have to handle this case when the MCF graph is built (init_linear_network in mcf.c assumes that).

@Lagrang3
Copy link
Collaborator

I can fix this issue on top of #7125, by just skipping channels for which gossmap_chan_get_capacity fail.
If we merge these changes here there will be some conflicts with #7125.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants