-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug]: cannot connect to peer - "failing link: unable to handle upstream settle with error: invalid update", "unknown channel ID" #8130
Comments
Could you look up the hash of the HTLC and grep for logs? It seems that your node force closed, could you verify that? |
No, according to my logs the peer closed:
Shortly after this, regarding the HTLC:
The channel is still in pendingchannels (not closedchannels), the htlc is:
|
Are you in contact with the peer? If so it would be helpful if they could share logs (can do that privately, dm on slack perhaps). |
Maybe we should add a log here:https://github.com/lightningnetwork/lnd/blob/master/htlcswitch/link.go#L1752. Because I am curious what HTLC did you Peer try to settle with you. |
I think it is d1fb33 tx you mention, this is the full lncli closedchannels info for that channel:
|
Sorry for horribly late reply, the original peer told he had logs for past 30 minutes, so I assumed he did not have anything when you asked. But as mentioned, had the same problem with other peers, contacted one of them - KnockOnWood / 039e05e271f537cfa1c060d2364b960b85bd509ac89bae524e4a01948a07b3e8d1 . In this case the outage only lasted for 15 minutes. From my side:
They wrote:
Sounded very strange, I thought my node did try to contact theirs, so would expect my node mentioned, but we have at least some info . |
Ok that brings a bit light into this problem, so the HTLC which you have in your
These messages are normal, which basically means the sender of the payment had an old Policy update of your channel and you are rejecting this htlc. |
So encountered the exact same problem with another node and I could narrow down the problem. So in case you still have some channels which do not reactive because of a @Roasbeef @yyforyongyu I think you should take a look at this. What happened to the other noderunner: He had a channel with 10 htlcs with the following relevant forwarded HTLC:
Now his peer also an LND node tried to settle this exact HTLC with an He tries to settle the HTLC with the ID:
But somehow his node thinks this HTLC is not locked in:
=> relevant codeline: Line 1755 in f005b24
What I think happens is that somehow the hashes of the 2 onionblobs remote and local differ and therefore we do not count this HTLC as an active HTLC which is fully locked in. => relevant codeline: https://github.com/lightningnetwork/lnd/blob/master/channeldb/channel.go#L2094-L2120 Looking shortly at this code, I am not sure whether we need this kind of strict check, do we really need to make sure both onion-blobs are the same in the Settle-Case, I mean we can just try whether the preimage is good and if it is we will never need this Onion-Blob anyways.... So I think this check can be loosened up, remains the question how could the two onion-blobs diverge, maybe we need to check in detail whether we do not flush different things in some cases. Apart from that the related Channel got Force-Closed and the relevant HTLC got swept by the preimage by its peer which is evidence that the peer tried to settle the correct HTLC and the problem lies indeed locally in having a database inconsistency between the remote and the local onionblob. Sweep of the HTLC by the preimage: https://mempool.space/tx/9a98fcd342d575dbbd225a8d921b605664eed3b90c6306a4723b56521293f02d |
Could you guys patch #8220 so we can have more info around this area? Thanks! |
In case your relevant channel is already closed, use lightninglabs/chantools#97 to dump the relevant data to the terminal.
|
We were able to get the relevant data at least for the case I mentioned and it underlines my assumption that the sha onions mismatch. But it's weird because only one single HTLC matches in the onionblob sha of remote and local:
|
I am running commit e3761459f52e9c343a759aa55621ce11613bc33d since yesterday around 20:00 CET, should I enable debug logs? I already saw the "unknown channel" message twice with this version, but with log level INFO, going to put CHAN to DBG. |
Restarted lnd, now running with e376145 cherry-picked over 0.17.2, got this:
I have 3 HTLCs right now with that peer, 2 of them have forwarding_channel = 0 (I see this in There is no log category CHAN, I am running with |
Thanks for helping out, we are already working on a fix. |
Fixed with #8220 (lnd 17.3) |
Background
I had a channel, but could not connect to the peer, always getting disconnected with logs like this:
The channel has been force-closed a day later, to claim the incoming HTLC I guess. The channel was working and forwarding at least until 28/Oct/2023 22:54 (the last recorded forward according to my node). I have / had 3-4 other channels with the same problem, at least in one case the complaints about unknown channel ID stopped, without restarting lnd and without closing that channel.
At the very least, this problem makes a channel unusable. And sounds a bit scary, did one of the nodes lose some data? How can a node suddenly stop recognizing its channel?
Your environment
lnd
: 0.17 (both peers). I use boltdb, peer probably postgres (on raspiblitz).bitcoind
: v25.1rc1Steps to reproduce
No idea.
Expected behaviour
Should be able to connect.
The text was updated successfully, but these errors were encountered: