Skip to content

LSPS2: Add error handling events for failed client requests #3804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

martinsaposnic
Copy link
Contributor

Fixes #3459

Add GetInfoFailed and BuyRequestFailed event variants to LSPS2ClientEvent
to handle LSP error responses, similar to the OrderRequestFailed event
added to LSPS1.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented May 27, 2025

👋 Thanks for assigning @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

Copy link

codecov bot commented May 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.84%. Comparing base (c46d0df) to head (f1c95fc).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3804      +/-   ##
==========================================
+ Coverage   89.73%   89.84%   +0.10%     
==========================================
  Files         159      159              
  Lines      128910   128937      +27     
  Branches   128910   128937      +27     
==========================================
+ Hits       115682   115848     +166     
+ Misses      10535    10395     -140     
- Partials     2693     2694       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Mostly looks good, just some comments.

Will review the test changes closer once #3712 landed and we rebased, as the same refactoring is happening there.

code: LSPS0_CLIENT_REJECTED_ERROR_CODE,
message: "Reached maximum number of pending requests. Please try again later."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was very intentional that we hardcoded a standardized response here, because while we want to log the full details, the counterparty shouldn't learn anything about our rate limits.

Could you expand what else you mean with "incorrect error response types"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand what else you mean with "incorrect error response types"?

insert_pending_request is called by handle_get_info_request and handle_buy_request but it always returns BuyError, even if the request type is GetInfo. Now it returns GetInfoError for GetInfo requests and BuyError for Buy requests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, that difference got lost in the diff here. Yes, then let's still keep returning the hardcoded error strings to the counterparties, but just make sure the right response type is returned.

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

@ldk-reviews-bot
Copy link

✅ Added second reviewer: @joostjager

service_node.liquidity_manager.handle_custom_message(get_info_request, client_node_id).unwrap();

let get_info_event = service_node.liquidity_manager.next_event().unwrap();
match get_info_event {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With something like this, you can keep the happy flow unindented for improved readability.

	let LiquidityEvent::LSPS2Service(LSPS2ServiceEvent::GetInfo {
		request_id,
		counterparty_node_id,
		token,
	}) = service_node.liquidity_manager.next_event().unwrap() else {
		panic!("Unexpected event");
	};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let-else bindings are only stabilized since rustc 1.65, so we can't use them currently due to our 1.63 MSRV.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm too bad. Still if let can I think remove one level of indent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found a few places where indentation is reduced so I did the if let there 👍

"Peer {} reached maximum number of total pending requests: {}",
if self.total_pending_requests.load(Ordering::Relaxed) >= MAX_TOTAL_PENDING_REQUESTS {
let error_message = format!(
"reached maximum number of total pending requests {}: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not capitalized anymore?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it useful to have counterparty_node_id in this message when the total is exceeded?

@@ -1226,45 +1226,43 @@ where
&self, peer_state_lock: &mut MutexGuard<'a, PeerState>, request_id: LSPSRequestId,
counterparty_node_id: PublicKey, request: LSPS2Request,
) -> (Result<(), LightningError>, Option<LSPSMessage>) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing but shouldn't the type be Result<(), (LightningError, LSPSMessage>)?

Copy link
Contributor

@tnull tnull May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the method returns either Ok(()) or an error, and optionally a message that needs to be sent to the counterparty.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that in this fn, it's indeed either Ok(()), or an error always accompanied by a msg (not optional). But maybe this is a useful type further up the call stack?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the rest of the API follows the same pattern of returning Result<(), LightningError>. Changing the result type in this way here would mean we'd get an Err variant that we can't just bubble up at the callsite but rather awkwardly need to map, etc.

Also note that we chose that if let Some(msg) pattern to ensure we're able to drop all locks (in particular peer_state_lock) before enqueuing messages, which might lead to reentry when PeerManager processes the enqueued messages. I.e., it was exactly introduced to avoid handling/enqueing the messages in the match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Err variant is already not just bubbled up. Both call sites do a match on the result already?

I was indeed wondering why the lock had to be dropped before enqueueing. Could be worthy of a comment.

This thread is mostly out of curiosity / to learn. No changes requested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Err variant is already not just bubbled up

Yes it is, essentially at all callsites?

I was indeed wondering why the lock had to be dropped before enqueueing. Could be worthy of a comment.

Well, it's also best practice in general. But yeah. At some point we could even consider a refactoring that would make it infeasible to enqueue while still holding the lock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is, essentially at all callsites?

There is still a match at the call sites.

Well, it's also best practice in general.

You mean best practice to not hold the lock longer than necessary? Or not queueing while holding the lock?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean best practice to not hold the lock longer than necessary? Or not queueing while holding the lock?

The latter.

Add GetInfoFailed and BuyRequestFailed event variants to LSPS2ClientEvent
to handle LSP error responses, similar to the OrderRequestFailed event
added to LSPS1.

Fixes lightningdevkit#3459
Update handle_get_info_error and handle_buy_error to:
- Emit GetInfoFailed and BuyRequestFailed events to notify users
- Return LightningError instead of Ok() to properly signal failures
- Use the actual error parameter instead of ignoring it

This ensures consistent error handling behavior across LSPS implementations.
Replace hardcoded BuyError responses with appropriate error types based on
the actual request (GetInfo vs Buy) when rate limiting is triggered.

This fixes incorrect error response types and provides an easy way to test
the error event handling added in this PR.
Add tests for the new error events:
- Test per-peer request limit rejection (GetInfoFailed)
- Test global request limit rejection (BuyRequestFailed)
- Test invalid token handling (GetInfoFailed)

These tests verify that error events are properly emitted when LSP
requests fail, ensuring the error handling behavior works end-to-end.
… improve readability and reduce indentation in some places
@martinsaposnic
Copy link
Contributor Author

@tnull @joostjager thank you for your reviews! Comments are addressed on fixup commits

Will review the test changes closer once #3712 landed and we rebased, as the same refactoring is happening there.

Rebase with #3712 is done

@martinsaposnic
Copy link
Contributor Author

The build macos-latest 1.63.0 failed with error [build ](error: package lock_api v0.4.13 cannot be built because it requires rustc 1.64.0 or newer, while the currently active rustc version is 1.63.0)

I don't think it's related to my PR but will take a closer look tomorrow

assert_eq!(counterparty_node_id, service_node_id);
assert_eq!(error.code, 1); // LSPS0_CLIENT_REJECTED_ERROR_CODE
},
_ => panic!("Expected LSPS2ClientEvent::BuyRequestFailed event"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: more indent removal possible using if let? Also further up there are blocks.

@tnull
Copy link
Contributor

tnull commented May 30, 2025

The build macos-latest 1.63.0 failed with error [build ](error: package lock_api v0.4.13 cannot be built because it requires rustc 1.64.0 or newer, while the currently active rustc version is 1.63.0)

I don't think it's related to my PR but will take a closer look tomorrow

Yeah, no, this is unrelated as they unfortunately bumped the MSRV on a patch release. CI fix is here: #3812

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LSPS2: Add LSPS2ClientEvent::RequestFailed event
4 participants