-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle bind failures when running NIO in a tight sandbox #2616
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is not related to the PR right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct not related.
Sources/NIOCore/Channel.swift
Outdated
|
||
/// An attempt to bind a `Channel` failed. | ||
case bindFailed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly this is a breaking change but I don't think we need this case at all.
Sources/NIOPosix/SocketChannel.swift
Outdated
self.updateCachedAddressesFromSocket(updateRemote: false) | ||
try self.socket.listen(backlog: backlog) | ||
} catch { | ||
promise?.fail(ChannelError.bindFailed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should definitely still fail the promise with the error produced by the socket.bind
class. Otherwise it is hard to tell what went wrong. Could you explain why this approach works around the assert? I would like to understand how we actually ended up in the assert in the first place since something must have caused the call to readable
and we just have to avoid that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after re thinking my approach the way I set up my sandbox to test this was not correct, thus nor is this solution. In the actual testChannelInactiveCancelScheduled
I set up a class MockServerSocket: ServerSocket {}
that essentially always simulate bind failure. However I did this by overriding bind but by doing that I am not hitting all of NIOs binding process.
Now I am overriding the standard bind sys call thats called by NIO by creating a bind function that always returns EPERM
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen) {
int (*original_bind)(int, const struct sockaddr *, socklen_t);
original_bind = dlsym(RTLD_NEXT, "bind");
errno = EPERM;
return -1;
}
compile it down to a shared object and preload it and run the test. Going to use this approach to create a new solution here.
In the actual testChannelInactiveCancelScheduled
we call readable
since we want to test whether scheduled reads are closed when the channel is closed. The issue here is readable
checks if the lifecycleManager
is active
, however at this stage it's only transitioned from fresh
to preRegistered
thus triggering the assert.
@@ -263,6 +263,7 @@ final class ServerSocketChannel: BaseSocketChannel<ServerSocket> { | |||
// It's important to call the methods before we actually notify the original promise for ordering reasons. | |||
self.becomeActive0(promise: promise) | |||
}.whenFailure{ error in | |||
self.close0(error: error, mode: .all, promise: nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon bind failing, the lifecycleManager
seems to remain in the preRegistered
state. thus we are transiting to the closed
state here.
@@ -884,7 +884,18 @@ extension SALTest { | |||
} | |||
} | |||
} | |||
|
|||
|
|||
func assertBindFailure(expectedAddress: SocketAddress, file: StaticString = #filePath, line: UInt = #line) throws { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we fold this into the existing assertBind
call, with an optional errorReturn
parameter that defaults to nil
? That's a little closer to how we typically structure things in the SAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
import NIOEmbedded | ||
import NIOHTTP1 | ||
|
||
private class MessageEndHandler<Head: Equatable, Body: Equatable>: ChannelInboundHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably you don't actually want to add this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed!
…t-nio into fix-sandbox-bind-error
…dbox-bind-error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I would like @Lukasa to have one final looks as well though
@@ -322,6 +322,48 @@ final class SALChannelTest: XCTestCase, SALTest { | |||
} | |||
}.salWait()) | |||
} | |||
|
|||
func testBindFailureClosesChannel() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test isn't specific enough: it continues to pass even with the new close0
line removed. It looks like this is because the failed bind
promise is caught by the ClientBootstrap
in the connect
flow and triggers a close0
.
The same code appears to be present in the server bootstrap. Can we try to nail down where the bind
issue came from in your original reproducer? It doesn't currently look like bind0
needs to close where it does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I will be taking a closer look at this over the weekend!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment
Addressed
AcceptBackoffHandlerTest
assertion failure when running NIO in a tight sandbox with bind disallowed.Motivation:
When running SwiftNIO tests in a restricted environment, an assertion error was encountered when bind and other network related operations were disallowed. The tests were expected to fail in such environment, but hitting an assertion was not expected. This change aims to handle such failure gracefully. Notably, the test in
AcceptBackoffHandlerTest
are all failing as expected, but it was also hitting an assertion, which was not the intended behavior. This PR closes #2465.Modifications:
XCTAssertNoThrow
insetupChannel
to allow bind failures to propagate upwards.close0
to release resources and ensure proper channel state management when bind fails.testBindFailureClosesChannel
inSALChannelTests
to simulate and test the behavior when a bind operation fails.assertBind
inSyscallAbstractionLayer
to also account for when a bind operation could fails as expected.Result:
After these changes, all tests in
AcceptBackoffHandlerTest
will fail gracefully instead of hitting an assertion if bind fails.