Skip to content

Conversation

ysbaddaden
Copy link
Contributor

@ysbaddaden ysbaddaden commented Sep 2, 2025

First pass at fixing #16127 using a basic ref counter and delayed close to when there are no more references.

The integration is limited to UNIX, needs to refactor a few system methods, and requires a new EventLoop#before_close, distinct from EventLoop#close, so we can close the fd lock, resume pending waiters (references can be decremented) then we can finally close the fd for real.

The refcount only happens for calls that may affect the fd, mere queries for the fd configuration skip it: at worst they return whatever config or fail normally.

closes #16127

@ysbaddaden
Copy link
Contributor Author

The GC prints a bunch of GC Warning: Finalization cycle involving (???) 😭

I'm at a loss to guess where the cycle can come from.

@ysbaddaden ysbaddaden force-pushed the feature/add-crystal-fd-lock branch from 68e1929 to 47bc6ef Compare September 2, 2025 18:10
end

def close(file_descriptor : Crystal::System::FileDescriptor) : Nil
def before_close(file_descriptor : Crystal::System::FileDescriptor) : Nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: The names before_close and close are not very descriptive about their actual purpose. Maybe we can find more explicit names?
before_close closes the wrapper object, while close closes the system resource. Maybe close_wrapper and close_system_fd? Not entirely convinced these are the best choices, but they seem better than the current ones...

Copy link
Contributor Author

@ysbaddaden ysbaddaden Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean with "wrapper object". Maybe #before_close could be #resume_all but it becomes a directive (do this) rather than a broader "we're about to close, do whatever you need to"

But #close becomes more to the point: close this fd/sock (and only useful for io_uring that can close async).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "wrapper object" is not a good term. I was thinking about FileDescriptor as being a wrapper of the system API. And before_close takes care of closing the aspects specific to the runtime, while close closes the actual file descriptor resource for the OS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the behavior of #before_close here is identical to the #reopened hook called after we reopen a file descriptor: resume any pending waiter for the given IO::FileDescriptor or Socket. We might as well have a single #resume_all, maybe?


private def decrement_slow(m)
# last ref after close must resume the closing fiber
return unless (m & CLOSED) == CLOSED && (m & MASK) == REF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Simplify to CLOSED | REF

Suggested change
return unless (m & CLOSED) == CLOSED && (m & MASK) == REF
return unless m == CLOSED | REF

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, but I prefer a more explicit "this is closed" + "this is the last ref", and it won't break if we add more bits later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it was a bit surprising because the expression is more complicated than it needs to be. The simplified version is much easier to understand that this matches actually one single state.

@ysbaddaden
Copy link
Contributor Author

@straight-shoota I'll see the suggestions after I figure out the issues (especially the GC finalization cycles).

@ysbaddaden
Copy link
Contributor Author

Now the interpreter is broken 😮‍💨

@ysbaddaden
Copy link
Contributor Author

I made FdLock a NOOP until the preview_mt flag is set.

On UNIX also delegates the implementation from `System::Socket` to
`System::FileDescriptor` since it's identical.
Instead use class methods on System::Socket on Win32 for the one case
where we need to configure the socket handle after accept.
@ysbaddaden
Copy link
Contributor Author

There's an issue on Windows. I should rebase from #16137 an investigate.

@ysbaddaden ysbaddaden force-pushed the feature/add-crystal-fd-lock branch from 555f008 to 9301e5b Compare September 8, 2025 12:18
straight-shoota pushed a commit that referenced this pull request Sep 9, 2025
…#16137)

The main point is to not pass `fd` explicitly as an argument, and to instead let `System::FileDescriptor` and `System::Socket` use the `fd` property. This allows #16128 to wrap calls affecting the `fd` in a reference counter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Review
Development

Successfully merging this pull request may close these issues.

Closing fd is thread unsafe on UNIX targets
2 participants