-
Notifications
You must be signed in to change notification settings - Fork 5
refactor(dynamic mount): introduce dynamic server manager #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
312876a to
19509c8
Compare
19509c8 to
6823740
Compare
|
@gemini-code-assist review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant and well-motivated refactoring to improve the management of dynamic CSI mounts. The introduction of DynamicServerManager to handle per-volume HTTP servers is a great step towards better security and stability. The code is well-structured, and backward compatibility for the old shared socket approach is thoughtfully maintained. My review includes a few suggestions to enhance robustness, particularly around context propagation for graceful shutdowns, ensuring proper logging severity for error conditions, and maintaining test coverage for the refactored code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a significant refactoring to improve the management of dynamic CSI mounts. It introduces a DynamicServerManager to create per-volume HTTP servers, which enhances security and stability by isolating volumes from each other and resolving issues with mount propagation. The changes are well-structured, moving logic into a new service layer and cleaning up the old implementation.
My review focuses on a few critical areas:
- Concurrency safety: I've identified a critical issue with the use of
os.Chdirin a concurrent environment, which can lead to race conditions. - Context propagation: There are places where
context.Background()is used instead of propagating the request-scoped context, which can affect graceful shutdown. - Error handling: I've pointed out a couple of spots where error handling could be more robust to prevent unexpected behavior.
Overall, this is a great improvement. Addressing these points will make the implementation even more robust.
We found that when a pod container is recreated, there's a chance the dynamically mounted CSI directory gets unmounted. Since this directory is shared among multiple pod containers via mount propagation from /var/run/model-csi (a tmpfs), it introduces instability and security isolation risks, for example, if one pod learns another pod's volume name, it could maliciously call APIs to mount or unmount that submount of volume. To ensure secure isolation for each dynamic mount and avoid unstable mount propagation, an independent csi.sock is currently created under each dynamic mount directory instead of using a shared csi.sock, these individual csi.sock servers are managed by the DynamicServerManager. Signed-off-by: imeoer <[email protected]>
6823740 to
de42dbc
Compare
aftersnow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
chlins
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
We found that when a pod container is recreated, there's a chance the dynamically mounted CSI directory gets unmounted. Since this directory is shared among multiple pod containers via mount propagation from
/var/run/model-csi(a tmpfs), it introduces instability and security isolation risks, for example, if one pod learns another pod's volume name, it could maliciously call APIs to mount or unmount that submount of volume.This pull request refactors the management of dynamic HTTP servers for CSI mounts, improving security and stability by introducing the
DynamicServerManagerand deprecating the previous shared dynamic CSI endpoint approach. It also restructures the codebase for better maintainability by moving HTTP server logic frompkg/server/http.goto new service-layer files and updating handlers accordingly. Additionally, it enhances resource cleanup and recovery logic for dynamic servers, and updates mount/unmount workflows to leverage the new manager.Dynamic Server Management and API Refactor:
DynamicServerManagerinpkg/service/dynamic_server.goto handle lifecycle (creation, closing, recovery) of per-volume dynamic HTTP servers, replacing the old shared socket approach. This improves isolation and avoids mount propagation issues.dynamic_csi_endpointin favor of individualcsi.socksockets managed per dynamic mount directory, as documented inpkg/config/config.go.pkg/server/http.goand replaced it with dynamic server logic in the service layer. [1] [2]Handler and API Changes:
HttpHandlerinpkg/server/http_handler.gotoDynamicServerHandlerinpkg/service/dynamic_server_handler.go, updating method names and request types for clarity and consistency. [1] [2] [3] [4] [5] [6] [7]Mount/Unmount Workflow Updates:
pkg/service/node_dynamic.goto useDynamicServerManagerfor server lifecycle management, and added device checks to determine the correct cleanup path for backward compatibility. [1] [2]Resource Management and Utilities:
EnsureSockNotExists), replacing the previous inline implementation for better reuse. [1] [2]pkg/config/config.go.Test and Import Cleanup:
pkg/server/server_test.goand cleaned up unused imports related to the old HTTP server code. [1] [2]This refactor lays the groundwork for more robust and secure dynamic mount management, while maintaining backward compatibility for legacy setups.