-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add must-gather from Forklift UI enhancement #37
Conversation
Proposing enhancement allowing to execute must-gather from Forklift UI.
I like the idea of making it easier to obtain the must-gather for end-user. @aufi do you see a path forward of implementing this in a reusable approach so other projects could share this and tweak for their integration? (Updated comment: Just saw you already mentioned plans of this being shared with other projects via #36) @eriknelson @ibolton336 @djwhatle @shawn-hurley please take a look and consider if this would be useful to consider for Crane. Assume we'd need to have:
|
Correct, I'm planning to have the same must-gather-service as a backend (just with configuration adjustments) and the UI part specific per-project. The backend can be configured e.g. for DEFAULT_IMAGE, so it should be suitable for everyone, https://github.com/aufi/must-gather-rest-wrapper#available-api-parameters
Good point, I think bigger files (hundreds MB) could be quite common (also in OpenShift must-gather outside migration toolkit) and it could be useful. We could add information about size to API, so when must-gather has finished and archive is ready for dowload, the archive size would be part of API respose to let UI or User decide if the archive download actually makes sense.
Right, the must-gather execution usually takes from few minutes to few tens of minutes. The must-gather execution/archive generation on backend is asynchronous, so UI triggers must-gather with POST request and then periodically queries the API until must-gather execution status gets to "completed", then it can serve the archive file to user. But user needs to wait to ge able download result until it finishes. |
|
||
When a VM migration fails, Forklift user needs quick access to migration debugging informations and other details provided by must-gather feature. It should help to identify where is a problem and to make a successful migration once the problem is resolved. | ||
|
||
To make it easy for user, it should be supported to trigger must-gather directly from UI and don't have to switch to CLI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea of making it obvious to the user that this is an option from the UI. The approach we've taken for some things like this in Crane has been to provide a "call to action" in the UI with a "Click this button to copy command" for the CLI command that the user can execute on their own (e.g. provide the logs command we recommend the user run).
Since must-gather was designed for CLI in the first place and provides all output on progress through streaming must-gather Pod logs (and given that we have no way to offer progress otherwise on the several-minutes long process) I wonder if giving the user a pointer to the CLI command would be a more empowering UX that lets the user have more direct access to info.
As a counter point to my above argument, if Forklift is designed so the user rarely has to drop to CLI and this would be a jarring experience for them, I could see the argument for the UI-driven must-gather button being less scary to the user. My main gripe if I was the user here would be complete lack of progress reporting on a rather long-running process.
I would also be concerned about how this info would be stored and accessed (auth?) if it's run by the wrapper service, given that info gathered could contain sensitive info depending on how you wrote your must-gather. Would we require a PV allocated to this wrapper service? How long will must-gather dumps be kept in storage?
One example of a system I've seen like this is Google Takeout, a system for getting all your personal info from Google in a .zip archive. They kick off the process but let you know it will take a long time, and email you when it's done. They don't provide progress info but at least it's very easy to come back later and get the data you asked for. Just something to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 IMO it sounds like the flow here would be something best managed by node or something on the middle/backend since it would involve large streams of data and user specific information that could be authenticated with stored session cookies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea of making it obvious to the user that this is an option from the UI. The approach we've taken for some things like this in Crane has been to provide a "call to action" in the UI with a "Click this button to copy command" for the CLI command that the user can execute on their own (e.g. provide the logs command we recommend the user run). ...snip... My main gripe if I was the user here would be complete lack of progress reporting on a rather long-running process.
Thanks for the review! This is clearly an alternative to CLI, as mentioned, this should fit well to Forklift. The Crane's approach is little bit different (execute command by user).
To provide updates during must-gather execution process, I'm looking on command streamed output capture and updating it in the must-gather execution JSON, so when API returns execution that have started, there will be the latest update of actual command output in response.
I would also be concerned about how this info would be stored and accessed (auth?) if it's run by the wrapper service, given that info gathered could contain sensitive info depending on how you wrote your must-gather. Would we require a PV allocated to this wrapper service? How long will must-gather dumps be kept in storage?
The authentication - initial plan was that the pod running the service will have mounted PV with auth token. I'm not sure on oauth part passed from UI. I'm going o investigate more and provide update later.
Must-gather dumps are planned to be in the container (not PV) and be regually cleared (files&sqlite records). This has been implemented already and it can be configured with CLEANUP_MAX_AGE option https://github.com/aufi/must-gather-rest-wrapper#configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To provide updates during must-gather execution process, I'm looking on command streamed output capture and updating it in the must-gather execution JSON, so when API returns execution that have started, there will be the latest update of actual command output in response.
Oh that sounds very cool! +1
The authentication - initial plan was that the pod running the service will have mounted PV with auth token. I'm not sure on oauth part passed from UI. I'm going o investigate more and provide update later.
Sounds good, just need to make sure the must-gather dumps are protected if they contain, for instance, VM management credentials.
Must-gather dumps are planned to be in the container (not PV)
Not sure I'm following, I don't think you can just write data "in the container" without some sort of volume mount, but maybe you're just using the filesystem of whichever node the rest-wrapper is running on?
and be regually cleared (files&sqlite records). This has been implemented already and it can be configured with CLEANUP_MAX_AGE option
Nice, smart usage of sqlite features there 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must-gather dumps are planned to be in the container (not PV)
Not sure I'm following, I don't think you can just write data "in the container" without some sort of volume mount, but maybe you're just using the filesystem of whichever node the rest-wrapper is running on?
Thanks for feedback, I meant a filesystem where the container runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would personally add the backend and this together. I wonder if we can consider some sort of type script module for executing the backend that anyone using typescript and has the re-usable backend implementation could use?
I like this idea! Would be really good candidate for https://github.com/konveyor/lib-ui if we can pull something together. |
Thanks for feedback provided so far! Based on suggestions I merged this backend enhancement with a Forklift UI enhancement #36. The discussion can continue here. |
Added auth by bearer token provided via API, no serviceaccount or secret is needed to be passed to the must-gather-api backend pod deployment. |
|
||
The must-gather wrapper service is an administration tool serving users with oauth token for ```cluster-admin``` role, so it does not accept requests from anonymous or unknown clients, so e.g. accepting arbitrary ```image``` as API parameter seems reasonable. | ||
To ensure that the must-gather result archive is provided to the right user, a hash of the provided bearer token is stored for each must-gather execution and API filters must-gather results by the bearer token hash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bearer tokens themselves are not the identity of the user IIUC. Do we have any other alternatives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point on naming, correct; it is more identification of User's temporary session, than persisted User account. The con is that if user would re-login, the must-gather execution from their previous login became not be visible anymore.
I was thinking about translating bearer token to username (like oc whoami output), but that requires oc login call for each request even for get/list must-gather executions, which I wanted to avoid. Let me check more options by tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems I don't have anything better than "translating" bearer token to username that it belongs to and keep that username (or its hash) in the must-gather execution record. The bearer token -> username would be cached most likely to avoid maby login requests to OpenShift API.
@shawn-hurley, do you think this is an acceptable solution (or would you recommend something else)? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have access to the user that is requesting a resource could you just do a SubjectAccessReview to the specific resource to see if they have access rather than caching it on the object itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks, I can do something like Jeff did in https://github.com/konveyor/forklift-controller/blob/main/pkg/controller/provider/web/base/auth.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawn-hurley Thanks for your feedback! The authorization section was updated with 69f2c92
The API service will check bearer token before processing each request. The check is the SubjectAccessReview against the cluster with * on * (all actions on all objects, https://github.com/konveyor/forklift-must-gather-api/blob/main/pkg/backend/auth.go#L128-L142). Would that be okay from the enhancement specification point of view?
Anyone know if further work is planned on this writeup, is this in a state to merge, etc? |
@aufi are we good with merging this, are there any other updates you would like to make? |
I'm good with merging this enhancement spec! |
Proposing enhancement allowing to execute must-gather from Forklift UI.
Related to must-gather backend enhancement proposal #36