Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add must-gather from Forklift UI enhancement #37

Merged
merged 5 commits into from
Mar 3, 2022
Merged

Add must-gather from Forklift UI enhancement #37

merged 5 commits into from
Mar 3, 2022

Conversation

aufi
Copy link
Member

@aufi aufi commented Jul 16, 2021

Proposing enhancement allowing to execute must-gather from Forklift UI.

Related to must-gather backend enhancement proposal #36

Proposing enhancement allowing to execute must-gather from Forklift UI.
@jwmatthews
Copy link
Member

jwmatthews commented Jul 16, 2021

I like the idea of making it easier to obtain the must-gather for end-user.

@aufi do you see a path forward of implementing this in a reusable approach so other projects could share this and tweak for their integration? (Updated comment: Just saw you already mentioned plans of this being shared with other projects via #36)

@eriknelson @ibolton336 @djwhatle @shawn-hurley please take a look and consider if this would be useful to consider for Crane.

Assume we'd need to have:

  • sane way to limit the size of the must-gather so the user is not downloading a multi-hundred MB+ archive
  • need to account for the generation of must-gather taking several minutes+, I think we've seen 10+ min times to generate in past

@aufi
Copy link
Member Author

aufi commented Jul 16, 2021

@aufi do you see a path forward of implementing this in a reusable approach so other projects could share this and tweak for their integration? (Updated comment: Just saw you already mentioned plans of this being shared with other projects via #36)

Correct, I'm planning to have the same must-gather-service as a backend (just with configuration adjustments) and the UI part specific per-project. The backend can be configured e.g. for DEFAULT_IMAGE, so it should be suitable for everyone, https://github.com/aufi/must-gather-rest-wrapper#available-api-parameters

Assume we'd need to have:

  • sane way to limit the size of the must-gather so the user is not downloading a multi-hundred MB+ archive

Good point, I think bigger files (hundreds MB) could be quite common (also in OpenShift must-gather outside migration toolkit) and it could be useful. We could add information about size to API, so when must-gather has finished and archive is ready for dowload, the archive size would be part of API respose to let UI or User decide if the archive download actually makes sense.

  • need to account for the generation of must-gather taking several minutes+, I think we've seen 10+ min times to generate in past

Right, the must-gather execution usually takes from few minutes to few tens of minutes.

The must-gather execution/archive generation on backend is asynchronous, so UI triggers must-gather with POST request and then periodically queries the API until must-gather execution status gets to "completed", then it can serve the archive file to user. But user needs to wait to ge able download result until it finishes.


When a VM migration fails, Forklift user needs quick access to migration debugging informations and other details provided by must-gather feature. It should help to identify where is a problem and to make a successful migration once the problem is resolved.

To make it easy for user, it should be supported to trigger must-gather directly from UI and don't have to switch to CLI.
Copy link
Contributor

@djwhatle djwhatle Jul 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea of making it obvious to the user that this is an option from the UI. The approach we've taken for some things like this in Crane has been to provide a "call to action" in the UI with a "Click this button to copy command" for the CLI command that the user can execute on their own (e.g. provide the logs command we recommend the user run).

Since must-gather was designed for CLI in the first place and provides all output on progress through streaming must-gather Pod logs (and given that we have no way to offer progress otherwise on the several-minutes long process) I wonder if giving the user a pointer to the CLI command would be a more empowering UX that lets the user have more direct access to info.

As a counter point to my above argument, if Forklift is designed so the user rarely has to drop to CLI and this would be a jarring experience for them, I could see the argument for the UI-driven must-gather button being less scary to the user. My main gripe if I was the user here would be complete lack of progress reporting on a rather long-running process.

I would also be concerned about how this info would be stored and accessed (auth?) if it's run by the wrapper service, given that info gathered could contain sensitive info depending on how you wrote your must-gather. Would we require a PV allocated to this wrapper service? How long will must-gather dumps be kept in storage?

One example of a system I've seen like this is Google Takeout, a system for getting all your personal info from Google in a .zip archive. They kick off the process but let you know it will take a long time, and email you when it's done. They don't provide progress info but at least it's very easy to come back later and get the data you asked for. Just something to consider.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 IMO it sounds like the flow here would be something best managed by node or something on the middle/backend since it would involve large streams of data and user specific information that could be authenticated with stored session cookies.

Copy link
Member Author

@aufi aufi Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea of making it obvious to the user that this is an option from the UI. The approach we've taken for some things like this in Crane has been to provide a "call to action" in the UI with a "Click this button to copy command" for the CLI command that the user can execute on their own (e.g. provide the logs command we recommend the user run). ...snip... My main gripe if I was the user here would be complete lack of progress reporting on a rather long-running process.

Thanks for the review! This is clearly an alternative to CLI, as mentioned, this should fit well to Forklift. The Crane's approach is little bit different (execute command by user).

To provide updates during must-gather execution process, I'm looking on command streamed output capture and updating it in the must-gather execution JSON, so when API returns execution that have started, there will be the latest update of actual command output in response.

I would also be concerned about how this info would be stored and accessed (auth?) if it's run by the wrapper service, given that info gathered could contain sensitive info depending on how you wrote your must-gather. Would we require a PV allocated to this wrapper service? How long will must-gather dumps be kept in storage?

The authentication - initial plan was that the pod running the service will have mounted PV with auth token. I'm not sure on oauth part passed from UI. I'm going o investigate more and provide update later.

Must-gather dumps are planned to be in the container (not PV) and be regually cleared (files&sqlite records). This has been implemented already and it can be configured with CLEANUP_MAX_AGE option https://github.com/aufi/must-gather-rest-wrapper#configuration

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To provide updates during must-gather execution process, I'm looking on command streamed output capture and updating it in the must-gather execution JSON, so when API returns execution that have started, there will be the latest update of actual command output in response.

Oh that sounds very cool! +1

The authentication - initial plan was that the pod running the service will have mounted PV with auth token. I'm not sure on oauth part passed from UI. I'm going o investigate more and provide update later.

Sounds good, just need to make sure the must-gather dumps are protected if they contain, for instance, VM management credentials.

Must-gather dumps are planned to be in the container (not PV)

Not sure I'm following, I don't think you can just write data "in the container" without some sort of volume mount, but maybe you're just using the filesystem of whichever node the rest-wrapper is running on?

and be regually cleared (files&sqlite records). This has been implemented already and it can be configured with CLEANUP_MAX_AGE option

Nice, smart usage of sqlite features there 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must-gather dumps are planned to be in the container (not PV)

Not sure I'm following, I don't think you can just write data "in the container" without some sort of volume mount, but maybe you're just using the filesystem of whichever node the rest-wrapper is running on?

Thanks for feedback, I meant a filesystem where the container runs.

Copy link
Contributor

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally add the backend and this together. I wonder if we can consider some sort of type script module for executing the backend that anyone using typescript and has the re-usable backend implementation could use?

@ibolton336
Copy link
Member

I would personally add the backend and this together. I wonder if we can consider some sort of type script module for executing the backend that anyone using typescript and has the re-usable backend implementation could use?

I like this idea! Would be really good candidate for https://github.com/konveyor/lib-ui if we can pull something together.

@aufi
Copy link
Member Author

aufi commented Jul 28, 2021

Thanks for feedback provided so far! Based on suggestions I merged this backend enhancement with a Forklift UI enhancement #36. The discussion can continue here.

@aufi aufi changed the title [WIP] Add must-gather from Forklift UI enhancement Add must-gather from Forklift UI enhancement Jul 28, 2021
@aufi aufi requested review from shawn-hurley and djwhatle July 29, 2021 08:39
@aufi aufi changed the title Add must-gather from Forklift UI enhancement [WIP] Add must-gather from Forklift UI enhancement Sep 3, 2021
@aufi
Copy link
Member Author

aufi commented Sep 13, 2021

Added auth by bearer token provided via API, no serviceaccount or secret is needed to be passed to the must-gather-api backend pod deployment.

@aufi aufi changed the title [WIP] Add must-gather from Forklift UI enhancement Add must-gather from Forklift UI enhancement Sep 13, 2021

The must-gather wrapper service is an administration tool serving users with oauth token for ```cluster-admin``` role, so it does not accept requests from anonymous or unknown clients, so e.g. accepting arbitrary ```image``` as API parameter seems reasonable.
To ensure that the must-gather result archive is provided to the right user, a hash of the provided bearer token is stored for each must-gather execution and API filters must-gather results by the bearer token hash.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bearer tokens themselves are not the identity of the user IIUC. Do we have any other alternatives?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point on naming, correct; it is more identification of User's temporary session, than persisted User account. The con is that if user would re-login, the must-gather execution from their previous login became not be visible anymore.

I was thinking about translating bearer token to username (like oc whoami output), but that requires oc login call for each request even for get/list must-gather executions, which I wanted to avoid. Let me check more options by tomorrow.

Copy link
Member Author

@aufi aufi Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems I don't have anything better than "translating" bearer token to username that it belongs to and keep that username (or its hash) in the must-gather execution record. The bearer token -> username would be cached most likely to avoid maby login requests to OpenShift API.

@shawn-hurley, do you think this is an acceptable solution (or would you recommend something else)? Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have access to the user that is requesting a resource could you just do a SubjectAccessReview to the specific resource to see if they have access rather than caching it on the object itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shawn-hurley Thanks for your feedback! The authorization section was updated with 69f2c92

The API service will check bearer token before processing each request. The check is the SubjectAccessReview against the cluster with * on * (all actions on all objects, https://github.com/konveyor/forklift-must-gather-api/blob/main/pkg/backend/auth.go#L128-L142). Would that be okay from the enhancement specification point of view?

@aufi aufi requested a review from shawn-hurley October 20, 2021 08:49
@jwmatthews
Copy link
Member

Anyone know if further work is planned on this writeup, is this in a state to merge, etc?

@jwmatthews
Copy link
Member

@aufi are we good with merging this, are there any other updates you would like to make?

@aufi
Copy link
Member Author

aufi commented Mar 3, 2022

I'm good with merging this enhancement spec!

@jwmatthews jwmatthews merged commit 9ec602c into konveyor:master Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants