Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve qualification tool estimation #472

Closed
nartal1 opened this issue Aug 2, 2023 · 2 comments
Closed

Improve qualification tool estimation #472

nartal1 opened this issue Aug 2, 2023 · 2 comments
Assignees
Labels
core_tools Scope the core module (scala) feature request New feature or request

Comments

@nartal1
Copy link
Collaborator

nartal1 commented Aug 2, 2023

While working on issue #385 , I came across few issues which if addressed would make the qualification tool speedup estimation better.

  1. Some Execs do not have metrics populated, so while calculating the number of Execs within a stage we miss these. We map stageToExec based on accumulatorId and since metrics are not populated, we miss adding those Execs to that corresponding stage - [FEA] Qualification tool - enhance exec to stage mapping  #615

  2. We don't consider duration of each Exec within a stage i.e we evenly distribute durations(total taskMetrics durations) to all the known Execs. The issue is one Exec with speedup 5x maybe taking 90% of the total time and other Exec with 2x could be taking the remaining 10%. But in our case we assign same durations to both and take the average of the speedups. I am not certain if there is a way to fix this but would be good to investigate.

3. promote_precision is shown as Not supported expression. Should fix this as it is supported

@amahussein
Copy link
Collaborator

@nartal1 I moved the promote_precision to a separate issue #517 .

@nartal1
Copy link
Collaborator Author

nartal1 commented Jan 9, 2024

  1. Fixed by - PR-634

  2. I investigated this further and we cannot accomplish this due to the way metrics are generated per stage. Typically, CPU operators are consolidated within a single stage without detailed breakdowns of individual durations. This grouping constrains the granularity of timing details, as it does not precisely measure the time taken by each specific operator. So the current implementation which is to assign equal durations to all the operators in a given stage seems to be the best possible solution in this case.
    Closing this issue

@nartal1 nartal1 closed this as completed Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core_tools Scope the core module (scala) feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants