-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New way of comparing current run uuid with matching data runs #652
Conversation
294ea13
to
a221fd2
Compare
Can we |
## Defining new config | ||
|
||
Each config is a json file consisting of | ||
* metrics: A list of dictionairies each contiaining a list of the key values you want to compare upon (can use regex on some values, depends on set up in elasticsearch). Multiple items in this list will still use the same below fields with every search in addition to the key/values set in the dictionary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we specify those same below fields
used in addition to the key/values set in the dictionary here, please?
|
||
Each config is a json file consisting of | ||
* metrics: A list of dictionairies each contiaining a list of the key values you want to compare upon (can use regex on some values, depends on set up in elasticsearch). Multiple items in this list will still use the same below fields with every search in addition to the key/values set in the dictionary | ||
* additionalColumns: List of keys that you want to have in your results table, this can be used for only combining certain metrics. For example. if I set "samples" in the additionalColumns the data wont combine/aggregate different samples, it'll give me separate rows for each different sample type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we specify those certain metrics that are combined/not-combined
for easy understanding.
utils/compare/graph.py
Outdated
version = float(version) - .01 | ||
|
||
# https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html | ||
if not workerCount: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not workerCount: | |
query_params = { | |
"benchmark": meta["benchmark"], | |
"workerNodesType": meta["workerNodesType"], | |
"masterNodesType": meta["masterNodesType"], | |
"networkType": meta["networkType"], | |
"platform": meta["platform"], | |
"ocpVersion": f"{version}*", | |
"jobStatus": "success", | |
} | |
if workerCount: | |
query_params.update({ | |
"masterNodesCount": meta["masterNodesCount"], | |
"workerNodesCount": meta["workerNodesCount"], | |
}) | |
query = { | |
"_source": ["uuid"], | |
"query": { | |
"query_string": { | |
"query": " AND ".join([f'{key}: "{value}"' for key, value in query_params.items()]), | |
} | |
} | |
} | |
es = ElasticService(index=index) | |
response = es.post(query) | |
es.close() | |
return [item['_source']['uuid'] for item in response["hits"]["hits"]] |
This will only fetch the uuid
portion of the docs instead of the entire docs in-memory.
utils/compare/graph.py
Outdated
value = "{}".format(row['throughput']) | ||
x.append(value) | ||
y.append(test) | ||
new = {'y' : x, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These x, y
need not necessarily interchange. This implementation can be written in more readable and simple way. More details here.
utils/compare/graph.py
Outdated
if np.isinf(row['difference']): | ||
return "Pass" | ||
# Fail if value is a more negative than tolerancy | ||
# High positive valus mean new uuid is better |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[typo]: High positive values
|
||
if np.isinf(row['difference']): | ||
return "Pass" | ||
# Fail if value is a more negative than tolerancy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
README.md
line#16 says better performance in negative case, but here we are marking it as Fail 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had just switched the way the dataframe was set up switching the percent values, will update to be sure they match
utils/compare/graph.py
Outdated
print("ndf " + str(ndf)) | ||
return ndf | ||
|
||
def getResults(uuid: str, uuids: list, index_str: str, metrics: dict ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind this function looks cool. I think we can reuse the same to fetch other metrics like jobSummary
etc.
#print("p tile" + str(ptile)) | ||
return ptile | ||
|
||
def jobFilter(pdata: dict, data: dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No invocations for this function call too. Lets remove dead code as it creates unnecessary confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was an old function I rewrote to work for all workload types, should be taken out now
utils/compare/graph.py
Outdated
return all_result_tolerancy | ||
return [] | ||
|
||
def jobSummary(uuids: list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just now realized that, this function can also be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a lot of these are from the graph-api in the dashboard work...
Preferably, we should create a python package to share these functions, and reuse them across our projects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expect for the cpt-dashboard
what is the alternative place that we are planning to visualize these graphs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, no where to my knowledge. The idea was to share how we build the dataframes for the trending graphs and for the comparisons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, lets wait for at-least on more tool to have the same use-case of displaying these graphs. Until then its just dead code and can be removed.
utils/compare/graph.py
Outdated
return ids_df['uuid'].to_list() | ||
|
||
def burnerFilter(data: dict, columns: list[str]) : | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If interested in keeping doc strings, we can keep the format same across: https://google.github.io/styleguide/pyguide.html
kubelet_json = read_config_file("configs/kubelet.json") | ||
etcd_json = read_config_file("configs/etcd.json") | ||
master_json = read_config_file("configs/master_node.json") | ||
worker_agg_json = read_config_file("configs/worker_node.json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we miss on adding/removing this from the metrics list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Think I updated now
Do we want to remove or name first column? Currently it seems empty
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In overall I like this idea. After going through the discussion on JIRA attached, Reg: "I'm wondering where a good spot more helper type functions should go", here are my .02 cents
- If we want visualizations on top of this data,
cpt-dashboard
would be a better place to go. - Or else if we just want to use it as a tool at the end of each CI run just to look at the results in a
csv/tabular/json
format, I think better place would bebenchmark-comparison
that we have as of today.
def process_all_data(metric_of_interest, find_metrics,uuid, ids, data_func, index, divider, additional_columns): | ||
# Process all | ||
|
||
columns = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it a set
please, so that we can avoid duplicate columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I use this variable as the columns for the dataframe I dont think it allows me to specify it as a set.
File "/Users/prubenda/PycharmProjects/e2e-benchmarking/utils/compare/venv3/lib/python3.10/site-packages/pandas/core/frame.py", line 702, in __init__
raise ValueError("columns cannot be a set")
ValueError: columns cannot be a set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, we can have set (to handle duplicates if any in the additionalColumns) to collect all the columns and then type cast it to a list which is supported by data frames.
I think this is just the row indices from the data frame, I will look into how I can easily remove that |
IMHO the shared functions should go into a shared library -- I am thinking something similar to
The implementation belongs in benchmark-comparison -- agreed. Or we create a tool more specific to the use case |
https://github.com/cloud-bulldozer/py-commons We have created this repo ;)
|
9b1dca7
to
7d46a4c
Compare
2a64a4b
to
6bcbace
Compare
Moving this work between py-commons and orion A start of moving this functionality is in these PR's |
Type of change
Description
With an environment variable set of the current UUID of a run, we want to be able to see if any metrics have regressed enough to fail a job. This current implementation takes the metadata of the uuid and collects other matching runs. Using the similar runs it aggregates the data of specific fields and gives a pass fail based on toleration value
This is still a work in progress and would love any feedback. I tried to document steps and configurations in the utils/compare/README.md
Related Tickets & Documents
Checklist before requesting a review
Testing
Example ingress comparison