-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BERT Inference Test #459
Conversation
|
||
// Label the first node with the GPU label | ||
nodeName := nodes.Items[0].Name | ||
cmd := exec.Command("kubectl", "label", "node", nodeName, "nvidia.com/gpu.present=true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two things, (1) need to investigate if the label here is necessary and (2) if so can we use the k8s clients to do these instead of exec-ing out to kubectl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary. Just updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addition LGTM
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/ | ||
priorityClassName: "system-node-critical" | ||
containers: | ||
- image: nvcr.io/nvidia/k8s-device-plugin:v0.14.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how important is it to keep on the latest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, not totally sure. I chose to use the same exact one as our other tests to keep it consistent.
@@ -0,0 +1,75 @@ | |||
package bert_inference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just call this package inference
? we realistically may add other inference cases that aren't BERT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sure thing. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine for this PR, but we should find a way to share a single manifest for things like device plugins that are used by multiple suites
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree. Had briefly discussed this on the side with @weicongw, but decided to push any decisions around it to later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you intend to commit this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I did not... Removed.
Issue #, if available:
Description of changes:
This test being added will run an E2E BERT inference test. The validation for this test was done on a cluster consisting of
g5.2xlarge
instance type. The cluster has two nodes in total, in compliance with the requirement for the unit tests, but the inference test will only be ran on one of the nodes.Two validation tests were ran, one for each inference mode:
throughput
(default), andlatency
. The results of running the test for each inference mode type can be seen below. These logs were obtained from the pod that ran the E2E BERT inference job.Throughput
The logs from the worker pod:
Latency
The logs from the worker pod:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.