Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge pull request #835 from jkottiku/master #838

Open
wants to merge 1 commit into
base: release/rocm-rel-6.3
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion iet.so/src/action.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ using std::fstream;
#define IET_DEFAULT_RAMP_INTERVAL 5000
#define IET_DEFAULT_LOG_INTERVAL 1000
#define IET_DEFAULT_MAX_VIOLATIONS 0
#define IET_DEFAULT_TOLERANCE 0.1
#define IET_DEFAULT_TOLERANCE 0
#define IET_DEFAULT_SAMPLE_INTERVAL 1000
#define IET_DEFAULT_MATRIX_SIZE 5760
#define IET_DEFAULT_MATRIX_SIZE_A 0
Expand Down
9 changes: 5 additions & 4 deletions iet.so/src/iet_worker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -270,14 +270,15 @@ bool IETWorker::do_iet_power_stress(void) {

// json log the avg power
log_to_json(IET_AVERAGE_POWER_KEY, std::to_string(max_power),
rvs::loginfo);
//check whether we reached the target power
if(max_power >= target_power) {
rvs::loginfo);
// check whether we reached the target power or within the tolerance limit
if(max_power >= (target_power - (target_power * tolerance))) {
msg = "[" + action_name + "] " + MODULE_NAME + " " +
std::to_string(gpu_id) + " " + " Average power met the target power :" + " " + std::to_string(max_power);
rvs::lp::Log(msg, rvs::loginfo);
result = true;
}else {
}
else {
msg = "[" + action_name + "] " + MODULE_NAME + " " +
std::to_string(gpu_id) + " " + " Average power could not meet the target power \
in the given interval, increase the duration and try again, \
Expand Down
1 change: 1 addition & 0 deletions rvs/conf/MI308X/iet_single.conf
Original file line number Diff line number Diff line change
Expand Up @@ -106,4 +106,5 @@ actions:
target_power: 650
bw_workload: true
cp_workload: false
tolerance: 0.05

2 changes: 2 additions & 0 deletions rvs/conf/MI308X/iet_stress.conf
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
# Set parallel execution to true (gemm workload execution on all GPUs in parallel)
# Test duration set to 10 mins.
# Target power set to 650W for each GPU.
# Tolerance set to 5% of target power.
#
# Run test with:
# cd bin
Expand All @@ -51,6 +52,7 @@ actions:
sample_interval: 5000
log_interval: 5000
target_power: 650
tolerance: 0.05
bw_workload: true
cp_workload: false