Conversation
frival
left a comment
There was a problem hiding this comment.
A few comments / issues:
First, it looks like this will create one archive per iteration - is that the delineation we want or do we want a single archive for all iterations? A single archive would make comparing between iterations easier, but maybe there's a good reason to keep them separate?
Second, I don't see an actual log of the wrapper. Given how we swallow all output so we don't choke the ansible log when run from Zathras, I'd really like to see something like a bash -x ./specjbb_run output to look for errors or warnings we might not be seeing.
Third, I'm concerned about the pcp output. There's one entry for each warehouse count, but all of them have the throughput of the last warehouse count and the first two report the warehouse count as 8 (the last point run) and the second two report the warehouse count as 0 (which is, well, impossible). I think it will take the ability to write out our own metric archive and merge it to solve this properly since SPECjbb2005 runs as a single Java binary, so until then does it make sense to just record the peak result per run rather than have $ndatapoints entries all with the same throughput and some weirdness with the reported warehouse count?
|
For the output from the test, open the file https://github.com/user-attachments/files/24541236/SPECjbb.001.txt it is above in the results section. It is fairly large, did not want to clutter the git. It will download it, then open it via your choice of viewer. We create an archive per numa node run. If it is desired we can change that to add the numa node into the pcp archive, does not matter to me which, let me know. Cut of pcp from a 2 numa system (filtered of empty fields). Filtered too much out earlier. 10:51:29 0.000 1.000 0.000 NaN NaN NaN 24.000 868972.0 corresponding csv file numa nodes=6 corresponding csv file |
That file is only the output of SPECjbb's view of its run (i.e. it's only the output from the SPECjbb Java executable), it's not a log of the full wrapper. |
|
Requested output from /bin/bash -x |
|
Now tracking numa binding. pmrep output |
|
specj_out.txt |
Description
Fixes the pcp output so
Before/After Comparison
Before:
Send result to PCP archive
Logging results nr_jvms_0 _pcp
Unexpected metric logged. Check for a typo.
Stopping PCP subset
adding: results_specjbb_virtual-guest.tar (deflated 86%)
pmrep -p -a specjbb.0 openmetrics.workloads > foo
Does not show the desired workload data.
After
pmrep -p -a specjbb_jvms_0.0 openmetrics.workload
(partial output)
o.w.iteration o.w.running o.w.numthreads o.w.runtime o.w.throughput o.w.latency o.w.Warehouse o.w.BOPs o.w.JVMs
16:06:18 0.000 1.000 0.000 NaN NaN NaN 2.000 215137.0 1.000
16:06:19 0.000 1.000 0.000 NaN NaN NaN 2.000 215137.0 1.000
16:06:20 0.000 0.000 0.000 NaN NaN NaN NaN NaN NaN
16:06:21 0.000 0.000 0.000 NaN NaN NaN NaN NaN NaN
16:06:22 0.000 0.000 0.000 NaN NaN NaN 4.000 269278.0 1.000
16:06:23 0.000 0.000 0.000 NaN NaN NaN 4.000 269278.0 1.000
16:06:24 0.000 0.000 0.000 NaN NaN NaN NaN NaN NaN
16:06:25 0.000 0.000 0.000 NaN NaN NaN NaN NaN NaN
16:06:26 0.000 0.000 0.000 NaN NaN NaN 6.000 266119.0 1.000
16:06:27 0.000 0.000 0.000 NaN NaN NaN 6.000 266119.0 1.000
16:06:28 0.000 0.000 0.000 NaN NaN NaN NaN NaN NaN
16:06:29 0.000 0.000 0.000 NaN NaN NaN NaN NaN NaN
16:06:30 0.000 0.000 0.000 NaN NaN NaN NaN NaN NaN
16:06:31 0.000 0.000 0.000 NaN NaN NaN 8.000 261159.0 1.000
16:06:32 0.000 0.000 0.000 NaN NaN NaN 8.000 261159.0 1.000
Clerical Stuff
This closes #48
Relates to JIRA: RPOPC-760
Test information
Command executed:
/home/ec2-user/workloads/specjbb-wrapper/specjbb/specjbb_run --run_user ec2-user --home_parent /home --iterations 1 --tuned_setting tuned_none_sys_file_ --host_config "m5a.24xlarge" --sysname "m5a.24xlarge" --sys_type aws --use_pcp --java_version 21 --debug
===============================
csv file
Single jvm
Warehouses,Bops,Numb_JVMs,Start_Date,End_Date
24,847245,1,2026-01-28T14:12:35Z,2026-01-28T14:21:03Z
48,1158553,1,2026-01-28T14:12:35Z,2026-01-28T14:21:03Z
72,1065174,1,2026-01-28T14:12:35Z,2026-01-28T14:21:03Z
96,961328,1,2026-01-28T14:12:35Z,2026-01-28T14:21:03Z
120,918485,1,2026-01-28T14:12:35Z,2026-01-28T14:21:03Z
144,872202,1,2026-01-28T14:12:35Z,2026-01-28T14:21:03Z
168,808530,1,2026-01-28T14:12:35Z,2026-01-28T14:21:03Z
192,731861,1,2026-01-28T14:12:35Z,2026-01-28T14:21:03Z
multiple jvms
Warehouses,Bops,Numb_JVMs,Start_Date,End_Date
24,988377,6,2026-01-28T14:21:43Z,2026-01-28T14:30:06Z
48,1686562,6,2026-01-28T14:21:43Z,2026-01-28T14:30:06Z
72,1683347,6,2026-01-28T14:21:43Z,2026-01-28T14:30:06Z
96,1491024,6,2026-01-28T14:21:43Z,2026-01-28T14:30:06Z
120,1485539,6,2026-01-28T14:21:43Z,2026-01-28T14:30:06Z
144,1364024,6,2026-01-28T14:21:43Z,2026-01-28T14:30:06Z
168,1330200,6,2026-01-28T14:21:43Z,2026-01-28T14:30:06Z
192,1302089,6,2026-01-28T14:21:43Z,2026-01-28T14:30:06Z
===============================
partial pcp output
14:21:08 0.000 0.000 0.000 NaN NaN NaN 1.000 48.000 1158553
14:21:09 0.000 0.000 0.000 NaN NaN NaN 1.000 48.000 1158553
14:21:12 0.000 0.000 0.000 NaN NaN NaN 1.000 72.000 1065174
14:21:13 0.000 0.000 0.000 NaN NaN NaN 1.000 72.000 1065174
14:21:17 0.000 0.000 0.000 NaN NaN NaN 1.000 96.000 961328.0
14:21:18 0.000 0.000 0.000 NaN NaN NaN 1.000 96.000 961328.0
14:21:21 0.000 0.000 0.000 NaN NaN NaN 1.000 120.000 918485.0
14:21:22 0.000 0.000 0.000 NaN NaN NaN 1.000 120.000 918485.0
14:21:25 0.000 0.000 0.000 NaN NaN NaN 1.000 144.000 872202.0
14:21:26 0.000 0.000 0.000 NaN NaN NaN 1.000 144.000 872202.0
14:21:29 0.000 0.000 0.000 NaN NaN NaN 1.000 168.000 808530.0
14:21:30 0.000 0.000 0.000 NaN NaN NaN 1.000 168.000 808530.0
14:21:33 0.000 0.000 0.000 NaN NaN NaN 1.000 192.000 731861.0
14:21:34 0.000 0.000 0.000 NaN NaN NaN 1.000 192.000 731861.0
14:30:03 0.000 1.000 0.000 NaN NaN NaN NaN NaN NaN
14:30:04 0.000 1.000 0.000 NaN NaN NaN NaN NaN NaN
14:30:05 0.000 1.000 0.000 NaN NaN NaN NaN NaN NaN
14:30:06 0.000 1.000 0.000 NaN NaN NaN NaN NaN NaN
14:30:07 0.000 1.000 0.000 NaN NaN NaN 6.000 24.000 988377.0
14:30:08 0.000 1.000 0.000 NaN NaN NaN 6.000 24.000 988377.0
14:30:12 0.000 0.000 0.000 NaN NaN NaN 6.000 48.000 1686562
14:30:13 0.000 0.000 0.000 NaN NaN NaN 6.000 48.000 1686562
14:30:16 0.000 0.000 0.000 NaN NaN NaN 6.000 72.000 1683347
14:30:17 0.000 0.000 0.000 NaN NaN NaN 6.000 72.000 1683347
14:30:20 0.000 0.000 0.000 NaN NaN NaN 6.000 96.000 1491024
14:30:21 0.000 0.000 0.000 NaN NaN NaN 6.000 96.000 1491024
14:30:25 0.000 0.000 0.000 NaN NaN NaN 6.000 120.000 1485539
14:30:26 0.000 0.000 0.000 NaN NaN NaN 6.000 120.000 1485539
14:30:29 0.000 0.000 0.000 NaN NaN NaN 6.000 144.000 1364024
14:30:30 0.000 0.000 0.000 NaN NaN NaN 6.000 144.000 1364024
14:30:34 0.000 0.000 0.000 NaN NaN NaN 6.000 168.000 1330200
14:30:37 0.000 0.000 0.000 NaN NaN NaN 6.000 192.000 1302089
14:30:38 0.000 0.000 0.000 NaN NaN NaN 6.000 192.000 1302089
~
================================
Screen output from test
specjbb_out.txt