Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PerfCounters: Add support for AMD Family 15h Model 2 (Piledriver) #3401

Merged
merged 1 commit into from
Aug 28, 2024

Conversation

vsrinivas
Copy link
Contributor

Extends existing Family 15h Model 30 (Steamroller) support for Piledriver. Piledriver supports PMCx0C4 (Retired Taken Branch Instructions) and PMCx0C6 (Retired Far Control Transfer), just like Model 30h. 1

Note that PMCx0C4 counts all control flow changes, including exceptions and interrupts. AFAICT on 15h, there is no PMC for just retired conditional branches.

Tested:

  1. counters-test:
    vsrinivas@ubuntu:/tmp/rr/src/counters-test$ cc -O2 counters.c
    vsrinivas@ubuntu:
    /tmp/rr/src/counters-test$ sudo ./a.out
    Ticks mismatch; got 1003, expected 1002
    Aborted

    Varying the number of volatile matches, we always see one more
    tick than expected, which I think is a RET instruction.

  2. ctest:
    97% tests passed, 45 tests failed out of 1425

Total Test time (real) = 3092.96 sec

The following tests FAILED:
53 - x86/chew_cpu_cpuid-no-syscallbuf (Failed)
110 - detach_state (Failed)
162 - x86/fault_in_code_page (Failed)
558 - x86/rdtsc_flags (Failed)
724 - sioc (Failed)
725 - sioc-no-syscallbuf (Failed)
842 - vsyscall (Failed)
843 - vsyscall-no-syscallbuf (Failed)
844 - vsyscall_timeslice (Failed)
845 - vsyscall_timeslice-no-syscallbuf (Failed)
846 - x86/x87env (Failed)
847 - x86/x87env-no-syscallbuf (Failed)
888 - async_signal_syscalls (Failed)
890 - async_signal_syscalls2 (Failed)
910 - x86/blocked_sigsegv (Failed)
916 - breakpoint_overlap (Failed)
924 - checkpoint_dying_threads (Failed)
932 - clone_interruption (Failed)
938 - conditional_breakpoint_offload (Failed)
939 - conditional_breakpoint_offload-no-syscallbuf (Failed)
951 - daemon_read-no-syscallbuf (Failed)
962 - dlopen (Failed)
980 - exit_race (Failed)
981 - exit_race-no-syscallbuf (Failed)
984 - x86/explicit_checkpoints (Failed)
1072 - x86/rdtsc_loop (Failed)
1080 - reverse_continue_breakpoint (Failed)
1081 - reverse_continue_breakpoint-no-syscallbuf (Failed)
1089 - reverse_step_long-no-syscallbuf (Failed)
1092 - reverse_step_threads_break (Failed)
1100 - rseq_syscallbuf (Failed)
1130 - strict_priorities (Failed)
1131 - strict_priorities-no-syscallbuf (Failed)
1150 - x86/syscallbuf_rdtsc_page (Failed)
1174 - thread_open_race (Failed)
1206 - watchpoint_at_sched (Failed)
1208 - watchpoint_before_signal (Failed)
1209 - watchpoint_before_signal-no-syscallbuf (Failed)
1218 - async_signal_syscalls_100 (Failed)
1219 - async_signal_syscalls_100-no-syscallbuf (Failed)
1320 - record_replay (Failed)
1321 - record_replay-no-syscallbuf (Failed)
1354 - reverse_watchpoint_syscall (Failed)
1418 - vsyscall_singlestep (Failed)
1419 - vsyscall_singlestep-no-syscallbuf (Failed)

@khuey
Copy link
Collaborator

khuey commented Nov 23, 2022

Note that PMCx0C4 counts all control flow changes, including exceptions and interrupts. AFAICT on 15h, there is no PMC for just retired conditional branches.

That's going to be a fatal problem if true because interrupts are not deterministic.

@vsrinivas
Copy link
Contributor Author

From the 15h Model 00-0Fh BKDG:

PMCx0C4 Retired Taken Branch Instructions
PERF_CTL[5:0]. The number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts

What we do right now (for model 30h) is subtract out PMCx0C6 (minus_ticks_cntr_event), which counts "The number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts". This patch does the same for 15h model 2.

@GitMensch
Copy link
Contributor

@vsrinivas I do wonder if you have access to that CPU and would rebase + retest for failing tests.
Was the resulting rr usable for you on that machine or not?

Extends existing Family 15h Model 30 (Steamroller) support for
Piledriver. Piledriver supports PMCx0C4 (Retired Taken Branch
Instructions) and PMCx0C6 (Retired Far Control Transfer), just
like Model 30h. [1]

Note that PMCx0C4 counts all control flow changes, including
exceptions and interrupts. Like on 15h model 30-3Fh (Steamroller)
we subtract PMCx0C6 (Retired Far Control Transfers) to count
only regular taken branches.

[1]: https://www.amd.com/system/files/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf

Tested:
1) counters-test:
   vsrinivas@ubuntu:~/tmp/rr/src/counters-test$ sudo ./a.out
   Interrupted after 1000025 ticks, expected 1000000 ticks
   EXIT-SUCCESS
@vsrinivas
Copy link
Contributor Author

@vsrinivas I do wonder if you have access to that CPU and would rebase + retest for failing tests. Was the resulting rr usable for you on that machine or not?

I do still have access to the CPU; rebased, re-ran counters-test (the output is in the commit message).

@GitMensch
Copy link
Contributor

Thanks for the rebase. Apart from that one falling test - is the resulting rr binary usable for record and replay?

@rocallahan rocallahan merged commit 2c7d867 into rr-debugger:master Aug 28, 2024
@vsrinivas
Copy link
Contributor Author

@GitMensch yep!

int a;
int b;
void t1(void) {
	a = 1;
	b = 0;
}
void t2(void) {
	a = 0;
	b = 1;
}

int main(int argc, char *argv[]) {
	t1();
	t2();
}

...

Breakpoint 5, main (argc=1, argv=0x7ffd6d17f118) at test.c:14
14		t1();
(rr) c
Continuing.

Breakpoint 1, t1 () at test.c:5
5		a = 1;
(rr) c
Continuing.

Hardware watchpoint 3: a

Old value = 0
New value = 1
t1 () at test.c:6
6		b = 0;
(rr) c
Continuing.

Breakpoint 2, t2 () at test.c:9
9		a = 0;
(rr) c
Continuing.

Hardware watchpoint 3: a

Old value = 1
New value = 0
t2 () at test.c:10
10		b = 1;
(rr) reverse-cont
Continuing.

Breakpoint 2, t2 () at test.c:9
9		a = 0;
(rr) 
Continuing.

Breakpoint 1, t1 () at test.c:5
5		a = 1;
(rr) 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants