Skip to content

Commit 7960307

Browse files
Merge pull request icl-utk-edu#285 from Treece-Burgess/11.15.2024-libpfm4-update
Update libpfm4 to commit 07050c
2 parents 59008ee + f6ec753 commit 7960307

22 files changed

+1620
-19
lines changed

src/libpfm4/README

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -82,9 +82,8 @@ The library supports many PMUs. The current version can handle:
8282
Applied Micro X-Gene
8383
Qualcomm Krait
8484
Fujitsu A64FX
85-
Arm Neoverse V1
86-
Arm Neoverse V2
87-
Arm Neoverse V3
85+
Arm Neoverse V1, V2, V3
86+
Arm Neoverse N1, N2, N3
8887
Huawei HiSilicon Kunpeng 920
8988

9089
- For SPARC

src/libpfm4/docs/Makefile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,7 @@ ARCH_MAN += libpfm_arm_xgene.3 \
155155
libpfm_arm_qcom_krait.3 \
156156
libpfm_arm_neoverse_n1.3 \
157157
libpfm_arm_neoverse_n2.3 \
158+
libpfm_arm_neoverse_n3.3 \
158159
libpfm_arm_neoverse_v1.3 \
159160
libpfm_arm_neoverse_v2.3 \
160161
libpfm_arm_neoverse_v3.3
@@ -170,6 +171,7 @@ ARCH_MAN += libpfm_arm_xgene.3 \
170171
libpfm_arm_a64fx.3 \
171172
libpfm_arm_neoverse_n1.3 \
172173
libpfm_arm_neoverse_n2.3 \
174+
libpfm_arm_neoverse_n3.3 \
173175
libpfm_arm_neoverse_v1.3 \
174176
libpfm_arm_neoverse_v2.3 \
175177
libpfm_arm_neoverse_v3.3
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
.TH LIBPFM 3 "October, 2024" "" "Linux Programmer's Manual"
2+
.SH NAME
3+
libpfm_arm_neoverse_n3 - support for ARM Neoverse N2 core PMU
4+
.SH SYNOPSIS
5+
.nf
6+
.B #include <perfmon/pfmlib.h>
7+
.sp
8+
.B PMU name: arm_n3
9+
.B PMU desc: ARM Neoverse N2
10+
.sp
11+
.SH DESCRIPTION
12+
The library supports the ARM Neoverse N2 core PMU.
13+
14+
This PMU supports 6 or 20 64-bit counters and privilege levels filtering.
15+
It can operate in both 32 and 64 bit modes.
16+
17+
.SH MODIFIERS
18+
The following modifiers are supported on ARM Neoverse N3:
19+
.TP
20+
.B u
21+
Measure at the user level. This corresponds to \fBPFM_PLM3\fR.
22+
This is a boolean modifier.
23+
.TP
24+
.B k
25+
Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR.
26+
This is a boolean modifier.
27+
.TP
28+
.B hv
29+
Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR.
30+
This is a boolean modifier.
31+
32+
.SH AUTHORS
33+
.nf
34+
Stephane Eranian <[email protected]>
35+
.if
36+
.PP

src/libpfm4/docs/man3/libpfm_intel_gnr.3

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,11 @@ On Intel GraniteRapids, the event is treated as a regular event with a flat set
7979
It is not possible to combine the various requests, supplier, snoop bits anymore. Therefore the
8080
library offers the list of validated combinations as per Intel's official event list.
8181

82+
.SH Topdown via PERF_METRICS
83+
84+
Intel GraniteRapids supports the PERF_METRICS MSR which provides support for Topdown Level 1 and 2 via a single PMU counter. This special counter provides percentages of slots for each metric. This feature must be used in conjunction with fixed counter 3 which counts SLOTS in order to work properly. The Linux kernel exposes PERF_METRICS metrics as individual pseudo events counting in slots unit however to operate correctly all events must be programmed together. The Linux kernel requires all PERF_METRICS events to be programmed as a single event group with the first event as SLOTS required. Example: '{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}'. Libpfm4 provides acces to the PERF_METRICS pseudo events via a dedicated event called \fBTOPDOWN_M\fR. This event uses the pseudo encodings assigned by the Linux kernel to PERF_METRICS pseudo events. Using these encodings ensures the kernel detects them as targeting the PERF_METRICS MSR. Note that libpfm4 only provides the encodings and that it is up the user on Linux to group them and order them properly for the perf_events interface. There exists generic counter encodings for most of the Topdown metrics and libpfm4 provides support for those via the \fBTOPDOWN\fR event. Note that all subevents of \fBTOPDOWN_M\fR use fixed counters which have, by definition, no actual event codes. The library uses the Linux pseudo event codes for them, even when compiled on non Linux operating systems.The same holds true for any fixed counters pseudo event exported by libpfm4.
85+
86+
8287
.SH AUTHORS
8388
.nf
8489
Stephane Eranian <[email protected]>

src/libpfm4/docs/man3/libpfm_intel_icl.3

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,10 @@ On Intel IceLake unlike older processors, the event is treated as a regular even
7777
It is not possible to combine the various requests, supplier, snoop bits anymore. Therefore the
7878
library offers the list of validated combinations as per Intel's official event list.
7979

80+
.SH Topdown via PERF_METRICS
81+
82+
Intel Icelake supports the PERF_METRICS MSR which provides support for Topdown Level 1 via a single PMU counter. This special counter provides percentages of slots for each metric. This feature must be used in conjunction with fixed counter 3 which counts SLOTS in order to work properly. The Linux kernel exposes PERF_METRICS metrics as individual pseudo events counting in slots unit however to operate correctly all events must be programmed together. The Linux kernel requires all PERF_METRICS events to be programmed as a single event group with the first event as SLOTS required. Example: '{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}'. Libpfm4 provides acces to the PERF_METRICS pseudo events via a dedicated event called \fBTOPDOWN_M\fR. This event uses the pseudo encodings assigned by the Linux kernel to PERF_METRICS pseudo events. Using these encodings ensures the kernel detects them as targeting the PERF_METRICS MSR. Note that libpfm4 only provides the encodings and that it is up the user on Linux to group them and order them properly for the perf_events interface. There exists generic counter encodings for most of the Topdown metrics and libpfm4 provides support for those via the \fBTOPDOWN\fR event. Note that all subevents of \fBTOPDOWN_M\fR use fixed counters which have, by definition, no actual event codes. The library uses the Linux pseudo event codes for them, even when compiled on non Linux operating systems.The same holds true for any fixed counters pseudo event exported by libpfm4.
83+
8084
.SH AUTHORS
8185
.nf
8286
Stephane Eranian <[email protected]>

src/libpfm4/docs/man3/libpfm_intel_spr.3

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,25 @@
11
.TH LIBPFM 3 "April, 2022" "" "Linux Programmer's Manual"
22
.SH NAME
3-
libpfm_intel_spr - support for Intel SapphireRapid core PMU
3+
libpfm_intel_spr - support for Intel SapphireRapids core PMU
44
.SH SYNOPSIS
55
.nf
66
.B #include <perfmon/pfmlib.h>
77
.sp
88
.B PMU name: spr
9-
.B PMU desc: Intel SapphireRapid
9+
.B PMU desc: Intel SapphireRapids
1010
.sp
1111
.SH DESCRIPTION
12-
The library supports the Intel SapphireRapid core PMU. It should be noted that
12+
The library supports the Intel SapphireRapids core PMU. It should be noted that
1313
this PMU model only covers each core's PMU and not the socket level
1414
PMU.
1515

16-
On SapphireRapid, the number of generic counters depends on the Hyperthreading (HT) mode.
16+
On SapphireRapids, the number of generic counters depends on the Hyperthreading (HT) mode.
1717

1818
The \fBpfm_get_pmu_info()\fR function returns the maximum number
1919
of generic counters in \fBnum_cntrs\fr.
2020

2121
.SH MODIFIERS
22-
The following modifiers are supported on Intel SapphireRapid processors:
22+
The following modifiers are supported on Intel SapphireRapids processors:
2323
.TP
2424
.B u
2525
Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR.
@@ -62,7 +62,7 @@ counts. It acts as a threshold, i.e., at least a period of N core cycles where t
6262
be used with the IDQ_*_BUBBLES umasks. If not specified, the default threshold value is 1 cycle. the valid values are in [1-4095].
6363

6464
.SH OFFCORE_RESPONSE events
65-
Intel SapphireRapid supports two encodings for offcore_response events. In the library, these are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1.
65+
Intel SapphireRapids supports two encodings for offcore_response events. In the library, these are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1.
6666

6767
Those events need special treatment in the performance monitoring infrastructure
6868
because each event uses an extra register to store some settings. Thus, in
@@ -73,10 +73,14 @@ The offcore_response events are exposed as a normal events by the library. The e
7373
settings are exposed as regular umasks. The library takes care of encoding the
7474
events according to the underlying kernel interface.
7575

76-
On Intel SapphireRapid unlike older processors, the event is treated as a regular event with a flat set of umasks to choose from.
76+
On Intel SapphireRapids unlike older processors, the event is treated as a regular event with a flat set of umasks to choose from.
7777
It is not possible to combine the various requests, supplier, snoop bits anymore. Therefore the
7878
library offers the list of validated combinations as per Intel's official event list.
7979

80+
.SH Topdown via PERF_METRICS
81+
82+
Intel SapphireRapids supports the PERF_METRICS MSR which provides support for Topdown Level 1 and 2 via a single PMU counter. This special counter provides percentages of slots for each metric. This feature must be used in conjunction with fixed counter 3 which counts SLOTS in order to work properly. The Linux kernel exposes PERF_METRICS metrics as individual pseudo events counting in slots unit however to operate correctly all events must be programmed together. The Linux kernel requires all PERF_METRICS events to be programmed as a single event group with the first event as SLOTS required. Example: '{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}'. Libpfm4 provides acces to the PERF_METRICS pseudo events via a dedicated event called \fBTOPDOWN_M\fR. This event uses the pseudo encodings assigned by the Linux kernel to PERF_METRICS pseudo events. Using these encodings ensures the kernel detects them as targeting the PERF_METRICS MSR. Note that libpfm4 only provides the encodings and that it is up the user on Linux to group them and order them properly for the perf_events interface. There exists generic counter encodings for most of the Topdown metrics and libpfm4 provides support for those via the \fBTOPDOWN\fR event. Note that all subevents of \fBTOPDOWN_M\fR use fixed counters which have, by definition, no actual event codes. The library uses the Linux pseudo event codes for them, even when compiled on non Linux operating systems.The same holds true for any fixed counters pseudo event exported by libpfm4.
83+
8084
.SH AUTHORS
8185
.nf
8286
Stephane Eranian <[email protected]>

src/libpfm4/docs/man3/pfm_get_event_attr_info.3

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,9 @@ typedef struct {
4646
int is_dfl:1;
4747
int is_precise:1;
4848
int is_speculative:2;
49-
int reserved:28;
49+
int support_hw_smpl:1;
50+
int support_no_mods:1;
51+
int reserved:26;
5052
};
5153
union {
5254
uint64_t dfl_val64;
@@ -136,6 +138,16 @@ information is not available. A value of \fBPFM_EVENT_INFO_SPEC_TRUE\fR indicate
136138
that the attribute counts during speculative execution. A value of \fBPFM_EVENT_INFO_SPEC_FALSE\fR
137139
indicates that the attribute does not count during speculative execution.
138140
.TP
141+
.B support_hw_smpl
142+
This boolean field indicates that the attribute (in this case a umask) supports hardware sampling.
143+
That means the hardware can sample this event+umasks without involving the kernel at each sample.
144+
.TP
145+
.B support_no_mods
146+
This boolean field indicates that the attribute (in this case a umask) does not support any hardware
147+
or software modifiers, such as privilege level filters, sampling, precise sampling, and such. This
148+
is necessary when select umasks of an event have more restrictions than others, e.g., the event and
149+
most umasks support modifiers except a few umasks.
150+
.TP
139151
.B dfl_val64, dfl_str, dfl_bool, dfl_int
140152
This union contains the value of an attribute. For PFM_ATTR_UMASK, the is
141153
the unit mask code, for all other types this is the actual value of the

src/libpfm4/examples/showevtinfo.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -403,6 +403,11 @@ print_attr_flags(pfm_event_attr_info_t *info)
403403
int n = 0;
404404
int spec = info->is_speculative;
405405

406+
if (info->support_no_mods) {
407+
printf("[no modifier supported] ");
408+
n++;
409+
}
410+
406411
if (info->is_dfl) {
407412
printf("[default] ");
408413
n++;

src/libpfm4/include/perfmon/pfmlib.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -819,6 +819,7 @@ typedef enum {
819819
PFM_PMU_ARM_V3, /* Arm Neoverse V3 (ARMv9) */
820820
PFM_PMU_ARM_CORTEX_A55, /* ARM Cortex A55 (ARMv8) */
821821
PFM_PMU_ARM_CORTEX_A76, /* ARM Cortex A76 (ARMv8) */
822+
PFM_PMU_ARM_N3, /* Arm Neoverse N3 */
822823
/* MUST ADD NEW PMU MODELS HERE */
823824

824825
PFM_PMU_MAX /* end marker */
@@ -961,7 +962,8 @@ typedef struct {
961962
unsigned int is_precise:1; /* Intel X86: supports PEBS */
962963
unsigned int is_speculative:2; /* count correct and wrong path occurrences */
963964
unsigned int support_hw_smpl:1;/* can be recorded by hw buffer (Intel X86=EXTPEBS) */
964-
unsigned int reserved_bits:27;
965+
unsigned int support_no_mods:1;/* attribute does not support modifiers (umask only) */
966+
unsigned int reserved_bits:26;
965967
} SWIG_NAME(flags);
966968
union {
967969
uint64_t dfl_val64; /* default 64-bit value */

0 commit comments

Comments
 (0)