toplev.man

.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.13.
.TH TOPLEV.PY "1" "April 2020" "toplev.py toplev" "User Commands"
.SH NAME
toplev.py \- manual page for toplev.py toplev
.SH DESCRIPTION
usage: toplev [options] perf\-arguments
.PP
Estimate on which part of the CPU pipeline a workload bottlenecks using the TopDown model.
The bottlenecks are expressed as a tree with different levels.
Requires a modern Intel CPU.
.SH ENVIRONMENT
.TP
\fB\-\-force\-cpu\fR {snb,jkt,ivb,ivt,hsw,hsx,slm,bdw,bdx,simple,skl,knl,skx,clx,icl}
Force CPU type
.TP
\fB\-\-force\-topology\fR findsysoutput
Use specified topology file (find \fI\,/sys/devices\/\fP)
.TP
\fB\-\-force\-cpuinfo\fR cpuinfo
Use specified cpuinfo file (\fI\,/proc/cpuinfo\/\fP)
.TP
\fB\-\-force\-hypervisor\fR
Assume running under hypervisor (no uncore, no
offcore, no PEBS)
.TP
\fB\-\-no\-uncore\fR
Disable uncore events
.TP
\fB\-\-no\-check\fR
Do not check that PMU units exist
.SS "Additional information:"
.TP
\fB\-\-print\-group\fR, \fB\-g\fR
Print event group assignments
.TP
\fB\-\-raw\fR
Print raw values
.TP
\fB\-\-valcsv\fR VALCSV, \fB\-V\fR VALCSV
Write raw counter values into CSV file
.TP
\fB\-\-stats\fR
Show statistics on what events counted
.SS "Sampling:"
.TP
\fB\-\-show\-sample\fR
Show command line to rerun workload with sampling
.TP
\fB\-\-run\-sample\fR
Automatically rerun workload with sampling
.TP
\fB\-\-sample\-args\fR SAMPLE_ARGS
Extra rguments to pass to perf record for sampling.
Use + to specify \-
.TP
\fB\-\-sample\-repeat\fR SAMPLE_REPEAT
Repeat measurement and sampling N times. This
interleaves counting and sampling. Useful for
background collection with \fB\-a\fR sleep X.
.TP
\fB\-\-sample\-basename\fR SAMPLE_BASENAME
Base name of sample perf.data files
.PP
Other perf arguments allowed (see the perf documentation)
After \fB\-\-\fR perf arguments conflicting with toplev can be used.
.PP
Some caveats:
.PP
toplev defaults to measuring the full system and show data
for all CPUs. Use taskset to limit the workload to known CPUs if needed.
In some cases (idle system, single threaded workload) \fB\-\-single\-thread\fR
can also be used.
.PP
The lower levels of the measurement tree are less reliable
than the higher levels.  They also rely on counter multi\-plexing,
and can not run each equation in a single group, which can cause larger
measurement errors with non steady state workloads
.PP
(If you don't understand this terminology; it means measurements
in higher levels are less accurate and it works best with programs that primarily
do the same thing over and over)
.PP
If the program is very reproducible \fB\-\-\fR such as a simple kernel \fB\-\-\fR
it is also possible to use \fB\-\-no\-multiplex\fR. In this case the
workload is rerun multiple times until all data is collected.
Do not use together with sleep.
.PP
toplev needs a new enough perf tool and has specific requirements on
the kernel. See http://github.com/andikleen/pmu\-tools/wiki/toplev\-kernel\-support
.PP
Other CPUs can be forced with FORCECPU=name
This usually requires setting the correct event map with EVENTMAP=...
The topology can be overriden with TOPOLOGY=file (sysfs filenames) and CPUINFO=file
(\fI\,/proc/cpuinfo\/\fP replacement)
Valid CPU names: snb jkt ivb ivt hsw hsx slm bdw bdx simple skl knl skx clx icl
.SH EXAMPLES
toplev.py \-l2 program
measure whole system in level 2 while program is running
.PP
toplev.py \-l1 \-\-single\-thread program
measure single threaded program. system must be idle.
.PP
toplev.py \-l3 \-\-no\-desc \-I 100 \-x, sleep X
measure whole system for X seconds every 100ms, outputting in CSV format.
.PP
toplev.py \-\-all \-\-core C0 taskset \-c 0,1 program
Measure program running on core 0 with all nodes and metrics enables
.SS "optional arguments:"
.TP
\-h, \-\-help
show this help message and exit
.SS "General operation:"
.TP
\-\-interval INTERVAL, \-I INTERVAL
Measure every ms instead of only once
.TP
\-\-no\-multiplex
Do not multiplex, but run the workload multiple times
as needed. Requires reproducible workloads.
.TP
\-\-single\-thread, \-S
Measure workload as single thread. Workload must run
single threaded. In SMT mode other thread must be
idle.
.TP
\-\-fast, \-F
Skip sanity checks to optimize CPU consumption
.TP
\-\-import _IMPORT
Import specified perf stat output file instead of
running perf. Must be for same cpu, same arguments,
same /proc/cpuinfo, same topology, unless overriden
.TP
\-\-gen\-script
Generate script to collect perfmon information for
\-\-import later
.SS "Measurement filtering:"
.TP
\-\-kernel
Only measure kernel code
.TP
\-\-user
Only measure user code
.TP
\-\-core CORE
Limit output to cores. Comma list of Sx\-Cx\-Tx. All
parts optional.
.SS "Select events:"
.TP
\-\-level LEVEL, \-l LEVEL
Measure upto level N (max 6)
.TP
\-\-metrics, \-m
Print extra metrics
.TP
\-\-sw
Measure perf Linux metrics
.TP
\-\-no\-util
Do not measure CPU utilization
.TP
\-\-tsx
Measure TSX metrics
.TP
\-\-all
Measure everything available
.TP
\-\-frequency
Measure frequency
.TP
\-\-power
Display power metrics
.TP
\-\-nodes NODES
Include or exclude nodes (with + to add, \-|^ to
remove, comma separated list, wildcards allowed)
.TP
\-\-reduced
Use reduced server subset of nodes/metrics
.TP
\-\-metric\-group METRIC_GROUP
Add (+) or remove (\-|^) metric groups of metrics,
comma separated list from \-\-list\-metric\-groups.
.SS "Query nodes:"
.TP
\-\-list\-metrics
List all metrics
.TP
\-\-list\-nodes
List all nodes
.TP
\-\-list\-metric\-groups
List metric groups
.TP
\-\-list\-all
List every supported node/metric/metricgroup
.SS "Workarounds:"
.TP
\-\-no\-group
Dont use groups
.TP
\-\-force\-events
Assume kernel supports all events. May give wrong
results.
.TP
\-\-ignore\-errata
Do not disable events with errata
.TP
\-\-handle\-errata
Disable events with errata
.SS "Output:"
.TP
\-\-per\-core
Aggregate output per core
.TP
\-\-per\-socket
Aggregate output per socket
.TP
\-\-per\-thread
Aggregate output per CPU thread
.TP
\-\-global
Aggregate output for all CPUs
.TP
\-\-no\-desc
Do not print event descriptions
.TP
\-\-desc
Force event descriptions
.TP
\-\-verbose, \-v
Print all results even when below threshold or
exceeding boundaries. Note this can result in bogus
values, as the TopDown methodology relies on
thresholds to correctly characterize workloads.
.TP
\-\-csv CSV, \-x CSV
Enable CSV mode with specified delimeter
.TP
\-\-output OUTPUT, \-o OUTPUT
Set output file
.TP
\-\-split\-output
Generate multiple output files, one for each specified
aggregation option (with \-o)
.TP
\-\-graph
Automatically graph interval output with tl\-barplot.py
.TP
\-\-graph\-cpu GRAPH_CPU
CPU to graph using \-\-graph
.TP
\-\-title TITLE
Set title of graph
.TP
\-\-quiet
Avoid unnecessary status output
.TP
\-\-long\-desc
Print long descriptions instead of abbreviated ones.
.TP
\-\-columns
Print CPU output in multiple columns for each node
.TP
\-\-summary
Print summary at the end. Only useful with \-I
.TP
\-\-no\-area
Hide area column
.TP
\-\-perf\-output PERF_OUTPUT
Save perf stat output in specified file
.TP
\-\-no\-perf
Don't print perf command line
.TP
\-\-print
Only print perf command line. Don't run