Re: [PATCH v4 1/2] perf: uncore: Adding documentation for ThunderX2 pmu uncore driver
From: Randy Dunlap
Date: Thu Apr 26 2018 - 16:56:10 EST
Hi,
Just a few typo corrections...
On 04/25/2018 02:00 AM, Ganapatrao Kulkarni wrote:
> Documentation for the UNCORE PMUs on Cavium's ThunderX2 SoC.
> The SoC has PMU support in its L3 cache controller (L3C) and in the
> DDR4 Memory Controller (DMC).
>
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@xxxxxxxxxx>
> ---
> Documentation/perf/thunderx2-pmu.txt | 66 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 66 insertions(+)
> create mode 100644 Documentation/perf/thunderx2-pmu.txt
>
> diff --git a/Documentation/perf/thunderx2-pmu.txt b/Documentation/perf/thunderx2-pmu.txt
> new file mode 100644
> index 0000000..9e9f535
> --- /dev/null
> +++ b/Documentation/perf/thunderx2-pmu.txt
> @@ -0,0 +1,66 @@
> +
> +Cavium ThunderX2 SoC Performance Monitoring Unit (PMU UNCORE)
> +==========================================================================
> +
> +ThunderX2 SoC PMU consists of independent system wide per Socket PMUs such
> +as Level 3 Cache(L3C) and DDR4 Memory Controller(DMC).
> +
> +It has 8 independent DMC PMUs to capture performance events corresponding
> +to 8 channels of DDR4 Memory Controller. There are 16 independent L3C PMUs
> +to capture events corresponding to 16 tiles of L3 cache. Each PMU supports
> +up to 4 counters.
> +
> +Counters are independent programmable and can be started and stopped
independently
> +individually. Each counter can be set to sample specific perf events.
> +Counters are 32 bit and does not support overflow interrupt, they are
do not interrupt; they are
> +sampled at every 2 seconds. The Counters register access are multiplexed
> +across channels of DMC and L3C. The muxing(select channel) is done through
> +write to a Secure register using smcc calls.
> +
> +PMU UNCORE (perf) driver:
> +
> +The thunderx2-pmu driver registers several perf PMUs for DMC and L3C devices.
> +Each of the PMU provides description of its available events
of the PMUs
> +and configuration options in sysfs.
> + see /sys/devices/uncore_<l3c_S_X/dmc_S_X/>
> +
> +S is socket id and X represents channel number.
> +Each PMU can be used to sample up to 4 events simultaneously.
> +
> +The "format" directory describes format of the config (event ID).
> +The "events" directory provides configuration templates for all
> +supported event types that can be used with perf tool.
> +
> +For example, "uncore_dmc_0_0/cnt_cycles/" is an
> +equivalent of "uncore_dmc_0_0/config=0x1/".
> +
> +Each perf driver also provides a "cpumask" sysfs attribute, which contains a
> +single CPU ID of the processor which is likely to be used to handle all the
> +PMU events. It will be the first online CPU from the NUMA node of PMU device.
> +
> +Example for perf tool use:
> +
> +perf stat -a -e \
> +uncore_dmc_0_0/cnt_cycles/,\
> +uncore_dmc_0_1/cnt_cycles/,\
> +uncore_dmc_0_2/cnt_cycles/,\
> +uncore_dmc_0_3/cnt_cycles/,\
> +uncore_dmc_0_4/cnt_cycles/,\
> +uncore_dmc_0_5/cnt_cycles/,\
> +uncore_dmc_0_6/cnt_cycles/,\
> +uncore_dmc_0_7/cnt_cycles/ sleep 1
> +
> +perf stat -a -e \
> +uncore_dmc_0_0/cancelled_read_txns/,\
> +uncore_dmc_0_0/cnt_cycles/,\
> +uncore_dmc_0_0/consumed_read_txns/,\
> +uncore_dmc_0_0/data_transfers/ sleep 1
> +
> +perf stat -a -e \
> +uncore_l3c_0_0/l3_retry/,\
> +uncore_l3c_0_0/read_hit/,\
> +uncore_l3c_0_0/read_request/,\
> +uncore_l3c_0_0/inv_request/ sleep 1
> +
> +The driver does not support sampling, therefore "perf record" will
> +not work. Per-task (without "-a") perf sessions are not supported.
>
--
~Randy