Re: [RFC PATCH 0/6] Support raw event and DT for perf on RISC-V

From: Alan Kao
Date: Tue Jun 30 2020 - 21:18:39 EST


On Mon, Jun 29, 2020 at 11:19:09AM +0800, Zong Li wrote:
> This patch set adds raw event support on RISC-V. In addition, we
> introduce the DT mechanism to make our perf more generic and common.
>
> Currently, we set the hardware events by writing the mhpmeventN CSRs, it
> would raise an illegal instruction exception and trap into m-mode to
> emulate event selector CSRs access. It doesn't make sense because we
> shouldn't write the m-mode CSRs in s-mode. Ideally, we should set event
> selector through standard SBI call or the shadow CSRs of s-mode. We have
> prepared a proposal of a new SBI extension, called "PMU SBI extension",
> but we also discussing the feasibility of accessing these PMU CSRs on
> s-mode at the same time, such as delegation mechanism, so I was
> wondering if we could use SBI calls first and make the PMU SBI extension
> as legacy when s-mode access mechanism is accepted by Foundation? or
> keep the current situation to see what would happen in the future.
>
> This patch set also introduces the DT mechanism, we don't want to add too
> much platform-dependency code in perf like other architectures, so we
> put the mapping of generic hardware events to DT, then we can easy to
> transfer generic hardware events to vendor's own hardware events without
> any platfrom-dependency stuff in our perf.
>
> Zong Li (6):
> dt-bindings: riscv: Add YAML documentation for PMU
> riscv: dts: sifive: Add DT support for PMU
> riscv: add definition of hpmcounter CSRs
> riscv: perf: Add raw event support
> riscv: perf: introduce DT mechanism
> riscv: remove PMU menu of Kconfig
>

DT-based PMU registration looks good to me. Together with Anup's feedback,
we can anticipate that the following items will be:

- rewrite RISC-V PMU to a platform driver
- propose SBI PMU extention
- fixes: RV32 counter access, namings, etc.

Yes, all are good directions towards better counting (`perf stat`) function.
But as the original author of RISC-V perf port, please allow me to address
the fundamental problems of RISC-V perf, again [0][1][2][3], that the sampling
(`perf record`) function never earned enough respect. Counting gives you a
shallow view regarding an application, while sampling demystifies one for you.

The problems are three-fold
(1) Interrupt
Sampling in perf requires that a HPM raises an interrupt when it overflows.
Making RISC-V perf platform driver or not has nothing to do with this. This
requires more discussions in TGs.
(2) S-mode access to PMU CSRs
This is also addressed in this patch set but to me, it is kind of like a
SBI-solves-them-all mindset to me. Perf event is for performance monitoring
thus we should eliminate any possible overhead if we can. Setting event masks
through SBI calls for counting maybe OK, but if we really take sampling and
interrupt handling into consideration, it is questionable if it is still a
viable way.
(3) Registers, registers, registers
There is just no enough CSR/function for perf sampling. The previous proposal
explains why [2].

Perf sampling is off-topic but somehow related, so I bring it up here just
for your information.

As this patch set goes v2, the PMU porting guide in [0] should be removed since
it contains no useful information anymore.

[0] Documentation/riscv/pmu.rst
[1] https://www.youtube.com/watch?v=Onvlcl4e2IU
[2] https://github.com/riscv/riscv-isa-manual/issues/402
This proposal has been posted in Privileged Spec Task Group, in
https://lists.riscv.org/g/tech-privileged-archive/message/488?p=,,,20,0,0,0::Created,,Proposal,20,2,40,32306071
but never receive any feedback.
[3] https://lists.riscv.org/g/tech-unixplatformspec/message/84
I intended to discuss [2] in the Unixplatform Spec Task Group at the
online meeting, but obviously people were too busy knowing who the new
RISC-V CTO is and what he has done to even follow the agenda.