Re: [RFC PATCH] perf: New start period for the freq mode
From: Liang, Kan
Date: Tue Sep 03 2024 - 11:29:31 EST
On 2024-09-02 6:38 a.m., Peter Zijlstra wrote:
> On Thu, Aug 29, 2024 at 11:13:42PM -0700, Namhyung Kim wrote:
>> Hi Kan,
>>
>> On Thu, Aug 29, 2024 at 08:20:36AM -0700, kan.liang@xxxxxxxxxxxxxxx wrote:
>>> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>>>
>>> The freq mode is the current default mode of Linux perf. 1 period is
>>> used as a start period. The period is auto-adjusted in each tick or an
>>> overflow to meet the frequency target.
>>>
>>> The start period 1 is too low and may trigger some issues.
>>> - Many HWs do not support period 1 well.
>>> https://lore.kernel.org/lkml/875xs2oh69.ffs@tglx/
>
> So we already have x86_pmu::limit_period and pmu::check_period to deal
> with this. Don't they already capture the 1 and increase it where
> appropriate?
The limit_period only checks the minimum acceptable value for HW. If the
value is lower than that, I think HW errors may be triggered. It's a
mandatory request.
However, it doesn't make it a perfect start value, which perf uses in
the default freq mode.
As you can see in Thomas's experiment, it doesn't trigger HW issue to
set the start period to 1. But the message "perf: interrupt took too
long (2503 > 2500), lowering ..." is printed. That should be a false
alarm. To avoid it, 32 is finally used for the limit_period.
https://lore.kernel.org/lkml/87plq9l5d2.ffs@tglx/
We cannot always use this way to address the above issue.
- It's impossible to test all the platforms to find a perfect "32" for
each platform.
- Some events may need a very low period. We cannot set the limit_period
too high.
Furthermore, a low start period for the frequently occurring event
challenges both HW and virtualization, which has a longer path to handle
a PMI.
I think we need a better start period for the default freq mode.
Yes, there is already a pmu::check_period which is period related. I
will check if it can be modified to feedback a start value somehow.
>
>>> - For an event that occurs frequently, period 1 is too far away from the
>>> real period. Lots of the samples are generated at the beginning.
>>> The distribution of samples may not be even.
>
> Which is why samples include a WEIGHT option IIRC.
>
The WEIGHT gives all kinds of latency to understand it. But it doesn't
help changing the distribution.
>> Sounds like a per-pmu callback is fine. PMUs don't have the callback
>> (including SW) can use 1 same as of now.
>
> This, but also, be very careful to not over-estimate, because ramping up
> is fast, but having to adjust down can take a while.
Sure.
Thanks,
Kan