Re: [PATCH V2 3/3] cpufreq: CPPC: Add support for frequency invariance

From: Viresh Kumar
Date: Thu Jun 17 2021 - 07:19:43 EST


On 17-06-21, 11:34, Ionela Voinescu wrote:
> I might be missing something, but when you offline a single CPU in a
> policy, the worse that can happen is that a last call to
> cppc_scale_freq_tick() would have sneaked in before irqs and the tick
> are disabled. But even if we have a last call to
> cppc_scale_freq_workfn(), the counter read methods would know how to
> cope with hotplug, and the cppc_cpudata structure would still be
> allocated and have valid desired_perf and highest_perf values.

Okay, I somehow assumed that cppc_scale_freq_workfn() needs to run on the local
CPU, while it can actually land anywhere. My fault.

But the irq-work being queued here is per-cpu and it will get queued on the
local CPU where the tick occurred.

Now I am not sure of what will happen to this irq-work in that case. I tried to
look now and I see that these irq-work items are processed first on tick and
then the tick handler of scheduler is called, so that will again queue the cppc
irq-work.

What happens if this happens along with CPU hotplug ? I am not sure I understand
that. There may or may not be any side effects of this. Lets assume the work
item is left in the queue as is and no tick happens after that as the CPU is
offlined. We are good.

Now if we unload the cpufreq driver at this moment, the driver will call
irq_work_sync(), which may end up in a while loop ? There is no
irq-work-cancel() API.

Peter: Can you help here on this ? Lemme try to explain a bit here:

We are starting an irq-work (in cppc cpufreq driver) from
scheduler_tick()->arch_scale_freq_tick(). What will happen if the driver doesn't
take care of CPU hotplug explicitly and make sure this work isn't queued again
from the next tick.

Is it important for user to make sure it gets rid of the irq-work during hotplug
here ?

> Worse case, the last scale factor set for the CPU will be meaningless,
> but it's already meaningless as the CPU is going down.
>
> When you are referring to the issue reported by Qian I suppose you are
> referring to this [1]. I think this is the case where you hotplug the
> last CPU in a policy and free cppc_cpudata.
>
> [1] https://lore.kernel.org/linux-pm/41f5195e-0e5f-fdfe-ba37-34e1fd8e4064@xxxxxxxxxxx/

Yes, I was talking about this report only, I am not sure now if I understand
what actually happened here :)

Ionela: I have skipped replying to rest of your email, will get back to that
once we have clarification on the above first.

Thanks a lot for your reviews, always on time :)

--
viresh