Re: [PATCH v2 02/10] cpufreq: provide data for frequency-invariant load-tracking support

From: Sudeep Holla
Date: Thu Jul 13 2017 - 10:04:43 EST

On 12/07/17 12:14, Peter Zijlstra wrote:
> On Wed, Jul 12, 2017 at 02:57:55PM +0530, Viresh Kumar wrote:
>> On 12-07-17, 10:31, Peter Zijlstra wrote:
>>> So the problem with the thread is two-fold; one the one hand we like the
>>> scheduler to directly set frequency, but then we need to schedule a task
>>> to change the frequency, which will change the frequency and around we
>>> go.
>>> On the other hand, there's very nasty issues with PI. This thread would
>>> have very high priority (otherwise the SCHED_DEADLINE stuff won't work)
>>> but that then means this thread needs to boost the owner of the i2c
>>> mutex. And that then creates a massive bandwidth accounting hole.
>>> The advantage of using an interrupt driven state machine is that all
>>> those issues go away.
>>> But yes, whichever way around you turn things, its crap. But given the
>>> hardware its the best we can do.
>> Thanks for the explanation Peter.
>> IIUC, it will take more time to change the frequency eventually with
>> the interrupt-driven state machine as there may be multiple bottom
>> halves involved here, for supply, clk, etc, which would run at normal
>> priorities now. And those were boosted currently due to the high
>> priority sugov thread. And we are fine with that (from performance
>> point of view) ?
> I'm not sure what you mean; bottom halves as in softirq? From what I can
> tell an i2c bus does clk_prepare_enable() on registration and from that
> point on clk_enable() is usable from atomic contexts. But afaict clk
> stuff doesn't do interrupts at all.
> (with a note that I absolutely hate the clk locking)

Agreed. Juri pointed out this as a blocker a while ago and when we
started implementing the new and shiny ARM SCMI specification, I dropped
the whole clock layer interaction for the CPUFreq driver. However, I
still have to deal with some mailbox locking(still experimenting currently)

> I think the interrupt driven thing can actually be faster than the
> 'regular' task waiting on the mutex. The regulator message can be
> locklessly queued (it only ever makes sense to have 1 such message
> pending, any later one will invalidate a prior one).

Ah OK, I just asked the same in the other thread, you have already
answered me. Good we can ignore.

> Then the i2c interrupt can detect the availability of this pending
> message and splice it into the transfer queue at an opportune moment.
> (of course, the current i2c bits don't support any of that)
>> Coming back to where we started from (where should we call
>> arch_set_freq_scale() from ?).
> The drivers.. the core cpufreq doesn't know when (if any) transition is
> completed.
>> I think we would still need some kind of synchronization between
>> cpufreq core and the cpufreq drivers to make sure we don't start
>> another freq change before the previous one is complete. Otherwise
>> the cpufreq drivers would be required to have similar support with
>> proper locking in place.
> Not sure what you mean; also not sure why. On x86 we never know, cannot
> know. So why would this stuff be any different.

Good, I was under the same assumption that it's okay to override the old
request with new.

>> And if the core is going to get notified about successful freq changes
>> (which it should IMHO), then it may still be better to call
>> arch_set_freq_scale() from the core itself and not from individual
>> drivers.
> I would not involve the core. All we want from the core is a unified
> interface towards requesting DVFS changes. Everything that happens after
> is not its business.

The question is whether we *need* to know the completion of frequency
transition. What is the impact of absence of it ? I am considering
platforms which may take up to a ms or more to do the actual transition
in the firmware.