Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.
From: Srivatsa S. Bhat
Date: Tue Mar 18 2014 - 08:16:44 EST
On 03/15/2014 07:29 AM, Rafael J. Wysocki wrote:
> On Friday, March 14, 2014 11:29:04 AM Dirk Brandewie wrote:
>> On 03/14/2014 10:07 AM, Viresh Kumar wrote:
>>> On 14 March 2014 20:40, Dirk Brandewie <dirk.brandewie@xxxxxxxxx> wrote:
>>>> Are you proposing adding cpufreq_generic_suspend() to the core I can not
>>>> find
>>>> it in the mainline code.
>>>
>>> Its already there in linux-next. I am suggesting to reuse that
>>> infrastructure with
>>> some necessary modification to support both suspend and hotplug.
>>
>> Suspend and hotplug are two very different things and if we start
>> crossing those wires bad things are going to happen IMHO.
>>
>> In "normal" operation using the suspend path to do this work could
>> work in principal but doesn't handle the case where the user does
>> echo 0 | sudo tee /sys/devices/system/cpu/cpuX/online
>>
>> Trying force hotplug and suspend into a common mechanism would
>> lead to a bunch of special case code or a significant rework of the
>> core code IMHO.
>>
>>
>>>
>>>>> Over that I don't think Dirk's solution is going to work if we twist
>>>>> the systems a bit.
>>>>
>>>> Could you explain what "twist the systems a bit" means?
>>>
>>> The one I explained in the below paragraph.
>>>
>>>>> For example, Dirk probably wanted to set P-STATE of every core to MIN
>>>>> when it goes down. But his solution probably doesn't do that right now.
>>>>>
>>>>
>>>> No, I wanted to set the core that was being off-lined to min P state.
>>>
>>> Sorry, probably my words were a bit confusing. I meant exactly what
>>> you just wrote. Core going down will set its freq to min.
>>>
>>>>> As exit() is called only when the policy expires or all the CPUs of that
>>>>> policy
>>>>> are down. Suppose only one CPU out of 4 goes down from a policy, then
>>>>> pstate driver would never know that happened. And that core wouldn't go
>>>>> to min state.
>>>>
>>>> My patch does not change the semantics of exit() or when it is called. For
>>>> intel_pstate their is one cpu per policy so I moved all the cleanup to
>>>
>>> I didn't knew that its guaranteed by pstate driver. I thought it would still be
>>> hardware dependent as some cores might share clock line.
>>
>> This is guaranteed by the hardware. Each core has its own MSR for P state
>> request. Any coordination that is required between cores to select the
>> package P state is handled by the hardware.
>>
>>>
>>>> exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would
>>>> have
>>>> continued to export the *optional* exit callback.
>>>>
>>>> The callback name exit_prepare() was probably unfortunate and might be
>>>> causing
>>>> some confusion. I will be changing the name per Rafael's feedback.
>>>
>>> Don't know.. There is another problem here that exit_prepare() would be called
>>> for each CPU whereas exit() would be called for each policy.
>>
>> Granted but I don't see this as a problem in this case there is a 1:1
>> relationship. If a driver chooses to use the *optional* exit_prepare() callback
>> and knows that there is a many:1 relationship between the policy and CPUs
>> then it would have to deal with it.
>
> Actually, I think we should make it clear that the new callback is for
> ->setpolicy drivers only, which will make things a bit clearer.
>
Agreed. As far as I understand, for ->target drivers, today we use GOV_STOP
to stop managing the CPU going offline. And for ->setpolicy drivers, we will
use this new callback to achieve the same goal.
Regards,
Srivatsa S. Bhat
> We seem to get caught by the difference between ->setpolicy and ->target
> drivers on a regular basis, so it might be a good idea to make the distinction
> more clear in the code. I have an idea how to do that, but need some time
> to prototype it.
>
>>> And I strongly feel that we shouldn't give another callback here but instead
>>> just set core to a specific freq as mentioned by driver with some other field.
>>>
>>>>> I think we have two solutions here:
>>>>> - If its just about setting core a particular freq when it goes down, I
>>>>> think it
>>>>> looks a generic enough problem and so better fix core for that. Probably
>>>>> with
>>>>> help of flags field/suspend-freq (maybe renamed) and without calling
>>>>> drivers
>>>>> exit() at all..
>>>>
>>>>
>>>> ATM the only thing that needs to be done in this case is to allow
>>>> intel_pstate
>>>> to set the P state on the core when it is going done. My solution from the
>>>> cores point of view is more generic, it allows any driver that needs to do
>>>> work
>>>> during CPU_DOWN_PREPARE to do it without adding any new logic to the core.
>>>
>>> Yeah, do we really need to give that freedom right now? Would be better
>>> to get this into core as that would be more generic and people looking to set
>>> core to some freq at shutdown wouldn't be replicating that code.
>
> Question is if it needs to be more generic.
>
> I honestly don't think that ->target drivers will ever do anything like it,
> because they need the governor to "exit" before. So we are talking about the
> only two ->setpolicy drivers in the tree here.
>
>> IMHO yes and it would be hard to be more generic, if your platform needs to
>> do architecture specific during the PREPARE phase of cpu hotplug use this
>> callback or not.
>>
>> BTW now that you have added a path where the cpufreq_suspend() could fail
>> it return a value and be checked in dpm_suspend() instead of printing an
>> error and just continuing.
>
> I'm not sure what you mean? Are you saying that it might be a good idea
> to allow cpufreq_suspend() to return error codes on failure?
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/