Re: [PATCH 01/18] sched: select_task_rq_fair clean up

From: Preeti U Murthy
Date: Wed Dec 12 2012 - 00:27:10 EST


On 12/11/2012 05:23 PM, Alex Shi wrote:
> On 12/11/2012 02:30 PM, Preeti U Murthy wrote:
>> On 12/11/2012 10:58 AM, Alex Shi wrote:
>>> On 12/11/2012 12:23 PM, Preeti U Murthy wrote:
>>>> Hi Alex,
>>>>
>>>> On 12/10/2012 01:52 PM, Alex Shi wrote:
>>>>> It is impossible to miss a task allowed cpu in a eligible group.
>>>>
>>>> The one thing I am concerned with here is if there is a possibility of
>>>> the task changing its tsk_cpus_allowed() while this code is running.
>>>>
>>>> i.e find_idlest_group() finds an idle group,then the tsk_cpus_allowed()
>>>> for the task changes,perhaps by the user himself,which might not include
>>>> the cpus in the idle group.After this find_idlest_cpu() is called.I mean
>>>> a race condition in short.Then we might not have an eligible cpu in that
>>>> group right?
>>>
>>> your worry make sense, but the code handle the situation, in
>>> select_task_rq(), it will check the cpu allowed again. if the answer is
>>> no, it will fallback to old cpu.
>>>>
>>>>> And since find_idlest_group only return a different group which
>>>>> excludes old cpu, it's also imporissible to find a new cpu same as old
>>>>> cpu.
>>
>> I doubt this will work correctly.Consider the following situation:sched
>> domain begins with sd that encloses both socket1 and socket2
>>
>> cpu0 cpu1 | cpu2 cpu3
>> -----------|-------------
>> socket1 | socket2
>>
>> old cpu = cpu1
>>
>> Iteration1:
>> 1.find_idlest_group() returns socket2 to be idlest.
>> 2.task changes tsk_allowed_cpus to 0,1
>> 3.find_idlest_cpu() returns cpu2
>>
>> * without your patch
>> 1.the condition after find_idlest_cpu() returns -1,and sd->child is
>> chosen which happens to be socket1
>> 2.in the next iteration, find_idlest_group() and find_idlest_cpu()
>> will probably choose cpu0 which happens to be idler than cpu1,which is
>> in tsk_allowed_cpu.
>
> Thanks for question Preeti! :)
>
> Yes, with more iteration you has more possibility to get task allowed
> cpu in select_task_rq_fair. but how many opportunity the situation
> happened? how much gain you get here?
> With LCPU increasing many many iterations cause scalability issue. that
> is the simplified forking patchset for. and that why 10% performance
> gain on hackbench process/thread.
>
> and if you insist want not to miss your chance in strf(), the current
> iteration is still not enough. How you know the idlest cpu is still
> idlest after this function finished? how to ensure the allowed cpu won't
> be changed again?
>
> A quick snapshot is enough in balancing here. we still has periodic
> balacning.

Hmm ok,let me look at this more closely.
>
>>
>> * with your patch
>> 1.the condition after find_idlest_cpu() does not exist,therefore
>> a sched domain to which cpu2 belongs to is chosen.this is socket2.(under
>> the for_each_domain() loop).
>> 2.in the next iteration, find_idlest_group() return NULL,because
>> there is no cpu which intersects with tsk_allowed_cpus.
>> 3.in select task rq,the fallback cpu is chosen even when an idle cpu
>> existed.
>>
>> So my concern is though select_task_rq() checks the
>> tsk_allowed_cpus(),you might end up choosing a different path of
>> sched_domains compared to without this patch as shown above.
>>
>> In short without the "if(new_cpu==-1)" condition we might get misled
>> doing unnecessary iterations over the wrong sched domains in
>> select_task_rq_fair().(Think about situations when not all the cpus of
>> socket2 are disallowed by the task,then there will more iterations in
>
> After read the first 4 patches, believe you will find the patchset is
> trying to reduce iterations, not increase them.

Right,sorry about not noticing this.
>
>> the wrong path of sched_domains before exit,compared to what is shown
>> above.)
>>
Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/