Re: [PATCH 01/18] sched: select_task_rq_fair clean up

From: Alex Shi
Date: Tue Dec 11 2012 - 06:53:07 EST


On 12/11/2012 02:30 PM, Preeti U Murthy wrote:
> On 12/11/2012 10:58 AM, Alex Shi wrote:
>> On 12/11/2012 12:23 PM, Preeti U Murthy wrote:
>>> Hi Alex,
>>>
>>> On 12/10/2012 01:52 PM, Alex Shi wrote:
>>>> It is impossible to miss a task allowed cpu in a eligible group.
>>>
>>> The one thing I am concerned with here is if there is a possibility of
>>> the task changing its tsk_cpus_allowed() while this code is running.
>>>
>>> i.e find_idlest_group() finds an idle group,then the tsk_cpus_allowed()
>>> for the task changes,perhaps by the user himself,which might not include
>>> the cpus in the idle group.After this find_idlest_cpu() is called.I mean
>>> a race condition in short.Then we might not have an eligible cpu in that
>>> group right?
>>
>> your worry make sense, but the code handle the situation, in
>> select_task_rq(), it will check the cpu allowed again. if the answer is
>> no, it will fallback to old cpu.
>>>
>>>> And since find_idlest_group only return a different group which
>>>> excludes old cpu, it's also imporissible to find a new cpu same as old
>>>> cpu.
>
> I doubt this will work correctly.Consider the following situation:sched
> domain begins with sd that encloses both socket1 and socket2
>
> cpu0 cpu1 | cpu2 cpu3
> -----------|-------------
> socket1 | socket2
>
> old cpu = cpu1
>
> Iteration1:
> 1.find_idlest_group() returns socket2 to be idlest.
> 2.task changes tsk_allowed_cpus to 0,1
> 3.find_idlest_cpu() returns cpu2
>
> * without your patch
> 1.the condition after find_idlest_cpu() returns -1,and sd->child is
> chosen which happens to be socket1
> 2.in the next iteration, find_idlest_group() and find_idlest_cpu()
> will probably choose cpu0 which happens to be idler than cpu1,which is
> in tsk_allowed_cpu.

Thanks for question Preeti! :)

Yes, with more iteration you has more possibility to get task allowed
cpu in select_task_rq_fair. but how many opportunity the situation
happened? how much gain you get here?
With LCPU increasing many many iterations cause scalability issue. that
is the simplified forking patchset for. and that why 10% performance
gain on hackbench process/thread.

and if you insist want not to miss your chance in strf(), the current
iteration is still not enough. How you know the idlest cpu is still
idlest after this function finished? how to ensure the allowed cpu won't
be changed again?

A quick snapshot is enough in balancing here. we still has periodic
balacning.

>
> * with your patch
> 1.the condition after find_idlest_cpu() does not exist,therefore
> a sched domain to which cpu2 belongs to is chosen.this is socket2.(under
> the for_each_domain() loop).
> 2.in the next iteration, find_idlest_group() return NULL,because
> there is no cpu which intersects with tsk_allowed_cpus.
> 3.in select task rq,the fallback cpu is chosen even when an idle cpu
> existed.
>
> So my concern is though select_task_rq() checks the
> tsk_allowed_cpus(),you might end up choosing a different path of
> sched_domains compared to without this patch as shown above.
>
> In short without the "if(new_cpu==-1)" condition we might get misled
> doing unnecessary iterations over the wrong sched domains in
> select_task_rq_fair().(Think about situations when not all the cpus of
> socket2 are disallowed by the task,then there will more iterations in

After read the first 4 patches, believe you will find the patchset is
trying to reduce iterations, not increase them.

> the wrong path of sched_domains before exit,compared to what is shown
> above.)
>
> Regards
> Preeti U Murthy
>
>


--
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/