[PATCH] sched: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c

From: Ingo Molnar
Date: Sun Dec 30 2018 - 07:31:33 EST



* Ingo Molnar <mingo@xxxxxxxxxx> wrote:

>
> * Vincent Guittot <vincent.guittot@xxxxxxxxxx> wrote:
>
> > > Reported-by: Zhipeng Xie <xiezhipeng1@xxxxxxxxxx>
> > > Cc: Bin Li <huawei.libin@xxxxxxxxxx>
> > > Cc: <stable@xxxxxxxxxxxxxxx> [4.10+]
> > > Fixes: 9c2791f936ef (sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list)
> >
> > If it only happens in update_blocked_averages(), the del leaf has been added by:
> > a9e7f6544b9c (sched/fair: Fix O(nr_cgroups) in load balance path)
>
> So I think until we are confident in the proposed fixes, how about
> applying Linus's patch that reverts a9e7f6544b9c and simplifies the list
> manipulation?
>
> That way we can re-introduce the O(nr_cgroups) optimization without
> pressure.
>
> I'll prepare a commit for sched/urgent that does this, please holler if
> any of you disagrees!

I've applied the patch below to tip:sched/urgent and I'll push it out if
all goes well in testing:

1e2adc76e619: ("sched: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c")

I've preemptively added the Tested-by tags of the gents who found and
analyzed this bug:

Tested-by: Zhipeng Xie <xiezhipeng1@xxxxxxxxxx>
Tested-by: Sargun Dhillon <sargun@xxxxxxxxx>

... in the assumption that you'll do the testing of Linus's fix to make
sure it's all good!

[ Will probably update the commit with acks and any other feedback before
sending it to Linus tomorrow-ish. We don't want to end 2018 with a
known scheduler bug in the upstream tree! ;-) ]

Thanks,

Ingo

===========================>