Re: [PATCH] sched: fix NULL pointer issue in pick_next_entity()

From: Peter Zijlstra
Date: Tue Aug 01 2017 - 05:12:23 EST


On Tue, Aug 01, 2017 at 04:57:43PM +0800, Yafang Shao wrote:
> > And how would that happen? We only call pick_next_entity(.curr=NULL)
> > when we _know_ cfs_rq->nr_running.
>
> It crashed my machine when I did hadoop test, and after I made this change
> it works now.
> On SMP system, cfs_rq->nr_running isn't protected well, although we _know_
> cfs_rq->nr_running,
> but it is modified by other thread running on other CPU and the
> sched_entity is set NULL as well.
> Then this thread broken here as accessed the NULL pointer here.

cfs_rq->nr_running should be protected by the rq->lock. If it is not,
something else is buggered.