Re: NULL pointer dereference in pick_next_task_fair

From: Peter Zijlstra
Date: Wed Nov 06 2019 - 11:57:46 EST


On Wed, Nov 06, 2019 at 03:04:50PM +0000, Qais Yousef wrote:
> On 11/06/19 14:08, Peter Zijlstra wrote:
> > On Wed, Nov 06, 2019 at 01:05:25PM +0100, Peter Zijlstra wrote:

> > > The only thing I'm now considering is if we shouldn't be setting
> > > ->on_cpu=2 _before_ calling put_prev_task(). I'll go audit the RT/DL
> > > cases.
> >
> > So I think it all works, but that's more by accident than anything else.
> > I'll move the ->on_cpu=2 assignment earlier. That clearly avoids calling
> > put_prev_task() while we're in put_prev_task().
>
> Did you mean avoids calling *set_next_task()* while we're in put_prev_task()?

Either, really. The change pattern does put_prev_task() first, and then
restores state by calling set_next_task(). And it can do that while
we're in put_prev_task(), unless we're setting ->on_cpu=2.

> So what you're saying is that put_prev_task_{rt,dl}() could drop the rq_lock()
> too and the race could happen while we're inside these functions, correct? Or
> is it a different reason?

Indeed, except it looks like that actually works (mostly by accident).

> By the way, is all reads/writes to ->on_cpu happen when a lock is held? Ie: we
> don't need to use any smp read/write barriers?

Yes, ->on_cpu is fully serialized by rq->lock. We use
smp_store_release() in finish_task() due to ttwu spin-waiting on it
(which reminds me, riel was seeing lots of that).