Re: [BUG] Use of probe_kernel_address() in task_rcu_dereference() without checking return value
From: Peter Zijlstra
Date: Mon Sep 02 2019 - 12:21:13 EST
On Mon, Sep 02, 2019 at 04:44:24PM +0200, Oleg Nesterov wrote:
> speaking of the users of task_rcu_dereference(), membarrier_global_expedited()
> does
>
> rcu_read_lock();
> p = task_rcu_dereference(&cpu_rq(cpu)->curr);
> if (p && p->mm && (atomic_read(&p->mm->membarrier_state) &
> MEMBARRIER_STATE_GLOBAL_EXPEDITED)) {
>
> This asks for READ_ONCE, but this is minor. Why can't p->mm be freed?
>
> I guess it is fine to read the garbage from &p->mm->membarrier_state if we race
> with the exiting task, but in theory this looks unsafe if CONFIG_DEBUG_PAGEALLOC.
>
> Another possible user of probe_slab_address() or I am totally confused?
You're quite right; that's busted.
Due to the lack of READ_ONCE() on p->mm, the above can in fact turn into
a NULL deref when we hit do_exit() around exit_mm(). The first p->mm
read is before and sees !NULL, the second is after and does observe
NULL, and *bang*.
I suppose it wants to be something like:
mm = READ_ONCE(p->mm);
if (mm && probe_address())
(I'm not sure _slab_ is a useful part of the name; it should work on
kernel memory irrespective of the allocator)
If it got freed, that CPU already just did something that implies
smp_mb() so we're good. So whatever garbage gets read, is fine. Either
we do a superfluous IPI or not is immaterial.