Re: [PATCH] sched/numa: fix unsafe get_task_struct() in task_numa_assign()
From: Kirill Tkhai
Date: Sun Oct 19 2014 - 04:21:37 EST
On 18.10.2014 23:36, Peter Zijlstra wrote:
> On Sat, Oct 18, 2014 at 12:33:27PM +0400, Kirill Tkhai wrote:
>> How about this?
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index b78280c..d46427e 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -1165,7 +1165,21 @@ static void task_numa_compare(struct task_numa_env *env,
>>
>> rcu_read_lock();
>> cur = ACCESS_ONCE(dst_rq->curr);
>> - if (cur->pid == 0) /* idle */
>> + /*
>> + * No need to move the exiting task, and this ensures that ->curr
>> + * wasn't reaped and thus get_task_struct() in task_numa_assign()
>> + * is safe; note that rcu_read_lock() can't protect from the final
>> + * put_task_struct() after the last schedule().
>> + */
>> + if (is_idle_task(cur) || (cur->flags & PF_EXITING))
>> + cur = NULL;
>> + /*
>> + * Check once again to be sure curr is still on dst_rq. Even if
>> + * it points on a new task, which is using the memory of freed
>> + * cur, it's OK, because we've locked RCU before
>> + * delayed_put_task_struct() callback is called to put its struct.
>> + */
>> + if (cur != ACCESS_ONCE(dst_rq->curr))
>> cur = NULL;
>>
>> /*
>
> So you worry about the refcount doing 0->1 ? In which case the above is
> still wrong and we should be using atomic_inc_not_zero() in order to
> acquire the reference count.
>
We can't use atomic_inc_not_zero(). The problem is that cur is pointing
to a memory, which may be not a task_struct even. No guarantees at all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/