Re: [PATCH] sched/core: Create new task with twice disabled preemption

From: Kirill Tkhai
Date: Thu Feb 13 2014 - 12:45:56 EST


On 13.02.2014 20:00, Peter Zijlstra wrote:
> On Thu, Feb 13, 2014 at 07:51:56PM +0400, Kirill Tkhai wrote:
>> For archs without __ARCH_WANT_UNLOCKED_CTXSW set this means
>> that all newly created tasks execute finish_arch_post_lock_switch()
>> and post_schedule() with preemption enabled.
>
> That's IA64 and MIPS; do they have a 'good' reason to use this?

It seems my description misleads reader, I'm sorry if so.

I mean all architectures *except* IA64 and MIPS. All, which
has no __ARCH_WANT_UNLOCKED_CTXSW defined.

IA64 and MIPS already have preempt_enable() in schedule_tail():

#ifdef __ARCH_WANT_UNLOCKED_CTXSW
/* In this case, finish_task_switch does not reenable preemption */
preempt_enable();
#endif

Their initial preemption is not decremented in finish_lock_switch().

So, we speak about x86, ARM64 etc.

Look at ARM64's finish_arch_post_lock_switch(). It looks a task
must to not be preempted between switch_mm() and this function.
But in case of new task this is possible.

Example:
RT thread p0 and RT thread p1 are on shared mm. System has 2 cpu.

p0 is bound to CPU0.
p1 is bound to CPU1.

p1 has set timer and it is sleeping.

p0 create fair thread f. Task f wakes on CPU1.

When f is between raw_spin_unlock_irq() and
finish_arch_post_lock_switch(), preemption is enabled.
In this moment the process p1 is waking on CPU1.

For p1 the check

if (!cpumask_test_and_set_cpu(cpu, mm_cpumask(next)) || prev != next)

in switch_mm() is not passed, because mm is the same. So, later
we do not do cpu_switch_mm() in finish_arch_post_lock_switch()
and we just go to userspace.

This is the problem I tried to solve. I don't know arm64, and I can't
say how it is serious.

But it looks the place is buggy.

Kirill

> That is; the alternative is to fix those two archs and remove the
> __ARCH_WANT_UNLOCKED_CTXSW clutter alltogether; which seems like a big
> win to me.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/