Re: [PATCH] PREEMPT_RT_FULL: ARM context switch needs IRQs enabled

From: Frank Rowand
Date: Fri Dec 16 2011 - 18:23:58 EST


On 12/16/11 03:01, Catalin Marinas wrote:
> Hi Frank,
>
> On Fri, Dec 16, 2011 at 03:20:45AM +0000, Frank Rowand wrote:
>> ARMv6 and later have VIPT caches and the TLBs are tagged with an ASID
>> (application specific ID). The number of ASIDs is limited to 256 and
>> the allocation algorithm requires IPIs when all the ASIDs have been
>> used. The IPIs require interrupts enabled during context switch for
>> deadlock avoidance.
>>
>> The RT patch mm-protect-activate-switch-mm.patch disables irqs around
>> activate_mm() and switch_mm(), which are the portion of the ARMv6
>> context switch that require interrupts enabled.
>>
>> The solution for the ARMv6 processors could be to _not_ disable irqs.
>> A more conservative solution is to provide the same environment that
>> the scheduler provides, that is preempt_disable(). This is more
>> resilient for possible future changes to the ARM context switch code
>> that is not aware of the RT patches.
>>
>> This patch will conflict slightly with Catalin's patch set to remove
>> __ARCH_WANT_INTERRUPTS_ON_CTXSW, when that is accepted:
>>
>> http://lkml.indiana.edu/hypermail/linux/kernel/1111.3/01893.html
>>
>> When Catalin's patch set is accepted, this RT patch will need to reverse
>> the change in patch 6 to arch/arm/include/asm/system.h:
>>
>> -#ifndef CONFIG_CPU_HAS_ASID
>> -#define __ARCH_WANT_INTERRUPTS_ON_CTXSW
>> -#endif
>>
>> Signed-off-by: Frank Rowand <frank.rowand@xxxxxxxxxxx>
>
> The whole point of my patches was to no longer define
> __ARCH_WANT_INTERRUPTS_ON_CTXSW on ARM, so bringing it back in is not
> feasible.

Looking over Catalin's patches again, it looks like my hacky RT patch
will no longer be needed after Catalin's patch set is in place. The
problem my patch deals with is that with the RT patches applied, use_mm()
calls switch_mm() with IRQs disabled. The current ARM switch_mm() can
not be called with IRQs disabled. But with Catalin's patch 4
(http://lkml.indiana.edu/hypermail/linux/kernel/1111.3/01898.html)
applied, switch_mm() can be called with IRQs enabled, because
switch_mm() no longer calls check_context() which calls __new_context()
which calls smp_call_function() which requires IRQs to be enabled....

>
> But by reading your patch I realised that there are two more switch_mm()
> calls outside the scheduler switch and the post-switch hook would not be
> called:
>
> 1. idle_task_exit() - this function switches to the init_mm. With my new
> patches, even if the switch does not explicitly happen because it
> requires a new ASID, the ARMv6+ code sets the init_mm.pgd anyway. On
> ARMv5 such hook would not happen but we don't have CONFIG_HOTPLUG_CPU
> either. But even if it happens to work, it's not correct behaviour in
> either case.
>
> 2. use_mm() - this also assumes an immediate switch_mm() which may not
> happen.
>
> My question - can we not use activate_mm() directly in both cases? This
> function would ensure that the new mm is actually active on ARM and it
> does not disable the interrupts for the critical code. You may still
> need some RT patch if the RT series disables the interrupts for the
> switch_mm() calls.

Yes, activate_mm() should work for those two cases for ARM (I don't know
about other architectures). To make this easier to follow for those reading
along, Catalin's patch 4 changes activate_mm() to call
finish_arch_post_lock_switch(), which will actually make the new mm active.

Here is the relevant code from Catalin's patch 4:

+static inline void finish_arch_post_lock_switch(void)
+{
+ if (test_and_clear_thread_flag(TIF_SWITCH_MM)) {
+ struct mm_struct *mm = current->mm;
+ unsigned long flags;
+
+ __new_context(mm);
+
+ local_irq_save(flags);
+ cpu_switch_mm(mm->pgd, mm);
+ local_irq_restore(flags);
+ }
+}

+static inline void activate_mm(struct mm_struct *prev, struct mm_struct *next)
+{
+#ifdef CONFIG_MMU
+ unsigned long flags;
+
+ local_irq_save(flags);
+ switch_mm(prev, next, current);
+ local_irq_restore(flags);
+
+ finish_arch_post_lock_switch();
+#endif
+}

But as Catalin points out, this means in use_mm() the RT patch will need to
call the ARM activate_mm() with interrupts enabled because
finish_arch_post_lock_switch() calls __new_context() which calls
smp_call_function() which requires IRQs to be enabled.

And, once again, I don't have a nice clean way to differentiate the
RT case that requires IRQs to be enabled.

This also brings up another question: should the RT patch
mm-protect-activate-switch-mm.patch also disable interrupts
around the call to switch_mm() in idle_task_exit()?

>
> If that sounds reasonable, I'll repost my series with activate_mm() fix.
>
> Thanks.
>

-Frank

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/