Re: [PATCH] ARM: kexec: Fix panic after TLB are invalidated
From: Giancarlo Ferrari
Date: Mon Feb 01 2021 - 09:42:41 EST
Hi,
On Mon, Feb 01, 2021 at 12:47:20PM +0000, Mark Rutland wrote:
> On Mon, Feb 01, 2021 at 12:44:56AM +0000, Giancarlo Ferrari wrote:
> > machine_kexec() need to set rw permission in text and rodata sections
> > to assign some variables (e.g. kexec_start_address). To do that at
> > the end (after flushing pdm in memory, etc.) it needs to invalidate
> > TLB [section] entries.
>
> It'd be worth noting explicitly that set_kernel_text_rw() alters
> current->active_mm...
>
> > If during the TLB invalidation an interrupt occours, which might cause
> > a context switch, there is the risk to inject invalid TLBs, with ro
> > permissions.
>
> ... which is why if there's a context switch things can go wrong, since
> active_mm isn't stable, and so it's possible that set_kernel_text_rw()
> updates multiple tables, none of which might be the active table at the
> point we try to make an access.
>
Maybe the behaviour causing issue is not completely clear to me, and I do
apologize for that (moreover I haven't eougth debug capabilities).
However, current-active_mm is switched among context switches. Correct ?
So, in principle, the invalidation, if stopped, is carried on where it
left.
I thought the issue was that the PageTable entry for the section 0x8010_0000
is global, thus not indexed by ASID (Address Space ID). By the fact that each
process has its own version of that entry, is the cause of the issue, as the
schedule process might bringing a spurious entry (with ro permission) in the
MMU cache.
If the entry is not global holds the ASID, and the issue cannot happen.
Please note that this behaviour was tested on a armv7 arch board.
> It would be nice to spell that out rather than saying "invalid TLBs".
>
> We could disable preemption to prevent that, which is possibly better
> than disabling interrupts.
>
> Overall, it would be much better to avoid having to mess with the kernel
> page tables. So rather than going:
>
> 1. mark kernel RW
> 2. alter variables in reloc code
> 3. copy reloc code into buffer
> 4. branch to buffer
>
> ... we should be able to go:
>
> 1. copy reloc code into buffer
> 2. alter variables in copy of reloc code
> 3. branch to buffer
>
> ... which would avoid this class of problem too.
>
> Thanks,
> Mark.
Thanks,
GF