Re: [PATCH] powerpc/mm: Fix lockup on kernel exec fault

From: Christophe Leroy
Date: Fri Jul 02 2021 - 01:10:11 EST




Le 02/07/2021 à 03:25, Nicholas Piggin a écrit :
Excerpts from Christophe Leroy's message of July 1, 2021 9:17 pm:
The powerpc kernel is not prepared to handle exec faults from kernel.
Especially, the function is_exec_fault() will return 'false' when an
exec fault is taken by kernel, because the check is based on reading
current->thread.regs->trap which contains the trap from user.

For instance, when provoking a LKDTM EXEC_USERSPACE test,
current->thread.regs->trap is set to SYSCALL trap (0xc00), and
the fault taken by the kernel is not seen as an exec fault by
set_access_flags_filter().

Commit d7df2443cd5f ("powerpc/mm: Fix spurrious segfaults on radix
with autonuma") made it clear and handled it properly. But later on
commit d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute
faults") removed that handling, introducing test based on error_code.
And here is the problem, because on the 603 all upper bits of SRR1
get cleared when the TLB instruction miss handler bails out to ISI.

So the problem is 603 doesn't see the DSISR_NOEXEC_OR_G bit?

I a way yes. But the problem is also that the kernel doesn't see it as an exec fault in set_access_flags_filter() as explained above. If it could see it as an exec fault, it would set PAGE_EXEC and it would work (or maybe not because it seems it also checks for the dirtiness of the page, and here the page is also flagged as dirty).

603 will see DSISR_NOEXEC_OR_G if it's an access to a page which is in a segment flagged NX.


I don't see the problem with this for 64s, I don't think anything sane
can be done for any 0x400 interrupt in the kernel so it's probably
good to catch all here just in case. For 64s,

Acked-by: Nicholas Piggin <npiggin@xxxxxxxxx>

Why is 32s clearing those top bits? And it seems to be setting DSISR
that AFAIKS it does not use. Seems like it would be good to add a
NOEXEC_OR_G bit into SRR1.

Probably for simplicity.

When taking the Instruction TLB Miss interrupt, SRR1 contains CR0 fields in bits 0-3 and some dedicated info in bits 12-15. That doesn't match SRR1 bits for ISI, so before falling back to the ISI handler, ITLB Miss handler error patch clears upper SRR1 bits.

Maybe it could instead try to set the right bits, but it would make it more complicated because the error patch can be taken for the following reasons:
- No page table
- Not PAGE_PRESENT
- Not PAGE_ACCESSED
- Not PAGE_EXEC
- Below TASK_SIZE and not PAGE_USER

At the time being the verification of the flags is done with a single 'andc' operation. If we wanted to set the proper bits, it would mean testing the flags separately, which would impact performance on the no-error path.

Or maybe it would be good enough to set the PROTFAULT bit in all cases but the lack of page table. The 8xx sets PROTFAULT when hitting non-exec pages, so the kernel is prepared for it anyway. Not sure about the lack of PAGE_PRESENT thought. The 8xx sets NOHPTE bit when PAGE_PRESENT is cleared.

But is it really worth doing ?

Christophe