Re: [PATCH] x86/mm: Do not verify W^X at boot up

From: Linus Torvalds
Date: Mon Oct 24 2022 - 17:13:31 EST


On Mon, Oct 24, 2022 at 11:52 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> It's not just speed up at boot up. It's because text_poke doesn't work at
> early boot up when function tracing is enabled. If I remove the
> SYSTEM_BOOTING checks in text_poke() this happens:

Let's just fix that up.

>
> [ 1.013753] BUG: kernel NULL pointer dereference, address: 0000000000000048

This is due to

__get_locked_pte:
0: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
5: 48 89 f0 mov %rsi,%rax
8: 41 55 push %r13
a: 48 c1 e8 24 shr $0x24,%rax
e: 41 54 push %r12
10: 25 f8 0f 00 00 and $0xff8,%eax
15: 55 push %rbp
16: 53 push %rbx
17:* 48 03 47 48 add 0x48(%rdi),%rax <-- trapping instruction
1b: 0f 84 c4 00 00 00 je 0xe5
21: 49 89 fc mov %rdi,%r12
24: 48 8b 38 mov (%rax),%rdi
27: 48 89 f3 mov %rsi,%rbx
2a: 48 89 c5 mov %rax,%rbp

and that 'addq' seems to be walk_to_pmd() doing

pgd = pgd_offset(mm, addr);

with a zero mm (that's 'mm->pgd').

And that, in turn, seems to be due to the absolutely disgusing 'poking_mm' hack.

> Interrupts haven't been enabled yet, so things are still rather fragile at
> this point of start up.

I don't think this has anything to do with interrupts. We do need the
page structures etc to be workable, but all the tracing setup needs
that *anyway*.

I suspect it would be fixed by just moving 'poking_init()' earlier. In
many ways I suspect it would make most sense as part of 'mm_init()',
not as a random call fairly late in start_kernel().

In other words, this all smells like "people added special cases
because they didn't want to hunt down the underlying problem".

And then all these special cases beget other special cases.

Linus