Re: [PATCH 4.19 079/276] x86/modules: Avoid breaking W^X while loading modules

From: Nadav Amit
Date: Fri May 31 2019 - 17:51:16 EST


> On May 31, 2019, at 3:37 AM, Pavel Machek <pavel@xxxxxxx> wrote:
>
> Hi!
>
>> [ Upstream commit f2c65fb3221adc6b73b0549fc7ba892022db9797 ]
>>
>> When modules and BPF filters are loaded, there is a time window in
>> which some memory is both writable and executable. An attacker that has
>> already found another vulnerability (e.g., a dangling pointer) might be
>> able to exploit this behavior to overwrite kernel code. Prevent having
>> writable executable PTEs in this stage.
>>
>> In addition, avoiding having W+X mappings can also slightly simplify the
>> patching of modules code on initialization (e.g., by alternatives and
>> static-key), as would be done in the next patch. This was actually the
>> main motivation for this patch.
>>
>> To avoid having W+X mappings, set them initially as RW (NX) and after
>> they are set as RO set them as X as well. Setting them as executable is
>> done as a separate step to avoid one core in which the old PTE is cached
>> (hence writable), and another which sees the updated PTE (executable),
>> which would break the W^X protection.
>
> First, is this stable material? Yes, it changes something.
>
> But if you assume attacker can write into kernel memory during module
> load, what prevents him to change the module as he sees fit while it
> is not executable, simply waiting for system to execute it?
>
> I don't see security benefit here.

I agree that at the moment the benefit it limited. I think the benefit would
come later, if the module signature check is performed after the module has
been write-protected, but before it is actually executed.

>> +++ b/arch/x86/kernel/alternative.c
>> @@ -662,15 +662,29 @@ void __init alternative_instructions(void)
>> * handlers seeing an inconsistent instruction while you patch.
>> */
>> void *__init_or_module text_poke_early(void *addr, const void *opcode,
>> - size_t len)
>> + size_t len)
>> {
>> unsigned long flags;
>> - local_irq_save(flags);
>> - memcpy(addr, opcode, len);
>> - local_irq_restore(flags);
>> - sync_core();
>> - /* Could also do a CLFLUSH here to speed up CPU recovery; but
>> - that causes hangs on some VIA CPUs. */
>> +
>> + if (boot_cpu_has(X86_FEATURE_NX) &&
>> + is_module_text_address((unsigned long)addr)) {
>> + /*
>> + * Modules text is marked initially as non-executable, so the
>> + * code cannot be running and speculative code-fetches are
>> + * prevented. Just change the code.
>> + */
>> + memcpy(addr, opcode, len);
>> + } else {
>> + local_irq_save(flags);
>> + memcpy(addr, opcode, len);
>> + local_irq_restore(flags);
>> + sync_core();
>> +
>> + /*
>> + * Could also do a CLFLUSH here to speed up CPU recovery; but
>> + * that causes hangs on some VIA CPUs.
>> + */
>
> I don't get it. If code can not be running here, it can not be running
> in the !NX case, either, and we are free to just change
> it. Speculative execution should not be a problem, either, as CPUs are
> supposed to mask it, and there are no known bugs in that area. (Plus,
> I'd not be surprise if speculative execution ignored NX... just saying
> :-) )

Yes, the module code should not run, but speculative execution might cause
it to be cached in the instruction cache (as unlikely as it might be, but
we need to consider malicious users that play with branch predictors).

I am unfamiliar with any bug that might cause the CPU to speculatively
ignore the NX bit. Without underestimating Intelâs ability to create
terrible bugs, I would assume, for now, that it is safe.