Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()
From: Mathias Krause
Date: Fri Apr 07 2017 - 06:51:26 EST
On 7 April 2017 at 11:46, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Fri, 7 Apr 2017, Mathias Krause wrote:
>> On 6 April 2017 at 17:59, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>> > On Wed, Apr 5, 2017 at 5:14 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>> >> static __always_inline rare_write_begin(void)
>> >> {
>> >> preempt_disable();
>> >> local_irq_disable();
>> >> barrier();
>> >> __arch_rare_write_begin();
>> >> barrier();
>> >> }
>> >
>> > Looks good, except you don't need preempt_disable().
>> > local_irq_disable() also disables preemption. You might need to use
>> > local_irq_save(), though, depending on whether any callers already
>> > have IRQs off.
>>
>> Well, doesn't look good to me. NMIs will still be able to interrupt
>> this code and will run with CR0.WP = 0.
>>
>> Shouldn't you instead question yourself why PaX can do it "just" with
>> preempt_disable() instead?!
>
> That's silly. Just because PaX does it, doesn't mean it's correct. To be
> honest, playing games with the CR0.WP bit is outright stupid to begin with.
Why that? It allows fast and CPU local modifications of r/o memory.
OTOH, an approach that needs to fiddle with page table entries
requires global synchronization to keep the individual TLB states in
sync. Hmm.. Not that fast, I'd say.
> Whether protected by preempt_disable or local_irq_disable, to make that
> work it needs CR0 handling in the exception entry/exit at the lowest
> level. And that's just a nightmare maintainence wise as it's prone to be
> broken over time.
It seems to be working fine for more than a decade now in PaX. So it
can't be such a big maintenance nightmare ;)
> Aside of that it's pointless overhead for the normal case.
>
> The proper solution is:
>
> write_rare(ptr, val)
> {
> mp = map_shadow_rw(ptr);
> *mp = val;
> unmap_shadow_rw(mp);
> }
>
> map_shadow_rw() is essentially the same thing as we do in the highmem case
> where the kernel creates a shadow mapping of the user space pages via
> kmap_atomic().
The "proper solution" seems to be much slower compared to just
toggling CR0.WP (which is costly in itself, already) because of the
TLB invalidation / synchronisation involved.
> It's valid (at least on x86) to have a shadow map with the same page
> attributes but write enabled. That does not require any fixups of CR0 and
> just works.
"Just works", sure -- but it's not as tightly focused as the PaX
solution which is CPU local, while your proposed solution is globally
visible.
Cheers,
Mathias