Re: [RFC PATCH v2 1/2] mm/userfaultfd: fix memory corruption due to writeprotect

From: Nadav Amit
Date: Tue Jan 05 2021 - 14:06:19 EST


> On Jan 5, 2021, at 10:45 AM, Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote:
>
> On Mon, Jan 04, 2021 at 09:26:33PM +0000, Nadav Amit wrote:
>> I would feel more comfortable if you provide patches for uffd-wp. If you
>> want, I will do it, but I restate that I do not feel comfortable with this
>> solution (worried as it seems a bit ad-hoc and might leave out a scenario
>> we all missed or cause a TLB shootdown storm).
>>
>> As for soft-dirty, I thought that you said that you do not see a better
>> (backportable) solution for soft-dirty. Correct me if I am wrong.
>
> I think they should use the same technique, since they deal with the
> exact same challenge. I will try to cleanup the patch in the meantime.
>
> I can also try to do the additional cleanups to clear_refs to
> eliminate the tlb_gather completely since it doesn't gather any page
> and it has no point in using it.
>
>> Anyhow, I will add your comments regarding the stale TLB window to make the
>> description clearer.
>
> Having the mmap_write_lock solution as backup won't hurt, but I think
> it's only for planB if planA doesn't work and the only stable tree
> that will have to apply this is v5.9.x. All previous don't need any
> change in this respect. So there's no worry of rejects.
>
> It worked by luck until Aug 2020, but it did so reliably or somebody
> would have noticed already. And it's not exploitable either, it just
> works stable, but it was prone to break if the kernel changed in some
> other way, and it eventually changed in Aug 2020 when an unrelated
> patch happened to the reuse logic.
>
> If you want to maintain the mmap_write_lock patch if you could drop
> the preserved_write and adjust the Fixes to target Aug 2020 it'd be
> ideal. The uffd-wp needs a different optimization that maybe Peter is
> already working on or I can include in the patchset for this, but
> definitely in a separate commit because it's orthogonal.
>
> It's great you noticed the W->RO transition of un-wprotect so we can
> optimize that too (it will have a positive runtime effect, it's not
> just theoretical since it's normal to unwrprotect a huge range once
> the postcopy snapshotting of the virtual machine is complete), I was
> thinking at the previous case discussed in the other thread.

Understood. I will separate it to a different patch and use your version.
I am sorry that I missed Peter Xu feedback for that. As I understand that
this will not be backported, I will see if I can get rid of the TLB flush
and the inc_tlb_flush_pending() for write-unprotect case as well (which
I think I mentioned before).

>
> I just don't like to slow down a feature required in the future for
> implementing postcopy live snapshotting or other snapshots to userland
> processes (for the non-KVM case, also unprivileged by default if using
> bounce buffers to feed the syscalls) that can be used by open source
> hypervisors to beat proprietary hypervisors like vmware.

Ouch, that’s uncalled for. I am sure that you understand that I have no
hidden agenda and we all have the same goal.

Regards,
Nadav