Re: [PATCH v2] mm/uffd: UFFD_FEATURE_WP_UNPOPULATED

From: Muhammad Usama Anjum
Date: Wed Mar 01 2023 - 12:14:01 EST


On 3/1/23 8:19 PM, Peter Xu wrote:
> On Wed, Mar 01, 2023 at 12:55:51PM +0500, Muhammad Usama Anjum wrote:
>> Hi Peter,
>
> Hi, Muhammad,
>
>> While using WP_UNPOPULATED, we get stuck if newly allocated memory is read
>> without initialization. This can be reproduced by either of the following
>> statements:
>> printf("%c", buffer[0]);
>> buffer[0]++;
>>
>> This bug has start to appear on this patch. How are you handling reading
>> newly allocated memory when WP_UNPOPULATED is defined?
>
> Yes it's a bug, thanks for the reproducer. You're right I missed a trivial
> but important detail. Could you try apply below on top?
>
> ---8<---
> diff --git a/mm/memory.c b/mm/memory.c
> index 46934133bd0b..2f4b3892948b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4062,7 +4062,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
> vma->vm_page_prot));
> vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
> vmf->address, &vmf->ptl);
> - if (!pte_none(*vmf->pte)) {
> + if (vmf_pte_changed(vmf)) {
> update_mmu_tlb(vma, vmf->address, vmf->pte);
> goto unlock;
> }
> ---8<---
This patch works. Thank you so much!

>
> I can send a new version after you confirmed it at least works on your
> side. I'll also add some more test to cover that in the next version.
>
> The current smoke test within this patch is really light; I somehow rely on
> you on this patch on the testing side, and thanks for that.
>
>> Running my pagemap_ioctl selftest as benchmark in a VM:
>> without zeropage / wp_unpopulated (decide from pte_none() if page is dirty
>> or not, buggy and wrong implementation, just for reference)
>> 26.608 seconds
>> with zeropage
>> 39.203 seconds
>> with wp_unpopulated
>> 62.907 seconds
>>
>> 136% worse performance overall
>> 60% worse performance of unpopulated than zeropage
>
> Yes this is unfortunate, because we're protecting more things than before
> when with WP_ZEROPAGE / WP_UNPOPULATED but that's what it is for (when we
> want to make sure that accuracy on the holes).
>
> I didn't look closer to your whole test suite yet, but my pure test on
> protection above should mean that it's still much better for such a use
> case than either (1) pre-read or (2) MADV_POPULATE_READ.
Ohh... I should stop comparing UNPOPULATE with buggy implementation and
compare with pre-read. I've compared apples with oranges.

I'll do better benchmark for the comparison sake. I'll let you know if the
performance is becoming an issue. Overall we need pagemap_ioctl + UFFD to
correctly emulate Windows syscall. Secondly we also need good performance
(more the better).

>
> Again, I hope the performance result is not a concern to you. If it is,
> please let us know.
>
> Thanks,
>

--
BR,
Muhammad Usama Anjum