Re: [PATCH] mm: avoid zeroing user movable page twice with init_on_alloc=1
From: Zi Yan
Date: Wed Dec 04 2024 - 17:05:30 EST
On 4 Dec 2024, at 13:16, Zi Yan wrote:
> On 4 Dec 2024, at 13:13, Zi Yan wrote:
>
>> On 4 Dec 2024, at 12:46, Vlastimil Babka wrote:
>>
>>> On 12/4/24 18:33, Zi Yan wrote:
>>>> On 4 Dec 2024, at 11:29, Matthew Wilcox wrote:
>>>>
>>>>> On Wed, Dec 04, 2024 at 11:16:51AM -0500, Zi Yan wrote:
>>>>>>> So maybe the clearing done as part of page allocator isn't enough here.
>>>>>>>
>>>>>> Basically, mips needs to flush data cache if kmap address is aliased to
>>>>>
>>>>> People use "aliased" in contronym ways. Do you mean "has a
>>>>> non-congruent alias" or "has a congruent alias"?
>>>>>
>>>>>> userspace address. This means when mips has THP on, the patch below
>>>>>> is not enough to fix the issue.
>>>>>>
>>>>>> In post_alloc_hook(), it does not make sense to pass userspace address
>>>>>> in to determine whether to flush dcache or not.
>>>>>>
>>>>>> One way to fix it is to add something like arch_userpage_post_alloc()
>>>>>> to flush dcache if kmap address is aliased to userspace address.
>>>>>> But my questions are that
>>>>>> 1) if kmap address will always be the same for two separate kmap_local() calls,
>>>>>
>>>>> No. It just takes the next address in the stack.
>>>>
>>>> Hmm, if kmap_local() gives different addresses, wouldn’t init_on_alloc be
>>>> causing issues before my patch? In the page allocator, the page is zeroed
>>>> from one kmap address without flush, then clear_user_highpage() clears
>>>> it again with another kmap address with flush. After returning to userspace,
>>>> the user application works on the page but when the cache line used by
>>>> init_on_alloc is written back (with 0s) at eviction, user data is corrupted.
>>>> Am I missing anything? Or all arch with cache aliasing never enables
>>>> init_on_alloc?
>>>
>>> Maybe the arch also defines some hooks like arch_kmap_local_post_unmap() ?
>>
>> But this does not solve the possible init_on_alloc issue, since init_on_alloc
>> is done in mm/page_alloc.c without userspace address and has no knowledge of
>> whether the zeroed page will be used in userspace nor the cache line will
>> be the same as the userspace page cache line. If dcache is flushed
>> unconditionally for kmap_local, that could degrade performance.
>>
>>> As for the fix, could it rely on e.g. __HAVE_ARCH_COPY_USER_HIGHPAGE instead
>>> of CONFIG_MIPS? That affects more arches, I don't know if we broke only mips
>>> or others too.
>>
>> Yes, this is much better, since this issue affects any arch with cache aliasing.
>> Let me update my fix. Thanks.
>
> I notice that arm64 has __HAVE_ARCH_COPY_USER_HIGHPAGE defined, so I will
> need to look for an alternative.
It turns out sh, sparc, arm, xtensa, nios2, m68k, parisc, csky, and powerpc all have cache flush operations in clear_user_page() compared to clear_page() and
arc clears PG_dc_clean bit in addition to clear_page().
So __GFP_ZERO cannot simply replace clear_user_page()/clear_user_highpage().
I can add ARCH_REQUIRE_CLEAR_USER_PAGE for these arch and use it to decide
whether clear_user_page()/clear_user_highpage() needs to be used regardless of
the presence of init_on_alloc.
I also wonder if INIT_ON_ALLOC_DEFAULT_ON works on these arch or not.
Best Regards,
Yan, Zi