Re: [PATCH 3/7] mm/migrate: skip data copy for already-copied folios
From: David Hildenbrand (Arm)
Date: Mon Jun 08 2026 - 13:10:35 EST
On 6/8/26 17:41, Zi Yan wrote:
> On 8 Jun 2026, at 11:18, David Hildenbrand (Arm) wrote:
>
>>> Sashiko Comment:
>>> Does expanding FOLIO_OLD_STATES to include BIT(2) silently corrupt anon_vma
>>> pointers on 32-bit architectures?
>>> The anon_vma pointer is packed with FOLIO_OLD_STATES inside dst->migrate_info.
>>> To separate them, __migrate_folio_extract() uses info & ~FOLIO_OLD_STATES.
>>> Since anon_vma objects are allocated from a kmem_cache with align=0,
>>> ARCH_SLAB_MINALIGN defaults to 4 bytes on many 32-bit architectures. This
>>> means a valid anon_vma pointer can legitimately end in 0x4 or 0xC (meaning
>>> bit 2 is set).
>>> When __migrate_folio_extract() masks the pointer with ~7, it will silently
>>> clear bit 2 from the anon_vma pointer. Any subsequent call to put_anon_vma()
>>> with this corrupted pointer could cause a use-after-free or a kernel panic.
>>> --
>>>
>>> #define ARCH_SLAB_MINALIGN __alignof__(unsigned long long)
>>>
>>> I initially assumed this to be always 8, confusing it to be same as size of
>>> unsigned long long.
>>> But the GCC docs note that alignment can be smaller in size:
>>>
>>> https://gcc.gnu.org/onlinedocs/gcc/Alignment.html
>>> "For example, if the target machine requires a double value to be aligned on
>>> an 8-byte boundary, then __alignof__ (double) is 8. This is true on many RISC
>>> machines. On more traditional machine designs, __alignof__ (double) is 4 or
>>> even 2."
>>>
>>> If my understanding is right, Sashiko concern is valid, and I can't safely
>>> use BIT(2).
>>
>> 32bit makes this tricky indeed. And that's also the reason why
>> FOLIO_MAPPING_FLAGS is currently limited to 2 bits.
>>
>>> I see few option from here. Either I can gate batch copy for CONFIG_64BIT,
>>
>> That's a bit nasty as we'll have to special case 32bit vs 64bit.
>
> IIRC, multithreaded copy is already gated by CONFIG_HIGHMEM, otherwise
> it needs to perform kmap_local() at each copying CPU, which complicates
> the process. Then, this code will only used for 32bit without highmem,
> I assume there will no page copy DMA on 32bit platform. Maybe it is not
> too bad to limit this to 64bit.
I'm more concerned of CONFIG_64BIT handling in the code, but if that can be
avoided easily, fine with me.
--
Cheers,
David