Re: [PATCH 3/7] mm/migrate: skip data copy for already-copied folios
From: Zi Yan
Date: Mon Jun 08 2026 - 12:24:02 EST
On 8 Jun 2026, at 11:18, David Hildenbrand (Arm) wrote:
>> Sashiko Comment:
>> Does expanding FOLIO_OLD_STATES to include BIT(2) silently corrupt anon_vma
>> pointers on 32-bit architectures?
>> The anon_vma pointer is packed with FOLIO_OLD_STATES inside dst->migrate_info.
>> To separate them, __migrate_folio_extract() uses info & ~FOLIO_OLD_STATES.
>> Since anon_vma objects are allocated from a kmem_cache with align=0,
>> ARCH_SLAB_MINALIGN defaults to 4 bytes on many 32-bit architectures. This
>> means a valid anon_vma pointer can legitimately end in 0x4 or 0xC (meaning
>> bit 2 is set).
>> When __migrate_folio_extract() masks the pointer with ~7, it will silently
>> clear bit 2 from the anon_vma pointer. Any subsequent call to put_anon_vma()
>> with this corrupted pointer could cause a use-after-free or a kernel panic.
>> --
>>
>> #define ARCH_SLAB_MINALIGN __alignof__(unsigned long long)
>>
>> I initially assumed this to be always 8, confusing it to be same as size of
>> unsigned long long.
>> But the GCC docs note that alignment can be smaller in size:
>>
>> https://gcc.gnu.org/onlinedocs/gcc/Alignment.html
>> "For example, if the target machine requires a double value to be aligned on
>> an 8-byte boundary, then __alignof__ (double) is 8. This is true on many RISC
>> machines. On more traditional machine designs, __alignof__ (double) is 4 or
>> even 2."
>>
>> If my understanding is right, Sashiko concern is valid, and I can't safely
>> use BIT(2).
>
> 32bit makes this tricky indeed. And that's also the reason why
> FOLIO_MAPPING_FLAGS is currently limited to 2 bits.
>
>> I see few option from here. Either I can gate batch copy for CONFIG_64BIT,
>
> That's a bit nasty as we'll have to special case 32bit vs 64bit.
IIRC, multithreaded copy is already gated by CONFIG_HIGHMEM, otherwise
it needs to perform kmap_local() at each copying CPU, which complicates
the process. Then, this code will only used for 32bit without highmem,
I assume there will no page copy DMA on 32bit platform. Maybe it is not
too bad to limit this to 64bit.
>
>> or I can force anon_vma_cachep to 8-byte alignment via kmem_cache_create()
>
> That's an option, but I would not do it just for this optimization.
>
>> align arg. Or I can change the migrate_folio() callback to pass already_copied
>> info to change to dst->migrate_info enum.
>
> Can you elaborate how that would look like?
Best Regards,
Yan, Zi