Re: [PATCH v6 15/20] mm: memfd_luo: allow preserving memfd
From: Pratyush Yadav
Date: Thu Nov 20 2025 - 10:34:55 EST
On Wed, Nov 19 2025, Pasha Tatashin wrote:
> On Mon, Nov 17, 2025 at 6:04 AM Mike Rapoport <rppt@xxxxxxxxxx> wrote:
>>
>> On Sat, Nov 15, 2025 at 06:34:01PM -0500, Pasha Tatashin wrote:
>> > From: Pratyush Yadav <ptyadav@xxxxxxxxx>
>> >
>> > The ability to preserve a memfd allows userspace to use KHO and LUO to
>> > transfer its memory contents to the next kernel. This is useful in many
>> > ways. For one, it can be used with IOMMUFD as the backing store for
>> > IOMMU page tables. Preserving IOMMUFD is essential for performing a
>> > hypervisor live update with passthrough devices. memfd support provides
>> > the first building block for making that possible.
>> >
>> > For another, applications with a large amount of memory that takes time
>> > to reconstruct, reboots to consume kernel upgrades can be very
>> > expensive. memfd with LUO gives those applications reboot-persistent
>> > memory that they can use to quickly save and reconstruct that state.
>> >
>> > While memfd is backed by either hugetlbfs or shmem, currently only
>> > support on shmem is added. To be more precise, support for anonymous
>> > shmem files is added.
>> >
>> > The handover to the next kernel is not transparent. All the properties
>> > of the file are not preserved; only its memory contents, position, and
>> > size. The recreated file gets the UID and GID of the task doing the
>> > restore, and the task's cgroup gets charged with the memory.
>> >
>> > Once preserved, the file cannot grow or shrink, and all its pages are
>> > pinned to avoid migrations and swapping. The file can still be read from
>> > or written to.
>> >
>> > Use vmalloc to get the buffer to hold the folios, and preserve
>> > it using kho_preserve_vmalloc(). This doesn't have the size limit.
>> >
>> > Co-developed-by: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx>
>> > Signed-off-by: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx>
>> > Signed-off-by: Pratyush Yadav <ptyadav@xxxxxxxxx>
[...]
>> > + struct inode *inode = file_inode(file);
>> > + struct memfd_luo_folio_ser *pfolios;
>> > + struct kho_vmalloc *kho_vmalloc;
>> > + unsigned int max_folios;
>> > + long i, size, nr_pinned;
>> > + struct folio **folios;
>>
>> pfolios and folios read like the former is a pointer to latter.
>> I'd s/pfolios/folios_ser/
folios_ser is a tricky name, it is very close to folio_ser (which is
what you might use for one member of the array).
I was bit by this when hacking on some hugetlb preservation code. I
wrote folios_ser instead of folio_ser in a loop, and then had to spend
half an hour trying to figure out why the code wasn't working. It is
kinda hard to differentiate between the two visually.
Not that I have a better name off the top of my head. Just saying that
this naming causes weird readability problems.
>
> Done
>
[...]
--
Regards,
Pratyush Yadav