Re: [RFC PATCH 1/5] misc: introduce FDBox

From: Jason Gunthorpe
Date: Mon Mar 17 2025 - 13:07:21 EST


On Sun, Mar 09, 2025 at 01:03:31PM +0100, Christian Brauner wrote:

> So either that work is done right from the start or that stashing files
> goes out the window and instead that KHO part is implemented in a way
> where during a KHO dump relevant userspace is notified that they must
> now serialize their state into the serialization stash. And no files are
> actually kept in there at all.

Let's ignore memfd/shmem for a moment..

It is not userspace state that is being serialized, it is *kernel*
state inside device drivers like VFIO/iommufd/kvm/etc that is being
serialized to the KHO.

The file descriptor is simply the handle to the kernel state. It is
not a "file" in any normal filesystem sense, it is just an uAPI handle
for a char dev that is used with IOCTL.

When KHO is triggered triggered whatever is contained inside the FD is
serialized into the KHO.

So we need:
1) A way to register FDs to be serialized. For instance, not every
VFIO FD should be retained.
2) A way for the kexecing kernel to make callbacks to the char dev
owner (probably via struct file operations) to perform the
serialization
3) A way for the new kernel to ask the char dev owner to create a new
struct file out of the serialized data. Probably allowed to happen
only once, ie you can't clone these things. This is not the same
as just opening an empty char device, it would also fill the char
device with whatever data was serialized.
4) A way to get the struct file into a process fd number so userspace
can route it to the right place.

It is not really a stash, it is not keeping files, it is hardwired to
KHO to drive it's serialize/deserialize mechanism around char devs in
a very limited way.

If you have that then feeding an anonymous memfd/guestmemfd through
the same machinery is a fairly small and logical step.

Jason