Re: [RFC PATCH 1/5] misc: introduce FDBox

From: Pratyush Yadav
Date: Tue Mar 18 2025 - 19:04:35 EST


On Tue, Mar 18 2025, Jason Gunthorpe wrote:

> On Tue, Mar 18, 2025 at 03:25:25PM +0100, Christian Brauner wrote:
>
>> > It is not really a stash, it is not keeping files, it is hardwired to
>>
>> Right now as written it is keeping references to files in these fdboxes
>> and thus functioning both as a crippled high-privileged fdstore and a
>> serialization mechanism.
>
> I think Pratyush went a bit overboard on that, I can see it is useful
> for testing, but really the kho control FD should be in either
> serializing or deserializing mode and it should not really act as an
> FD store.
>
> However, edge case handling makes this a bit complicated.
>
> Once a FD is submitted to be serialized that FD has to be frozen and
> can't be allowed to change anymore.
>
> If the kexec process aborts then we need to unwind all of this stuff
> and unfreeze all the FDs.

I do think I might have went a bit overboard, but this was one of the
reasons for doing so. Having the struct file around, and having the
ability to map it back in allowed for kexec failure to be recoverable
easily and quickly.

I suppose we can serialize all FDs when the box is sealed and get rid of
the struct file. If kexec fails, userspace can unseal the box, and FDs
will be deserialized into a new struct file. This way, the behaviour
from userspace perspective also stays the same regardless of whether
kexec went through or not. This also helps tie FDBox closer to KHO.

The downside is that the recovery time will be slower since the state
has to be deserialized, but I suppose kexec failure should not happen
too often so that is something we can live with.

What do you think about doing it this way?

>
> It sure would be nice if the freezing process could be managed
> generically somehow.
>
> One option for freezing would have the kernel enforce that userspace
> has closed and idled the FD everywhere (eg check the struct file
> refcount == 1). If userspace doesn't have access to the FD then it is
> effectively frozen.

Yes, that is what I want to do in the next revision. FDBox itself will
not close the file descriptors when you put a FD in the box. It will
just grab a reference and let the userspace close the FD. Then when the
box is sealed, the operation can be refused if refcount != 1.

>
> In this case the error path would need to bring the FD back out of the
> fdbox.
>
> Jason
>

--
Regards,
Pratyush Yadav