Re: [RFC PATCH 0/9] fuse: API for Checkpoint/Restore

From: Aleksandr Mikhalitsyn
Date: Mon Mar 06 2023 - 17:16:51 EST


On Mon, Mar 6, 2023 at 10:05 PM Bernd Schubert <bschubert@xxxxxxx> wrote:
>
>
>
> On 3/6/23 20:18, Miklos Szeredi wrote:
> > On Mon, 6 Mar 2023 at 17:44, Aleksandr Mikhalitsyn
> > <aleksandr.mikhalitsyn@xxxxxxxxxxxxx> wrote:
> >>
> >> On Mon, Mar 6, 2023 at 5:15 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> >
> >>> Apparently all of the added mechanisms (REINIT, BM_REVAL, conn_gen)
> >>> are crash recovery related, and not useful for C/R. Why is this being
> >>> advertised as a precursor for CRIU support?
> >>
> >> It's because I'm doing this with CRIU in mind too, I think it's a good
> >> way to make a universal interface
> >> which can address not only the recovery case but also the C/R, cause
> >> in some sense it's a close problem.
> >
> > That's what I'm wondering about...
> >
> > Crash recovery is about restoring (or at least regenerating) state in
> > the userspace server.
> >
> > In CRIU restoring the state of the userspace server is a solved
> > problem, the issue is restoring state in the kernel part of fuse. In
> > a sense it's the exact opposite problem that crash recovery is doing.

I can't argue, you're right. In the "recover" case we don't care about userspace
state, we just want to forget everything in the kernel but only keep
mounts (someone may want to keep opened FDs too).
In the C/R case we want to recreate full userspace and kernel states.

These are different problems, but in some parts they require the same UAPIs.
I think I need to write a detailed motivation for the CRIU part in the
-v2 cover letter, so we can discuss it. What do you think?

> >
> >> But of course, Checkpoint/Restore is a way more trickier. But before
> >> doing all the work with CRIU PoC,
> >> I wanted to consult with you and folks if there are any serious
> >> objections to this interface/feature or, conversely,
> >> if there is someone else who is interested in it.
> >>
> >> Now about interfaces REINIT, BM_REVAL.
> >>
> >> I think it will be useful for CRIU case, but probably I need to extend
> >> it a little bit, as I mentioned earlier in the cover letter:
> >>>> * "fake" daemon has to reply to FUSE_INIT request from the kernel and initialize fuse connection somehow.
> >>>> This setup can be not consistent with the original daemon (protocol version, daemon capabilities/settings
> >>>> like no_open, no_flush, readahead, and so on).
> >>
> >> So, after the "fake" demon has done its job during CRIU restore, we
> >> need to replace it with the actual demon from
> >> the dumpee tree and performing REINIT looks like a sanner way.
> >
> > I don't get it. How does REINIT help with switching to the real daemon?
>
> The way I read the patches, the new daemon sends FUSE_INIT to advertise
> all of its features.

Yes, thanks, Bernd!

Theoretically, we can implement some basic C/R without using reinit.
It was my first idea and I've described it in my LPC 2022 talk,
but this approach is not fully safe and universal because CRIU fake
daemon will implement a particular fuse protocol version (and define a
particular set on fuse ops/features),
but the dumpee fuse daemon can use a different set of fuse ops and
fuse protocol version. So, changing fuse daemon fully transparently to
the kernel is not fully safe.

Thank you guys for your attention to this!

Kind regards,
Alex