RE: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add callbacks for PM suspend and hibernation

From: Durrant, Paul
Date: Fri Feb 21 2020 - 04:57:15 EST


> -----Original Message-----
> From: Roger Pau Monnà <roger.pau@xxxxxxxxxx>
> Sent: 21 February 2020 09:22
> To: Durrant, Paul <pdurrant@xxxxxxxxxxxx>
> Cc: Agarwal, Anchal <anchalag@xxxxxxxxxx>; Valentin, Eduardo
> <eduval@xxxxxxxxxx>; len.brown@xxxxxxxxx; peterz@xxxxxxxxxxxxx;
> benh@xxxxxxxxxxxxxxxxxxx; x86@xxxxxxxxxx; linux-mm@xxxxxxxxx;
> pavel@xxxxxx; hpa@xxxxxxxxx; tglx@xxxxxxxxxxxxx; sstabellini@xxxxxxxxxx;
> fllinden@xxxxxxxxxx; Kamata, Munehisa <kamatam@xxxxxxxxxx>;
> mingo@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx; Singh, Balbir
> <sblbir@xxxxxxxxxx>; axboe@xxxxxxxxx; konrad.wilk@xxxxxxxxxx;
> bp@xxxxxxxxx; boris.ostrovsky@xxxxxxxxxx; jgross@xxxxxxxx;
> netdev@xxxxxxxxxxxxxxx; linux-pm@xxxxxxxxxxxxxxx; rjw@xxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; vkuznets@xxxxxxxxxx; davem@xxxxxxxxxxxxx;
> Woodhouse, David <dwmw@xxxxxxxxxxxx>
> Subject: Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add callbacks
> for PM suspend and hibernation
>
> On Thu, Feb 20, 2020 at 05:01:52PM +0000, Durrant, Paul wrote:
> > > > Hopefully what I said above illustrates why it may not be 100%
> common.
> > >
> > > Yes, that's fine. I don't expect it to be 100% common (as I guess
> > > that the hooks will have different prototypes), but I expect
> > > that routines can be shared, and that the approach taken can be the
> > > same.
> > >
> > > For example one necessary difference will be that xenbus initiated
> > > suspend won't close the PV connection, in case suspension fails. On PM
> > > suspend you seem to always close the connection beforehand, so you
> > > will always have to re-negotiate on resume even if suspension failed.
> > >
> > > What I'm mostly worried about is the different approach to ring
> > > draining. Ie: either xenbus is changed to freeze the queues and drain
> > > the shared rings, or PM uses the already existing logic of not
> > > flushing the rings an re-issuing in-flight requests on resume.
> > >
> >
> > Yes, that's needs consideration. I donât think the same semantic can be
> suitable for both. E.g. in a xen-suspend we need to freeze with as little
> processing as possible to avoid dirtying RAM late in the migration cycle,
> and we know that in-flight data can wait. But in a transition to S4 we
> need to make sure that at least all the in-flight blkif requests get
> completed, since they probably contain bits of the guest's memory image
> and that's not going to get saved any other way.
>
> Thanks, that makes sense and something along this lines should be
> added to the commit message IMO.
>
> Wondering about S4, shouldn't we expect the queues to already be
> empty? As any subsystem that wanted to store something to disk should
> make sure requests have been successfully completed before
> suspending.

What about writing the suspend image itself? Normal filesystem I/O will have been flushed of course, but whatever vestigial kernel actually writes out the hibernation file may well expect a final D0->D3 on the storage device to cause a flush. Again, I don't know the specifics for Linux (and Windows actually uses an incarnation of the crash kernel to do the job, which brings with it a whole other set of complexity as far as PV drivers go).

Paul

>
> Thanks, Roger.