Re: [RFC v2 2/2] pmem: device flush over VIRTIO

From: Pankaj Gupta
Date: Thu Apr 26 2018 - 12:41:01 EST



>
> On Wed, Apr 25, 2018 at 04:54:14PM +0530, Pankaj Gupta wrote:
> > This patch adds functionality to perform
> > flush from guest to hosy over VIRTIO
> > when 'ND_REGION_VIRTIO'flag is set on
> > nd_negion. Flag is set by 'virtio-pmem'
> > driver.
> >
> > Signed-off-by: Pankaj Gupta <pagupta@xxxxxxxxxx>
> > ---
> > drivers/nvdimm/region_devs.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> > index a612be6..6c6454e 100644
> > --- a/drivers/nvdimm/region_devs.c
> > +++ b/drivers/nvdimm/region_devs.c
> > @@ -20,6 +20,7 @@
> > #include <linux/nd.h>
> > #include "nd-core.h"
> > #include "nd.h"
> > +#include <linux/virtio_pmem.h>
> >
> > /*
> > * For readq() and writeq() on 32-bit builds, the hi-lo, lo-hi order is
> > @@ -1074,6 +1075,12 @@ void nvdimm_flush(struct nd_region *nd_region)
> > struct nd_region_data *ndrd = dev_get_drvdata(&nd_region->dev);
> > int i, idx;
> >
> > + /* call PV device flush */
> > + if (test_bit(ND_REGION_VIRTIO, &nd_region->flags)) {
> > + virtio_pmem_flush(&nd_region->dev);
> > + return;
> > + }
>
> How does libnvdimm know when flush has completed?
>
> Callers expect the flush to be finished when nvdimm_flush() returns but
> the virtio driver has only queued the request, it hasn't waited for
> completion!

I tried to implement what nvdimm does right now. It just writes to
flush hint address to make sure data persists.

I just did not want to block guest write requests till host side
fsync completes.

Operations(write/fsync) on same file would be blocking at guest side and wait time could
be worse for operations on different guest files because all these operations would happen
ultimately on same file at host.

I think with current way, we can achieve an asynchronous queuing mechanism on cost of not
100% sure when fsync would complete but it is assured it will happen. Also, its entire block
flush.

I am open for suggestions here, this is my current thought and implementation.

Thanks,
Pankaj