Re: sync_file_range(SYNC_FILE_RANGE_WRITE) blocks?

From: Rafael J. Wysocki
Date: Sun Jun 01 2008 - 17:59:38 EST


On Sunday, 1 of June 2008, Andrew Morton wrote:
> On Sun, 1 Jun 2008 13:40:09 +0200 Pavel Machek <pavel@xxxxxxx> wrote:
>
> > Hi!
> >
> > > > > > All I can say so far is that I find the same as you do:
> > > > > > SYNC_FILE_RANGE_WRITE (after writing) takes a significant amount of time,
> > > > > > more than half as long as when you add in SYNC_FILE_RANGE_WAIT_AFTER too.
> > > > > >
> > > > > > Which make the sync_file_range call pretty pointless: your usage seems
> > > > > > perfectly reasonable to me, but somehow we've broken its behaviour.
> > > > > > I'll be investigating ...
> > > > >
> > > > > It will block on disk queue fullness - sysrq-W will tell.
> > > >
> > > > Ah, thank you. What a disappointment, though it's understandable.
> > > > Doesn't that very severely limit the usefulness of the system call?
> > >
> > > A bit. The request queue size is runtime tunable though.
> >
> > Which /sys is that?
>
> /sys/block/sda/queue/nr_requests
>
> > What happens if I set the queue size to pretty
> > much infinity, will memory management die horribly?
>
> In theory, no - it's always caused problems when the VM/VFS/FS layer
> has relied upon request-queue exhaustion for throttling. Hence all
> that code is supposed to work OK when there is no request-queue
> blocking. Of course, (theory/practice != 1.0).
>
> > > I expect major users of this system call will be applications which do
> > > small-sized overwrites into large files, mainly databases. That is,
> > > once the application developers discover its existence. I'm still
> > > getting expressions of wonder from people who I tell about the
> > > five-year-old fadvise().
> >
> > Hey, you have one user now, its called s2disk. But for this call to be
> > useful, we'd need asynchronous variant... is there such thing?
>
> Well if you're asking the syscall to shove more data into the block
> layer than it can concurrently handle, sure, the block layer will
> block. It's tunable...
>
> It can still block in places, of course - we might need to do
> synchronous reads to get at metadata and we'll need to allocate memory.
>
> > Okay, I can fork and do the call from another process, but...
>
> I sense a strangeness. What are you actually trying to do with all of this?

Pavel is trying to avoid me doing multithreaded s2disk, more or less. ;-)

However, I have some numbers showing that multithreaded image saving actually
helps a lot in case the image is compressed and encrypted.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/