Re: sync_file_range(SYNC_FILE_RANGE_WRITE) blocks?

From: Andrew Morton
Date: Sun Jun 01 2008 - 18:49:12 EST


On Mon, 2 Jun 2008 00:22:02 +0200 Pavel Machek <pavel@xxxxxxx> wrote:

> Hi!
>
> > > > I expect major users of this system call will be applications which do
> > > > small-sized overwrites into large files, mainly databases. That is,
> > > > once the application developers discover its existence. I'm still
> > > > getting expressions of wonder from people who I tell about the
> > > > five-year-old fadvise().
> > >
> > > Hey, you have one user now, its called s2disk. But for this call to be
> > > useful, we'd need asynchronous variant... is there such thing?
> >
> > Well if you're asking the syscall to shove more data into the block
> > layer than it can concurrently handle, sure, the block layer will
> > block. It's tunable...
>
> No, no, I don't want to overload block layer. All I want is ...
>
> > > Okay, I can fork and do the call from another process, but...
> >
> > I sense a strangeness. What are you actually trying to do with all of this?
>
> Okay, so I have around 400MB of data, I want it compressed, optionally
> encrypted and written to partition.
>
> Now, if I do it "naturally", I do writes, followed by fsync.
>
> That's bad, because kernel does not start write out immediately, and
> we waste time with idle disk. (If data compress really well, or
> encryption is off, this is significant).
>
> So we improve on this, by doing sync_file_range(SYNC_FILE_RANGE_WRITE)
> periodically. That keeps the disk busy, but occassionaly blocks the
> cpu... wasting time (which mostly hurts in compression+encryption
> case).

yep. That's another use of sync_file_range(): to allow smart userspace
to optimise the kernel's IO scheduling decisions.

> So... how can I keep _both_ cpu and disk busy?

pthread_create() ;)

How about this:

- Add a new SYNC_FILE_RANGE_NON_BLOCKING

- If userspace set that flag, turn on writeback_control.nonblocking
in __filemap_fdatawrite_range().

- test it a lot.

It will be userspace's responsibility to avoid burning huge amounts of
CPU repeatedly calling sync_file_range() and having it not actually write
anything.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/