Re: read()/readv() only from page cache

From: Christoph Hellwig
Date: Fri Sep 05 2014 - 12:32:15 EST


On Fri, Sep 05, 2014 at 12:27:21PM -0400, Milosz Tanski wrote:
> I would prefer a interface more like recv() where I can specify the
> flag if I want blocking behavior for this read or not. Let me explain
> why:
>
> In a VLDB like workload this would enable me to lower the latency of
> common fast requests and. By fast requests I mean ones that do not
> require much data, the data is cached, or there's a predictable read
> pattern (read-ahead). Obviously it would be at the expense of the
> latency of large/slow requests (they have to make 2 read calls, the
> first one always EWOULDBLOCK) ... but in that case it doesn't matter
> since the time to do actual IO would trump any kind of extra latency.

This is another good suggestion. I've actually heard people asking
for allowing per-I/O flags for other uses cases. The one I cane
remember is applying O_DSYNC only for FUA writes on a SCSI target,
the other one would be Samba again, as SMB allows per-I/O flags on
the wire as well.

> Essentially, it's using the kernel facilities (page cache) to help me
> perform better (in a more predictable fashion). I would implement this
> in our application tomorrow. It's frustrating that there is a similar
> interface (recv* family) that I cannot use.
>
> I know there's been a bunch of attempts at buffered AIO and none of
> them made it into the kernel. It would let me build a buffered AIO
> implementation in user-space using a threadpool. And cached data would
> not end up getting blocked behind other non-cached requests sitting in
> the queue. I know there's other sources of blocking (locking, metadata
> lookups) but direct AIO already suffers from these so I'm fine to
> paper over that for now.

Although I still think providing useful AIO at the kernel level would be
better than having everyone reimplement it it still would be useful to
allow people to sanely reimplement it. If only to avoid the discussion
about what API to use between the non-standard and not really that nice
Linux io_submit and the utterly horrible Posix aio_ semantics.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/