Re: read()/readv() only from page cache

From: Jeff Moyer
Date: Fri Sep 05 2014 - 12:03:04 EST


Christoph Hellwig <hch@xxxxxxxxxxxxx> writes:

> On Fri, Sep 05, 2014 at 12:09:27PM +0100, Mel Gorman wrote:
>> I suggest you look at the recent fincore debate. It did not progress much
>> the last time because the author wanted to push a lot of functionality in
>> there where as reviewers felt it should start simple. The simple case is
>> likely a good fit for what you want. The primary downside is that it would
>> be race-prone in memory pressure situations as the page could be reclaimed
>> between the fincore check and the read but I expect that your application
>> is already avoiding reclaim activity.
>
> I've actually experimentally hacked up O_NONBLOCK support for regular
> files so that it only returns data from the page cache, and not
> otherwise. Volker promised to test it with Samba, but we never made
> any progress on it, and just last week a customer told me they would
> have liked to use it if it was available.
>
> Note that we might want to also avoid blocking on locks, and I have some
> vague memory that we shouldn't actually implement O_NONBLOCK on regular
> files due to compatibility options but would have to use a new flag
> instead.

FWIW, here's a discussion from an old attempt at O_NONBLOCK for regular
files:
http://www.gossamer-threads.com/lists/linux/kernel/477936?do=post_view_threaded#477936

I recall it blowing up in various situations, so yeah, a new flag would
be a good idea.

> Note that mincor/fincore would not help for the usual use case where you
> have a non blocking event main loop and want to offload actual blocking
> I/O to helper threads, as you it returns information that can be stale
> any time.
>
> One further consideration would be to finally implement real buffered
> I/O in kernel space by something like the above and offloading to
> workqueues in kernelspace. I think our workqueues now are way better
> than any possible user thread pool, although we'd need to find a way to
> temporarily tie the work threads to a user address space.

Do you mean real buffered AIO?

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/