Re: [patch 1/2] splice: dont steal
From: Nick Piggin
Date: Thu Mar 15 2007 - 09:13:30 EST
On Thu, Mar 15, 2007 at 01:54:32PM +0100, Jens Axboe wrote:
> On Thu, Mar 15 2007, Nick Piggin wrote:
> > On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote:
> > > On Thu, Mar 15 2007, Nick Piggin wrote:
> > > >
> > > > We should be able to allow for it with the new a_ops API I'm working
> > > > on.
> > >
> > > "Should be" and in progress stuff, is it guarenteed to get there?
> >
> > Well considering that it is needed in order to solve 3 different deadlock
> > scenarios in the core write(2) path without taking a big performance hit,
> > I'd hope so ;)
> >
> > It isn't guaranteed, but I have only had positive feedback so far. Would
> > take a while to actually get merged, though.
>
> It's not that I don't believe you, I'm just a little reluctant to rip
> stuff out with a promise to fix it later when foo and bar are merged,
> since things like that have a tendency not to get done because they are
> forgotten :-)
Fair enough. The API side is trivial, all I need to do is set a single
flag and make splice pass down the page, and set that flag when stealing.
Filesystems might vary from trivial to impossible, but I think most should
be OK. If the flag is there then they at least have the option.
> Do you have a test case for stealing failures? What I'm really asking is
> how critical is this?
I guess you could fill a filesystem completely, and have a sparse file
in it. Then steal a page and splice it in. The prepare_write should fail,
but the page will still be in pagecache, until it gets reclaimed, then
it will go back to zeroes.
(no I don't have a test case ;)).
You could do something like remove the page if prepare_write fails, but
there is still a window where a read can see it. Basically I can't see
a way that it can possibly work within our current prepare_write API,
and it is a data corruption bug, so in my opinion it is a candidate for
2.6.21 + stable.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/