Re: [patch 52/52] fs: icache less I_FREEING time

From: Nick Piggin
Date: Thu Jul 01 2010 - 04:06:39 EST


On Thu, Jul 01, 2010 at 01:33:42PM +1000, Dave Chinner wrote:
> On Wed, Jun 30, 2010 at 10:14:52PM +1000, Nick Piggin wrote:
> > On Wed, Jun 30, 2010 at 08:13:54PM +1000, Dave Chinner wrote:
> > > On Thu, Jun 24, 2010 at 01:03:04PM +1000, npiggin@xxxxxxx wrote:
> > > > Problem with inode reclaim is that it puts inodes into I_FREEING state
> > > > and then continues to gather more, during which it may iput,
> > > > invalidate_mapping_pages, be preempted, etc. Holding these inodes in
> > > > I_FREEING can cause pauses.
> > >
> > > What sort of pauses? I can't see how holding a few inodes in
> > > I_FREEING state would cause any serious sort of holdoff...
> >
> > Well if the inode is accessed again, it has to wait for potentially
> > hundreds of inodes to be found from the LRU, pagecache invalidated,
> > and destroyed.
>
> So it's a theoretical concern you have, not something that's
> actually been demonstrated as a problem?

Yes it causes holdoffs. I actually started visibly seeing the problem
badly in my parallel git diff workload as I started experimenting with
increasing batch size.

But even smaller batches could have a problem if the wrong inode is hit.
I didn't add instrumentation in there, but I just didn't see such a
benefit from batching after the lock breakups, so I prefer to go the
other way and add back batching and its downsides IF that proves to be
an improvement.

Is this of any consequence for the filesystem?


> As it is, If the inode is accessed immediately after teardown has
> started, then we failed to hold on to the inode at a higher level
> for long enough.

Or simply didn't anticpate userspace access pattern.


> Changing the I_FREEING behaviour is trying to
> address the issue at the wrong level...

Reclaim should of course be as good as possible. But it is inherntly
going to have problems. You need to design everything around reclaim
thinking about the possibility that it will go wrong. Because it will,
in unreproducable ways, on a customer's machine.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/