Re: inodes: Support generic defragmentation

From: Dave Chinner
Date: Mon Feb 08 2010 - 17:13:41 EST


On Mon, Feb 08, 2010 at 06:37:53PM +1100, Nick Piggin wrote:
> On Thu, Feb 04, 2010 at 11:13:15AM -0600, Christoph Lameter wrote:
> > On Thu, 4 Feb 2010, Nick Piggin wrote:
> >
> > > Well what I described is to do the slab pinning from the reclaim path
> > > (rather than from slab calling into the subsystem). All slab locking
> > > basically "innermost", so you can pretty much poke the slab layer as
> > > much as you like from the subsystem.
> >
> > Reclaim/defrag is called from the reclaim path (of the VM). We could
> > enable a call from the fs reclaim code into the slab. But how would this
> > work?
>
> Well the exact details will depend, but I feel that things should
> get easier because you pin the object (and therefore the slab) via
> the normal and well tested reclaim paths.
>
> So for example, for dcache, you will come in and take the normal
> locks: dcache_lock, sb_lock, pin the sb, umount_lock. At which
> point you have pinned dentries without changing any locking. So
> then you can find the first entry on the LRU, and should be able
> to then build a list of dentries on the same slab.
>
> You still have the potential issue of now finding objects that would
> not be visible by searching the LRU alone. However at least the
> locking should be simplified.

Very true, but that leads us to the same problem of fragmented
caches because we empty unused objects off slabs that are still
pinned by hot objects and don't free the page. I agree that we can't
totally avoid this problem, but I still think that using an object
based LRU for reclaim has a fundamental mismatch with page based
reclaim that makes this problem worse than it could be.

FWIW, if we change the above to keeping a page based LRU in the slab
cache and the slab picks a page to reclaim, then the problem goes
mostly away, I think. We don't need to pin the slab to select and
prepare a page to reclaim - the cache only needs to be locked before
it starts reclaim. I think this has a much better chance of
reclaiming entire pages in situations where LRU based reclaim will
leave fragmentation.

i.e. instead of:

shrink_slab
-> external shrinker
-> lock cache
-> find reclaimable object
-> call into slab w/ object
-> return longer list of objects
-> reclaim objects

we do:

shrink_slab
-> internal shrinker
-> find oldest page and make object list
-> external shrinker
-> lock cache
-> reclaim objects

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/