Re: [PATCH 13/17] fs: Implement lazy LRU updates for inodes.

From: Nick Piggin
Date: Sat Oct 16 2010 - 03:55:06 EST


On Wed, Sep 29, 2010 at 10:05:17PM -0400, Christoph Hellwig wrote:
> > @@ -1058,8 +1051,6 @@ static void wait_sb_inodes(struct super_block *sb)
> > */
> > WARN_ON(!rwsem_is_locked(&sb->s_umount));
> >
> > - spin_lock(&sb_inode_list_lock);
> > -
> > /*
> > * Data integrity sync. Must wait for all pages under writeback,
> > * because there may have been pages dirtied before our sync
> > @@ -1067,6 +1058,7 @@ static void wait_sb_inodes(struct super_block *sb)
> > * In which case, the inode may not be on the dirty list, but
> > * we still have to wait for that writeout.
> > */
> > + spin_lock(&sb_inode_list_lock);
>
> I think this should be folded back into the patch introducing
> sb_inode_list_lock.
>
> > @@ -1083,10 +1075,10 @@ static void wait_sb_inodes(struct super_block *sb)
> > spin_unlock(&sb_inode_list_lock);
> > /*
> > * We hold a reference to 'inode' so it couldn't have been
> > - * removed from s_inodes list while we dropped the
> > - * sb_inode_list_lock. We cannot iput the inode now as we can
> > - * be holding the last reference and we cannot iput it under
> > - * spinlock. So we keep the reference and iput it later.
> > + * removed from s_inodes list while we dropped the i_lock. We
> > + * cannot iput the inode now as we can be holding the last
> > + * reference and we cannot iput it under spinlock. So we keep
> > + * the reference and iput it later.
>
> This also looks like a hunk that got in by accident and should be merged
> into an earlier patch.

These two actually came from a patch to do rcu locking (which Dave has
changed a bit, but originally due to my fault), so I'll fix those, thanks.


> > @@ -431,11 +412,12 @@ static int invalidate_list(struct list_head *head, struct list_head *dispose)
> > invalidate_inode_buffers(inode);
> > if (!inode->i_count) {
> > spin_lock(&wb_inode_list_lock);
> > - list_move(&inode->i_list, dispose);
> > + list_del(&inode->i_list);
> > spin_unlock(&wb_inode_list_lock);
> > WARN_ON(inode->i_state & I_NEW);
> > inode->i_state |= I_FREEING;
> > spin_unlock(&inode->i_lock);
> > + list_add(&inode->i_list, dispose);
>
> Moving the list_add out of the lock looks fine, but I can't really
> see how it's related to the rest of the patch.

Just helps shows that dispose isn't being protected by
wb_inode_list_lock, I guess.

>
> > + if (inode->i_count || (inode->i_state & ~I_REFERENCED)) {
> > + list_del_init(&inode->i_list);
> > + spin_unlock(&inode->i_lock);
> > + atomic_dec(&inodes_stat.nr_unused);
> > + continue;
> > + }
> > + if (inode->i_state) {
>
> Slightly confusing but okay given the only i_state that will get us here
> is I_REFERENCED. Do we really care about the additional cycle or two a
> dumb compiler might generate when writing
>
> if (inode->i_state & I_REFERENCED)

Sure, why not.

>
> ?
>
> > if (inode_has_buffers(inode) || inode->i_data.nrpages) {
> > + list_move(&inode->i_list, &inode_unused);
>
> Why are we now moving the inode to the front of the list?

It was always being moved to the front of the list, but with lazy LRU,
iput_final doesn't move it for us, hence the list_move here.

Without this, it busy-spins and locks badly under heavy reclaim load
when buffers or pagecache can't be invalidated.

Seeing as it wasn't obvious to you, I'll add a comment here.

I was thinking we should probably have a shortcut to go back to the
tail of the LRU in case of invalidation success, but that's out of the
scope of this patch and I never got around to testing such a change
yet.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/