Re: [PATCH] vfs: Fix lock inversion in drop_pagecache_sb()

From: Trond Myklebust
Date: Tue Mar 25 2008 - 18:02:38 EST



On Tue, 2008-03-25 at 12:53 -0700, Andrew Morton wrote:
> On Tue, 25 Mar 2008 19:12:27 +0100
> Jan Kara <jack@xxxxxxx> wrote:
>
> > Fix longstanding lock inversion in drop_pagecache_sb by dropping inode_lock
> > before calling __invalidate_mapping_pages(). We just have to make sure
> > inode won't go away from under us by keeping reference to it and putting
> > the reference only after we have safely resumed the scan of the inode
> > list. A bit tricky but not too bad...
> >
> > Signed-off-by: Jan Kara <jack@xxxxxxx>
> > CC: Fengguang Wu <wfg@xxxxxxxxxxxxxxxx>
> > CC: David Chinner <dgc@xxxxxxx>
> >
> > ---
> > fs/drop_caches.c | 8 +++++++-
> > 1 files changed, 7 insertions(+), 1 deletions(-)
> >
> > diff --git a/fs/drop_caches.c b/fs/drop_caches.c
> > index 59375ef..f5aae26 100644
> > --- a/fs/drop_caches.c
> > +++ b/fs/drop_caches.c
> > @@ -14,15 +14,21 @@ int sysctl_drop_caches;
> >
> > static void drop_pagecache_sb(struct super_block *sb)
> > {
> > - struct inode *inode;
> > + struct inode *inode, *toput_inode = NULL;
> >
> > spin_lock(&inode_lock);
> > list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
> > if (inode->i_state & (I_FREEING|I_WILL_FREE))
> > continue;
>
> OT: it might be worth having an `if (mapping->nrpages==0) continue' here.
>
> > + __iget(inode);
> > + spin_unlock(&inode_lock);
> > __invalidate_mapping_pages(inode->i_mapping, 0, -1, true);
> > + iput(toput_inode);
> > + toput_inode = inode;
> > + spin_lock(&inode_lock);
> > }
> > spin_unlock(&inode_lock);
> > + iput(toput_inode);
> > }
> >
> > void drop_pagecache(void)
>
> hrm. So we have a random ref on an inode without holding inode_lock. If
> we race with invalidate_list() we end up with an inode stuck on s_inodes
> and "Self-destruct in 5 seconds. Have a nice day...", don't we?

Calling drop_pagecache_sb() without having a reference to 'sb'? Surely
not...

Trond

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/