Re: [PATCH 09/18] fs: rework icount to be a locked variable

From: Al Viro
Date: Fri Oct 08 2010 - 05:32:28 EST


On Fri, Oct 08, 2010 at 04:21:23PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
>
> The inode reference count is currently an atomic variable so that it can be
> sampled/modified outside the inode_lock. However, the inode_lock is still
> needed to synchronise the final reference count and checks against the inode
> state.
>
> To avoid needing the protection of the inode lock, protect the inode reference
> count with the per-inode i_lock and convert it to a normal variable. To avoid
> existing out-of-tree code accidentally compiling against the new method, rename
> the i_count field to i_ref. This is relatively straight forward as there
> are limited external references to the i_count field remaining.

You are overdoing the information hiding here; _way_ too many small
functions that don't buy you anything so far, AFAICS. Moreover, why
the hell not make them static inlines and get rid of the exports?

> - if (atomic_add_unless(&inode->i_count, -1, 1))
> + /* XXX: filesystems should not play refcount games like this */
> + spin_lock(&inode->i_lock);
> + if (inode->i_ref > 1) {
> + inode->i_ref--;
> + spin_unlock(&inode->i_lock);
> return;
> + }
> + spin_unlock(&inode->i_lock);

... or, perhaps, they needs a helper along the lines of "try to do iput()
if it's known to hit easy case".

I really don't like the look of code around -ENOSPC returns, though.
What exactly is going on there? Can it e.g. interfere with that
delayed iput stuff?

> void iref(struct inode *inode)
> {
> spin_lock(&inode_lock);
> + spin_lock(&inode->i_lock);
> iref_locked(inode);
> + spin_unlock(&inode->i_lock);
> spin_unlock(&inode_lock);
> }

*cringe*

> int iref_read(struct inode *inode)
> {
> - return atomic_read(&inode->i_count);
> + int ref;
> +
> + spin_lock(&inode->i_lock);
> + ref = inode->i_ref;
> + spin_unlock(&inode->i_lock);
> + return ref;

What's the point of locking here?

> @@ -1324,8 +1359,16 @@ void iput(struct inode *inode)
> if (inode) {
> BUG_ON(inode->i_state & I_CLEAR);
>
> - if (atomic_dec_and_lock(&inode->i_count, &inode_lock))
> + spin_lock(&inode_lock);
> + spin_lock(&inode->i_lock);
> + inode->i_ref--;
> + if (inode->i_ref == 0) {
> + spin_unlock(&inode->i_lock);
> iput_final(inode);
> + return;
> + }

*UGH* So you take inode_lock on every damn iput()?
> state->owner = owner;
> atomic_inc(&owner->so_count);
> list_add(&state->inode_states, &nfsi->open_states);
> - state->inode = igrab(inode);
> spin_unlock(&inode->i_lock);
> + state->inode = igrab(inode);

Why is that safe?

> --- a/fs/notify/inode_mark.c
> +++ b/fs/notify/inode_mark.c
> @@ -257,7 +257,8 @@ void fsnotify_unmount_inodes(struct list_head *list)
> * actually evict all unreferenced inodes from icache which is
> * unnecessarily violent and may in fact be illegal to do.
> */
> - if (!iref_read(inode))
> + spin_lock(&inode->i_lock);
> + if (!inode->i_ref)
> continue;

Really?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/