Re: processes hung after sys_renameat, and 'missing' processes
From: Linus Torvalds
Date: Wed Jun 06 2012 - 20:35:44 EST
On Wed, Jun 6, 2012 at 4:54 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Wed, Jun 06, 2012 at 04:31:51PM -0700, Linus Torvalds wrote:
>
>> Al, looking at i_mutex use and rename, the only odd thing I see is how
>> vfs_rename_dir() does the "d_move()" *after* it has dropped the target
>> i_mutex. That looks odd. But I guess it shouldn't matter, because if
>> we're doing cross-directory renames we will always serialize everybody
>> with that rename mutex anyway. Yes/no? But wouldn't it make more sense
>> to do it inside the i_mutex? And before we do the dput() on the
>> new_dentry?
>
> What we need is ->i_mutex on parents.
Yes. but the placement is odd as-is, wouldn't you say? *Why* is it
that way? Especially considering that it isn't that way in the other
non-directory case.
> And I'm much more concerned about
> this: 7732a557b1342c6e6966efb5f07effcf99f56167 and
> 3f50fff4dace23d3cfeb195d5cd4ee813cee68b7.
Hmm. If two directory dentries point to the same inode, we're f*cked
for other reasons: we'd consider them separate entries, and then try
to mutex_lock() them both. Causing the obvious deadlock. But I would
have assumed those two commits would make us *less* likely to have
that case, rather than more.
That said, you're right, that d_move() is scary as hell. No parent
semaphores there.. So we're screwed whether we try to alias them or
not.
So yeah, I agree with the suggestion of trying to revert those two and
seeing if that changes anything.
> Al, in the middle of really messy bisect right now ;-/
> [...] On the "akpm patchbomb" side it was just a linear
> sequence, so doing cherry-pick of all of that stuff to the other side of
> merge has yielded a tree identical to the merge one and that allowed normal
> git bisect, which has located the point where it breaks.
Yeah, we've done that before.
> Can't do that
> trick on the other side - there we have shitloads of merges (including the
> one from tip, and I *really* hope it doesn't end up being the source of
> trouble - topology in that one is horrible). So I'm doing a kinda-sorta
> manual bisect - pick a point with gitk, reset the test branch to it,
> merge the ipc/mqueue commit into it, test, pick the next point, etc.
> Any suggestions re improving that process?
Just do a *real* bisect - not a manual one - but every time you test a
kernel you test it with the merge (or rebase) on top. And then you
just mark the *base* of that merge good/bad, and let bisect sort it
out.
That's effectively how people bisect bugs that are hidden by other
bugs: you have to apply the (known) bugfix on top of the tree you are
bisecting in order to find the unknown one.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/