Re: v4.2-rc dcache regression, probably 75a6f82a0d10

From: Dominique Martinet
Date: Sat Aug 01 2015 - 01:59:13 EST


Hugh Dickins wrote on Fri, Jul 31, 2015:
> On Fri, 31 Jul 2015, Linus Torvalds wrote:
>> I'd be more suspicious about other effects. For example, iot's not at
>> all obvious that the commit in question just changes the order of the
>> flags/inode field accesses, there are potentialy bigger changes there.
>> For example, this part (in __d_obtain_alias):
>>
>> - tmp->d_inode = inode;
>> - tmp->d_flags |= add_flags;
>> + __d_set_inode_and_type(tmp, inode, add_flags);
>>
>> looks a bit off, because it *used* to just add those flags, but now,
>> through __d_set_inode_and_type, it does
>>
>> + dentry->d_inode = inode;
>> + smp_wmb();
>> + flags = READ_ONCE(dentry->d_flags);
>> + flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
>> + flags |= type_flags;
>> + WRITE_ONCE(dentry->d_flags, flags);
>>
>> so it clears DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU.
>>
>> Is that correct? Maybe, I haven't checked. And maybe it's a big bad
>> bug. Regardless, it sure as hell isn't just changing the order of the
>> access to those fields. That "DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU"
>> clearing came from __d_instantiate(), but now it hits __d_obtain_alias
>> too.

Yeah, had spotted this and tried to fix just this bit, but it didn't
seem to change much for my problem.
Not saying it isn't a bug, but I have no idea what __d_obtain_alias
does and nobody seemed to care about this bit in my previous thread.

> Yes, the one which grabbed my attention is:
>
> @@ -311,7 +346,7 @@ static void dentry_iput(struct dentry * dentry)
> {
> struct inode *inode = dentry->d_inode;
> if (inode) {
> - dentry->d_inode = NULL;
> + __d_clear_type_and_inode(dentry);
> hlist_del_init(&dentry->d_u.d_alias);
> spin_unlock(&dentry->d_lock);
> spin_unlock(&inode->i_lock);

Oh, missed that one... Running tests with just that over the weekend,
it's a good candidate and the first 10 minutes of tests sound quite
positive!

I think it's wrong to call it a "fix" even if it stops the bug from
reproducing though, the way I understand the intent of the patch, they
want that checking d_flags be enough to take decisions without having to
check d_inode as well - so now things rely on that, it's still going to
be wrong on HEAD... I think?
(I'm running tests at this commit, so I don't have the patches that rely
on that yet)

As I understand things, the fact that lookup managed to get a
being-removed entry from rcu/wherever isn't changed, it's just that it
won't fail as fast and maybe something later will notice the lack of
inode and fallback graciously instead of that ENOTDIR I've been
tracking -- so that commit makes it possible to hit the bug, but there's
another issue about taking the dentry in the first place?


I really need to spend some time to understand vfs/dcache better as a
whole at some point, I've been looking at a small part of it without
context for too long...


Cheers,
--
Dominique
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/