Re: [PATCH] vfs: Fix RCU path walk failiures due to uninitializednameidata seq number for root directory

From: Andi Kleen
Date: Fri Apr 15 2011 - 17:09:22 EST

On 4/15/2011 11:39 AM, Tim Chen wrote:
During RCU walk in path_lookupat and path_openat, the rcu lookup
frequently failed because when root directory was looked up, seq number
was not properly set in nameidata. We dropped out of RCU walk in
nameidata_drop_rcu due to mismatch in directory entry's seq number. We
reverted to slow path walk that need to take references.

Thanks Tim. Adding Andrew, Linus too. IMHO this fix is quite important to
actually make the fabled RCU dcache work -- without it it's just slower because
it will fallback nearly allways.

And it's a correctness fix because with the bogus sequence number you could fail
to detect a race on root's dentry, leading to very subtle malfunction.

Could it be merged ASAP please?
Also should be a stable candidate for .38 (whoever merges it please
add a Cc: stable@xxxxxxxxxx # .38)

Reviewed-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>


With the following patch, I saw a 50% increase in an exim mail server
benchmark throughput on a 4-socket Nehalem-EX system.



Signed-off-by: Tim Chen<tim.c.chen@xxxxxxxxxxxxxxx>
diff --git a/fs/namei.c b/fs/namei.c
index 3cb616d..e4b27a6 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -697,6 +697,7 @@ static __always_inline void set_root_rcu(struct nameidata *nd)
do {
seq = read_seqcount_begin(&fs->seq);
nd->root = fs->root;
+ nd->seq = __read_seqcount_begin(&nd->root.dentry->d_seq);
} while (read_seqcount_retry(&fs->seq, seq));

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at