ping!!
On 9/27/19 10:12 AM, Ritesh Harjani wrote:
d_is_negative can race with d_instantiate_new()
-> __d_set_inode_and_type().
For e.g. in use cases where Thread-1 is creating
symlink (doing d_instantiate_new()) & Thread-2 is doing
cat of that symlink while doing lookup_fast (via REF-walk-
one such case is, when ->permission returns -ECHILD).
During this race if __d_set_and_inode_type() does out-of-order
execution and set the dentry->d_flags before setting
dentry->inode, then it can result into following kernel panic.
This change fixes the issue by directly checking for inode.
E.g. kernel panic, since inode was NULL.
trailing_symlink() -> may_follow_link() -> inode->i_uid.
Issue signature:-
ÂÂ [NIPÂ : trailing_symlink+80]
ÂÂ [LRÂÂ : trailing_symlink+1092]
ÂÂ #4 [c00000198069bb70] trailing_symlink at c0000000004bae60 (unreliable)
ÂÂ #5 [c00000198069bc00] path_openat at c0000000004bdd14
ÂÂ #6 [c00000198069bc90] do_filp_open at c0000000004c0274
ÂÂ #7 [c00000198069bdb0] do_sys_open at c00000000049b248
ÂÂ #8 [c00000198069be30] system_call at c00000000000b388
Sequence of events:-
Thread-2(Comm: ln)ÂÂÂÂÂÂÂÂÂÂÂ Thread-1(Comm: cat)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ dentry = __d_lookup() //nonRCU
__d_set_and_inode_type() (Out-of-order execution)
ÂÂÂÂ flags = READ_ONCE(dentry->d_flags);
ÂÂÂÂ flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
ÂÂÂÂ flags |= type_flags;
ÂÂÂÂ WRITE_ONCE(dentry->d_flags, flags);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (unlikely(d_is_negative()) // fails
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ {}
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ // since d_flags is already updated in
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ // Thread-2 in parallel but inode
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ // not yet set.
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ // d_is_negative returns false
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ *inode = d_backing_inode(path->dentry);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ // means inode is still NULL
ÂÂÂÂ dentry->d_inode = inode;
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ trailing_symlink()
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ may_follow_link()
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ inode = nd->link_inode;
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ // nd->link_inode = NULL
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //Then it crashes while
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //doing inode->i_uid
Reported-by: Guang Yuan Wu <wugyuan@xxxxxxxxxx>
Tested-by: Guang Yuan Wu <wugyuan@xxxxxxxxxx>
Acked-by: Jeff Layton <jlayton@xxxxxxxxxx>
Signed-off-by: Ritesh Harjani <riteshh@xxxxxxxxxxxxx>
---
 fs/namei.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/fs/namei.c b/fs/namei.c
index 671c3c1a3425..7c5337cddebd 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1617,7 +1617,21 @@ static int lookup_fast(struct nameidata *nd,
ÂÂÂÂÂÂÂÂÂ dput(dentry);
ÂÂÂÂÂÂÂÂÂ return status;
ÂÂÂÂÂ }
-ÂÂÂ if (unlikely(d_is_negative(dentry))) {
+
+ÂÂÂ /*
+ÂÂÂÂ * Caution: d_is_negative() can race with
+ÂÂÂÂ * __d_set_inode_and_type().
+ÂÂÂÂ * For e.g. in use cases where Thread-1 is creating
+ÂÂÂÂ * symlink (doing d_instantiate_new()) & Thread-2 is doing
+ÂÂÂÂ * cat of that symlink and falling here (via Ref-walk) while
+ÂÂÂÂ * doing lookup_fast (one such case is when ->permission
+ÂÂÂÂ * returns -ECHILD).
+ÂÂÂÂ * Now if __d_set_inode_and_type() does out-of-order execution
+ÂÂÂÂ * i.e. it first sets the dentry->d_flags & then dentry->inode
+ÂÂÂÂ * then it can result into inode being NULL, causing panic later.
+ÂÂÂÂ * Hence directly check if inode is NULL here.
+ÂÂÂÂ */
+ÂÂÂ if (unlikely(d_really_is_negative(dentry))) {
ÂÂÂÂÂÂÂÂÂ dput(dentry);
ÂÂÂÂÂÂÂÂÂ return -ENOENT;
ÂÂÂÂÂ }