Re: [PATCH v4] smb: client: Fix mount deadlock by avoiding super block iteration in DFS reconnect
From: Wang Zhaolong
Date: Wed Aug 20 2025 - 08:08:53 EST
On Fri, Aug 15, 2025 at 11:16:18AM +0800, Wang Zhaolong wrote:
diff --git a/fs/smb/client/dfs.c b/fs/smb/client/dfs.c
index f65a8a90ba27..37d83aade843 100644
--- a/fs/smb/client/dfs.c
+++ b/fs/smb/client/dfs.c
@@ -429,11 +429,11 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon)
tcon, tcon->ses->local_nls);
goto out;
}
sb = cifs_get_dfs_tcon_super(tcon);
- if (!IS_ERR(sb))
+ if (!IS_ERR_OR_NULL(sb))
cifs_sb = CIFS_SB(sb);
This is a bad or incomplete fix. When functions return BOTH error
pointers and NULL it MEANS something. The NULL return in this case
is a special kind of success.
For example, if you look up a file, then the an error means the
lookup failed because we're not allowed to have filenames '/' so that's
-EINVAL or maybe there was an allocation failure so that's -ENOMEM or
maybe you don't have access to the directory so it's -EPERM. The NULL
would mean that the lookup succeeded fine, but the file was not found.
Another common use case is "get the LED functions so I can blink
them". -EPROBE_DEFER means the LED subsystem isn't ready yet, but NULL
means the administrator has deliberately disabled it. It's not an error
it's deliberate.
It needs to be documented what the NULL returns *means*. The documentation
is missing here.
See my blog for more details.
https://staticthinking.wordpress.com/2022/08/01/mixing-error-pointers-and-null/
regards,
dan carpenter
Hi Dan,
Thank you for your valuable feedback and the insightful blog post. You're
absolutely right - mixing error pointers and NULL without clear semantics
is problematic.
I've just posted a v5 patch [1] that takes a completely different approach:
- Removes cifs_get_dfs_tcon_super() entirely (no more ERR_PTR/NULL confusion)
- Directly updates DFS mount prepaths without searching through superblocks
- Eliminates the deadlock by avoiding iterate_supers_type() completely
Thank you again for catching this issue - it led me to a much better
solution.
[1] https://lore.kernel.org/all/20250820113435.2319994-1-wangzhaolong@xxxxxxxxxxxxxxx/
Best regards,
Wang Zhaolong