Re: kernel crash in mknod
From: Al Viro
Date: Sun Mar 24 2024 - 02:31:35 EST
On Sun, Mar 24, 2024 at 05:46:36AM +0000, Al Viro wrote:
> On Sun, Mar 24, 2024 at 12:00:15AM -0500, Steve French wrote:
> > Anyone else seeing this kernel crash in do_mknodat (I see it with a
> > simple "mkfifo" on smb3 mount). I started seeing this in 6.9-rc (did
> > not see it in 6.8). I did not see it with the 3/12/23 mainline
> > (early in the 6.9-rc merge Window) but I do see it in the 3/22 build
> > so it looks like the regression was introduced by:
>
> FWIW, successful ->mknod() is allowed to return 0 and unhash
> dentry, rather than bothering with lookups. So commit in question
> is bogus - lack of error does *NOT* mean that you have struct inode
> existing, let alone attached to dentry. That kind of behaviour
> used to be common for network filesystems more than just for ->mknod(),
> the theory being "if somebody wants to look at it, they can bloody
> well pay the cost of lookup after dcache miss".
>
> Said that, the language in D/f/vfs.rst is vague as hell and is very easy
> to misread in direction of "you must instantiate".
>
> Thankfully, there's no counterpart with mkdir - *there* it's not just
> possible, it's inevitable in some cases for e.g. nfs.
>
> What the hell is that hook doing in non-S_IFREG cases, anyway? Move it
> up and be done with it...
PS: moving the call site up to S_IFREG case deals with the immediate
problem (->create() *IS* required to make dentry uptodate), but the other
side of what had lead to that bug needs to be dealt with separately.
It needs to be documented clearly (for all object-creating methods) and
we need to decide what their behaviours should be. Right now it's
successful ->create() must make positive
successful ->mkdir() may leave negative unhashed (and it might be forced to)
successful ->tmpfile() must make positive
->mknod() and ->symlink() are uncertain. VFS doesn't give a damn;
other users might. FWIW, ecryptfs is fine with either behaviour.
nfsd and overlayfs might or might not break. AF_UNIX bind()
probably *does* break on such ->mknod() behaviour and unless I'm
misreading the history it had been that way since way back.
>From a quick look through ->mknod() instances it looks like
CIFS_MOUNT_UNX_EMUL case in cifs is the odd man out at the moment.
Could you check it AF_UNIX bind() with ->sun_path containing
a pathname that resolves to (inexistent) file on your filesystem
breaks with your setup?
setup?