Re: kernel crash in mknod
From: Al Viro
Date: Mon Mar 25 2024 - 15:55:21 EST
On Mon, Mar 25, 2024 at 11:26:59AM -0500, Steve French wrote:
> A loosely related question. Do I need to change cifs.ko to return the
> pointer to inode on mknod now? dentry->inode is NULL in the case of mknod
> from cifs.ko (and presumably some other fs as Al noted), unlike mkdir and
> create where it is filled in. Is there a perf advantage in filling in the
> dentry->inode in the mknod path in the fs or better to leave it as is? Is
> there a good example to borrow from on this?
AFAICS, that case in in CIFS is the only instance of ->mknod() that does this
"skip lookups, just unhash and return 0" at the moment.
What's more, it really had been broken all along for one important case -
AF_UNIX bind(2) with address (== socket pathname) being on the filesystem
in question.
Options:
1) make vfs_mknod() callers aware of the possibility, have the ones
that care do lookup in case when return value is 0 and dentry is unhashed.
That's similar to what we do for vfs_mkdir(). No changes needed for CIFS
or fs/namei.c (i.e. do_mknodat()), unix_bind() definitely needs a change,
ecryptfs can stay as-is, overlayfs just needs to stop complaining when it sees
that situation, nfsd might or might not need a change - hadn't checked yet.
In that case we document ->mknod() as "may unhash and return 0 if it wants
to save a lookup".
2) make vfs_mknod() check for that case and have it call ->lookup()
if it sees that. I don't see any benefits to that, TBH - no performance
benefits anywhere and no real simplification for ->mknod() instances. It
does avoid the need to change anything in CIFS, though.
3) require ->mknod() instances to make dentry positive on success.
CIFS needs a fix, documentation gets updated to explicitly require that.
AFAICS, nothing else would need to be touched, except possibly adding
a warning in vfs_mknod() to catch violation of that rule.
Note that cifs_sfu_make_node() is the only case in CIFS where that happens -
other codepaths (both in cifs_make_node() and in smb2_make_node()) will
instantiate. How painful would it be for cifs_sfu_make_node()?
AFAICS, you do open/sync_write/close there; would it be hard to do
an eqiuvalent of fstat and set the inode up? No need to reread the
file contents (as cifs_sfu_type() does), and you do have full path
anyway, so it's less work than for full ->lookup() even if you need
a path-based protocol operations...
Does that thing have an equivalent of fstat() that would return the
metadata of opened file?