Re: [BUG] KASAN: slab-use-after-free in link_path_walk
From: Al Viro
Date: Thu Apr 23 2026 - 01:15:56 EST
On Thu, Apr 23, 2026 at 05:39:06AM +0100, Al Viro wrote:
> Folks, the rules are simple:
> * anything that might be accessed in RCU mode (inode very much included
> for objects that are visible in the tree) must be freed after RCU delay; that's
> what ->free_inode() is for.
> * anything that can't be freed in such context should either be
> dealt with in ->destroy_inode() (if it isn't needed for RCU-exposed methods)
> or, if it really is needed for those, done via schedule_work() or equivalent
> done by ->destroy_inode().
If you do ->destroy_inode() alone, you must use an explicit call_rcu() in there
(or in ->evict_inode(), for that matter), with everything that must be RCU-delayed
done via that callback; strongly discouraged, though, since it's easier to leave
that to fs/inode.c by turning that callback into ->free_inode().
> Seeing that bpffs has the grand total of zero RCU-exposed methods (no ->d_compare(),
> no ->d_hash(), no ->permission(), no ->d_revalidate(), no ->get_link()) I would
> guess that it's the case of "have your bpf_any_put() done promptly, leave freeing
> the inode and cached symlink body RCU-delayed".
Other than bpffs there are only two instances of super_operations that have non-NULL
->destroy_inode() and NULL ->free_inode():
static const struct super_operations pipefs_ops = {
.destroy_inode = free_inode_nonrcu,
.statfs = simple_statfs,
};
which is fine, since pipefs inodes are not exposed to RCU pathwalk at all and
static const struct super_operations btrfs_test_super_ops = {
.alloc_inode = btrfs_alloc_inode,
.destroy_inode = btrfs_test_destroy_inode,
};
which is definitely not fine, but since that thing is not exposed to regular
syscalls (only to odd internal selftests, what with not being user-mountable),
presumably it gets away with that. AFAICS, it may end up calling
cond_resched_rwlock_write(&tree->lock);
from drop_all_extent_maps_fast(), from btrfs_drop_extent_map_range(), called
in btrfs_test_destroy_inode(), so it probably needs to leave that call
of btrfs_drop_extent_map_range() in ->destroy_inode() and use their
regular btrfs_free_inode() for ->free_inode().