Re: [PATCH] nilfs2: no longer save to shadow map if the num of members is too small

From: Ryusuke Konishi

Date: Tue Mar 17 2026 - 12:52:02 EST


Hi Deepanshu and Edward,

On Wed, Mar 18, 2026 at 12:15 AM Deepanshu Kartikey wrote:
>
> Hi Edward,
>
> On Mon, 17 Mar 2026, Edward Adam Davis wrote:
>
> > The value of argv0.v_nmembs passed from userspace is 0. This prevents
> > nilfs_iget_for_gc() from being called to initialize the gcinode during
> > the execution of nilfs_ioctl_move_blocks(). Consequently, this triggers
> > a null-ptr-deref involving ii->i_assoc_inode within the subsequent call
> > sequence: nilfs_clean_segments()->nilfs_mdt_save_to_shadow_map() [1].
>
> This analysis is incorrect. The null-ptr-deref is not caused by
> nilfs_iget_for_gc() not being called. The real problem is that
> ns_dat->i_assoc_inode (the DAT inode's btree node cache) is never
> initialized at mount time.
>
> > A check for argv[0].v_nmembs has been added to nilfs_clean_segments()
> > to prevent this potential null-ptr-deref of ii->i_assoc_inode.
>
> This fixes the symptom but not the root cause. Also note that in
> the original syzkaller reproducer:
>
> argv[0].v_nmembs = 0xd = 13 > 0
>
> Your check would NOT prevent the crash with the original reproducer.
>
> The correct fix is to initialize the btnode cache eagerly in
> nilfs_dat_read() at mount time, since i_assoc_inode is only
> initialized lazily during btree operations. When
> NILFS_IOCTL_CLEAN_SEGMENTS is called before any btree operation
> has occurred, i_assoc_inode is NULL.
>
> I have already submitted this fix and syzbot confirmed it as fixed:
>
> https://lore.kernel.org/all/20260317090109.878401-1-kartikey406@xxxxxxxxx/T/
>
> Regards,
> Deepanshu Kartikey

Deepanshu's suggestion seems close to the answer, but I think there's
a slight leap in the root cause analysis.

When nilfs_dat_read() is in a b-tree configuration, it normally calls
nilfs_attach_btree_node_cache() via nilfs_read_inode_common() ->
nilfs_bmap_read() -> nilfs_btree_init().

Therefore, the problem seems to be one of the following two:
(1) nilfs_mdt_save_to_shadow_map(), called from a GC ioctl specifying
the dat, calls nilfs_copy_dirty_pages() assuming a b-tree node cache
exists, regardless of whether the DAT is direct mapping or b-tree
mapping.
(The DAT mapping method switching is not considered.)

(2) The DAT is in b-tree mapping mode, but nilfs_btree_init() is not
being called because the i_mode of the DAT inode is corrupt.

Both appear to be potential bugs, but their fixes are different.
Have you determined which of these is causing this bug?

Regards,
Ryusuke Konishi