Re: [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers

From: Deepanshu Kartikey

Date: Thu Apr 30 2026 - 00:09:08 EST


On Wed, Apr 29, 2026 at 6:02 PM Ryusuke Konishi
<konishi.ryusuke@xxxxxxxxx> wrote:
>
> On Wed, Apr 29, 2026 at 10:50 AM Deepanshu Kartikey wrote:
> >
> > On Wed, Apr 29, 2026 at 12:29 AM Viacheslav Dubeyko wrote:
> > >
> > > On Tue, 2026-04-28 at 09:32 +0530, Deepanshu Kartikey wrote:
> > > > Syzbot reported a hung task in nilfs_transaction_begin() where multiple
> > > > tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
> > > > waiting to acquire ns_segctor_sem for read:
> > > >
> > > > INFO: task syz.0.17:5918 blocked for more than 143 seconds.
> > > > Call Trace:
> > > > schedule+0x164/0x360
> > > > rwsem_down_read_slowpath+0x6d9/0x940
> > > > down_read+0x99/0x2e0
> > > > nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
> > > > nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
> > > > notify_change+0xc1a/0xf40
> > > > chmod_common+0x273/0x4a0
> > > > do_fchmodat+0x12d/0x230
> > > >
> > > > The writer holding ns_segctor_sem was a concurrent NILFS_IOCTL_CLEAN_SEGMENTS
> > > > caller, stuck inside printk while emitting per-element warnings from
> > > > nilfs_sufile_updatev():
> > > >
> > > > __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
> > > > nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
> > > > nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
> > > > nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
> > > > nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
> > > > nilfs_segctor_do_construct+0x1f55/0x76c0
> > > > nilfs_clean_segments+0x3bd/0xa50
> > > > nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
> > > > nilfs_ioctl+0x261f/0x2780
> > > >
> > > > The root cause is that nilfs_ioctl_clean_segments() does not validate
> > > > the user-supplied segment numbers in kbufs[4] before calling
> > > > nilfs_clean_segments(), which acquires ns_segctor_sem for write. The
> > > > range check on each segnum is performed deep inside the call chain by
> > > > nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
> > > > while still under the segctor lock and the sufile mi_sem. Under load
> > > > (repeated invocations across multiple mounts saturating the global
> > > > printk path), the cumulative printk latency keeps ns_segctor_sem held
> > > > long enough to trip the hung_task watchdog, blocking concurrent
> > > > operations such as chmod() that need ns_segctor_sem for read.
> > > >
> > > > Fix by validating the contents of kbufs[4] in the ioctl entry path,
> > > > before any FS-wide lock is acquired. Out-of-range segment numbers are
> > > > rejected with -EINVAL synchronously, with no work performed under
> > > > ns_segctor_sem.
> > > >
> > > > Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > > Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
> > > > Fixes: 4f6b828837b4 ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl")
> > > > Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > > Signed-off-by: Deepanshu Kartikey <kartikey406@xxxxxxxxx>
> > > > ---
> > > > fs/nilfs2/ioctl.c | 16 ++++++++++++++++
> > > > 1 file changed, 16 insertions(+)
> > > >
> > > > diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c
> > > > index e0a606643e87..38822dce1839 100644
> > > > --- a/fs/nilfs2/ioctl.c
> > > > +++ b/fs/nilfs2/ioctl.c
> > > > @@ -846,6 +846,7 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
> > > > struct the_nilfs *nilfs;
> > > > size_t len, nsegs;
> > > > int n, ret;
> > > > + size_t i;
> > >
> > > What about re-using the n variable? Does it make sense to introduce new one?
> > >
> > > >
> > > > if (!capable(CAP_SYS_ADMIN))
> > > > return -EPERM;
> > > > @@ -876,6 +877,21 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
> > > > }
> > > > nilfs = inode->i_sb->s_fs_info;
> > > >
> > > > + /*
> > > > + * Validate segment numbers against the filesystem's segment count
> > > > + * before entering nilfs_clean_segments(), which acquires
> > > > + * ns_segctor_sem for write. Catching invalid segnums here avoids
> > > > + * holding that lock while emitting per-element diagnostics under
> > > > + * the segment constructor.
> > > > + */
> > > > + for (i = 0; i < nsegs; i++) {
> > > > + if (((__u64 *)kbufs[4])[i] >= nilfs->ns_nsegments) {
> > > > + ret = -EINVAL;
> > > > + kfree(kbufs[4]);
> > > > + goto out;
> > >
> > > Are you sure that you need to free buffer here and go to out? Maybe, we can
> > > introduce another label and to jump to kfree(kbufs[4]) at the end of method?
> > >
> > > Thanks,
> > > Slava.
> > >
> > > > + }
> > > > + }
> > > > +
> > > > for (n = 0; n < 4; n++) {
> > > > ret = -EINVAL;
> > > > if (argv[n].v_size != argsz[n])
> > >
> >
> > Thanks for the feedback. I have sent patch v2.
> >
> > Thanks
> >
> > Deepanshu Kartikey
>
> Thank you, Deepanshu, for the patch proposal.
>
> Because nilfs->ns_nsegments can be modified by nilfs_ioctl_resize(),
> we must avoid race conditions regarding this proposed fix.
>
> Therefore, it's appropriate to insert this check within the write lock
> section of nilfs->ns_segctor_sem, that is, immediately after calling
> nilfs_transaction_lock() within nilfs_clean_segments().
>
> As a coding comment, directly referencing kbufs[4] as an array is not
> very readable, so it's better to declare a variable in the local
> variable declaration section of nilfs_clean_segments() like this:
>
> size_t i, nsegs = argv[4].v_nmembs;
> __u64 *segnumv = kbufs[4];
>
> and then compare by referencing segnumv[i].
> Here, the original variable name "nsegs" is confusingly similar to
> "ns_nsegments", therefore, it would be better to simply change it to
> "n" or rename it to something that more concisely represents the
> number of segments in the array.
>
> Also, to help identify the error's cause, I recommend adding an error
> message like the following within the pre-check loop:
>
> nilfs_err(sb,
> "Segment number %llu to be freed is out of range",
> (unsigned long long)segnumv[i]);
>
> Finally, a minor comment: to avoid scattering the function's exit path
> when making corrections according to the above policy, it's better to
> add a label like the following to nilfs_clean_segments() and jump to
> it, rather than returning separately.
>
> out_unlock:
> ...
>
> bail_unlock:
> nilfs_transaction_unlock(sb);
> return err;
>
> Thanks,
> Ryusuke Konishi

Thanks for the detailed feedback, Ryusuke. I have sent the patch v3

Thanks


Deepanshu Kartikey