Re: [PATCH v3 1/2] landlock: Serialize TSYNC thread restriction

From: Justin Suess

Date: Tue Mar 03 2026 - 16:10:09 EST


On Tue, Mar 03, 2026 at 08:50:50PM +0100, Günther Noack wrote:
> Oof, thanks for catching this, Justin! 🏆
>
> I was convinced I had tried the selftests for that variant,
> but apparently I had not. Mea culpa.
>
> On Tue, Mar 03, 2026 at 11:20:10AM -0500, Justin Suess wrote:
> > On Thu, Feb 26, 2026 at 09:59:02AM +0800, Yihan Ding wrote:
> > > syzbot found a deadlock in landlock_restrict_sibling_threads().
> > > When multiple threads concurrently call landlock_restrict_self() with
> > > sibling thread restriction enabled, they can deadlock by mutually
> > > queueing task_works on each other and then blocking in kernel space
> > > (waiting for the other to finish).
> > >
> > > Fix this by serializing the TSYNC operations within the same process
> > > using the exec_update_lock. This prevents concurrent invocations
> > > from deadlocking.
> > >
> > > We use down_write_trylock() and return -ERESTARTNOINTR if the lock
> > > cannot be acquired immediately. This ensures that if a thread fails
> > > to get the lock, it will return to userspace, allowing it to process
> > > any pending TSYNC task_works from the lock holder, and then
> > > transparently restart the syscall.
> > >
> > > Fixes: 42fc7e6543f6 ("landlock: Multithreading support for landlock_restrict_self()")
> > > Reported-by: syzbot+7ea2f5e9dfd468201817@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > Closes: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> > > Suggested-by: Günther Noack <gnoack3000@xxxxxxxxx>
> > > Signed-off-by: Yihan Ding <dingyihan@xxxxxxxxxxxxx>
> > > ---
> > > Changes in v3:
> > > - Replaced down_write_killable() with down_write_trylock() and
> > > returned -ERESTARTNOINTR to avoid a secondary deadlock caused by
> > > blocking the execution of task_works. (Caught by Günther Noack).
> > >
> > > Changes in v2:
> > > - Use down_write_killable() instead of down_write().
> > > - Split the interrupt path cleanup into a separate patch.
> > > ---
> > > security/landlock/tsync.c | 8 ++++++++
> > > 1 file changed, 8 insertions(+)
> > >
> > > diff --git a/security/landlock/tsync.c b/security/landlock/tsync.c
> > > index de01aa899751..xxxxxxxxxxxx 100644
> > > --- a/security/landlock/tsync.c
> > > +++ b/security/landlock/tsync.c
> > > @@ -447,6 +447,13 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
> > > shared_ctx.new_cred = new_cred;
> > > shared_ctx.set_no_new_privs = task_no_new_privs(current);
> > >
> > > + /*
> > > + * Serialize concurrent TSYNC operations to prevent deadlocks
> > > + * when multiple threads call landlock_restrict_self() simultaneously.
> > > + */
> > > + if (!down_write_trylock(&current->signal->exec_update_lock))
> > > + return -ERESTARTNOINTR;
> > These two lines above introduced a test failure in tsync_test
> > completing_enablement.
> >
> > The commit that introduced the bug is 3d6327c306b3e1356ab868bf27a0854669295a4f
> > (this patch) and is currently in the mic/next branch.
> >
> > I noticed the test failure while testing an unrelated patch.
> >
> > The bug is because this code never actually yields or restarts the syscall.
> >
> > This is the test output I observed:
> >
> > [+] Running tsync_test:
> > TAP version 13
> > 1..4
> > # Starting 4 tests from 1 test cases.
> > # RUN global.single_threaded_success ...
> > # OK global.single_threaded_success
> > ok 1 global.single_threaded_success
> > # RUN global.multi_threaded_success ...
> > # OK global.multi_threaded_success
> > ok 2 global.multi_threaded_success
> > # RUN global.multi_threaded_success_despite_diverging_domains ...
> > # OK global.multi_threaded_success_despite_diverging_domains
> > ok 3 global.multi_threaded_success_despite_diverging_domains
> > # RUN global.competing_enablement ...
> > # tsync_test.c:156:competing_enablement:Expected 0 (0) == d[1].result (-1)
>
> The interesting part here is when you print out the errno that is
> returned from the syscall -- it is 513, the value of ERESTARTNOINTR!
>
> My understanding so far: Poking around in kernel/entry/common.c, it
> seems that __exit_to_user_mode_loop() calls
> arch_do_signal_or_restart() only when there is a pending signal
> (_TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL). So it was possible that the
> system call returns with the (normally internal) error code
> ERESTARTNOINTR, in the case where the trylock fails, but where current
> has not received a signal from the other competing TSYNC thread yet.
>
> So with that in mind, would it work to do this?
>
> while (try-to-acquire-the-lock) {
> if (current-has-task-works-pending)
> return -ERESTARTNOINTR;
>
> cond_resched();
> }
Returning -ERESTARTNOINTR does *work* in my testing.

But really anything that would yield to the scheduler technically would!

But it smells funny to me to restart the whole syscall to wait for a
lock to become free.

Also the documentation for that code ERESTARTNOINTR is:
"System call was interrupted by a signal and will be restarted".

That clearly isn't happening here, we're not being interrupted by a
(literal) signal, we're really waiting on a semaphore.

And this lock is held by a task in this code, so why not run the pending
tasks with task_work_run so you can get out of the lock ASAP instead of
jumping through hoops back down to userspace and back?

>
> Then we could avoid calling task_work_run() directly; (I find it
> difficult to reason about the implications of calling taks_work_run()
> directly, because these task works may make assumptions about the
> context in which they are running.)
Hmm possibly? I really don't know because there aren't many examples or
docs to look at.

My thought process for picking task_work_run was like this:
1. We have to wait for a lock.
2. So how do we do something useful while waiting to avoid deadlock?
3. We can simply help the other task get through doing whatever needs to
be done to get done the lock (doing the jobs we made with task_work_add)

And the way I decided to do that was task_work_run after grepping around
for some examples in io_uring code.
>
> (Apologies for the somewhat draft-state source code; I have not had
> the time to test my theories yet; I'll fully accept it if I am wrong
> about it.)
>
> –Günther
>
>
> > # competing_enablement: Test failed
> > # FAIL global.competing_enablement
> > not ok 4 global.competing_enablement
> > # FAILED: 3 / 4 tests passed.
> >
> >
> > Brief investigation and the additions of these pr_warn lines:
> >
> >
> > diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
> > index 0d66a68677b7..84909232b220 100644
> > --- a/security/landlock/syscalls.c
> > +++ b/security/landlock/syscalls.c
> > @@ -574,6 +574,9 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
> > const int err = landlock_restrict_sibling_threads(
> > current_cred(), new_cred);
> > if (err) {
> > + pr_warn("landlock: restrict_self tsync err pid=%d tgid=%d err=%d flags=0x%x ruleset_fd=%d\n",
> > + task_pid_nr(current), task_tgid_nr(current), err,
> > + flags, ruleset_fd);
> > abort_creds(new_cred);
> > return err;
> > }
> > diff --git a/security/landlock/tsync.c b/security/landlock/tsync.c
> > index 5afc5d639b8f..deb0f0b1f081 100644
> > --- a/security/landlock/tsync.c
> > +++ b/security/landlock/tsync.c
> > @@ -489,8 +489,11 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
> > * Serialize concurrent TSYNC operations to prevent deadlocks when multiple
> > * threads call landlock_restrict_self() simultaneously.
> > */
> > - if (!down_write_trylock(&current->signal->exec_update_lock))
> > + if (!down_write_trylock(&current->signal->exec_update_lock)) {
> > + pr_warn("landlock: tsync trylock busy pid=%d tgid=%d\n",
> > + task_pid_nr(current), task_tgid_nr(current));
> > return -ERESTARTNOINTR;
> > + }
> >
> > /*
> > * We schedule a pseudo-signal task_work for each of the calling task's
> > @@ -602,6 +605,10 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
> >
> > tsync_works_release(&works);
> > up_write(&current->signal->exec_update_lock);
> > + if (atomic_read(&shared_ctx.preparation_error))
> > + pr_warn("landlock: tsync preparation_error pid=%d tgid=%d err=%d\n",
> > + task_pid_nr(current), task_tgid_nr(current),
> > + atomic_read(&shared_ctx.preparation_error));
> >
> > return atomic_read(&shared_ctx.preparation_error);
> > }
> >
> > Resulted in the following output:
> >
> > landlock: tsync trylock busy pid=1263 tgid=1261
> > landlock: landlock: restrict_self tsync err pid=1263 tgid=1261 err=-513 flags=0x8 ruleset_fd=6
> > # tsync_test.c:156:competing_enablement:Expected 0 (0) == d[1].result (-1)
> > # competing_enablement: Test failed
> > # FAIL global.competing_enablement
> > not ok 4 global.competing_enablement
> >
> > I have a fix that I will send as a patch.
> >
> > Kind Regards,
> > Justin Suess