Re: [PATCH v3 1/2] landlock: Serialize TSYNC thread restriction
From: Ding Yihan
Date: Tue Mar 03 2026 - 21:49:01 EST
Hi all,
Thank you Justin for catching the test failure and the thorough
investigation! And thanks Günther and Tingmao for diving into the
syscall restart mechanics.
I've evaluated both the `while` loop approach with `task_work_run()`
and the `restart_syscall()` approach. I strongly lean towards using
`restart_syscall()` as suggested by Tingmao.
As Günther pointed out earlier, executing `task_work_run()` directly
deep inside the syscall context can be risky. Task works often assume
they are running at the kernel-user boundary with a specific state.
Using `restart_syscall()` safely bounces us to that boundary, processes
the works cleanly, and restarts the syscall via standard mechanisms.
After some selftests,I will prepare the v4 patch series using `restart_syscall()`.
I will also ensure all comments are properly wrapped to 80 columns as requested
by Mickaël, and make sure to include the proper Reported-by and
Suggested-by tags for everyone's excellent input here.
Expect the v4 series shortly. Thanks again for the great collaboration!
Best regards,
Yihan Ding
在 2026/3/4 05:19, Günther Noack 写道:
> On Tue, Mar 03, 2026 at 08:38:13PM +0000, Tingmao Wang wrote:
>> On 3/3/26 19:50, Günther Noack wrote:
>>> [...]
>>> On Tue, Mar 03, 2026 at 11:20:10AM -0500, Justin Suess wrote:
>>>> On Thu, Feb 26, 2026 at 09:59:02AM +0800, Yihan Ding wrote:
>>>>> [...]
>>>>> diff --git a/security/landlock/tsync.c b/security/landlock/tsync.c
>>>>> index de01aa899751..xxxxxxxxxxxx 100644
>>>>> --- a/security/landlock/tsync.c
>>>>> +++ b/security/landlock/tsync.c
>>>>> @@ -447,6 +447,13 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
>>>>> shared_ctx.new_cred = new_cred;
>>>>> shared_ctx.set_no_new_privs = task_no_new_privs(current);
>>>>>
>>>>> + /*
>>>>> + * Serialize concurrent TSYNC operations to prevent deadlocks
>>>>> + * when multiple threads call landlock_restrict_self() simultaneously.
>>>>> + */
>>>>> + if (!down_write_trylock(¤t->signal->exec_update_lock))
>>>>> + return -ERESTARTNOINTR;
>>>> These two lines above introduced a test failure in tsync_test
>>>> completing_enablement.
>>>>
>>>> The commit that introduced the bug is 3d6327c306b3e1356ab868bf27a0854669295a4f
>>>> (this patch) and is currently in the mic/next branch.
>>>>
>>>> I noticed the test failure while testing an unrelated patch.
>>>>
>>>> The bug is because this code never actually yields or restarts the syscall.
>>>>
>>>> This is the test output I observed:
>>>>
>>>> [+] Running tsync_test:
>>>> TAP version 13
>>>> 1..4
>>>> # Starting 4 tests from 1 test cases.
>>>> # RUN global.single_threaded_success ...
>>>> # OK global.single_threaded_success
>>>> ok 1 global.single_threaded_success
>>>> # RUN global.multi_threaded_success ...
>>>> # OK global.multi_threaded_success
>>>> ok 2 global.multi_threaded_success
>>>> # RUN global.multi_threaded_success_despite_diverging_domains ...
>>>> # OK global.multi_threaded_success_despite_diverging_domains
>>>> ok 3 global.multi_threaded_success_despite_diverging_domains
>>>> # RUN global.competing_enablement ...
>>>> # tsync_test.c:156:competing_enablement:Expected 0 (0) == d[1].result (-1)
>>>
>>> The interesting part here is when you print out the errno that is
>>> returned from the syscall -- it is 513, the value of ERESTARTNOINTR!
>>>
>>> My understanding so far: Poking around in kernel/entry/common.c, it
>>> seems that __exit_to_user_mode_loop() calls
>>> arch_do_signal_or_restart() only when there is a pending signal
>>> (_TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL). So it was possible that the
>>> system call returns with the (normally internal) error code
>>> ERESTARTNOINTR, in the case where the trylock fails, but where current
>>> has not received a signal from the other competing TSYNC thread yet.
>>>
>>> So with that in mind, would it work to do this?
>>>
>>> while (try-to-acquire-the-lock) {
>>> if (current-has-task-works-pending)
>>> return -ERESTARTNOINTR;
>>>
>>> cond_resched();
>>> }
>>>
>>> Then we could avoid calling task_work_run() directly; (I find it
>>> difficult to reason about the implications of calling taks_work_run()
>>> directly, because these task works may make assumptions about the
>>> context in which they are running.)
>>
>> I've not caught up with the full discussion so might be missing some context on why RESTARTNOINTR was used here,
>> but wouldn't
>>
>> diff --git a/security/landlock/tsync.c b/security/landlock/tsync.c
>> index 950b63d23729..f695fe44e2f1 100644
>> --- a/security/landlock/tsync.c
>> +++ b/security/landlock/tsync.c
>> @@ -490,7 +490,7 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
>> * when multiple threads call landlock_restrict_self() simultaneously.
>> */
>> if (!down_write_trylock(¤t->signal->exec_update_lock))
>> - return -ERESTARTNOINTR;
>> + return restart_syscall();
>>
>> /*
>> * We schedule a pseudo-signal task_work for each of the calling task's
>>
>> achieve what the original patch intended?
>
> Thanks, that's an excellent point!
>
> restart_syscall() (a) sets TIF_SIGPENDING and then (b) returns
> -ERESTARTNOINTR. (a) was the part that we have been missing for the
> restart to work (see discussion above). Together, (a) and (b) cause
> __exit_to_user_mode_loop() to restart the syscall. Given that this is
> offered in signal.h, this seems like a clean and more "official" way
> to do this than using the task works APIs.
>
> It also fixes the previously failing selftest (I tried).
>
> Yihan, Justin: Does that seem reasonable to you as well?
>
> –Günther
>