Re: [syzbot] [fs?] [mm?] KCSAN: data-race in bprm_execve / copy_fs (4)

From: Oleg Nesterov
Date: Sun Mar 23 2025 - 14:15:16 EST


On 03/22, Al Viro wrote:
>
> On Sat, Mar 22, 2025 at 04:55:39PM +0100, Oleg Nesterov wrote:
>
> > And this means that we just need to ensure that ->in_exec is cleared
> > before this mutex is dropped, no? Something like below?
>
> Probably should work, but I wonder if it would be cleaner to have
> ->in_exec replaced with pointer to task_struct responsible. Not
> "somebody with that fs_struct for ->fs is trying to do execve(),
> has verified that nothing outside of their threads is using this
> and had been holding ->signal->cred_guard_mutex ever since then",
> but "this is the thread that..."

perhaps... or something else to make this "not immediately obvious"
fs->in_exec more clear.

But I guess we need something simple for -stable, so will you agree
with this fix for now? Apart from changelog/comments.

retval = de_thread(me);
+ current->fs->in_exec = 0;
if (retval)
current->fs->in_exec = 0;

is correct but looks confusing. See "V2" below, it clears fs->in_exec
after the "if (retval)" check.

syzbot says:

Unfortunately, I don't have any reproducer for this issue yet.

so I guess "#syz test: " is pointless right now...

Oleg.
---

diff --git a/fs/exec.c b/fs/exec.c
index 506cd411f4ac..02e8824fc9cd 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1236,6 +1236,7 @@ int begin_new_exec(struct linux_binprm * bprm)
if (retval)
goto out;

+ current->fs->in_exec = 0;
/*
* Cancel any io_uring activity across execve
*/
@@ -1497,6 +1498,8 @@ static void free_bprm(struct linux_binprm *bprm)
}
free_arg_pages(bprm);
if (bprm->cred) {
+ // for the case exec fails before de_thread()
+ current->fs->in_exec = 0;
mutex_unlock(&current->signal->cred_guard_mutex);
abort_creds(bprm->cred);
}
@@ -1862,7 +1865,6 @@ static int bprm_execve(struct linux_binprm *bprm)

sched_mm_cid_after_execve(current);
/* execve succeeded */
- current->fs->in_exec = 0;
current->in_execve = 0;
rseq_execve(current);
user_events_execve(current);
@@ -1881,7 +1883,6 @@ static int bprm_execve(struct linux_binprm *bprm)
force_fatal_sig(SIGSEGV);

sched_mm_cid_after_execve(current);
- current->fs->in_exec = 0;
current->in_execve = 0;

return retval;