[PATCH RFC v2 17/23] fs: stop sharing fs_struct between init_task and pid 1

From: Christian Brauner

Date: Thu Mar 05 2026 - 18:33:01 EST


Spawn kernel_init (PID 1) via kernel_clone() directly instead of
user_mode_thread(), without CLONE_FS. This gives PID 1 its own private
copy of init_task's fs_struct rather than sharing it.

This is a prerequisite for isolating kthreads in nullfs: when
init_task's fs is later pointed at nullfs, PID 1 must not share it
or init_userspace_fs() would modify init_task's fs as well, defeating
the isolation.

At this stage PID 1 still gets rootfs (a private copy rather than a
shared reference), so there is no functional change.

Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx>
---
init/main.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/init/main.c b/init/main.c
index 5ccc642a5aa7..6633d4bea52b 100644
--- a/init/main.c
+++ b/init/main.c
@@ -714,6 +714,11 @@ static __initdata DECLARE_COMPLETION(kthreadd_done);

static noinline void __ref __noreturn rest_init(void)
{
+ struct kernel_clone_args init_args = {
+ .flags = (CLONE_VM | CLONE_UNTRACED),
+ .fn = kernel_init,
+ .fn_arg = NULL,
+ };
struct task_struct *tsk;
int pid;

@@ -723,7 +728,7 @@ static noinline void __ref __noreturn rest_init(void)
* the init task will end up wanting to create kthreads, which, if
* we schedule it before we create kthreadd, will OOPS.
*/
- pid = user_mode_thread(kernel_init, NULL, CLONE_FS);
+ pid = kernel_clone(&init_args);
/*
* Pin init on the boot CPU. Task migration is not properly working
* until sched_init_smp() has been run. It will set the allowed

--
2.47.3