Re: [PATCH 0/7] fork: Make init and umh ordinary tasks

From: Qian Cai
Date: Mon May 09 2022 - 16:47:21 EST


On Fri, May 06, 2022 at 09:11:36AM -0500, Eric W. Biederman wrote:
>
> In commit 40966e316f86 ("kthread: Ensure struct kthread is present for
> all kthreads") caused init and the user mode helper threads that call
> kernel_execve to have struct kthread allocated for them.
>
> I believe my first patch in this series is enough to fix the bug
> and is simple enough and obvious enough to be backportable.
>
> The rest of the changes pass struct kernel_clone_args to clean things
> up and cause the code to make sense.
>
> There is one rough spot in this change. In the init process before the
> user space init process is exec'd there is a lot going on. I have found
> when async_schedule_domain is low on memory or has more than 32K callers
> executing do_populate_rootfs will now run in a user space thread making
> flush_delayed_fput meaningless, and __fput_sync is unusable. I solved
> this as I did in usermode_driver.c with an added explicit task_work_run.
> I point this out as I have seen some talk about making flushing file
> handles more explicit.

Reverting the last 3 commits of the series fixed a boot crash.

1b2552cbdbe0 fork: Stop allowing kthreads to call execve
753550eb0ce1 fork: Explicitly set PF_KTHREAD
68d85f0a33b0 init: Deal with the init process being a user mode process

BUG: KASAN: null-ptr-deref in task_nr_scan_windows.isra.0
arch_atomic_long_read at ./include/linux/atomic/atomic-long.h:29
(inlined by) atomic_long_read at ./include/linux/atomic/atomic-instrumented.h:1266
(inlined by) get_mm_counter at ./include/linux/mm.h:1996
(inlined by) get_mm_rss at ./include/linux/mm.h:2049
(inlined by) task_nr_scan_windows at kernel/sched/fair.c:1123
Read of size 8 at addr 00000000000003d0 by task swapper/0/1

CPU: 72 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc6-next-20220509-dirty #29
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
print_report
kasan_report
kasan_check_range
__kasan_check_read
task_nr_scan_windows.isra.0
task_scan_start
task_scan_min at /home/user/linux/kernel/sched/fair.c:1144
(inlined by) task_scan_start at /home/user/linux/kernel/sched/fair.c:1150
task_tick_fair
task_tick_numa at /home/user/linux/kernel/sched/fair.c:2944
(inlined by) task_tick_fair at /home/user/linux/kernel/sched/fair.c:11186
scheduler_tick
update_process_times
tick_periodic
tick_handle_periodic
arch_timer_handler_phys
handle_percpu_devid_irq
generic_handle_domain_irq
gic_handle_irq
call_on_irq_stack
do_interrupt_handler
el1_interrupt
el1h_64_irq_handler
el1h_64_irq
split_page
make_alloc_exact
alloc_pages_exact_nid
init_section_page_ext
page_ext_init
kernel_init_freeable
kernel_init
ret_from_fork
==================================================================
Disabling lock debugging due to kernel taint
Unable to handle kernel paging request at virtual address dfff80000000007a
KASAN: null-ptr-deref in range [0x00000000000003d0-0x00000000000003d7]
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
[dfff80000000007a] address between user and kernel address ranges
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 72 PID: 1 Comm: swapper/0 Tainted: G B 5.18.0-rc6-next-20220509-dirty #29
pstate: 404000c9 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : task_nr_scan_windows.isra.0
lr : task_nr_scan_windows.isra.0
sp : ffff800008487cb0
x29: ffff800008487cb0 x28: ffff07ff89728040 x27: 000000003bc47ee0
x26: ffff08367f088980 x25: 1fffe0fff12e525f x24: ffff07ff897292f8
x23: ffff07ff89728040 x22: 1fffe0fff12e5262 x21: 0000000000010000
x20: 00000000000003d0 x19: 0000000000000000 x18: ffffdd41783f7d1c
x17: 3d3d3d3d3d3d3d3d x16: 3d3d3d3d3d3d3d3d x15: 3d3d3d3d3d3d3d3d
x14: 3d3d3d3d3d3d3d3d x13: 746e696174206c65 x12: ffff7ba82f3b98b5
x11: 1ffffba82f3b98b4 x10: ffff7ba82f3b98b4 x9 : dfff800000000000
x8 : ffffdd4179dcc5a7 x7 : 0000000000000001 x6 : ffff7ba82f3b98b4
x5 : ffffdd4179dcc5a0 x4 : ffff7ba82f3b98b5 x3 : ffffdd4171de2b14
x2 : 0000000000000001 x1 : 000000000000007a x0 : dfff800000000000
Call trace:
task_nr_scan_windows.isra.0
task_scan_start
task_tick_fair
scheduler_tick
update_process_times
tick_periodic
tick_handle_periodic
arch_timer_handler_phys
handle_percpu_devid_irq
generic_handle_domain_irq
gic_handle_irq
call_on_irq_stack
do_interrupt_handler
el1_interrupt
el1h_64_irq_handler
el1h_64_irq
split_page
make_alloc_exact
alloc_pages_exact_nid
init_section_page_ext
page_ext_init
kernel_init_freeable
kernel_init
ret_from_fork
Code: d343fe81 d2d00000 f2fbffe0 53185eb5 (38e06820)