Re: [PATCH v1] io_uring/register.c: fix NULL pointer dereference in io_register_resize_rings
From: Jens Axboe
Date: Mon Mar 09 2026 - 09:11:44 EST
On 3/9/26 12:27 AM, Hao-Yu Yang wrote:
> During io_register_resize_rings execution, ctx->rings is temporarily
> set to NULL before new ring memory is allocated. If a timer interrupt
> fires during this window, the interrupt handler (via timerfd_tmrproc
> -> io_poll_wake -> __io_req_task_work_add -> io_req_local_work_add)
> attempts to access ctx->rings->sq_flags, causing race condition and
> a NULL pointer dereference.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000024
> PF: supervisor read access in kernel mode
> PF: error_code(0x0000) - not-present page
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Call Trace:
> <IRQ>
> __io_poll_execute (io_uring/poll.c:223)
> io_poll_wake (io_uring/poll.c:426)
> __wake_up_common (kernel/sched/wait.c:109)
> __wake_up_locked_key (kernel/sched/wait.c:167)
> timerfd_tmrproc (./include/linux/spinlock.h:407 fs/timerfd.c:71 fs/timerfd.c:78)
> ? __pfx_timerfd_tmrproc (fs/timerfd.c:75)
> __hrtimer_run_queues (kernel/time/hrtimer.c:1785 kernel/time/hrtimer.c:1849)
> hrtimer_interrupt (kernel/time/hrtimer.c:1914)
> __sysvec_apic_timer_interrupt (./arch/x86/include/asm/jump_label.h:37 ./arch/x86/include/asm/trace/irq_vectors.h:40 arch/x86/kernel/apic/apic.c:1063)
> sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1056 arch/x86/kernel/apic/apic.c:1056)
> </IRQ>
> <TASK>
> asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:697)
> RIP: 0010:io_register_resize_rings (io_uring/register.c:593)
> ? io_register_resize_rings (io_uring/register.c:580)
> __io_uring_register (io_uring/register.c:898)
> ? fget (fs/file.c:1114)
> __x64_sys_io_uring_register (io_uring/register.c:1026 io_uring/register.c:1001 io_uring/register.c:1001)
> x64_sys_call (arch/x86/entry/syscall_64.c:41)
> do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall)
> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> </TASK>
>
> Fix by using spin_lock_irq/spin_unlock_irq instead of spin_lock/spin_unlock
> in io_register_resize_rings. This disables IRQs while ctx->rings is set to
> NULL, preventing interrupt handlers from executing during the window when
> ctx->rings is NULL.
I see the your crash, but this isn't the right fix - you can't just have
that one spot turn the completion lock into an IRQ disabling one,
that'll mess up the rest of the use cases in terms of lockdep. This lock
is never grabbed from IRQ context, hence it'd be misleading too.
You probably want something ala:
mutex_lock(&ctx->mmap_lock);
spin_lock(&ctx->completion_lock();
+ local_irq_disable();
...
+ local_irq_enable();
spin_unlock(&ctx->completion_lock);
mutex_unlock(&ctx->mmap_lock);
which I think should make lockdep happy with it too.
And then I think it wants a comment on top of that local_irq_disable(),
explaining how this is meant to prevent any irq/bh from triggering
task_work additions that touch ctx->rings to set the IORING_SQ_TASKRUN
flag.
Does that makes sense? If so, please do send a v2!
--
Jens Axboe