Re: [PATCH v3] watchdog/hardlockup: Fix UAF in perf event cleanup due to migration race

From: Doug Anderson

Date: Mon Jan 26 2026 - 20:15:19 EST

Hi,

On Sun, Jan 25, 2026 at 7:30 PM Qiliang Yuan <realwujing@xxxxxxxxx> wrote:
>
> Hi Doug,
>
> Thanks for your further questions and for digging into the 4.19 vs ToT
> differences.
>
> On Sat, 24 Jan 2026 15:36:01 Doug Anderson <dianders@xxxxxxxxxxxx> wrote:
> > The part that doesn't make a lot of sense to me, though, is that v4.19
> > also doesn't have commit 930d8f8dbab9 ("watchdog/perf: adapt the
> > watchdog_perf interface for async model"), which is where we are
> > saying the problem was introduced.
> >
> > ...so in v4.19 I think:
> > * hardlockup_detector_perf_init() is only called from watchdog_nmi_probe()
> > * watchdog_nmi_probe() is only called from lockup_detector_init()
> > * lockup_detector_init() is only called from kernel_init_freeable()
> > right before smp_init()
> >
> > Thus I'm super confused about how you could have seen the problem on
> > v4.19. Maybe your v4.19 kernel has some backported patches that makes
> > this possible?
>
> You caught it! Here is the context for the differences:
>
> 1. Mainline (ToT):
> - `lockup_detector_init()` is always called before `smp_init()`
> (pre-SMP phase).
> - Risk source: The asynchronous retry path (`lockup_detector_delay_init`)
> introduced by 930d8f8dbab9, which runs in a workqueue (post-SMP)
> context and triggers the UAF.
>
> 2. openEuler (4.19/5.10):
> - Local `euler inclusion` patches moved `lockup_detector_init()` after
> `do_basic_setup()` (post-SMP phase).
> - Risk source: The initial probe occurs directly in a post-SMP
> environment, exposing the race condition.
>
> For openEuler (4.19/5.10) kernel, the call stack looks like this:
> kernel_init()
> -> kernel_init_freeable()
> -> lockup_detector_init() <-- Called after smp_init()
> -> watchdog_nmi_probe()
> -> hardlockup_detector_perf_init()
> -> hardlockup_detector_event_create()
>
> In mainline (ToT), the initial probe (safe) call stack is:
> kernel_init()
> -> kernel_init_freeable()
> -> lockup_detector_init() <-- Called before smp_init()
> -> watchdog_hardlockup_probe()
> -> hardlockup_detector_event_create()
>
> However, the asynchronous retry mechanism (commit 930d8f8dbab9) executes the
> probe logic in a post-SMP, preemptible context.
>
> For the mainline (ToT) retry path (at risk), the call stack is:
> kworker thread
> -> process_one_work()
> -> lockup_detector_delay_init()
> -> watchdog_hardlockup_probe()
> -> hardlockup_detector_event_create()
>
> Thus, `930d8f8dbab9` remains the correct "Fixes" target for ToT.

OK, at least I'm not crazy! That does indeed explain why things seemed
so wonky...

> > OK, fair enough. ...but I'm a bit curious why nobody else saw this
> > WARN_ON(). I'm also curious if you have tested the hardlockup detector
> > on newer kernels, or if all of your work has been done on 4.19. If all
> > your work has been done on 4.19, do we need to find someone to test
> > your patch on a newer kernel and make sure it works OK? If you've
> > tested on a newer kernel, did the hardlockup detector init from the
> > kernel's early-init code, or the retry code?
>
> In newer kernels, when the probe fails initially and falls
> back to the retry workqueue (or even during early init if preemption is
> enabled), the `WARN_ON(!is_percpu_thread())` in
> `hardlockup_detector_event_create()` does indeed trigger because
> `watchdog_hardlockup_probe()` is called from a non-bound context.
>
> I have verified this patch on the openEuler 4.19 kernel. During our stress
> testing, where we start dozens of VMs simultaneously to create high resource
> contention, the UAF was consistently reproducible without this fix and is now
> confirmed resolved.
>
> The v4 patch addresses this by refactoring the creation logic to be stateless
> and adding `cpu_hotplug_disable()` to ensure the probed CPU stays alive.

OK, so I think the answer is: you haven't actually seen the problem
(or the WARN_ON) on a mainline kernel, only on the openEuler 4.19
kernel...

...actually, I looked and now think the problem doesn't exist on a
mainline kernel. Specificaly, when we run lockup_detector_retry_init()
we call schedule_work() to do the work. That schedules work on the
"system_percpu_wq". While the work ends up being queued with
"WORK_CPU_UNBOUND", I believe that we still end up running on a thread
that's bound to just one CPU in the end. This is presumably why
nobody has reported that "WARN_ON(!is_percpu_thread())" actually
hitting on mainline.

Given the above, it sounds to me like the problem you're having is
with a downstream kernel and upstream is actually fine. Did I
understand that correctly?

If that's the case, we'd definitely want to at least change the
description and presumably _remove_ the Fixes tag? I actually still
think the code looks nicer after your CL and (maybe?) we could even
remove the whole schedule_work() for running this code? Maybe it was
only added to deal with this exact problem? ...but the CL description
would definitely need to be updated.

> I'll wait for your further thoughts on v4:
> https://lore.kernel.org/all/20260124070814.806828-1-realwujing@xxxxxxxxx/

Sure. In the very least the CL description would need to be updated
(assuming my understanding is correct), but for now let's avoid
forking the conversation and resolve things here?

-Doug