Re: Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout

From: Matthieu Baerts

Date: Mon Mar 09 2026 - 08:34:00 EST


Hi Thomas,

On 09/03/2026 09:43, Thomas Gleixner wrote:
> On Sun, Mar 08 2026 at 18:23, Matthieu Baerts wrote:
>> 08 Mar 2026 17:58:26 Thomas Gleixner <tglx@xxxxxxxxxx>:
>>> So I'm back to square one. I go and do what I should have done in the
>>> first place. Write a debug patch with trace_printks and let the people
>>> who can actually trigger the problem run with it.
>>
>> Happy to test such debug patches!
>
> See below.
>
> Enable the tracepoints either on the kernel command line:
>
> trace_event=sched_switch,mmcid:*
>
> or before starting the test case:
>
> echo 1 >/sys/kernel/tracing/events/sched/sched_switch/enable
> echo 1 >/sys/kernel/tracing/events/mmcid/enable
>
> I added a 50ms timeout into mm_cid_get() which freezes the trace and
> emits a warning. If you enable panic_on_warn and ftrace_dump_on_oops,
> then it dumps the trace buffer once it hits the warning.
>
> Either kernel command line:
>
> panic_on_warn ftrace_dump_on_oops
>
> or
>
> echo 1 >/proc/sys/kernel/panic_on_warn
> echo 1 >/proc/sys/kernel/ftrace_dump_on_oops
>
> That should provide enough information to decode this mystery.

Thank you for the debug patch and the clear instructions. I managed to
reproduce the issue with the extra debug. The ouput is available here:

https://github.com/user-attachments/files/25841808/issue-617-debug.txt.gz

Just in case, the kernel config file that was used:


https://github.com/user-attachments/files/25841873/issue-617-debug.config.gz

Please tell me if it is an issue to download these files from GitHub.
The output file has 10k+ lines.

Cheers,
Matt
--
Sponsored by the NGI0 Core fund.