Re: [BUG almost bisected] Splat in dequeue_rt_stack() and build error
From: Valentin Schneider
Date: Tue Oct 01 2024 - 08:53:43 EST
On 01/10/24 03:10, Paul E. McKenney wrote:
> On Mon, Sep 30, 2024 at 10:44:24PM +0200, Valentin Schneider wrote:
>> On 30/09/24 12:09, Paul E. McKenney wrote:
>> >
>> > And Peter asked that I send along a reproducer, which I am finally getting
>> > around to doing:
>> >
>> > tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 12h --configs "100*TREE03" --trust-make
>> >
>>
>> FYI Tomas (on Cc) has been working on getting pretty much this to run on
>> our infra, no hit so far.
>>
>> How much of a pain would it be to record an ftrace trace while this runs?
>> I'm thinking sched_switch, sched_wakeup and function-tracing
>> dl_server_start() and dl_server_stop() would be a start.
>>
>> AIUI this is running under QEMU so we'd need to record the trace within
>> that, I'm guessing we can (ab)use --bootargs to feed it tracing arguments,
>> but how do we get the trace out?
>
> Me, I would change those warnings to dump the trace buffer to the
> console when triggered. Let me see if I can come up with something
> better over breakfast. And yes, there is the concern that adding tracing
> will suppress this issue.
>
> So is there some state that I could manually dump upon triggering either
> of these two warnings? That approach would minimize the probability of
> suppressing the problem.
>
Usually enabling panic_on_warn and getting a kdump is ideal, but here this
is with QEMU - I know we can get a vmcore out via dump-guest-memory in the
QEMU monitor, but I don't have an immediate solution to do that on a
warn/panic.
Also I'd say here we're mostly interested in the sequence of events leading
us to the warn (dl_server_start() when the DL entity is somehow still
enqueued) rather than the state of things when the warn is hit, and for
that dumping the ftrace buffer to the console sounds good enough to me.
> Thanx, Paul