Re: [BUG almost bisected] Splat in dequeue_rt_stack() and build error
From: Paul E. McKenney
Date: Wed Oct 02 2024 - 08:07:27 EST
On Wed, Oct 02, 2024 at 11:01:03AM +0200, Tomas Glozar wrote:
> út 1. 10. 2024 v 18:47 odesílatel Paul E. McKenney <paulmck@xxxxxxxxxx> napsal:
> > Huh, 50MB and growing. I need to limit the buffer size as well.
> > How about "trace_buf_size=2k"? The default is 1,441,792, just
> > over 1m.
> >
> Yeah, limiting the size of the buffer is the way to go, we only need
> the last n entries before the oops.
>
> > Except that I am not getting either dl_server_start() or dl_server_stop(),
> > perhaps because they are not being invoked in this short test run.
> > So try some function that is definitely getting invoked, such as
> > rcu_sched_clock_irq().
> >
> > No joy there, either, so maybe add "ftrace=function"?
> >
> > No: "[ 1.542360] ftrace bootup tracer 'function' not registered."
> >
> Did you enable CONFIG_BOOTTIME_TRACING and CONFIG_FUNCTION_TRACER?
> They are not set in the default configuration for TREE03:
>
> $ grep -E '(FUNCTION_TRACER)|(BOOTTIME_TRACING)'
> ./tools/testing/selftests/rcutorture/res/2024.09.26-14.35.03/TREE03/.config
> CONFIG_HAVE_FUNCTION_TRACER=y
> # CONFIG_BOOTTIME_TRACING is not set
> # CONFIG_FUNCTION_TRACER is not set
Ah, thank you! I knew I must be forgetting something. Now a short test
gets me things like this:
[ 304.572701] torture_-190 13d.h2. 302863957us : rcu_is_cpu_rrupt_from_idle <-rcu_sched_clock_irq
> > Especially given that I don't have a QEMU monitor for these 100 runs.
> >
> > But if there is a way to do this programatically from within the
> > kernel, I would be happy to give it a try.
> >
> > > Also I'd say here we're mostly interested in the sequence of events leading
> > > us to the warn (dl_server_start() when the DL entity is somehow still
> > > enqueued) rather than the state of things when the warn is hit, and for
> > > that dumping the ftrace buffer to the console sounds good enough to me.
> >
> > That I can do!!! Give or take function tracing appearing not to work
> > for me from the kernel command line. :-(
> >
> > Thanx, Paul
> >
>
> Thanks for trying to get details about the bug. See my comment above
> about the config options to enable function tracing.
I will check up on last night's run for heisenbug-evaluation purposes,
and if it did trigger, restart with this added:
--kconfigs "CONFIG_BOOTTIME_TRACING=y CONFIG_FUNCTION_TRACER=y"
> FYI I have managed to reproduce the bug on our infrastructure after 21
> hours of 7*TREE03 and I will continue with trying to reproduce it with
> the tracers we want.
Even better!!!
Thanx, Paul