Re: [PATCH v12 0/3] tracing: Centralize preemptirq tracepoints and unify their usage

From: Joel Fernandes
Date: Sat Aug 04 2018 - 00:51:44 EST


Hi Masami,

On Fri, Aug 3, 2018 at 12:23 AM, Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> Hi Joel,
>
> Thank you for trying to fix that.
>
> On Thu, 2 Aug 2018 19:57:09 -0700
> Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
>
>> Hi Masami,
>>
>> On Thu, Aug 2, 2018 at 7:55 AM, Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
>> > Hi Joel,
>> >
>> > I found this caused several issues when testing ftrace.
>> >
>> > #1) ftrace boottest (FTRACE_STARTUP_TEST) fails
>>
>> This sadly appears to be a real issue. The startup test for
>> "preemptirqsoff" tracer fails, however it passes for only preemptoff
>> or only irqsoff. I tested only the last 2 tracers, not the first one,
>> that's why I didn't catch it. I need to debug this more.
>>
>> > #2) mmiotrace reports "IRQs not enabled as expected" error
>> > #3) lock subsystem event boottest causes "IRQs not disabled as expected" error (sometimes)
>>
>> Could you try the below patch and let me know if you still see the
>> issue? In the v11 I removed the lockdep_recursing() check since it
>> seemed unnecessary. But I'd like to rule out that there's still some
>> issue lurking there. Thanks and I appreciate your help, diff is
>> attached to this email.
>
> I tried the attached patch. But unfortunately I still see #1 and #2.
> (#3 does not always occurred, I'll continue to test)
>

I found that #2 is because of CPU hotplug and tracepoint interaction.
I can reproduce the same issue if I manually offline a CPU by:
echo 0 > /sys/devices/system/cpu/cpuX/online

mmiotrace also does the same thing.

Its because the tracepoints become inactive when we hotplug. This
prevents the lockdep probes that were registered from getting called.

Basically the cond parameter in DECLARE_TRACE is whether the cpu is online:
#define DECLARE_TRACE(name, proto, args) \
__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args), \
cpu_online(raw_smp_processor_id()), \
PARAMS(void *__data, proto), \
PARAMS(__data, args))

which prevents the probes from getting called.

Steve, any ideas on how we can fix this? Basically we need the
tracepoints to work until the CPU is really offline... Could you also
let me know why we need to check if CPU is online or not before
calling the tracepoint probes?

I'll try to look into #1 and #2 tomorrow as well (I'm on vacation +
weekend, so time is a bit short, but this is quite important so I'll
make time). Thanks!

- Joel