Re: [PATCH 1/2] tracepoints: Do not trace when cpu is offline
From: Steven Rostedt
Date: Tue Feb 16 2016 - 15:32:13 EST
On Tue, 16 Feb 2016 20:09:35 +0000 (UTC)
Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
> If I get this right, you are proposing to "hide" events happening
> during CPU hot-unplug on dying CPUs from the tracers to fix an issue
> caused by interaction of RCU-sched (used for Tracepoint synchronization)
> wrt CPU hotplug.
>
> Removing tracing visibility of hot-unplug events seems to be an unwelcome
> side-effect. I don't know how far Thomas Gleixner got in his overhaul of
> CPU hotplug, but he might have something to say about this, as I believe
> he would be the first user concerned.
>
Well, trace_printk() still works. But right now you *can't* have a
tracepoint executed on a CPU that is offline, because it is a bug.
Period. That's because we use RCU sched to protect tracepoints. When
the CPU is offline, there is no protection. It is possible that the
tracepoint structures may get corrupted, or worse, crash the system.
Granted, the race is quite small but it is a bug never the less.
Now, if you want tracepoints to be visible for CPUs that are offline,
then we need something else to protect it. But until then, this fixes
the issue.
And before this patch, we've been adding conditional tracepoints to
check "if (cpu_online(raw_smp_processor_id()))" when a warning appeared.
This patch gets rid of the need to keep adding these whack-a-mole
patches.
-- Steve