Re: [tip: core/rcu] rcu/tree: Mark the idle relevant functions noinstr

From: Peter Zijlstra
Date: Tue Sep 29 2020 - 03:25:37 EST


On Mon, Sep 28, 2020 at 05:22:33PM -0500, Kim Phillips wrote:
> On 5/19/20 2:52 PM, tip-bot2 for Thomas Gleixner wrote:
> > The following commit has been merged into the core/rcu branch of tip:
> >
> > Commit-ID: ff5c4f5cad33061b07c3fb9187506783c0f3cb66
> > Gitweb: https://git.kernel.org/tip/ff5c4f5cad33061b07c3fb9187506783c0f3cb66
> > Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > AuthorDate: Fri, 13 Mar 2020 17:32:17 +01:00
> > Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > CommitterDate: Tue, 19 May 2020 15:51:20 +02:00
> >
> > rcu/tree: Mark the idle relevant functions noinstr
> >
> > These functions are invoked from context tracking and other places in the
> > low level entry code. Move them into the .noinstr.text section to exclude
> > them from instrumentation.
> >
> > Mark the places which are safe to invoke traceable functions with
> > instrumentation_begin/end() so objtool won't complain.
> >
> > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Reviewed-by: Alexandre Chartre <alexandre.chartre@xxxxxxxxxx>
> > Acked-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Acked-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > Link: https://lkml.kernel.org/r/20200505134100.575356107@xxxxxxxxxxxxx
> >
> >
> > ---
>
> I bisected a system hang condition down to this commit.

That's odd, mostly its the lack of noinstr that causes hangs, I've never
yet seen the presence of it cause problems.

> To reproduce the hang, compile the below code and execute it as root
> on an x86_64 server (AMD or Intel). The code is opening a
> PERF_TYPE_TRACEPOINT event with a non-zero pe.config.

In my experience, it is very relevant which exact tracepoint you end up
using.

PERF_COUNT_HW_INSTRUCTIONS is a very long and tedious way of writing 1
in this case, on my randonly selected test box this morning, trace event
1 is:

$ for file in /debug/tracing/events/*/*/id ; do echo $file -- $(cat $file); done | grep " 1$"
/debug/tracing/events/ftrace/function/id -- 1

> If I revert the commit from Linus' ToT, the system stays up.

> memset(&pe, 0, sizeof(struct perf_event_attr));
> pe.type = PERF_TYPE_TRACEPOINT;
> pe.size = sizeof(struct perf_event_attr);
> pe.config = PERF_COUNT_HW_INSTRUCTIONS;
> pe.disabled = 1;

Doubly funny for not actually enabling the event...

> pe.exclude_kernel = 1;
> pe.exclude_hv = 1;

Still, it seems to make my machine unhappy.. Let's see if I can get
anything useful out of it.