Re: Instrumentation and RCU

From: Thomas Gleixner
Date: Tue Mar 10 2020 - 04:03:12 EST


Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes:
> In general I'm sceptical that .text annotations will work. Let's say all of
> idle is a red zone. But a ton of normal functions are called when idle. So
> objtool will go and mark them as red zone too.

No. If you carefully read what I proposed its:

noinst foo()
{
protected_work();

instr_begin();

invoke_otherstuff();

instr_end();

moar_protected_work();

}

objtool will not mark anything. It will check that invocations out of
the protected section are marked as safe, i.e. inside a
instr_begin/end() pair.

So if you fail to mark protected_work() as noinstr then it will
complain. If you forget to put instr_begin/end() around the safe area it
will complain about invoke_otherstuff().

So it's a very targeted approach. objtool is there to verify that it's
consistent nothing else.

> This way large percent of the
> kernel will be off limits for tracers. Which is imo not a good trade off. I
> think addressing 1 and 2 with explicit notrace/nokprobe annotations will cover
> all practical cases where people can shot themselves in a foot with a
> tracer.

That's simply wishful thinking. The discussions in the last weeks have
clearly demonstrated that this is not the case. People were truly
convinced that e.g. probing rcu_idle_exit() is safe, but it was
not. Read the thread how long this went on.

> I realize that there will be forever whack-a-mole game and these
> annotations will never reach 100%. I think it's a fine trade
> off. Security is never 100% either. Tracing is never going to be 100%
> safe too.

I disagree. Whack a mole games are horrible and have a guaranteed
high failure rate. Otherwise we would not discuss this at all.

And no, it's not a fine trade off.

If we can have technical means to prevent the wreckage, then not using
them for handwaving reasons is just violating the only sane engineering
principle:

Correctness first

I spent the last 20 years mopping up the violations of this principle.

We have to stop the "features first, performance first" and "good
enough" mentality if we want to master the ever increasing complexity of
hardware and software in the long run.

>From my experience of cleaning up stuff, I can tell you, that
correctness first neither hurts performance nor does it prevent
features, except those which are wrong to begin with.

As quite some people do not care about or even willfully ignore
"correctness first", we have to force them to adhere by technical means,
which spares us to mop up the mess they'd create otherwise.

And even for those who deeply care tooling support is a great help to
prevent the accidental slip up. I wish I could have spared chasing call
chains manually and then figure out two days later that I missed
something.

Thanks,

tglx