Re: [patch part-II V2 02/13] x86/entry: Mark enter_from_user_mode() notrace and NOKPROBE

From: Thomas Gleixner
Date: Mon Mar 09 2020 - 11:41:09 EST


Frederic Weisbecker <frederic@xxxxxxxxxx> writes:

> On Sun, Mar 08, 2020 at 11:24:01PM +0100, Thomas Gleixner wrote:
>> Both the callers in the low level ASM code and __context_tracking_exit()
>> which is invoked from enter_from_user_mode() via user_exit_irqoff() are
>> marked NOKPROBE. Allowing enter_from_user_mode() to be probed is
>> inconsistent at best.
>>
>> Aside of that while function tracing per se is safe the function trace
>> entry/exit points can be used via BPF as well which is not safe to use
>> before context tracking has reached CONTEXT_KERNEL and adjusted RCU.
>>
>> Mark it notrace and NOKROBE.
>
> Ok for the NOKPROBE, also I remember from the inclusion of kprobes
> that spreading those NOKPROBE couldn't be more than some sort of best
> effort to mitigate the accidents and it's up to the user to keep some
> common sense and try to stay away from the borderline functions. The same
> would apply to breakpoints, steps, etc...
>
> Now for the BPF and function tracer, the latter has been made robust to
> deal with these fragile RCU blind spots. Probably the same requirements should be
> expected from the function tracer users. Perhaps we should have a specific
> version of __register_ftrace_function() which protects the given probes
> inside rcu_nmi_enter()? As it seems the BPF maintainer doesn't want the whole
> BPF execution path to be hammered.

Right. The problem is that as things stand e.g. for tracepoints you need
to invoke trace_foo_rcuidle() which then does the scru/rcu_irq dance
around the invocation, but then the functions attached need to be fixed
that they are not issuing rcu_read_lock() or such.

While that is halfways doable for tracepoints when you place them, the
whole function entry/exit hooks along with kprobes are even more
interesting because functions can be called from arbitrary contexts...

So to make this sane, you'd need to do:

if (!rcu_watching()) {
....
} else {
....
}

and the reverse when leaving the thing. So in the worst case you end up
with a gazillion of scru/rcu_irq pairs which really make crap slow.

So we are way better off to have well defined off limit regions and are
careful about them and then switch over ONCE and be done with it.

Thanks,

tglx