Re: Instrumentation and RCU

From: Daniel Thompson
Date: Tue Mar 10 2020 - 13:05:42 EST


On Tue, Mar 10, 2020 at 05:09:51PM +0900, Masami Hiramatsu wrote:
> Hi,
>
> On Mon, 09 Mar 2020 19:59:18 +0100
> Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> > >> #2) Breakpoint utilization
> > >>
> > >> As recent findings have shown, breakpoint utilization needs to be
> > >> extremly careful about not creating infinite breakpoint recursions.
> > >>
> > >> I think that's pretty much obvious, but falls into the overall
> > >> question of how to protect callchains.
> > >
> > > This is rather unique, and I agree that its best to at least get to a point
> > > where we limit the tracing within breakpoint code. I'm fine with making
> > > rcu_nmi_exit() nokprobe too.
> >
> > Yes, the break point stuff is unique, but it has nicely demonstrated how
> > much of the code is affected by it.
>
> I see. I had followed the callchain several times, and always found new function.
> So I agree with the off-limit section idea. That is a kind of entry code section
> but more generic one. It is natural to split such sensitive code in different
> place.
>
> BTW, what about kdb stuffs? (+Cc Jason)

There is some double breakpoint detection code but IIRC this merely
retrospectively warns the user that they have their hurt their system...
and whether the system would run long enough to reach that logic is
relatively unlikely.

For both kdb and kgdb, the main "defence" is the use case. Neither kdb
or kgdb faces the userspace (except via a SysRq, which can be disabled)
and triggering either already implies the user is not especially
concerned about things like availability guarantees since they are happy
for everything running on the system to be halted indefinitely.

Additionally breakpoints in kgdb/kdb are not wildcarded so there are no
need to worry about a user selecting a bad pattern! Setting a breakpoint
with kgdb/kdb needs a user (who is assumed to have kernel knowledge) to
have explicitly chose to study the dynamic behaviour of that particular
bit of code.

I'm not saying kgdb/kdb would not benefit from additional safety
barriers (it would), simply that the problem is less acute.


Daniel.