Re: [RFT PATCH 04/13] kprobes: Make optimizer delay to 1 second

From: Paul E. McKenney
Date: Wed Jan 22 2020 - 11:54:35 EST


On Wed, Jan 22, 2020 at 10:12:40PM +0900, Masami Hiramatsu wrote:
> On Wed, 22 Jan 2020 07:11:15 -0500
> Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> > On Wed, 22 Jan 2020 16:23:17 +0900
> > Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> >
> > > On Tue, 21 Jan 2020 19:29:05 -0500
> > > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > >
> > > > On Thu, 16 Jan 2020 23:44:52 +0900
> > > > Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> > > >
> > > > > Since the 5 jiffies delay for the optimizer is too
> > > > > short to wait for other probes, make it longer,
> > > > > like 1 second.
> > > >
> > > > Hi Masami,
> > > >
> > > > Can you explain more *why* 5 jiffies is too short.
> > >
> > > Yes, I had introduced this 5 jiffies delay for multiple probe registration
> > > and unregistration like systemtap, which will use array-based interface to
> > > register/unregister. In that case, 5 jiffies will be enough for the delay
> > > to wait for other kprobe registration/unregsitration.
> > >
> > > However, since perf and ftrace register/unregister probes one-by-one with
> > > RCU synchronization interval, the optimizer will be started before
> > > finishing to register/unregister all probes.
> > > And the optimizer locks kprobe_mutex a while -- RCU-tasks synchronization.
> > > Since the kprobe_mutex is also involved in disabling kprobes, this also
> > > stops probe-event disabling.
> > >
> > > Maybe 5 jiffies is enough for adding/removing a few probe events, but
> > > not enough for dozens of probe events.
> > >
> >
> > Perhaps we should have a mechanism that can detect new probes being
> > added, and just continue to delay the optimization, instead of having
> > some arbitrary delay.
>
> Yes, that is what [03/13] does :)
> Anyway, it seems that the RCU-synchronization takes more than 5 jiffies.
> And in that case, [03/13] still doesn't work. That's why I added this patch
> after that.

If the RCU synchronization is synchronize_rcu_tasks(), then yes, it
will often take way more than 5 jiffies. If it is synchronize_rcu(),
5 jiffies would not be unusual, especially on larger systems.
But in the case of synchronize_rcu(), one option is to instead use
synchronize_rcu_expedited(). It is not clear that this last is really
justified in this case, but figured it might be worth mentioning.

Thanx, Paul