Re: [PATCH] perf/core: fix a possible deadlock scenario

From: Cong Wang
Date: Wed Jul 18 2018 - 16:21:26 EST


On Wed, Jul 18, 2018 at 1:19 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
>
> On Mon, Jul 16, 2018 at 02:51:01PM -0700, Cong Wang wrote:
> > hrtimer_cancel() busy-waits for the hrtimer callback to stop,
> > pretty much like del_timer_sync(). This creates a possible deadlock
> > scenario where we hold a spinlock before calling hrtimer_cancel()
> > while in trying to acquire the same spinlock in the callback.
> >
> > This kind of deadlock is already known and is catchable by lockdep,
> > like for del_timer_sync(), we can add lockdep annotations. However,
> > it is still missing for hrtimer_cancel(). (I have a WIP patch to make
> > it complete for hrtimer_cancel() but it breaks booting.)
> >
> > And there is such a deadlock scenario in kernel/events/core.c too,
> > well actually, it is a simpler version: the hrtimer callback waits
> > for itself to finish on the same CPU! It sounds stupid but it is
> > not obvious at all, it hides very deeply in the perf event code:
> >
> > cpu_clock_event_init():
> > perf_swevent_init_hrtimer():
> > hwc->hrtimer.function = perf_swevent_hrtimer;
> >
> > perf_swevent_hrtimer():
> > __perf_event_overflow():
> > __perf_event_account_interrupt():
> > perf_adjust_period():
> > pmu->stop():
> > cpu_clock_event_stop():
> > perf_swevent_cancel():
> > hrtimer_cancel()
> >
> > Getting stuck in a timer doesn't sound very scary, however, in this
>
> sound scary enough for me ;-) were you able to hit it?

Yeah, we have seen hundreds of soft lockup's in smp function call
which is probably caused by this bug. It dates back to 3.10 kernel,
seriously. :) Unfortunately we don't have a reproducer, so we can't
verify if this patch really fixes those soft lockup's until we deploy
the fix to probably thousands of machines in data center.


>
> > case, its consequences are a disaster:
> >
> > perf_event_overflow() which calls __perf_event_overflow() is called
> > in NMI handler too, so it is racy with hrtimer callback as disabling
> > IRQ can't possibly disable NMI. This means this hrtimer callback
> > once interrupted by an NMI handler could deadlock within NMI!
>
> hum, the swevent pmu does not triger NMI, so that timer
> will never be touched in NMI context

Good to know! So I worry too much here.

But still, given hrtimer interrupt is called with IRQ disabled, getting
stuck in a hrtimer callback could also block IPI handler, therefore
causing soft lockups.

Let me know if you want me to adjust the changelog for this.

Thanks!