Re: [PATCH v9 0/7] arm64: Add debug IPI for backtraces / kgdb; try to use NMI for it

From: Sumit Garg
Date: Mon Aug 07 2023 - 08:47:09 EST


On Mon, 7 Aug 2023 at 16:11, Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> Hi Doug,
>
> Apologies for the delay.
>
> On Mon, Jul 24, 2023 at 08:55:44AM -0700, Doug Anderson wrote:
> > On Thu, Jun 1, 2023 at 2:37 PM Douglas Anderson <dianders@xxxxxxxxxxxx> wrote:
> > I'm looking for some ideas on what to do to move this patch series
> > forward. Thanks to Daniel, the kgdb patch is now in Linus's tree which
> > hopefully makes this simpler to land. I guess there is still the
> > irqchip dependency that will need to be sorted out, though...
> >
> > Even if folks aren't in agreement about whether this is ready to be
> > enabled in production, I don't think anything here is super
> > objectionable or controversial, is it? Can we land it? If you feel
> > like it needs extra review, would it help if I tried to drum up some
> > extra people to provide review feedback?
>
> Ignoring the soundness issues I mentioned before (which I'm slowly chipping
> away at, and you're likely lucky enough to avoid in practice)...
>
> Having looked over the series, I think the GICv3 bit isn't quite right, but is
> easy enough to fix. I've commented on the patch with what I think we should
> have there.

Thanks for catching this and I agree with your proposed fix.

>
> The only major thing otherwise from my PoV is the structure of the debug IPI
> framework. I'm not keen on that being a separate body of code and I think it
> should live in smp.c along with the other IPIs.

That's a fair point.

> I'd also strongly prefer if we
> could have separate IPI_CPU_BACKTRACE and IPI_CPU_KGDB IPIs,

With current logic of single debug IPI, it is not required for a user
to enable KGDB in order to use that IPI for backtrace. The original
motivation for this logic was that the IPIs are a scarce resource on
arm64 as per comments from Marc. So I am fine either way to keep them
separate or unified.

> and I think we can
> do that either by unifying IPI_CPU_STOP && IPI_CPU_CRASH_STOP or by reclaiming
> IPI_WAKEUP by reusing a different IPI for the parking protocol (e.g.
> IPI_RESCHEDULE).

That sounds like a good cleanup.

>
> I think it'd be nice if the series could enable NMIs for backtrace and the
> CPU_{,CRASH_}STOP cases, with KGDB being the bonus atop. That way it'd be
> clearly beneficial for anyone trying to debug lockups even if they're not a
> KGDB user.
>

It's good to see other use-cases of IPIs turned into NMIs.

-Sumit

> > Also: in case it's interesting to anyone, I've been doing benchmarks
> > on sc7180-trogdor devices in preparation for enabling this. On that
> > platform, I did manage to see about 4% reduction in a set of hackbench
> > numbers when fully enabling pseudo-NMI. However, when I instead ran
> > Speedometer 2.1 I saw no difference. See:
> >
> > https://issuetracker.google.com/issues/197061987
>
> Thanks for the pointer!
>
> I know that there are a couple of things that we could do to slightly improve
> local_irq_*() when using pNMIs, though I suspect that the bulk of the cost
> there will come from the necessary synchronization.
>
> Thanks,
> Mark.