Re: [PATCH v3 1/2] nmi_backtrace: Allow excluding an arbitrary CPU

From: Michal Hocko
Date: Fri Aug 04 2023 - 11:02:38 EST


On Fri 04-08-23 06:56:51, Doug Anderson wrote:
> Hi,
>
> On Fri, Aug 4, 2023 at 12:50 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > On Thu 03-08-23 16:07:57, Douglas Anderson wrote:
> > > The APIs that allow backtracing across CPUs have always had a way to
> > > exclude the current CPU. This convenience means callers didn't need to
> > > find a place to allocate a CPU mask just to handle the common case.
> > >
> > > Let's extend the API to take a CPU ID to exclude instead of just a
> > > boolean. This isn't any more complex for the API to handle and allows
> > > the hardlockup detector to exclude a different CPU (the one it already
> > > did a trace for) without needing to find space for a CPU mask.
> > >
> > > Arguably, this new API also encourages safer behavior. Specifically if
> > > the caller wants to avoid tracing the current CPU (maybe because they
> > > already traced the current CPU) this makes it more obvious to the
> > > caller that they need to make sure that the current CPU ID can't
> > > change.
> >
> > Yes, this looks like the best way forward.
> >
> > It would have been slightly safer to modify arch_trigger_cpumask_backtrace
> > by switching arguments so that some leftovers are captured easier.
>
> I'm not sure I understand. Oh, you're saying make the prototype of
> arch_trigger_cpumask_backtrace() incompatible so that if someone is
> directly calling it then it'll be a compile-time error?

exactly. bool to int promotion would be too easy to miss while the
pointer to int would complain loudly.

> I guess the
> hope is that nobody is calling that directly and they're calling
> through the trigger_...() functions.

Hope is one thing, being preventive another.

> For now I'm going to leave this alone.

If you are going to send another version then please consider this. Not
a hard requirement but better.


--
Michal Hocko
SUSE Labs