RE: [PATCH v1] platform/x86: intel_pmc_core: promote S0ix failure warn() to WARN()

From: Limonciello, Mario
Date: Thu Oct 27 2022 - 11:47:39 EST


[Public]

+Shyam & Raul

> -----Original Message-----
> From: Hans de Goede <hdegoede@xxxxxxxxxx>
> Sent: Thursday, October 27, 2022 10:41
> To: Sven van Ashbrook <svenva@xxxxxxxxxxxx>; LKML <linux-
> kernel@xxxxxxxxxxxxxxx>
> Cc: platform-driver-x86@xxxxxxxxxxxxxxx; Rajneesh Bhardwaj
> <rajneesh.bhardwaj@xxxxxxxxx>; Rafael J Wysocki <rjw@xxxxxxxxxxxxx>;
> Rajat Jain <rajatja@xxxxxxxxxx>; David E Box <david.e.box@xxxxxxxxx>;
> Mark Gross <markgross@xxxxxxxxxx>; Rajneesh Bhardwaj
> <irenic.rajneesh@xxxxxxxxx>
> Subject: Re: [PATCH v1] platform/x86: intel_pmc_core: promote S0ix failure
> warn() to WARN()
>
> Hi,
>
> On 10/27/22 17:19, Sven van Ashbrook wrote:
> > The "failure to enter S0ix" warning is critically important for monitoring
> > and debugging power regressions, both in the field and in the test lab.
> >
> > Promote from lower-case warn() to upper-case WARN() so that it becomes
> > more prominent, and gets picked up as part of existing monitoring
> > infrastructure, which typically focuses on WARN() and ignores warn()
> > type log messages.
> >
> > Signed-off-by: Sven van Ashbrook <svenva@xxxxxxxxxxxx>
>
> WARN() is really only intended for internal kernel bugs and not for
> hw misbehaving, so I'm not a fan of the change you are suggesting here.
>
> Intel folks, do you have an opinion on this ?

For a reference point; on AMD's implementation of a similar driver (platform/x86/amd/pmc.c)
this "type" of message is also "dev_warn":
https://github.com/torvalds/linux/blob/master/drivers/platform/x86/amd/pmc.c#L365

If we do make changes to this message level so that other infrastructure picks up I suggest
we do it for both drivers.

Are we maybe at the point now it should be dev_err instead?

>
> Regards,
>
> Hans
>
>
> > ---
> > Against v6.1-rc2
> >
> > drivers/platform/x86/intel/pmc/core.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/platform/x86/intel/pmc/core.c
> b/drivers/platform/x86/intel/pmc/core.c
> > index a1fe1e0dcf4a5..834f0352c0edf 100644
> > --- a/drivers/platform/x86/intel/pmc/core.c
> > +++ b/drivers/platform/x86/intel/pmc/core.c
> > @@ -2125,7 +2125,7 @@ static __maybe_unused int
> pmc_core_resume(struct device *dev)
> > }
> >
> > /* The real interesting case - S0ix failed - lets ask PMC why. */
> > - dev_warn(dev, "CPU did not enter SLP_S0!!! (S0ix cnt=%llu)\n",
> > + dev_WARN(dev, "CPU did not enter SLP_S0!!! (S0ix cnt=%llu)\n",
> > pmcdev->s0ix_counter);
> > if (pmcdev->map->slps0_dbg_maps)
> > pmc_core_slps0_display(pmcdev, dev, NULL);