Re: [PATCH 0/4] coretemp: Fix spamming of ring buffer

From: Zhang, Rui
Date: Wed Mar 29 2023 - 10:13:14 EST


On Wed, 2023-03-29 at 05:47 -0700, Guenter Roeck wrote:
> On Wed, Mar 29, 2023 at 03:43:58AM +0000, Zhang, Rui wrote:
> > Hi, Peter,
> >
> > CC the list.
> >
> > On Tue, 2023-03-28 at 22:37 +0200, Peter Ganzhorn wrote:
> > > Dear Mr. Rui,
> > > Dear Mr. Roeck,
> > >
> > > please consider accepting the attached patches or
> > > modifying the coretemp code to stop spamming my syslog.
> > > I would appreciate it very much if you can accept the patches.
> > >
> > > coretemp: Improve dynamic changes of TjMax
> > > After introduction of dynamic TjMax changes in commit
> > > c0c67f8761cec1fe36c21d85b1a5400ea7ac30cd
> > > my syslog gets spammed with "TjMax is ... degrees C"
> > > messages.
> > > If TjMax is subject to change at any time, it won't be
> > > set in tdata anymore and re-read every time from MSR.
> > > This causes quite a lot of dev_dbg() messages to be issued.
> > >
> > > The following patches change the code to read TjMax
> > > from the MSRs into tdata->tjmax (again) but allow for a
> > > dynamic update at any time as well. (Patches 1 and 2)
> > > This way a message will only be issued after actual changes.
> > > Also I replaced the dev_dbg() with dev_notice (Patch 3) and
> > > added a additional dev_notice for the case where TjMax is
> > > set based on assumptions. (Patch 4)
> > >
> > >
> > > If you do not want to accept my patches, removing the
> > > dev_dbg() in get_tjmax() would be the most simple
> > > solution I guess.
> > >
> > Please check if below patch solves your problem or not.
> >
> > From 9370ee5163a85f65230b5222f1f4dece59ce078a Mon Sep 17 00:00:00
> > 2001
> > From: Zhang Rui <rui.zhang@xxxxxxxxx>
> > Date: Wed, 29 Mar 2023 11:35:18 +0800
> > Subject: [PATCH] hwmon: (coretemp) Avoid duplicate debug messages
> >
> > Avoid duplicate dev_dbg messages when tjmax value retrieved from
> > MSR
> > does not change.
> >
> > Signed-off-by: Zhang Rui <rui.zhang@xxxxxxxxx>
> > ---
> > drivers/hwmon/coretemp.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > index 30d77f451937..809456967b50 100644
> > --- a/drivers/hwmon/coretemp.c
> > +++ b/drivers/hwmon/coretemp.c
> > @@ -267,6 +267,7 @@ static int get_tjmax(struct temp_data *tdata,
> > struct device *dev)
> > int err;
> > u32 eax, edx;
> > u32 val;
> > + static u32 tjmax;
>
> That would apply to every instance of this driver, meaning to every
> CPU core. Is that really appropriate ?
>
> Guenter
>
Good point.

MSR_IA32_TEMPERATURE_TARGET is package scope, and the cached tjmax
should also be package scope, or else this message is shown for each
cpu when tjmax changes in one package.

Previously, the message is printed only once during driver probing time
thus I'm wondering how useful this is.

Maybe we can just delete it?

thanks,
rui