Re: [PATCH 1/2] x86, mce, therm_throt: Optimize logging of thermal throttle messages

From: Srinivas Pandruvada
Date: Tue Oct 15 2019 - 10:01:50 EST


On Tue, 2019-10-15 at 10:46 +0200, Borislav Petkov wrote:
> On Mon, Oct 14, 2019 at 03:41:38PM -0700, Srinivas Pandruvada wrote:
> > So some users who had issues in their systems can try with this
> > patch.
> > We can get rid of this, till it becomes real issue.
>
> We don't add command line parameters which we maybe can get rid of
> later.
I am saying the same.
We will not have command line parameter, till this is a problem.

Thanks,
Srinivas

>
> > The temperature is function of load, time and heat dissipation
> > capacity
> > of the system. I have to think more about this to come up with some
> > heuristics where we still warning users about real thermal issues.
> > Since value is not persistent, then next boot again will start from
> > the
> > default.
>
> Yes, and the fact that each machine's temperature is influenced by
> the
> specific *individual* environment and load the machine runs, shows
> that
> you need to adjust this timeout automatically and dynamically.
>
> With the command line parameter you're basically putting the onus on
> the
> user to do that which is just silly. And then she'd need to do it
> during
> runtime too, if the ambient temperature or machine load, etc,
> changes.
>
> The whole thing is crying "dynamic".
>
> For a simple example, see mce_timer_fn() where we switch to polling
> during CMCI storms.
>