Re: [PATCH] x86/mce: Lower throttling MCE messages to warnings

From: Hans de Goede
Date: Wed Oct 09 2019 - 11:57:41 EST


Hi,

On 09-10-2019 17:54, Benjamin Berg wrote:
On modern CPUs it is quite normal that the temperature limits are
reached and the CPU is throttled. In fact, often the thermal design is
not sufficient to cool the CPU at full load and limits can quickly be
reached when a burst in load happens. This will even happen with
technologies like RAPL limitting the long term power consumption of
the package.

So these messages do not usually indicate a hardware issue (e.g.
insufficient cooling). Log them as warnings to avoid confusion about
their severity.

Signed-off-by: Benjamin Berg <bberg@xxxxxxxxxx>
Tested-by: Christian Kellner <ckellner@xxxxxxxxxx>

Ah, yes lets please lower the log-prio of these messages:

Reviewed-by: Hans de Goede <hdegoede@xxxxxxxxxx>

Regards,

Hans



---
arch/x86/kernel/cpu/mce/therm_throt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mce/therm_throt.c b/arch/x86/kernel/cpu/mce/therm_throt.c
index 6e2becf547c5..bc441d68d060 100644
--- a/arch/x86/kernel/cpu/mce/therm_throt.c
+++ b/arch/x86/kernel/cpu/mce/therm_throt.c
@@ -188,7 +188,7 @@ static void therm_throt_process(bool new_event, int event, int level)
/* if we just entered the thermal event */
if (new_event) {
if (event == THERMAL_THROTTLING_EVENT)
- pr_crit("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n",
+ pr_warn("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n",
this_cpu,
level == CORE_LEVEL ? "Core" : "Package",
state->count);