Re: [PATCH v5 3/6] cpufreq: powernv: Register for OCC related opal_message notification

From: Stewart Smith
Date: Sun Aug 09 2015 - 21:41:22 EST


Shilpasri G Bhat <shilpa.bhat@xxxxxxxxxxxxxxxxxx> writes:
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index d0c18c9..a634199 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -33,6 +33,7 @@
> #include <asm/firmware.h>
> #include <asm/reg.h>
> #include <asm/smp.h> /* Required for cpu_sibling_mask() in UP configs */
> +#include <asm/opal.h>
>
> #define POWERNV_MAX_PSTATES 256
> #define PMSR_PSAFE_ENABLE (1UL << 30)
> @@ -41,7 +42,7 @@
> #define PMSR_LP(x) ((x >> 48) & 0xFF)
>
> static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
> -static bool rebooting, throttled;
> +static bool rebooting, throttled, occ_reset;
>
> static struct chip {
> unsigned int id;
> @@ -414,6 +415,74 @@ static struct notifier_block powernv_cpufreq_reboot_nb = {
> .notifier_call = powernv_cpufreq_reboot_notifier,
> };
>
> +static char throttle_reason[][30] = {
> + "No throttling",
> + "Power Cap",
> + "Processor Over Temperature",
> + "Power Supply Failure",
> + "Over Current",
> + "OCC Reset"
> + };
> +
> +static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
> + unsigned long msg_type, void *_msg)
> +{
> + struct opal_msg *msg = _msg;
> + struct opal_occ_msg omsg;
> +
> + if (msg_type != OPAL_MSG_OCC)
> + return 0;
> +
> + omsg.type = be64_to_cpu(msg->params[0]);
> +
> + switch (omsg.type) {
> + case OCC_RESET:
> + occ_reset = true;
> + /*
> + * powernv_cpufreq_throttle_check() is called in
> + * target() callback which can detect the throttle state
> + * for governors like ondemand.
> + * But static governors will not call target() often thus
> + * report throttling here.
> + */
> + if (!throttled) {
> + throttled = true;
> + pr_crit("CPU Frequency is throttled\n");
> + }
> + pr_info("OCC: Reset\n");
> + break;
> + case OCC_LOAD:
> + pr_info("OCC: Loaded\n");
> + break;

I wonder if we could have the log messages be a bit clearer here, odds
are, unless you're one of the people reading this code, you have no idea
what an OCC is or what on earth "OCC: Loaded" means and why this
*doesn't* mean that your CPUs are no longer throttled so that your
computer doesn't catch fire/break/add 1+1 and get 4.

Also, do we export this information via sysfs somewhere? It would seem
to want to go along with other cpufreq/cpu info there.

It feels like we could do much better at informing users as to what is
going on.... maybe something like:

"OCC (On Chip Controller - enforces hard thermal/power limits) Resetting: CPU frequency throttled for duration"
"OCC Loading, CPU frequency throttled until OCC started"
"OCC Active, CPU frequency no longer throttled"

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/