Re: [PATCH v4] acpi: Issue _OSC call for native thermal interrupt handling

From: Linda Knippers
Date: Thu Mar 17 2016 - 19:45:05 EST




On 3/17/2016 5:12 PM, Srinivas Pandruvada wrote:
<snip>
>>>>> This needs to be done
>>>>> before SMM code path looks for _OSC capabilities. The bit 12 of
>>>>> _OSC in processor scope defines whether OS will handle thermal
>>>>> interrupts.
>>>>> When bit 12 is set to 1, OS will handle thermal interrupts.
>>>>> Refer to this document for details on _OSC
>>>>> http://www.intel.com/content/www/us/en/standards/processor-vend
>>>>> or-
>>>>> specific-acpi-specification.html
>>>> Where is bit 12 documented?
>>>>
>>> In the above document.
>> When I look at that document, I see bit 12 described as
>> "If set, OSPM supports native interrupt handling for Collaborative
>> Processor
>> Performance Control notifications." Is that the same thing or am
>> I looking at the wrong table?
> Yes. If you look at section 14.4 in Intel SDM, you will see that
> "HWP is an implementation of the ACPI-defined Collaborative Processor
> Performance Control (CPPC)". Section 14.4.5 also specifies that HWP
> uses IA32_THERM_STATUS to communicate if there are notifications, which
> is notified via thermal interrupt.

Ok, thanks. That wasn't clear from the commit message. It
sounded like bit 12 directly indicated that the OS will handle
thermal interrupts but it's a bit more indirect than that.

> You asked above if platform can handle these notification in SMM only.
> If you do then the notification will arrive as ACPI notifications. We
> don't have support for such notifications in Linux yet.

What I meant to ask was if the platform can disregard the _OSC information
and handle thermal events on it's own, without OS involvement.
For example, servers typically don't want to rely on the OS to manage
thermal issues.

<snip>

>>>>> This change introduces a new function
>>>>> acpi_early_processor_set_osc(),
>>>>> which walks acpi name space and finds acpi processor object and
>>>>> set capability via _OSC method to take over thermal LVT.
>>>> Does this change just affect Skylake platforms or all platforms?
>>> Any platform which has Intel  Speed Shift Technology (aka HWP)
>>> feature present and enabled.

Could this be an unexpected change in behavior for platforms
with HWP that don't have this bug, assuming they would look at
the _OSC CPPP bit? That's actually my main concern here.

-- ljk

>>>
>>> Thanks,
>>> Srinivas
>>>
>>>>
>>>>
>>>> -- ljk
>>>>>
>>>>>
>>>>>
>>>>> Also this change writes HWP status bits to 0 to clear any HWP
>>>>> status
>>>>> bits in intel_thermal_interrupt().
>>>>>
>>>>> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxx
>>>>> ntel
>>>>> .com>
>>>>> ---
>>>>> v4:
>>>>> Suggestion by Boris for removing use of intermediate variable
>>>>> for
>>>>> clearing HWP status and using boot_cpu_has instead of
>>>>> static_cpu_has
>>>>>
>>>>> v3:
>>>>> - Added CONFIG_X86 around static_cpu_has() to fix compile error
>>>>> on
>>>>> ARCH=ia64, reported by kbuild test robot
>>>>> - return AE_CTRL_TERMINATE to terminate acpi name walk space,
>>>>> when
>>>>> _OSC
>>>>> is set already once.
>>>>> v2:
>>>>> Unnecessary newline was introduced, removed that in
>>>>> acpi_processor.c
>>>>>
>>>>> arch/x86/kernel/cpu/mcheck/therm_throt.c | 3 ++
>>>>> drivers/acpi/acpi_processor.c | 47
>>>>> ++++++++++++++++++++++++++++++++
>>>>> drivers/acpi/bus.c | 3 ++
>>>>> drivers/acpi/internal.h | 2 ++
>>>>> 4 files changed, 55 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c
>>>>> b/arch/x86/kernel/cpu/mcheck/therm_throt.c
>>>>> index 2c5aaf8..0553858 100644
>>>>> --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
>>>>> +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
>>>>> @@ -385,6 +385,9 @@ static void intel_thermal_interrupt(void)
>>>>> {
>>>>> __u64 msr_val;
>>>>>
>>>>> + if (static_cpu_has(X86_FEATURE_HWP))
>>>>> + wrmsrl_safe(MSR_HWP_STATUS, 0);
>>>>> +
>>>>> rdmsrl(MSR_IA32_THERM_STATUS, msr_val);
>>>>>
>>>>> /* Check for violation of core thermal thresholds*/
>>>>> diff --git a/drivers/acpi/acpi_processor.c
>>>>> b/drivers/acpi/acpi_processor.c
>>>>> index 6979186..18da84f 100644
>>>>> --- a/drivers/acpi/acpi_processor.c
>>>>> +++ b/drivers/acpi/acpi_processor.c
>>>>> @@ -491,6 +491,53 @@ static void acpi_processor_remove(struct
>>>>> acpi_device *device)
>>>>> }
>>>>> #endif /* CONFIG_ACPI_HOTPLUG_CPU */
>>>>>
>>>>> +#ifdef CONFIG_X86
>>>>> +static bool acpi_hwp_native_thermal_lvt_set;
>>>>> +static acpi_status
>>>>> acpi_set_hwp_native_thermal_lvt_osc(acpi_handle
>>>>> handle,
>>>>> + u32
>>>>> lvl,
>>>>> void *context,
>>>>> + void
>>>>> **rv)
>>>>> +{
>>>>> + u8 sb_uuid_str[] = "4077A616-290C-47BE-9EBD-
>>>>> D87058713953";
>>>>> + u32 capbuf[2];
>>>>> + struct acpi_osc_context osc_context = {
>>>>> + .uuid_str = sb_uuid_str,
>>>>> + .rev = 1,
>>>>> + .cap.length = 8,
>>>>> + .cap.pointer = capbuf,
>>>>> + };
>>>>> +
>>>>> + if (acpi_hwp_native_thermal_lvt_set)
>>>>> + return AE_CTRL_TERMINATE;
>>>>> +
>>>>> + capbuf[0] = 0x0000;
>>>>> + capbuf[1] = 0x1000; /* set bit 12 */
>>>>> +
>>>>> + if (ACPI_SUCCESS(acpi_run_osc(handle, &osc_context)))
>>>>> {
>>>>> + acpi_hwp_native_thermal_lvt_set = true;
>>>>> + kfree(osc_context.ret.pointer);
>>>>> + }
>>>>> +
>>>>> + return AE_OK;
>>>>> +}
>>>>> +
>>>>> +void acpi_early_processor_set_osc(void)
>>>>> +{
>>>>> + if (boot_cpu_has(X86_FEATURE_HWP)) {
>>>>> + acpi_walk_namespace(ACPI_TYPE_PROCESSOR,
>>>>> ACPI_ROOT_OBJECT,
>>>>> + ACPI_UINT32_MAX,
>>>>> + acpi_set_hwp_native_therma
>>>>> l_lv
>>>>> t_osc,
>>>>> + NULL, NULL, NULL);
>>>>> + acpi_get_devices(ACPI_PROCESSOR_DEVICE_HID,
>>>>> + acpi_set_hwp_native_thermal_l
>>>>> vt_o
>>>>> sc,
>>>>> + NULL, NULL);
>>>>> + }
>>>>> +}
>>>>> +#else
>>>>> +
>>>>> +void acpi_early_processor_set_osc(void) {}
>>>>> +
>>>>> +#endif
>>>>> +
>>>>> /*
>>>>> * The following ACPI IDs are known to be suitable for
>>>>> representing as
>>>>> * processor devices.
>>>>> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
>>>>> index 891c42d..7e73aea 100644
>>>>> --- a/drivers/acpi/bus.c
>>>>> +++ b/drivers/acpi/bus.c
>>>>> @@ -1005,6 +1005,9 @@ static int __init acpi_bus_init(void)
>>>>> goto error1;
>>>>> }
>>>>>
>>>>> + /* Set capability bits for _OSC under processor scope
>>>>> */
>>>>> + acpi_early_processor_set_osc();
>>>>> +
>>>>> /*
>>>>> * _OSC method may exist in module level code,
>>>>> * so it must be run after ACPI_FULL_INITIALIZATION
>>>>> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
>>>>> index 1e6833a..5c787ac 100644
>>>>> --- a/drivers/acpi/internal.h
>>>>> +++ b/drivers/acpi/internal.h
>>>>> @@ -138,6 +138,8 @@ void acpi_early_processor_set_pdc(void);
>>>>> static inline void acpi_early_processor_set_pdc(void) {}
>>>>> #endif
>>>>>
>>>>> +void acpi_early_processor_set_osc(void);
>>>>> +
>>>>> /* ---------------------------------------------------------
>>>>> ----
>>>>> -------------
>>>>> Embedded Controller
>>>>> -----------------------------------------------------------
>>>>> ----
>>>>> ----------- */
>>>>>