Re: intel_pstate oopses and lockdep report with Linux v4.5-1822-g63e30271b04c

From: Josh Boyer
Date: Fri Mar 18 2016 - 08:37:23 EST


On Thu, Mar 17, 2016 at 8:20 PM, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> On Thursday, March 17, 2016 12:44:54 PM Josh Boyer wrote:
>> On Thu, Mar 17, 2016 at 10:07 AM, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
>> > On Thursday, March 17, 2016 09:02:29 AM Josh Boyer wrote:
>> >> Hello,
>> >
>> > Hi,
>> >
>> >> I have an Intel Atom based NUC that is producing the following
>> >> backtraces on boot of Linus' tree as of last evening. This does not
>> >> happen with a tree with top level commit 271ecc5253e2, but does happen
>> >> when using the tree mentioned in the subject with top level commit
>> >> 63e30271b04c.
>> >>
>> >> The first backtrace appears to be a warning because the intel_pstate
>> >> driver is calling wrmsrl_on_cpu when interrupts are disabled? Not
>> >> sure on that one.
>> >>
>> >> The second backtrace is a lockdep report. Both are from the same boot.
>> >
>> > OK, thanks for the report.
>> >
>> > Can you please try the patch below?
>> >
>> > I'm actually unsure if we can do that safely in general for Atom because
>> > of the initialization, but that's what Core does anyway.
>> >
>> > Srinivas, Philippe, why exactly do we need the wrmsrl_on_cpu() in
>> > atom_set_pstate()? core_set_pstate() uses wrmsrl() and seems to be doing fine.
>> >
>> > ---
>> > drivers/cpufreq/intel_pstate.c | 2 +-
>> > 1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > Index: linux-pm/drivers/cpufreq/intel_pstate.c
>> > ===================================================================
>> > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
>> > +++ linux-pm/drivers/cpufreq/intel_pstate.c
>> > @@ -587,7 +587,7 @@ static void atom_set_pstate(struct cpuda
>> >
>> > val |= vid;
>> >
>> > - wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val);
>> > + wrmsrl(MSR_IA32_PERF_CTL, val);
>> > }
>> >
>> > static int silvermont_get_scaling(void)
>> >
>>
>> I applied this on top of commit 09fd671ccb24 and the backtrace and
>> lockdep report both go away. So yes, this seems to clear up the
>> issue. I tested it on a variety of different CPU types and didn't
>> notice anything wrong on them either.
>
> The problems may show up during initialization and cleanup where one CPU
> may be running code trying to configure a different one. In those cases
> wrmsrl_on_cpu() needs to be used.
>
> Let me cut a patch taking that into account.

OK. Happy to test when you have it ready.

josh