Re: [BUG] oops in cpufreq driver with AMD Kaveri CPU

From: Viresh Kumar
Date: Tue Aug 12 2014 - 02:16:27 EST


On Tue, Aug 12, 2014 at 11:25 AM, Oleksandr Natalenko
<oleksandr@xxxxxxxxxxxxxx> wrote:
> What should I do to debug it? Is that necessary to recompile kernel with full
> debug?

Yeah, you need to recompile the kernel for sure but not necessarily with debug
support..

Some background:

In cpufreq framework, we manage a CPUs frequency based on the load
on CPU. You are using ondemand governor which tries to
increase/decrease frequency
continuously at fixed intervals.

Now, when we change the frequency we *may* need to communicate this to few
drivers which *may* depend on CPUs frequency for their functioning.

This is handled via notifications.

Other drivers are required to do this:
cpufreq_register_notifier(&<some-local-struct>, CPUFREQ_TRANSITION_NOTIFIER);

to register themselves for frequency-change and them a routine of theirs would
be called from cpufreq-core..

This is exactly where it is crashing for you. i.e. while calling the
notifier list.

So, you need to check which all notifiers are registered and out of those which
one is crashing..

The first parameter of the above register-call should be declared this way:

static struct notifier_block xyz_notifier_block = {
.notifier_call = xyz_freq_notifier,
};

I would have done it this way:

- Add a print in cpufreq_register_notifier() to print the address of routine
present in .notifier_call for case: CPUFREQ_TRANSITION_NOTIFIER

- Then add prints to all the notifiers added for your configuration, shouldn't
be much only 4-5 I believe. i.e. you can add print messages to the notifier
callbacks..

- Then see the sequence/order in which they are called normally, when we
don't crash.. and check that when it crashes..

- You will be able to make out which notifier is crashing. And then we can
see why?

- You can add something like this to notifier routines:
pr_info("%s\n", __func__);

This will print function name.

>> Which patch are you talking about here?
>
> I thought about this one [1], but I guess that's not my case.
>
> [1] https://lkml.org/lkml/2014/7/16/815

I guessed so and I don't think it will help as the crashes reported in this
bug-log is something different.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/