Re: [PATCH v2] cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end

From: Viresh Kumar
Date: Tue Apr 29 2014 - 09:23:46 EST


On 29 April 2014 18:36, Srivatsa S. Bhat
<srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote:
> Some cpufreq drivers were redundantly invoking the _begin() and _end()
> APIs around frequency transitions, and this double invocation (one from
> the cpufreq core and the other from the cpufreq driver) used to result
> in a self-deadlock, leading to system hangs during boot. (The _begin()
> API makes contending callers wait until the previous invocation is
> complete. Hence, the cpufreq driver would end up waiting on itself!).
>
> Now all such drivers have been fixed, but debugging this issue was not
> very straight-forward (even lockdep didn't catch this). So let us add a
> debug infrastructure to the cpufreq core to catch such issues more easily
> in the future.
>
> We add a new field called 'transition_task' to the policy structure, to keep
> track of the task which is performing the frequency transition. Using this
> field, we make note of this task during _begin() and print a warning if we
> find a case where the same task is calling _begin() again, before completing
> the previous frequency transition using the corresponding _end().
>
> We have left out ASYNC_NOTIFICATION drivers from this debug infrastructure
> for 2 reasons:
>
> 1. At the moment, we have no way to avoid a particular scenario where this
> debug infrastructure can emit false-positive warnings for such drivers.
> The scenario is depicted below:
>
>
> Task A Task B
>
> /* 1st freq transition */
> Invoke _begin() {
> ...
> ...
> }
>
> Change the frequency
>
> /* 2nd freq transition */
> Invoke _begin() {
> ... //waiting for B to
> ... //finish _end() for
> ... //the 1st transition
> ... | Got interrupt for successful
> ... | change of frequency (1st one).
> ... |
> ... | /* 1st freq transition */
> ... | Invoke _end() {
> ... | ...
> ... V }
> ...
> ...
> }
>
> This scenario is actually deadlock-free because, once Task A changes the
> frequency, it is Task B's responsibility to invoke the corresponding
> _end() for the 1st frequency transition. Hence it is perfectly legal for
> Task A to go ahead and attempt another frequency transition in the meantime.
> (Of course it won't be able to proceed until Task B finishes the 1st _end(),
> but this doesn't cause a deadlock or a hang).
>
> The debug infrastructure cannot handle this scenario and will treat it as
> a deadlock and print a warning. To avoid this, we exclude such drivers
> from the purview of this code.
>
> 2. Luckily, we don't _need_ this infrastructure for ASYNC_NOTIFICATION drivers
> at all! The cpufreq core does not automatically invoke the _begin() and
> _end() APIs during frequency transitions in such drivers. Thus, the driver
> alone is responsible for invoking _begin()/_end() and hence there shouldn't
> be any conflicts which lead to double invocations. So, we can skip these
> drivers, since the probability that such drivers will hit this problem is
> extremely low, as outlined above.
>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx>
> ---
>
> v2: Removed the coverage of ASYNC_NOTIFICATION drivers, in order to avoid
> false-positives.
>
> drivers/cpufreq/cpufreq.c | 7 +++++++
> include/linux/cpufreq.h | 1 +
> 2 files changed, 8 insertions(+)

Acked-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/