Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

From: Viresh Kumar
Date: Tue Jan 26 2016 - 22:10:32 EST


On 26-01-16, 09:57, Juri Lelli wrote:
> This patch fixes the crash I was seeing.
>
> Tested-by: Juri Lelli <juri.lelli@xxxxxxx>

Thanks.

> However, it exposes another problem (running the concurrent lockdep test

It exposes? How can this patch expose the below crash. AFAIR, you
reported that you are getting below crash on plain mainline on TC2,
i.e. for drivers with policy-per-governor set.

The reason is obvious, as the governor's sysfs directory is present
cpus/cpuX/cpufreq/ instead of cpus/cpufreq/, which used to be the case
without the flag. And this forces the show()/store() present in
cpufreq.c to be called which also take policy->rwsem.

> that you merged in your tests). After the test is finished there is
> always at least one task spinning. Do you think it might be related to
> the race we are already discussing in the thread related to my cleanups
> patches? This is what I see:

So this is what you reported earlier, right?

> [ 38.843648] other info that might help us debug this:
> [ 38.843648]
> [ 38.867627] Chain exists of:
> s_active#41 --> &policy->rwsem --> od_dbs_cdata.mutex
>
> [ 38.891693] Possible unsafe locking scenario:
> [ 38.891693]

Will elaborate it a bit here..
- CPU0 is calling governor's EXIT()
- CPU1 is reading a governor file from sysfs

> [ 38.909419] CPU0 CPU1
> [ 38.922978] ---- ----

Following needs to be added here..

EXIT-governor read/write governor file

lock(s_active#41);

> [ 38.936535] lock(od_dbs_cdata.mutex);
> [ 38.948146] lock(&policy->rwsem);
> [ 38.966168] lock(od_dbs_cdata.mutex);
> [ 38.985219] lock(s_active#41);
> [ 38.994923]
> [ 38.994923] *** DEADLOCK ***

> Now, you already pointed me at a possible fix. I'm going to test that
> (even if I have questions about that patch :)) and see if it makes this
> go away.

@Rafael: Juri is talking about this patch:

http://www.linux-arm.org/git?p=linux-jl.git;a=commit;h=d3eb02ed23732de2c8671377316a190c38b8fe93

Juri, I thought it will fix it earlier (when I wrote it), but it never
did on x86 (while I dropped the rwsem-drop-code around EXIT as well).

And I never came back to it and so never sent it upstream.

--
viresh