Re: [PATCH] cpufreq: fix a NULL pointer dereference triggered by _PPC changed notification

From: Viresh Kumar
Date: Wed Dec 17 2014 - 23:26:25 EST


On 18 December 2014 at 06:38, Ethan Zhao <ethan.zhao@xxxxxxxxxx> wrote:
> If _PPC changed notification happens before governor was initiated while kernel
> is booting, a NULL pointer dereference will be triggered:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
> IP: [<ffffffff81470453>] __cpufreq_governor+0x23/0x1e0
> PGD 0
> Oops: 0000 [#1] SMP
> ... ...
> RIP: 0010:[<ffffffff81470453>] [<ffffffff81470453>]
> __cpufreq_governor+0x23/0x1e0
> RSP: 0018:ffff881fcfbcfbb8 EFLAGS: 00010286
> RAX: 0000000000000000 RBX: ffff881fd11b3980 RCX: ffff88407fc20000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff881fd11b3980
> RBP: ffff881fcfbcfbd8 R08: 0000000000000000 R09: 000000000000000f
> R10: ffffffff818068d0 R11: 0000000000000043 R12: 0000000000000004
> R13: 0000000000000000 R14: ffffffff8196cae0 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff881fffc00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000030 CR3: 00000000018ae000 CR4: 00000000000407f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kworker/0:3 (pid: 750, threadinfo ffff881fcfbce000, task
> ffff881fcf556400)
> Stack:
> ffff881fffc17d00 ffff881fcfbcfc18 ffff881fd11b3980 0000000000000000
> ffff881fcfbcfc08 ffffffff81470d08 ffff881fd11b3980 0000000000000007
> ffff881fcfbcfc18 ffff881fffc17d00 ffff881fcfbcfd28 ffffffff81472e9a
> Call Trace:
> [<ffffffff81470d08>] __cpufreq_set_policy+0x1b8/0x2e0
> [<ffffffff81472e9a>] cpufreq_update_policy+0xca/0x150
> [<ffffffff81472f20>] ? cpufreq_update_policy+0x150/0x150
> [<ffffffff81324a96>] acpi_processor_ppc_has_changed+0x71/0x7b
> [<ffffffff81320bcd>] acpi_processor_notify+0x55/0x115
> [<ffffffff812f9c29>] acpi_device_notify+0x19/0x1b
> [<ffffffff813084ca>] acpi_ev_notify_dispatch+0x41/0x5f
> [<ffffffff812f64a4>] acpi_os_execute_deferred+0x27/0x34
>
> The root cause is a race conditon -- cpufreq core and acpi-cpufreq driver
> were initiated, but cpufreq_governor wasn't and _PPC changed notification
> happened, __cpufreq_governor() was called within acpi_os_execute_deferred
> kernel thread context.
>
> To fix this panic issue, add pointer checking code in __cpufreq_governor()
> before pointer policy->governor is to be dereferenced.
>
> Signed-off-by: Ethan Zhao <ethan.zhao@xxxxxxxxxx>
> ---
> drivers/cpufreq/cpufreq.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 4473eba..b75735c 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2021,6 +2021,11 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
> /* Don't start any governor operations if we are entering suspend */
> if (cpufreq_suspended)
> return 0;
> + /* Governor might not be initiated here if _PPC changed notification
> + happened, check it.
> + */

Please adopt correct style of multiline comment here..

> + if (!policy->governor)
> + return -EINVAL;

And yet another band-aid to get things going...

We really need to sort out things here, its not getting us anywhere.
Cpufreq core's state machine is in real bad shape right now..

Okay, let me find some time at higher priority and get things
fixed here. There are unattended bugs floating around because
bandaids aren't working anymore.

Till then, you can get this one pushed for current rc.

After the comment fix, Acked-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/