Re: [PATCH] clk: Mark fwnodes when their clock provider is added

From: Geert Uytterhoeven
Date: Fri Mar 26 2021 - 14:31:07 EST


Hi Stephen,

On Fri, Mar 26, 2021 at 7:13 PM Stephen Boyd <sboyd@xxxxxxxxxx> wrote:
> Quoting Nicolas Saenz Julienne (2021-03-25 11:25:24)
> > On Thu, 2021-03-25 at 14:31 +0100, Marek Szyprowski wrote:
> > > On 10.02.2021 12:44, Tudor Ambarus wrote:
> > > > This is a follow-up for:
> > > > commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> > > >
> > > > The above commit updated the deprecated of_clk_add_provider(),
> > > > but missed to update the preferred of_clk_add_hw_provider().
> > > > Update it now.
> > > >
> > > > Signed-off-by: Tudor Ambarus <tudor.ambarus@xxxxxxxxxxxxx>
> > >
> > > This patch, which landed in linux-next as commit 6579c8d97ad7 ("clk:
> > > Mark fwnodes when their clock provider is added") causes the following
> > > NULL pointer dereference on Raspberry Pi 3b+ boards:
> > >
> > > --->8---
> > >
> > > raspberrypi-firmware soc:firmware: Attached to firmware from
> > > 2020-01-06T13:05:25
> > > Unable to handle kernel NULL pointer dereference at virtual address
> > > 0000000000000050
> > > Mem abort info:
> > > ESR = 0x96000004
> > > EC = 0x25: DABT (current EL), IL = 32 bits
> > > SET = 0, FnV = 0
> > > EA = 0, S1PTW = 0
> > > Data abort info:
> > > ISV = 0, ISS = 0x00000004
> > > CM = 0, WnR = 0
> > > [0000000000000050] user address but active_mm is swapper
> > > Internal error: Oops: 96000004 [#1] PREEMPT SMP
> > > Modules linked in:
> > > CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 5.12.0-rc4+ #2764
> > > Hardware name: Raspberry Pi 3 Model B (DT)
> > > Workqueue: events deferred_probe_work_func
> > > pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > pc : of_clk_add_hw_provider+0xac/0xe8
> > > lr : of_clk_add_hw_provider+0x94/0xe8
> > > sp : ffff8000130936b0
> > > x29: ffff8000130936b0 x28: ffff800012494e04
> > > x27: ffff00003b18cb05 x26: ffff00003aa5c010
> > > x25: 0000000000000000 x24: 0000000000000000
> > > x23: ffff00003aa1e380 x22: ffff8000106830d0
> > > x21: ffff80001233f180 x20: 0000000000000018
> > > x19: 0000000000000000 x18: ffff8000124d38b0
> > > x17: 0000000000000013 x16: 0000000000000014
> > > x15: ffff8000125758b0 x14: 00000000000184e0
> > > x13: 000000000000292e x12: ffff80001258dd98
> > > x11: 0000000000000001 x10: 0101010101010101
> > > x9 : ffff80001233f288 x8 : 7f7f7f7f7f7f7f7f
> > > x7 : fefefefeff6c626f x6 : 5d636d8080808080
> > > x5 : 00000000006d635d x4 : 0000000000000000
> > > x3 : 0000000000000000 x2 : 540eb5edae191600
> > > x1 : 0000000000000000 x0 : 0000000000000000
> > > Call trace:
> > > of_clk_add_hw_provider+0xac/0xe8
> > > devm_of_clk_add_hw_provider+0x5c/0xb8
> > > raspberrypi_clk_probe+0x110/0x210
> > > platform_probe+0x90/0xd8
> > > really_probe+0x108/0x3c0
> > > driver_probe_device+0x60/0xc0
> > > __device_attach_driver+0x9c/0xd0
> > > bus_for_each_drv+0x70/0xc8
> > > __device_attach+0xec/0x150
> > > device_initial_probe+0x10/0x18
> > > bus_probe_device+0x94/0xa0
> > > device_add+0x47c/0x780
> > > platform_device_add+0x110/0x248
> > > platform_device_register_full+0x120/0x150
> > > rpi_firmware_probe+0x158/0x1f8
> > > platform_probe+0x90/0xd8
> > > really_probe+0x108/0x3c0
> > > driver_probe_device+0x60/0xc0
> > > __device_attach_driver+0x9c/0xd0
> > > bus_for_each_drv+0x70/0xc8
> > > __device_attach+0xec/0x150
> > > device_initial_probe+0x10/0x18
> > > bus_probe_device+0x94/0xa0
> > > deferred_probe_work_func+0x70/0xa8
> > > process_one_work+0x2a8/0x718
> > > worker_thread+0x48/0x460
> > > kthread+0x134/0x160
> > > ret_from_fork+0x10/0x18
> > > Code: b1006294 540000c0 b140069f 54000088 (3940e280)
> > > ---[ end trace 7ead5ec2f0c51cfe ]---
> > >
> > > This patch mainly revealed that clk/bcm/clk-raspberrypi.c driver calls
> > > devm_of_clk_add_hw_provider(), with a device pointer, which has a NULL
> > > dev->of_node. I'm not sure if adding a check for a NULL np in
> > > of_clk_add_hw_provider() is a right fix, though.
> >
> > I believe the right fix is not to call 'devm_of_clk_add_hw_provider()' if
> > 'pdev->dev.of_node == NULL'. In such case, which is RPi3's, only the CPU clock
> > is used, and it's defined and queried later through
> > devm_clk_hw_register_clkdev().
> >
> > @Marek, I don't mind taking care of it if it's OK with you.
> >
>
> Ah I see this is related to the patch I just reviewed. Can you reference
> this in the commit text? And instead of putting the change into the clk
> provider let's check for NULL 'np' in of_clk_add_hw_provider() instead
> and return 0 if there's nothing to do. That way we don't visit this
> problem over and over again.

I'm not sure the latter is what we reall want: shouldn't calling
*of*_clk_add_hw_provider() with a NULL np be a bug in the provider?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds