Re: Crashing 'kzm' target in next-20160913 due to 'gpio: mxc: shift gpio_mxc_init() to subsys_initcall level'

From: Uwe Kleine-König
Date: Thu Sep 15 2016 - 10:23:55 EST


On Thu, Sep 15, 2016 at 04:35:04PM +0300, Vladimir Zapolskiy wrote:
> Hi Guenter,
>
> On 09/14/2016 06:20 AM, Guenter Roeck wrote:
> > Hi Vladimir,
> >
> > your commit e188cbf7564f ("gpio: mxc: shift gpio_mxc_init() to subsys_initcall level")
> > in -next causes the following crash when running the 'kzm' target (and most likely
> > the real thing) with qemu.
> >
> > [ 1.211426] Unable to handle kernel NULL pointer dereference at virtual address 0000000c
> > [ 1.211600] pgd = c0004000
> > [ 1.211680] [0000000c] *pgd=00000000
> > [ 1.212067] Internal error: Oops: 5 [#1] SMP ARM
> > [ 1.212245] Modules linked in:
> > [ 1.212542] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-next-20160913 #1
> > [ 1.212671] Hardware name: Kyoto Microcomputer Co., Ltd. KZM-ARM11-01
> > [ 1.212825] task: c6848000 task.stack: c683e000
> > [ 1.213231] PC is at platform_get_irq+0xc0/0xe8
> >
> > See http://kerneltests.org/builders/qemu-arm-next/builds/525/steps/qemubuildcommand/logs/stdio
> > for a complete log.
> >
> > Problem is quite subtle. The change causes the gpio driver to be installed later.
> > As a result, kzm_init_smsc9118() fails to initialize the gpio pins correctly.
> > gpio_request() in that function returns -EPROBE_DEFER, which is ignored,
> > gpio_to_irq() then returns -22 which is unconditionally assigned as interrupt number.
> > platform_get_irq(), as called from the smsc driver, gets this negative interrupt
> > number, and passes it unconditionally to irq_get_irq_data(), which returns NULL.
> > The NULL pointer is then passed to irqd_set_trigger_type() which, not entirely
> > surprisingly, crashes.
> >
> > So, in other words, lots of bugs here. Nevertheless, I would suggest to keep using
> > postcore_initcall(), at least until it is sure that all gpio clients handle -EPROBE_DEFER
> > correctly.
>
> I'm inviting Shawn and Uwe to the discussion.
>
> The proper fix in this particular case should be like this one:
>
> diff --git a/arch/arm/mach-imx/mach-kzm_arm11_01.c b/arch/arm/mach-imx/mach-kzm_arm11_01.c
> index 31df4361996f..8288acfe7221 100644
> --- a/arch/arm/mach-imx/mach-kzm_arm11_01.c
> +++ b/arch/arm/mach-imx/mach-kzm_arm11_01.c
> @@ -245,13 +245,17 @@ static void __init kzm_board_init(void)
> mxc_iomux_setup_multiple_pins(kzm_pins,
> ARRAY_SIZE(kzm_pins), "kzm");
> - kzm_init_ext_uart();
> - kzm_init_smsc9118();
> kzm_init_imx_uart();
> pr_info("Clock input source is 26MHz\n");
> }
> +static void __init kzm_late_init(void)
> +{
> + kzm_init_ext_uart();
> + kzm_init_smsc9118();
> +}
> +
> /*
> * This structure defines static mappings for the kzm-arm11-01 board.
> */
> @@ -291,5 +295,6 @@ MACHINE_START(KZM_ARM11_01, "Kyoto Microcomputer Co., Ltd. KZM-ARM11-01")
> .init_irq = mx31_init_irq,
> .init_time = kzm_timer_init,
> .init_machine = kzm_board_init,
> + .init_late = kzm_late_init,
> .restart = mxc_restart,
> MACHINE_END

That + checking the return code of gpio_request and the other calls.
Or better, convert the machine to dt.

> But I agree that there might be more legacy boards (i.MX31 only IMHO),
> which may attempt to manipulate GPIO lines before subsys_initcall()
> level.

I wouldn't revert anything for legacy boards. That's the chance to say
in the near future: They stopped working in September 2016, obviously
nobody cares, let's rip them. :-)

Best regards
Uwe

--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |