Re: [RFC PATCH] gpiolib: remove extra_checks
From: Bartosz Golaszewski
Date: Tue Jan 16 2024 - 17:32:55 EST
On Tue, Jan 16, 2024 at 7:23 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> Hi,
>
> On Tue, Dec 19, 2023 at 09:11:02PM +0100, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@xxxxxxxxxx>
> >
> > extra_checks is only used in a few places. It also depends on
> > a non-standard DEBUG define one needs to add to the source file. The
> > overhead of removing it should be minimal (we already use pure
> > might_sleep() in the code anyway) so drop it.
> >
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@xxxxxxxxxx>
>
> This patch triggers (exposes) the following backtrace.
>
> BUG: sleeping function called from invalid context at drivers/gpio/gpiolib.c:3738
> in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 7, name: kworker/0:0
> preempt_count: 1, expected: 0
> RCU nest depth: 0, expected: 0
> 3 locks held by kworker/0:0/7:
> #0: c181b3a4 ((wq_completion)events_freezable){+.+.}-{0:0}, at: process_scheduled_works+0x23c/0x644
> #1: c883df28 ((work_completion)(&(&host->detect)->work)){+.+.}-{0:0}, at: process_scheduled_works+0x23c/0x644
> #2: c24e1720 (&host->lock){-...}-{2:2}, at: sdhci_check_ro+0x14/0xd4
> irq event stamp: 2916
> hardirqs last enabled at (2915): [<c0b18838>] _raw_spin_unlock_irqrestore+0x70/0x84
> hardirqs last disabled at (2916): [<c0b1853c>] _raw_spin_lock_irqsave+0x74/0x78
> softirqs last enabled at (2360): [<c00098a4>] __do_softirq+0x28c/0x4b0
> softirqs last disabled at (2347): [<c0022774>] __irq_exit_rcu+0x15c/0x1a4
> CPU: 0 PID: 7 Comm: kworker/0:0 Tainted: G N 6.7.0-09928-g052d534373b7 #1
> Hardware name: Freescale i.MX25 (Device Tree Support)
> Workqueue: events_freezable mmc_rescan
> unwind_backtrace from show_stack+0x10/0x18
> show_stack from dump_stack_lvl+0x34/0x54
> dump_stack_lvl from __might_resched+0x188/0x274
> __might_resched from gpiod_get_value_cansleep+0x14/0x60
> gpiod_get_value_cansleep from mmc_gpio_get_ro+0x20/0x30
When getting GPIO value with a spinlock taken the driver *must* use
the non-sleeping variant of this function: gpiod_get_value(). If the
underlying driver can sleep then the developer seriously borked. The
API contract has always been this way so I wouldn't treat it as a
regression.
I'd start with checking if replacing this with gpiod_get_value()
helps. Possibly even do:
if (in_atomic())
gpiod_get_value();
else
gpiod_get_value_cansleep();
Bartosz
> mmc_gpio_get_ro from esdhc_pltfm_get_ro+0x20/0x48
> esdhc_pltfm_get_ro from sdhci_check_ro+0x44/0xd4
> sdhci_check_ro from mmc_sd_setup_card+0x2a8/0x47c
> mmc_sd_setup_card from mmc_sd_init_card+0xfc/0x93c
> mmc_sd_init_card from mmc_attach_sd+0xd8/0x180
> mmc_attach_sd from mmc_rescan+0x2ac/0x30c
> mmc_rescan from process_scheduled_works+0x2e4/0x644
> process_scheduled_works from worker_thread+0x188/0x418
> worker_thread from kthread+0x11c/0x144
> kthread from ret_from_fork+0x14/0x38
>
> This is with the imx25-pdk qemu emulation when booting from mmc/sd card.
> It isn't really surprising since sdhci_check_ro() calls the gpio code under
> spin_lock_irqsave(). No idea how to fix that, so I won't even try.
>
> Bisect log attached for reference.
>
> Guenter
>
> ---
> # bad: [052d534373b7ed33712a63d5e17b2b6cdbce84fd] Merge tag 'exfat-for-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
> # good: [70d201a40823acba23899342d62bc2644051ad2e] Merge tag 'f2fs-for-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
> git bisect start 'HEAD' '70d201a40823'
> # good: [b6e1b708176846248c87318786d22465ac96dd2c] drm/xe: Remove uninitialized variable from warning
> git bisect good b6e1b708176846248c87318786d22465ac96dd2c
> # good: [7912a6391f3ee7eb9f9a69227a209d502679bc0c] Merge tag 'sound-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
> git bisect good 7912a6391f3ee7eb9f9a69227a209d502679bc0c
> # bad: [a3cc31e75185f9b1ad8dc45eac77f8de788dc410] Merge tag 'libnvdimm-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
> git bisect bad a3cc31e75185f9b1ad8dc45eac77f8de788dc410
> # bad: [576db73424305036a6aa9e40daf7109742fbb1df] Merge tag 'gpio-updates-for-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
> git bisect bad 576db73424305036a6aa9e40daf7109742fbb1df
> # good: [61f4c3e6711477b8a347ca5fe89e5e6613e0a147] Merge tag 'linux-watchdog-6.8-rc1' of git://www.linux-watchdog.org/linux-watchdog
> git bisect good 61f4c3e6711477b8a347ca5fe89e5e6613e0a147
> # good: [12b7f4ddfcb66dafed432cf4a987f5b40179c0f1] Merge tag 'device_is_big_endian-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core into gpio/for-next
> git bisect good 12b7f4ddfcb66dafed432cf4a987f5b40179c0f1
> # good: [ede7511e7c22c9542a699ddff9f32de74e0bb972] gpiolib: cdev: include overflow.h
> git bisect good ede7511e7c22c9542a699ddff9f32de74e0bb972
> # bad: [f34fd6ee1be84c6e64574e9eb58f89d32c7f98a4] gpio: dwapb: Use generic request, free and set_config
> git bisect bad f34fd6ee1be84c6e64574e9eb58f89d32c7f98a4
> # good: [7dd1871e5049bbd40ee78ac94b1678ba5caf2486] gpio: tps65219: don't use CONFIG_DEBUG_GPIO
> git bisect good 7dd1871e5049bbd40ee78ac94b1678ba5caf2486
> # bad: [0338f6a6fb659f083eca7dd5967bb668d14707f8] gpiolib: drop tabs from local variable declarations
> git bisect bad 0338f6a6fb659f083eca7dd5967bb668d14707f8
> # bad: [5d5dfc50e5689d5b09de4a323f84c28a6700d156] gpiolib: remove extra_checks
> git bisect bad 5d5dfc50e5689d5b09de4a323f84c28a6700d156
> # first bad commit: [5d5dfc50e5689d5b09de4a323f84c28a6700d156] gpiolib: remove extra_checks