Re: regression: gpiolib: switch the line state notifier to atomic unexpected impact on performance
From: Bartosz Golaszewski
Date: Tue Mar 11 2025 - 06:21:48 EST
On Tue, Mar 11, 2025 at 11:01 AM David Jander <david@xxxxxxxxxxx> wrote:
>
>
> Dear Bartosz,
>
> I noticed this because after updating the kernel from 6.11 to 6.14 a
> user-space application that uses GPIOs heavily started getting extremely slow,
> to the point that I will need to heavily modify this application in order to
> be usable again.
> I traced the problem down to the following patch that went into 6.13:
>
> fcc8b637c542 gpiolib: switch the line state notifier to atomic
>
> What happens here, is that gpio_chrdev_release() now calls
> atomic_notifier_chain_unregister(), which uses RCU, and as such must call
> synchronize_rcu(). synchronize_rcu() waits for the RCU grace time to expire
> before returning and according to the documentation can cause a delay of up to
> several milliseconds. In fact it seems to take between 8-10ms on my system (an
> STM32MP153C single-core Cortex-A7).
>
> This has the effect that the time it takes to call close() on a /dev/gpiochipX
> takes now ~10ms each time. If I git-revert this commit, close() will take less
> than 1ms.
>
Thanks for the detailed report!
> 10ms doesn't sound like much, but it is more ~10x the time it tool before,
> and unfortunately libgpiod code calls this function very often in some places,
> especially in find_line() if your board has many gpiochips (mine has 16
> chardevs).
Yeah, I imagine it can affect the speed of execution of gpiofind,
gpiodetect and any other program that iterates over all character
devices.
>
> The effect can easily be reproduced with the gpiofind tool:
>
> Running on kernel 6.12:
>
> $ time gpiofind LPOUT0
> gpiochip7 9
> real 0m 0.02s
> user 0m 0.00s
> sys 0m 0.01s
>
> Running on kernel 6.13:
>
> $ time gpiofind LPOUT0
> gpiochip7 9
> real 0m 0.19s
> user 0m 0.00s
> sys 0m 0.01s
>
> That is almost a 10x increase in execution time of the whole program!!
>
> On kernel 6.13, after git revert -n fcc8b637c542 time is back to what it was
> on 6.12.
>
> Unfortunately I can't come up with an easy solution to this problem, that's
> why I don't have a patch to propose. Sorry for that.
>
> I still think it is a bit alarming this change has such a huge impact. IMHO it
> really shouldn't. What can be done about this? Is it maybe possible to defer
> unregistering and freeing to a kthread and return from the release function
> earlier?
>
This was my first idea too. Alternatively we can switch to using a raw
notifier and provide a spinlock ourselves.
Bartosz