Re: [PATCH 1/4] gpio: Remove VLA from gpiolib
From: Lukas Wunner
Date: Mon Mar 19 2018 - 03:00:56 EST
On Sun, Mar 18, 2018 at 09:34:12PM +0100, Rasmus Villemoes wrote:
> On 2018-03-18 15:23, Lukas Wunner wrote:
> >>> Other random thoughts: maybe two allocations for each loop iteration is
> >>> a bit much. Maybe do a first pass over the array and collect the maximal
> >>> chip->ngpio, do the memory allocation and freeing outside the loop (then
> >>> you'd of course need to preserve the memset() with appropriate length
> >>> computed). And maybe even just do one allocation, making bits point at
> >>> the second half.
> >>
> >> I think those are great ideas because the function is kind of a hotpath
> >> and usage of VLAs was motivated by the desire to make it fast.
> >>
> >> I'd go one step further and store the maximum ngpio of all registered
> >> chips in a global variable (and update it in gpiochip_add_data_with_key()),
> >> then allocate 2 * max_ngpio once before entering the loop (as you've
> >> suggested). That would avoid the first pass to determine the maximum
> >> chip->ngpio. In most systems max_ngpio will be < 64, so one or two
> >> unsigned longs depending on the arch's bitness.
> >
> > Actually, scratch that. If ngpio is usually smallish, we can just
> > allocate reasonably sized space for mask and bits on the stack,
>
> Yes.
>
> > and fall back to the kcalloc slowpath only if chip->ngpio exceeds
> > that limit.
>
> Well, I'd suggest not adding that fallback code now, but simply add a
> check in gpiochip_add_data_with_key to ensure ngpio is sane (and refuse
> to register the chip otherwise), at least if we know that every
> currently supported/known chip is covered by the 256 (?).
The number 256 was arbitrarily chosen. I really wouldn't be surprised
if gpiochips with more pins exist, but they're probably rare, so using
the slowpath seems fine, but dropping support for them completely would
be a regression.
E.g. many serially attached chips such as MAX3191X are daisy-chainable
and the driver deliberately doesn't impose an upper limit on the number
of chips because the spec doesn't contain one either. To the OS a
daisy-chain of such chips appears as a single gpiochip with many pins.
Thanks,
Lukas