Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

From: Syed Nayyar Waris
Date: Mon Nov 09 2020 - 09:49:05 EST


On Mon, Nov 9, 2020 at 8:09 PM William Breathitt Gray
<vilhelm.gray@xxxxxxxxx> wrote:
>
> On Mon, Nov 09, 2020 at 08:41:28AM -0500, William Breathitt Gray wrote:
> > On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
> > > On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
> > > > On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
> > > > <vilhelm.gray@xxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> > > > > > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris <syednwaris@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > This patch reimplements the xgpio_set_multiple() function in
> > > > > > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > > > > > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > > > > > > to read and understand. Moreover, instead of looping for each bit
> > > > > > > in xgpio_set_multiple() function, now we can check each channel at
> > > > > > > a time and save cycles.
> > > > > >
> > > > > > This now causes -Wtype-limits warnings in linux-next with gcc-10:
> > > > >
> > > > > Hi Arnd,
> > > > >
> > > > > What version of gcc-10 are you running? I'm having trouble generating
> > > > > these warnings so I suspect I'm using a different version than you.
> > > >
> > > > I originally saw it with the binaries from
> > > > https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
> > > > also been able to reproduce it with a minimal test case on the
> > > > binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n
> > > >
> > > > > Let me first verify that I understand the problem correctly. The issue
> > > > > is the possibility of a stack smash in bitmap_set_value() when the value
> > > > > of start + nbits is larger than the length of the map bitmap memory
> > > > > region. This is because index (or index + 1) could be outside the range
> > > > > of the bitmap memory region passed in as map. Is my understanding
> > > > > correct here?
> > > >
> > > > Yes, that seems to be the case here.
> > > >
> > > > > In xgpio_set_multiple(), the variables width[0] and width[1] serve as
> > > > > possible start and nbits values for the bitmap_set_value() calls.
> > > > > Because width[0] and width[1] are unsigned int variables, GCC considers
> > > > > the possibility that the value of width[0]/width[1] might exceed the
> > > > > length of the bitmap memory region named old and thus result in a stack
> > > > > smash.
> > > > >
> > > > > I don't know if invalid width values are actually possible for the
> > > > > Xilinx gpio device, but let's err on the side of safety and assume this
> > > > > is actually a possibility. We should verify that the combined value of
> > > > > gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
> > > > > check for this in xgpio_probe() when we grab the gpio_width values.
> > > > >
> > > > > However, we're still left with the GCC warnings because GCC is not smart
> > > > > enough to know that we've already checked the boundary and width[0] and
> > > > > width[1] are valid values. I suspect we can avoid this warning is we
> > > > > refactor bitmap_set_value() to increment map seperately and then set it:
> > > >
> > > > As I understand it, part of the problem is that gcc sees the possible
> > > > range as being constrained by the operations on 'start' and 'nbits',
> > > > in particular the shift in BIT_WORD() that put an upper bound on
> > > > the index, but then it sees that the upper bound is higher than the
> > > > upper bound of the array, i.e. element zero.
> > > >
> > > > I added a check
> > > >
> > > > if (start >= 64 || start + size >= 64) return;
> > > >
> > > > in the godbolt.org testcase, which does help limit the start
> > > > index appropriately, but it is not sufficient to let the compiler
> > > > see that the 'if (space >= nbits) ' condition is guaranteed to
> > > > be true for all values here.
> > > >
> > > > > static inline void bitmap_set_value(unsigned long *map,
> > > > > unsigned long value,
> > > > > unsigned long start, unsigned long nbits)
> > > > > {
> > > > > const unsigned long offset = start % BITS_PER_LONG;
> > > > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > > > const unsigned long space = ceiling - start;
> > > > >
> > > > > map += BIT_WORD(start);
> > > > > value &= GENMASK(nbits - 1, 0);
> > > > >
> > > > > if (space >= nbits) {
> > > > > *map &= ~(GENMASK(nbits - 1, 0) << offset);
> > > > > *map |= value << offset;
> > > > > } else {
> > > > > *map &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > > *map |= value << offset;
> > > > > map++;
> > > > > *map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > > > *map |= value >> space;
> > > > > }
> > > > > }
> > > > >
> > > > > This avoids adding a costly conditional check inside bitmap_set_value()
> > > > > when almost all bitmap_set_value() calls will have static arguments with
> > > > > well-defined and obvious boundaries.
> > > > >
> > > > > Do you think this would be an acceptable solution to resolve your GCC
> > > > > warnings?
> > > >
> > > > Unfortunately, it does not seem to make a difference, as gcc still
> > > > knows that this compiles to the same result, and it produces the same
> > > > warning as before (see https://godbolt.org/z/rjx34r)
> > > >
> > > > Arnd
> > >
> > > Hi Arnd,
> > >
> > > Sharing a different version of bitmap_set_valuei() function. See below.
> > >
> > > Let me know if the below solution looks good to you and if it resolves
> > > the above compiler warning.
> > >
> > >
> > > @@ -1,5 +1,5 @@
> > > static inline void bitmap_set_value(unsigned long *map,
> > > - unsigned long value,
> > > + unsigned long value, const size_t length,
> > > unsigned long start, unsigned long nbits)
> > > {
> > > const size_t index = BIT_WORD(start);
> > > @@ -7,6 +7,9 @@ static inline void bitmap_set_value(unsigned long *map,
> > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > const unsigned long space = ceiling - start;
> > >
> > > + if (index >= length)
> > > + return;
> > > +
> > > value &= GENMASK(nbits - 1, 0);
> > >
> > > if (space >= nbits) {
> > > @@ -15,6 +18,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > > } else {
> > > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > map[index + 0] |= value << offset;
> > > +
> > > + if (index + 1 >= length)
> > > + return;
> > > +
> > > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > map[index + 1] |= value >> space;
> > > }
> >
> > One of my concerns is that we're incurring the latency two additional
> > conditional checks just to suppress a compiler warning about a case that
> > wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
> > there's a way for us to suppress these warnings without adding onto the
> > latency of this function; given that bitmap_set_value() is intended to
> > be used in loops, conditionals here could significantly increase latency
> > in drivers.
> >
> > I wonder if array_index_nospec() might have the side effect of
> > suppressing these warnings for us. For example, would this work:
> >
> > static inline void bitmap_set_value(unsigned long *map,
> > unsigned long value,
> > unsigned long start, unsigned long nbits)
> > {
> > const unsigned long offset = start % BITS_PER_LONG;
> > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > const unsigned long space = ceiling - start;
> > size_t index = BIT_WORD(start);
> >
> > value &= GENMASK(nbits - 1, 0);
> >
> > if (space >= nbits) {
> > index = array_index_nospec(index, index + 1);
> >
> > map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> > map[index] |= value << offset;
> > } else {
> > index = array_index_nospec(index, index + 2);
> >
> > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > map[index + 0] |= value << offset;
> > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > map[index + 1] |= value >> space;
> > }
> > }
> >
> > Or is this going to produce the same warning because we're not using an
> > explicit check against the map array size?
> >
> > William Breathitt Gray
>
> After testing my suggestion, it looks like the warnings are still
> present. :-(
>
> Something else I've also considered is perhaps using the GCC built-in
> function __builtin_unreachable() instead of returning. So in Syed's code
> we would have the following instead:
>
> if (index + 1 >= length)
> __builtin_unreachable();
>
> This might allow GCC to optimize better and avoid the conditional check
> all together, thus avoiding latency while also hinting enough context to
> the compiler to suppress the warnings.
>
> William Breathitt Gray

I also thought of another optimization. Arnd, William, let me know
what you think about it.

Since exceeding the array limit is a rather rare event, we can use the
gcc extension: 'unlikely' for the boundary checks.
We can use it at the two places where 'index' and 'index + 1' is being
checked against the boundary limit.

It might help optimize the code. Wouldn't it?

Syed Nayyar Waris