Re: [PATCH v17 01/14] bitops: Introduce the for_each_set_clump8 macro

From: William Breathitt Gray
Date: Thu Oct 10 2019 - 09:13:01 EST


On Thu, Oct 10, 2019 at 10:21:45AM +0200, Geert Uytterhoeven wrote:
> Hi Andy,
>
> On Thu, Oct 10, 2019 at 10:08 AM Andy Shevchenko
> <andy.shevchenko@xxxxxxxxx> wrote:
> > On Thu, Oct 10, 2019 at 09:49:51AM +0200, Geert Uytterhoeven wrote:
> > > On Thu, Oct 10, 2019 at 9:42 AM Andy Shevchenko
> > > <andy.shevchenko@xxxxxxxxx> wrote:
> > > > On Thu, Oct 10, 2019 at 9:29 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > > > On Thu, Oct 10, 2019 at 7:49 AM Andy Shevchenko
> > > > > <andy.shevchenko@xxxxxxxxx> wrote:
> > > > > > On Thu, Oct 10, 2019 at 5:31 AM Masahiro Yamada
> > > > > > <yamada.masahiro@xxxxxxxxxxxxx> wrote:
> > > > > > > On Thu, Oct 10, 2019 at 3:54 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > > > > > > On Wed, Oct 9, 2019 at 7:09 PM Andy Shevchenko
> > > > > > > > <andriy.shevchenko@xxxxxxxxxxxxxxx> wrote:
> > > > > > > > > On Thu, Oct 10, 2019 at 01:28:08AM +0900, Masahiro Yamada wrote:
> > > > > > > > > > On Thu, Oct 10, 2019 at 12:27 AM William Breathitt Gray
> > > > > > > > > > <vilhelm.gray@xxxxxxxxx> wrote:
> >
> > > > > > > > > > Why is the return type "unsigned long" where you know
> > > > > > > > > > it return the 8-bit value ?
> > > > > > > > >
> > > > > > > > > Because bitmap API operates on unsigned long type. This is not only
> > > > > > > > > consistency, but for sake of flexibility in case we would like to introduce
> > > > > > > > > more calls like clump16 or so.
> > > > > > > >
> > > > > > > > TBH, that doesn't convince me: those functions explicitly take/return an
> > > > > > > > 8-bit value, and have "8" in their name. The 8-bit value is never
> > > > > > > > really related to, retrieved from, or stored in a full "unsigned long"
> > > > > > > > element of a bitmap, only to/from/in a part (byte) of it.
> > > > > > > >
> > > > > > > > Following your rationale, all of iowrite{8,16,32,64}*() should take an
> > > > > > > > "unsigned long" value, too.
> > > > > > >
> > > > > > > Using u8/u16/u32/u64 looks more consistent with other bitmap helpers.
> > > > > > >
> > > > > > > void bitmap_from_arr32(unsigned long *bitmap, const u32 *buf, unsigned
> > > > > > > int nbits);
> > > > > > > void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap, unsigned int nbits);
> > > > > > > static inline void bitmap_from_u64(unsigned long *dst, u64 mask);
> > > > > > >
> > > > > > > If you want to see more examples from other parts,
> > > > > >
> > > > > > Geert's and yours examples both are not related. They are about
> > > > > > fixed-width properies when we know that is the part of protocol.
> > > > > > Here we have no protocol which stricts us to the mentioned fixed-width types.
> > > > >
> > > > > Yes you have: they are functions to store/retrieve an 8-bit value from
> > > > > the middle of the bitmap, which is reflected in their names ("clump8",
> > > > > "value8").
> > > > > The input/output value is clearly separated from the actual bitmap,
> > > > > which is referenced by the "unsigned long *".
> > > > >
> > > > > If you add new "value16" functions, they will be intended to store/retrieve
> > > > > 16-bit values.
> > > >
> > > > And if I add 4-bit, 12-bit or 24-bit values, what should I use?
> > >
> > > Whatever is needed to store that?
> > > I agree "unsigned long" is appropriate for a generic function to extract a
> > > bit field of 1 to BITS_PER_LONG bits.
> > >
> > > > > Besides, if retrieving an 8-bit value requires passing an
> > > > > "unsigned long *", the caller needs two variables: one unsigned long to
> > > > > pass the address of, and one u8 to copy the returned value into.
> > > >
> > > > Why do you need a temporary variable? In some cases it might make
> > > > sense, but in general simple cases I don't see what you may achieve
> > > > with it.
> > >
> > > Because find_next_clump8() takes a pointer to store the output value.
> >
> > So does regmap_read().
>
> I believe that one is different, as it is a generic function, and the
> width of the
> returned value depends on the regmap config.
>
> > 8 appeared there during review when it has been proposed to optimize to 8-bit
> > clumps as most of the current users utilize it. The initial idea was to be
> > bit-width agnostic. And with current API it's possible to easy convert to other
> > formats later if we need.
>
> "optimized for 8-bit clumps" and "out-of-line function that takes an
> unsigned long pointer for an output parameter" don't match well, IMHO.
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds

"Optimize" may not be the best way of describing it. I conceded to
introducing a restricted implementation (i.e. for_each_set_clump8) since
there were disagreements on the best approach for an implementation a
generic for_each_set_clump macro that could support any bit size. So I
settled for introducing just for_each_set_clump8 since it has an
implementation everyone could agree on and I didn't want to stall the
patchset for this introduction.

I'm hoping to propose the generic for_each_set_clump macro again in the
future after for_each_set_clump8 has had time to be utilized. There are
some files that I think might benefit from such a generic implementation
(e.g. gpio-thunderx with 64-bit ports and gpio-xilinx with variable size
channels). In such case, for_each_set_clump8 would likely be
reimplemented as a macro hardcoding an 8 passed to for_each_set_clump --
or perhaps just eliminated and replaced with for_each_set_clump directly
-- so maintaining clump as unsigned long pointer is useful since we
won't need to worry about redeclaring variables to match the datatype.

Though I admit that there are advantages in specifying the datatype as
u8 (or u16, u32, etc.). If we know the size then it's reasonable to
expect that the implementation can be optimized to not worry about
variable sizes and boundaries -- as exemplified by the simplicity of the
for_each_set_clump8 implementation. So that may be an argument for
keeping the for_each_set_clump8 implementation separate from the generic
for_each_set_clump implementation.

William Breathitt Gray