Re: [RFC PATCH 1/2] bits: introduce ffs_val()
From: David Laight
Date: Fri Jan 09 2026 - 18:54:34 EST
On Fri, 09 Jan 2026 18:01:02 +0100
"Arnd Bergmann" <arnd@xxxxxxxx> wrote:
> On Fri, Jan 9, 2026, at 17:37, Petr Tesarik wrote:
>
> > + * Returns:
> > + * least significant non-zero bit, 0 if all bits are zero
> > + */
> > +#define ffs_val(x) \
> > +({ \
> > + const typeof(x) val__ = (x); \
> > + val__ & -val__; \
> > +})
>
> This looks good to me, but I'd suggest using 'const auto val__'
> instead of typeof(), to reduce expanding complex arguments twice.
It is more usual to just use a single _ prefix and the same name.
I wouldn't bother with 'const' either, maybe:
#define ffs_val(val) ({ \
auto _val = val; \
_val & -_val; \
})
However it isn't necessarily better than using __ffs().
FIELD_PREP(mask, val) (for non-constant mask) can be (mask & -mask) * val
or val << __ffs(mask).
So the 'ffs' version is fewer instructions (assuming non-zero mask).
The timings for bsf/bsr are similar to those for imul on intel cpu,
mul wins on zen3 and bsf/bsr on zen4.
On balance the __ffs() version is actually likely to be faster.
Other architectures may fair better or worse.
Clearly you don't want to use __ffs() unless it is a single instruction.
Of course, for FIELD_GET(reg, mask) you'd need reg/(mask & -mask)
and the cost of the integer division is far more than __ffs().
(But that will give you a compile-time error if mask is a constant zero
and the compiler will convert the divide to a shift.)
And for 64bit calculations on 32bit 'all bets are off'.
Even the shift left might be problematic.
David
>
> Arnd
>