Daniel Phillips <dphillips@sistina.com> writes:
> Here's a faster implementation of generic_fls, that I discovered
> accidently, by not noticing 2.5 already had a generic_fls, and so I
> rolled my own. Like the incumbent, it's O log2(bits), but there's
> not a lot of resemblance beyond that. I think the new algorithm is
> inherently more parallelizable than the traditional approach. A
> processor that can speculatively evaluate both sides of a
> conditional would benefit even more than the PIII I tested on.
gcc 3.4 will have a __builtin_ctz function which can be used for this.
It will emit special instructions on CPUs that support it (i386, Alpha
EV67), and use a lookup table on others, which is very boring, but
also faster.
-- Falk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Wed Apr 30 2003 - 22:00:34 EST