Re: [RFC][PATCH] Faster generic_fls

From: Falk Hueffner (falk.hueffner@student.uni-tuebingen.de)
Date: Wed Apr 30 2003 - 06:14:43 EST


Daniel Phillips <dphillips@sistina.com> writes:

> Here's a faster implementation of generic_fls, that I discovered
> accidently, by not noticing 2.5 already had a generic_fls, and so I
> rolled my own. Like the incumbent, it's O log2(bits), but there's
> not a lot of resemblance beyond that. I think the new algorithm is
> inherently more parallelizable than the traditional approach. A
> processor that can speculatively evaluate both sides of a
> conditional would benefit even more than the PIII I tested on.

gcc 3.4 will have a __builtin_ctz function which can be used for this.
It will emit special instructions on CPUs that support it (i386, Alpha
EV67), and use a lookup table on others, which is very boring, but
also faster.

-- 
	Falk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Apr 30 2003 - 22:00:34 EST