Re: Alternative implementation of the generic __ffs

From: Alexander van Heukelum
Date: Sat Apr 19 2008 - 08:10:50 EST


On Fri, 18 Apr 2008 21:13:47 -0700 (PDT), "dean gaudet"
<dean@xxxxxxxxxx> said:
> On Fri, 18 Apr 2008, Joe Perches wrote:
> > On Fri, 2008-04-18 at 18:11 -0700, dean gaudet wrote:
> > > have you benchmarked it?
> >
> > I modified Alexander's benchmark:
> > http://lkml.org/lkml/2008/4/18/267
> > to include 32 and 64 bit variants called smallest.
> >
> > On an old ARM:
>
> i'm guessing the 32-bit constants suck :(
>
> the code could be modified to use 16-bit constants only -- it would add
> some dependent operations though (to move the hot bit into the low
> 16-bits).
>
> -dean

That would look like this (although I chose to reduce to less than 128,
due to completely irrelevant x86 considerations ;) ).

static ATTR int __ffs32_smallconstant(unsigned int value)
{
int x0, x1, x2, x3, x4;
unsigned int t2, t4;

value &= -value;
t2 = value | (value >> 16);
t4 = t2 | (t2 >> 8);
x4 = (value << 16) ? 0 : 16;
x3 = (t2 << 24) ? 0 : 8;
x2 = (t4 & 0x0f) ? 0 : 4;
x1 = (t4 & 0x33) ? 0 : 2;
x0 = (t4 & 0x55) ? 0 : 1;

return x4 | x3 | x2 | x1 | x0;
}

I've added that to the benchmark, which you can now find here:
http://heukelum.fastmail.fm/ffs/. Testing the same with
"return x4 + x3 + x2 + x1 + x0;" as the last line would be
interesting too.

Greetings,
Alexander

> > $ gcc --version gcc (GCC) 3.4.6
> >
> > $ cat /proc/cpuinfo
> > Processor : Intel StrongARM-110 rev 4 (v4l)
> > BogoMIPS : 262.14
> > Hardware : Rebel-NetWinder
> > Revision : 57ff
> > Serial : 000000000000185c
> >
> > $ gcc -Os -fomit-frame-pointer ffs.c
> > $ ./a.out
> > Original: 3180 tics, 8379 tics
> > New: 4280 tics, 8890 tics
> > Smallest: 4027 tics, 7835 tics
> > Empty loop: 1543 tics, 2260 tics
> >
> > $ gcc -O2 -fomit-frame-pointer ffs.c
> > $ ./a.out
> > Original: 3161 tics, 7843 tics
> > New: 4778 tics, 8783 tics
> > Smallest: 4408 tics, 7149 tics
> > Empty loop: 1515 tics, 2140 tics
> >
> > $ gcc -O3 -fomit-frame-pointer ffs.c
> > $ ./a.out
> > Original: 3078 tics, 7692 tics
> > New: 4714 tics, 8671 tics
> > Smallest: 4344 tics, 7117 tics
> > Empty loop: 1444 tics, 2024 tics

Thanks for testing, Harvey!
--
Alexander van Heukelum
heukelum@xxxxxxxxxxx

--
http://www.fastmail.fm - IMAP accessible web-mail

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/