RE: [PATCH 4/6] perf: Optimize get_recursion_context()

From: David Laight
Date: Sat Oct 31 2020 - 09:19:38 EST


From: David Laight
> Sent: 31 October 2020 12:12
>
...
> The gcc 7.5.0 I have handy probably generates the best code for:
>
> unsigned char q_2(unsigned int pc)
> {
> unsigned char rctx = 0;
>
> rctx += !!(pc & (NMI_MASK));
> rctx += !!(pc & (NMI_MASK | HARDIRQ_MASK));
> rctx += !!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET));
>
> return rctx;
> }
>
> 0000000000000000 <q_2>:
> 0: f7 c7 00 00 f0 00 test $0xf00000,%edi # clock 0
> 6: 0f 95 c0 setne %al # clock 1
> 9: f7 c7 00 00 ff 00 test $0xff0000,%edi # clock 0
> f: 0f 95 c2 setne %dl # clock 1
> 12: 01 c2 add %eax,%edx # clock 2
> 14: 81 e7 00 01 ff 00 and $0xff0100,%edi
> 1a: 0f 95 c0 setne %al
> 1d: 01 d0 add %edx,%eax # clock 3
> 1f: c3 retq
>
> I doubt that is beatable.

I lied, you should be able to get:
test $0xff0000,%edi # clock 0
setne %al # clock 1
test $0xff0100,%edi # clock 0
setne %dl # clock 1
add $fffff000,%edi
adc %dl, %al # clock 2

But I suspect getting it from the compiler might be hard!

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)