Re: [PATCH v4 3/6] perf: Shrink the size of the recursion counter.
From: Sebastian Andrzej Siewior
Date: Mon Jul 01 2024 - 08:56:55 EST
On 2024-07-01 14:31:37 [+0200], Peter Zijlstra wrote:
> On Mon, Jun 24, 2024 at 05:15:16PM +0200, Sebastian Andrzej Siewior wrote:
> > There are four recursion counter, one for each context. The type of the
> > counter is `int' but the counter is used as `bool' since it is only
> > incremented if zero.
> >
> > Reduce the type of the recursion counter to an unsigned char, keep the
> > increment/ decrement operation.
>
> Does this actually matter? Aren't u8 memops encoded by longer
> instructions etc..
The goal here isn't to reduce the opcodes but to add it to task_struct
without making it larger by filling a hole.
But since you made me look at assembly:
old:
316b: 65 48 8b 15 00 00 00 mov %gs:0x0(%rip),%rdx # 3173 <perf_swevent_get_recursion_context+0x33>
3173: 1c ff sbb $0xff,%al
3175: 48 0f be c8 movsbq %al,%rcx
3179: 48 8d 94 8a 00 00 00 lea 0x0(%rdx,%rcx,4),%rdx
3180: 00
317d: R_X86_64_32S .data..percpu+0x4c
3181: 8b 0a mov (%rdx),%ecx
3183: 85 c9 test %ecx,%ecx
3185: 75 0e jne 3195 <perf_swevent_get_recursion_context+0x55>
3187: c7 02 01 00 00 00 movl $0x1,(%rdx)
^^^
318d: 0f be c0 movsbl %al,%eax
new:
2ff8: 1c ff sbb $0xff,%al
2ffa: 81 e2 00 01 ff 00 and $0xff0100,%edx
3000: 83 fa 01 cmp $0x1,%edx
3003: 1c ff sbb $0xff,%al
3005: 48 0f be d0 movsbq %al,%rdx
3009: 48 8d 94 11 00 00 00 lea 0x0(%rcx,%rdx,1),%rdx
3010: 00
300d: R_X86_64_32S .data..percpu+0x4c
3011: 80 3a 00 cmpb $0x0,(%rdx)
3014: 75 0b jne 3021 <perf_swevent_get_recursion_context+0x51>
3016: c6 02 01 movb $0x1,(%rdx)
^^^
3019: 0f be c0 movsbl %al,%eax
301c: e9 00 00 00 00 jmp 3021 <perf_swevent_get_recursion_context+0x51>
So we do even save a few bytes. We could avoid the "movsbl" at 3019 by
making the return type `unsigned char' ;)
Sebastian