Re:
From: Yury Norov
Date: Sat Dec 05 2020 - 13:33:00 EST
On Sat, Dec 5, 2020 at 3:10 AM Rasmus Villemoes
<linux@xxxxxxxxxxxxxxxxxx> wrote:
>
> On 03/12/2020 19.46, Yury Norov wrote:
>
> > I would prefer to avoid changing the find*bit() semantics. As for now,
> > if any of find_*_bit()
> > finds nothing, it returns the size of the bitmap it was passed.
>
> Yeah, we should actually try to fix that, it causes bad code generation.
> It's hard, because callers of course do that "if ret == size" check. But
> it's really silly that something like find_first_bit needs to do that
> "min(i*BPL + __ffs(word), size)" - the caller does a comparison anyway,
> that comparison might as well be "ret >= size" rather than "ret ==
> size", and then we could get rid of that branch (which min() necessarily
> becomes) at the end of find_next_bit.
We didn't do that 5 years ago because it's too invasive and the improvement
is barely measurable, the difference is 2 instructions (on arm64).e.
Has something
changed since that?
20000000000000000 <find_first_bit_better>:
0: aa0003e3 mov x3, x0
4: aa0103e0 mov x0, x1
8: b4000181 cbz x1, 38 <find_first_bit_better+0x38>
c: f9400064 ldr x4, [x3]
10: d2800802 mov x2, #0x40 // #64
14: 91002063 add x3, x3, #0x8
18: b40000c4 cbz x4, 30 <find_first_bit_better+0x30>
1c: 14000008 b 3c <find_first_bit_better+0x3c>
20: f8408464 ldr x4, [x3], #8
24: 91010045 add x5, x2, #0x40
28: b50000c4 cbnz x4, 40 <find_first_bit_better+0x40>
2c: aa0503e2 mov x2, x5
30: eb00005f cmp x2, x0
34: 54ffff63 b.cc 20 <find_first_bit_better+0x20> //
b.lo, b.ul, b.last
38: d65f03c0 ret
3c: d2800002 mov x2, #0x0 // #0
40: dac00084 rbit x4, x4
44: dac01084 clz x4, x4
48: 8b020080 add x0, x4, x2
4c: d65f03c0 ret
0000000000000050 <find_first_bit_worse>:
50: aa0003e4 mov x4, x0
54: aa0103e0 mov x0, x1
58: b4000181 cbz x1, 88 <find_first_bit_worse+0x38>
5c: f9400083 ldr x3, [x4]
60: d2800802 mov x2, #0x40 // #64
64: 91002084 add x4, x4, #0x8
68: b40000c3 cbz x3, 80 <find_first_bit_worse+0x30>
6c: 14000008 b 8c <find_first_bit_worse+0x3c>
70: f8408483 ldr x3, [x4], #8
74: 91010045 add x5, x2, #0x40
78: b50000c3 cbnz x3, 90 <find_first_bit_worse+0x40>
7c: aa0503e2 mov x2, x5
80: eb02001f cmp x0, x2
84: 54ffff68 b.hi 70 <find_first_bit_worse+0x20> // b.pmore
88: d65f03c0 ret
8c: d2800002 mov x2, #0x0 // #0
90: dac00063 rbit x3, x3
94: dac01063 clz x3, x3
98: 8b020062 add x2, x3, x2
9c: eb02001f cmp x0, x2
a0: 9a829000 csel x0, x0, x2, ls // ls = plast
a4: d65f03c0 ret
> I haven't dug very deep into this, but I could also imagine the
> arch-specific parts of this might become a little easier to do if the
> semantics were just "if no such bit, return an indeterminate value >=
> the size".
>
> > Changing this for
> > a single function would break the consistency, and may cause problems
> > for those who
> > rely on existing behaviour.
>
> True. But I think it should be possible - I suppose most users are via
> the iterator macros, which could all be updated at once. Changing ret ==
> size to ret >= size will still work even if the implementations have not
> been switched over, so it should be doable.
Since there's no assembler users for it, we can do just:
#define find_first_bit(bitmap, size)
min(better_find_first_bit((bitmap), (size)), (size))
... and deprecate find_first_bit.
> > Passing non-positive size to find_*_bit() should produce undefined
> > behaviour, because we cannot dereference a pointer to the bitmap in
> > this case; this is most probably a sign of a problem on a caller side
> > anyways.
>
> No, the out-of-line bitmap functions should all handle the case of a
> zero-size bitmap sensibly.
I could be more specific, the behaviour is defined: don't dereference
the address and return undefined value (which now is always 0).
> Is bitmap full? Yes (all the 0 bits are set).
> Is bitmap empty? Yes, (none of the 0 bits are set).
> Find the first bit set (returns 0, there's no such bit)
I can't answer because this object is not a map of bits - there's no room for
bits inside.
> Etc. The static inlines for small_const_nbits do assume that the pointer
> can be dereferenced, which is why small_const_nbits was updated to mean
> 1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG.
I don't want to do something like
if (size == 0)
return -1;
... because it legitimizes this kind of usage and hides problems on
callers' side.
Instead, I'd add WARN_ON(size == 0), but I don't think it's so
critical to bother with it.
Yury
> Rasmus