Re: [PATCH v4 1/3] powerpc/bitops: Use immediate operand when possible

From: Segher Boessenkool
Date: Mon Sep 20 2021 - 17:30:39 EST


Hi!

On Mon, Sep 20, 2021 at 10:31:17AM +0200, Christophe Leroy wrote:
> Today we get the following code generation for bitops like
> set or clear bit:
>
> c0009fe0: 39 40 08 00 li r10,2048
> c0009fe4: 7c e0 40 28 lwarx r7,0,r8
> c0009fe8: 7c e7 53 78 or r7,r7,r10
> c0009fec: 7c e0 41 2d stwcx. r7,0,r8
>
> c000d568: 39 00 18 00 li r8,6144
> c000d56c: 7c c0 38 28 lwarx r6,0,r7
> c000d570: 7c c6 40 78 andc r6,r6,r8
> c000d574: 7c c0 39 2d stwcx. r6,0,r7
>
> Most set bits are constant on lower 16 bits, so it can easily
> be replaced by the "immediate" version of the operation. Allow
> GCC to choose between the normal or immediate form.

You can also handle the second sixteen bits (the "shifted" half), by
using oris etc. The "%eN" output modifier prints an "s" for this:
/* If the low 16 bits are 0, but some other bit is set, write 's'. */
But this doesn't handle non-constant arguments, so you're likely better
off using what you have noe.

> For clear bits, on 32 bits 'rlwinm' can be used instead of 'andc' for
> when all bits to be cleared are consecutive.

Or when all you want to keep are consecutive (you do handle that now :-) )

> On 64 bits we don't have any equivalent single operation for clearing,
> single bits or a few bits, we'd need two 'rldicl' so it is not
> worth it, the li/andc sequence is doing the same.

You can use rlwinm whenever you want to clear all top 32 bits.

A sometimes nice idiom is ori x,x,N ; xori x,x,N to clear the bits N
(or oris/xoris). But it's two insns no matter what (but no spare
register is needed).

> Signed-off-by: Christophe Leroy <christophe.leroy@xxxxxxxxxx>

> +static inline unsigned long test_and_clear_bits(unsigned long mask, volatile unsigned long *_p)
> +{
> + unsigned long old, t;
> + unsigned long *p = (unsigned long *)_p;
> +
> + if (IS_ENABLED(CONFIG_PPC32) &&
> + __builtin_constant_p(mask) && is_rlwinm_mask_valid(mask)) {

is_rlwinm_mask_valid(~mask)? So that test_and_clear_bits(0, ...) will
work with rlwinm, and test_and_clear_bits(0xffffffff, ...) will not make
gas scream bloody murder ("illegal bitmask"). Tha mask you pass to the
instruction is ~mask after all.

Looks great except that one nit. Thanks :-)

Reviewed-by: Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx>


Segher