Re: [PATCH] arch/riscv: Add bitrev.h file to support rev8 and brev8
From: cp0613
Date: Mon Apr 13 2026 - 08:32:04 EST
On Sat, 11 Apr 2026 10:48:17 +0800, ruanjinjie@xxxxxxxxxx wrote:
> >> +#include <linux/types.h>
> >> +#include <asm/cpufeature-macros.h>
> >> +#include <asm/hwcap.h>
> >> +
> >> +extern u8 const byte_rev_table[256];
> >> +static __always_inline __attribute_const__ u32 __arch_bitrev32(u32 x)
> >> +{
> >> + if (IS_ENABLED(CONFIG_RISCV_ISA_ZBKB) &&
> >> + riscv_has_extension_likely(RISCV_ISA_EXT_ZBKB)) {
> >> + unsigned long result = x;
> >> +
> >> + asm volatile(
> >> + ".option push\n"
> >> + ".option arch,+zbkb\n"
> >> + "rev8 %0, %0\n"
> >> + "brev8 %0, %0\n"
> >> + ".option pop"
> >> + : "+r" (result)
> >> + );
> >> +
> >> + if (__riscv_xlen == 64)
> >> + return (u32)(result >> 32);
> >> +
> >> + return (u32)result;
> >> + }
> >> +
> >> + return (u32)byte_rev_table[x & 0xff] << 24 |
> >> + (u32)byte_rev_table[(x >> 8) & 0xff] << 16 |
> >> + (u32)byte_rev_table[(x >> 16) & 0xff] << 8 |
> >> + (u32)byte_rev_table[x >> 24];
> >> +}
> >
> > Hi Jinjie,
> >
> > Thanks for your patch. I have two suggestions.
> > 1. When ZBKB is not supported, is it simpler to directly use the generic
> > implementation __bitrev32 in <linux/bitrev.h>.
>
> Actually, you can't simply use the default implementation from
> linux/bitrev.h. It includes asm/bitrev.h (the architecture-specific
> implementation), which would lead to compilation issues. Furthermore,
> when ZBKB is not supported, current implementation is identical to the
> default one.
Understood. So, have you considered renaming the function to `generic_xxx`,
like `generic___ffs` in bitops?
> > 2. Could you please provide a benchmark test case to illustrate the
> > performance comparison with and without this extension (refer to
> > test_bitops.c) and also provide the results by bloat-o-meter.
>
> I don't have access to RISC-V hardware at the moment, so I've only
> performed basic functional testing on QEMU, which completed without
> issues,could you please help run some benchmarks to verify the performance?
I don't currently have the hardware that supports the corresponding extension,
but I can test it using an FPGA environment when I have the opportunity (it
will take some time).
> Thanks,
> Jinjie
>
> >
> > Thanks,
> > Pei
Thanks,
Pei