Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.

From: Yousong Zhou
Date: Fri Sep 04 2015 - 23:30:49 EST


On 5 September 2015 at 02:52, Maciej W. Rozycki <macro@xxxxxxxxxxxxxx> wrote:
> On Thu, 3 Sep 2015, Yousong Zhou wrote:
>
>> The nomips16 has to be added both as function attribute and assembler
>> directive.
>>
>> When only function attribute is specified, the compiler will inline the
>> function with -Os optimization. The generated assembly code cannot be
>> correctly assembled because ISA mode switch has to be done through jump
>> instruction.
>
> This can't be true. The compiler does not intepret the contents of an
> inline asm and therefore cannot decide whether to inline a function
> containing one or not based on the lone presence or the absence of an
> assembly directive within.
>

Most of the time I trust my compiler and never meddle with the
toolchain. Anyway I made a patch because it really did not work for
me. No big deal. It's not the end of world. It started with a
comment from OpenWrt packages feeds [1]. Actually this "unrecognized
opcode" problem have occurred within OpenWrt quite a few times before.

[1] https://github.com/openwrt/packages/commit/1e29676a8ac74f797f8ca799364681cec575ae6f#commitcomment-12901931

>> When only ".set nomips16" directive is used, the generated assembly code
>> will use MIPS32 code for the inline assembly template and MIPS16 for the
>> function return. The compiled binary is invalid:
>>
>> 00403100 <__arch_swab16>:
>> 403100: 7c0410a0 wsbh v0,a0
>> 403104: e820ea31 swc2 $0,-5583(at)
>>
>> while correct code should be:
>>
>> 00402650 <__arch_swab16>:
>> 402650: 7c0410a0 wsbh v0,a0
>> 402654: 03e00008 jr ra
>> 402658: 3042ffff andi v0,v0,0xffff
>
> It looks to me you're papering something over here -- when you use a
> `.set nomips16' directive the assembler will happily switch your
> instruction set in the middle of an instruction stream. Consequently if
> this function is (for whatever reason; it should not really) inlined in
> MIPS16 code, then you'll get a MIPS32 instruction in the middle, which
> will obviously be interpreted differently in the MIPS16 execution mode and
> is therefore surely a recipe for disaster.

If by "papering" you mean "made up", then whatever. Yeah, it's
disaster, an "invalid instruction" abort.

>
> How did you test your change and made the conclusion quoted with your
> patch description?
>

Compile the following program with a MIPS 24kc big endian variant compiler with
flag "-mips32r2 -mips16 -Os".

#include <stdio.h>
#include <stdint.h>

uint16_t __attribute__((noinline)) f(uint16_t v)
{
v = __cpu_to_le16(v);
return v;
}

int main()
{
printf("%x\n", f(0xbeef));

return 0;
}

When only ".set nomips16" was specified in __arch_swab16(), this was output
from objdump.

242 004003e0 <f>:
243 4003e0: 7c0410a0 wsbh v0,a0
244 4003e4: e820ea31 swc2 $0,-5583(at)
245 4003e8: 65006500 0x65006500
246 4003ec: 65006500 0x65006500

__arch_swab16() was indeed inlined. That swc2 instruction can be got from
assembler with the following code (it's from the "-S" result of GCC).

.set mips16
.set noreorder
.set nomacro
j $31
zeh $2

When only nomips16 function attribute was specified, this time GCC failed with
unrecognized opcode

/tmp/ccaGCouL.s: Assembler messages:
/tmp/ccaGCouL.s:21: Error: unrecognized opcode `wsbh $2,$4'

The generated assembly was something in the following form. Looks like the
assembler did not automatically switch to MIPS32 mode when ".set arch=mip32r2"

.set mips16

.ent f
.type f, @function
f
...
.set push
.set arch=mips32r2
wsbh $2,$4
.pop
j $31
zeh $2
.end f
...

The patch was run tested on QEMU Malta and an router with Atheros
AR9331 SoC. I didn't test __arch_swab64() though. I have done many
other trial-and-error tests while preparing this patch. It was a mess
when I was sure I should expect some sensible behaviour from the
compiler while it actually just did not behave that way.

>> Signed-off-by: Yousong Zhou <yszhou4tech@xxxxxxxxx>
>> ---
>> arch/mips/include/uapi/asm/swab.h | 12 +++++++++---
>> 1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/mips/include/uapi/asm/swab.h b/arch/mips/include/uapi/asm/swab.h
>> index 8f2d184..c4ddc4f 100644
>> --- a/arch/mips/include/uapi/asm/swab.h
>> +++ b/arch/mips/include/uapi/asm/swab.h
>> @@ -16,11 +16,13 @@
>> #if (defined(__mips_isa_rev) && (__mips_isa_rev >= 2)) || \
>> defined(_MIPS_ARCH_LOONGSON3A)
>>
>> -static inline __attribute_const__ __u16 __arch_swab16(__u16 x)
>> +static inline __attribute__((nomips16)) __attribute_const__
>> + __u16 __arch_swab16(__u16 x)
>> {
>> __asm__(
>> " .set push \n"
>> " .set arch=mips32r2 \n"
>> + " .set nomips16 \n"
>> " wsbh %0, %1 \n"
>> " .set pop \n"
>> : "=r" (x)
>
> So setting aside the correctness issues discussed above, for MIPS16 code
> this has to be put out of line by the compiler, with all the usual
> overhead of a function call, causing a performance loss rather than a gain
> expected here. Especially as switching the ISA mode requires draining the
> whole pipeline so that subsequent instructions are interpreted in the new
> execution mode. This is expensive in performance terms.
>
> I'm fairly sure simply disabling these asms (#ifndef __mips16) and
> letting the compiler generate the right mask and shift operations from
> plain C code will be cheaper in performance terms and possibly cheaper in
> code size as well, especially in otherwise leaf functions where an extra
> function call will according to the ABI clobber temporaries. So I suggest
> going in that direction instead.

I agree. Then you will provide the fix right? I am just curious
where that __mips16 should be placed or is it from compiler and
assembler?

>
> So this is a NAK really.

okay.

Cheers

yousong
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/