Re: [PATCH] ARM: use assembly mnemonics for VFP register access

From: Ard Biesheuvel
Date: Tue Feb 25 2020 - 14:34:04 EST


On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
>
> On Mon, Feb 24, 2020 at 9:22 PM Stefan Agner <stefan@xxxxxxxx> wrote:
> >
> > Clang's integrated assembler does not allow to to use the mcr
> > instruction to access floating point co-processor registers:
> > arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
> > fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
> > ^
> > arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
> > asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
> > ^
> > <inline asm>:1:6: note: instantiated into assembly here
> > mcr p10, 7, r0, cr8, cr0, 0 @ fmxr FPEXC, r0
> > ^
> >
> > The GNU assembler supports the .fpu directive at least since 2.17 (when
> > documentation has been added). Since Linux requires binutils 2.21 it is
> > safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
> > register access.
> >
> > This allows to build vfpmodule.c with Clang and its integrated assembler.
> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/905
> > Signed-off-by: Stefan Agner <stefan@xxxxxxxx>
> > ---
> > arch/arm/vfp/vfpinstr.h | 12 ++++--------
> > 1 file changed, 4 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
> > index 38dc154e39ff..799ccf065406 100644
> > --- a/arch/arm/vfp/vfpinstr.h
> > +++ b/arch/arm/vfp/vfpinstr.h
> > @@ -62,21 +62,17 @@
> > #define FPSCR_C (1 << 29)
> > #define FPSCR_V (1 << 28)
> >
> > -/*
> > - * Since we aren't building with -mfpu=vfp, we need to code
> > - * these instructions using their MRC/MCR equivalents.
> > - */
> > -#define vfpreg(_vfp_) #_vfp_
> > -
> > #define fmrx(_vfp_) ({ \
> > u32 __v; \
> > - asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx %0, " #_vfp_ \
> > + asm(".fpu vfpv2\n" \
> > + "vmrs %0, " #_vfp_ \
> > : "=r" (__v) : : "cc"); \
> > __v; \
> > })
> >
> > #define fmxr(_vfp_,_var_) \
> > - asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
> > + asm(".fpu vfpv2\n" \
> > + "vmsr " #_vfp_ ", %0" \
> > : : "r" (_var_) : "cc")
> >
> > u32 vfp_single_cpdo(u32 inst, u32 fpscr);
> > --
>
> Hi Stefan,
> Thanks for the patch. Reading through:
> - FMRX, FMXR, and FMSTAT:
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Bcfbdihi.html
> - VMRS and VMSR:
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfbdihi.html
>
> Should a macro called `fmrx` that had a comment about `fmrx` be using
> `vmrs` in place of `fmrx`?
>
> It looks like Clang treats them the same, but GCC keeps them separate:
> https://godbolt.org/z/YKmSAs
> Ah, this is only when streaming to assembly. Looks like they have the
> same encoding, and produce the same disassembly. (Godbolt emits
> assembly by default, and has the option to compile, then disassemble).
> If I take my case from godbolt above:
>
> â /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
> â /tmp llvm-objdump -dr x.o
>
> x.o: file format elf32-arm-little
>
>
> Disassembly of section .text:
>
> 00000000 bar:
> 0: f1 ee 10 0a vmrs r0, fpscr
> 4: 70 47 bx lr
> 6: 00 bf nop
>
> 00000008 baz:
> 8: f1 ee 10 0a vmrs r0, fpscr
> c: 70 47 bx lr
> e: 00 bf nop
>
> So indeed a similar encoding exists for the two different assembler
> instructions.

Does that hold for ARM (A32) instructions as well?