On 2023-09-20 09:39, Mark-PK Tsai wrote:
Add vudot opcode to the VFP undef hook to fix the
potentially undefined instruction error when the
user space executes vudot instruction.
Did the kernel expose a hwcap to say that the dot product extension is
supported? I'm pretty sure it didn't, so why would userspace expect this
to work? ;)
The hwcap for dotprod has been exported since commit:
62ea0d873af3 ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd
IIRC Amit was looking at defining the hwcaps to align with arm64 compat,
but I believe that series faltered since most of them weren't actually
needed (and I think at that point it was still missing the VFP support
code parts). It would be nice if someone could pick up and combine both
Were the mentioned series related to this commit?
62ea0d873af3 ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd
efforts and get this done properly; fill in *all* the hwcaps and
relevant handling for extensions which Cortex-A55 supports (since
there's definitely more than just VUDOT), and then hopefully we're done
for good.
Agree.
Before this commit, kernel didn't handle the undef exception
caused by vudot and didn't enable VFP in lazy VFP context
switch code like other NEON instructions.
This led to the occurrence of the undefined instruction
error as following:
[ 250.741238 ] 0904 (26902): undefined instruction: pc=004014ec
...
[ 250.741287 ] PC is at 0x4014ec
[ 250.741298 ] LR is at 0xb677874f
[ 250.741303 ] pc : [<004014ec>] lr : [<b677874f>] psr: 80070010
[ 250.741309 ] sp : beffedb0 ip : b67d7864 fp : beffee58
[ 250.741314 ] r10: 00000000 r9 : 00000000 r8 : 00000000
[ 250.741319 ] r7 : 00000001 r6 : 00000001 r5 : beffee90 r4 : 00401470
[ 250.741324 ] r3 : beffee20 r2 : beffee30 r1 : beffee40 r0 : 004003a8
[ 250.741331 ] Flags: Nzcv IRQs on FIQs on Mode USER_32 ISA ARM Segment user
[ 250.741339 ] Control: 10c5383d Table: 32d0406a DAC: 00000055
[ 250.741348 ] Code: f4434aef f4610aef f4622aef f4634aef (fc620df4)
Below is the assembly of the user program:
0x4014dc <+108>: vst1.64 {d20, d21}, [r3:128]
0x4014e0 <+112>: vld1.64 {d16, d17}, [r1:128]
0x4014e4 <+116>: vld1.64 {d18, d19}, [r2:128]
0x4014e8 <+120>: vld1.64 {d20, d21}, [r3:128] --> switch out
0x4014ec <+124>: vudot.u8 q8, q9, q10 <-- switch in, and FPEXC.EN = 0
SIGILL(illegal instruction)
Link: https://services.arm.com/support/s/case/5004L00000XsOjP
Linking to your private support case is not useful to upstream. Even I
can't open that link.
I thought that maybe someone in arm need this.
But it seems a bit noisy so I will remove the link from v2.
Signed-off-by: Mark-PK Tsai <mark-pk.tsai@xxxxxxxxxxxx>
---
arch/arm/vfp/vfpmodule.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 7e8773a2d99d..7eab8d1019d2 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -788,6 +788,12 @@ static struct undef_hook neon_support_hook[] = {{
.cpsr_mask = PSR_T_BIT,
.cpsr_val = 0,
.fn = vfp_support_entry,
+}, {
+ .instr_mask = 0xffb00000,
+ .instr_val = 0xfc200000,
+ .cpsr_mask = PSR_T_BIT,
+ .cpsr_val = 0,
+ .fn = vfp_support_entry,
}, {
.instr_mask = 0xef000000,
.instr_val = 0xef000000,
@@ -800,6 +806,12 @@ static struct undef_hook neon_support_hook[] = {{
.cpsr_mask = PSR_T_BIT,
.cpsr_val = PSR_T_BIT,
.fn = vfp_support_entry,
+}, {
+ .instr_mask = 0xffb00000,
+ .instr_val = 0xfc200000,
+ .cpsr_mask = PSR_T_BIT,
+ .cpsr_val = PSR_T_BIT,
+ .fn = vfp_support_entry,
Why have two entries conditional on each possible value of one bit for
otherwise identical encodings? Surely it suffices to set both cpsr_mask
and cpsr_val to 0?
You're right.
I will set both cpsr_mask and cpsr_val to 0 and use single entry,
as you suggested, in the v2 patch.
Thanks.
Thanks,
Robin.
}};
static struct undef_hook vfp_support_hook = {