Re: [PATCH] ARM: vfp: Add vudot opcode to VFP undef hook

From: Xuewen Yan
Date: Thu Sep 21 2023 - 15:28:01 EST


Hi Mark-PK, Robin

We also meets the scene, and has the following stack:

Thread-2 (5361): undefined instruction: pc=d05ae08c
CPU: 5 PID: 5361 Comm: Thread-2 Tainted: G W O
5.4.210-android12-9-04458-g56c7c43d3298-ab000045 #98
Hardware name: Generic DT based system
PC is at 0x7d1aa068
LR is at 0x7c22ae50
pc : [<7d1aa068>] lr : [<7c22ae50>] psr: 800b0010
sp : 7c78ee20 ip : 7c22ae40 fp : 7c78eea0
r10: 7c22ae60 r9 : 7c22ae70 r8 : 00000008
r7 : 7c0fee80 r6 : 7c0fee70 r5 : 7c0fee60 r4 : 00000010
r3 : 7c0fee50 r2 : 00000000 r1 : 00000010 r0 : a698aee0
Flags: Nzcv IRQs on FIQs on Mode USER_32 ISA ARM Segment user
Control: 10c5383d Table: 8649c06a DAC: 00000055
Code: edddcb1e fe22edd4 f4600a0d fc67cd98 (fe20edf4) <<<

So, we also need add the 0xfe000000:

Could you please help add the following patch into the patch-v2?

Thanks!

---
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 7e8773a2d99d..1078c0f169d2 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -788,6 +788,18 @@ static struct undef_hook neon_support_hook[] = {{
.cpsr_mask = PSR_T_BIT,
.cpsr_val = 0,
.fn = vfp_support_entry,
+}, {
+ .instr_mask = 0xfc000000,
+ .instr_val = 0xfc000000,
+ .cpsr_mask = PSR_T_BIT,
+ .cpsr_val = 0,
+ .fn = vfp_support_entry,
+}, {
+ .instr_mask = 0xfe000000,
+ .instr_val = 0xfe000000,
+ .cpsr_mask = PSR_T_BIT,
+ .cpsr_val = 0,
+ .fn = vfp_support_entry,
}, {
.instr_mask = 0xef000000,
.instr_val = 0xef000000,
@@ -800,6 +812,18 @@ static struct undef_hook neon_support_hook[] = {{
.cpsr_mask = PSR_T_BIT,
.cpsr_val = PSR_T_BIT,
.fn = vfp_support_entry,
+}, {
+ .instr_mask = 0xfc000000,
+ .instr_val = 0xfc000000,
+ .cpsr_mask = PSR_T_BIT,
+ .cpsr_val = PSR_T_BIT,
+ .fn = vfp_support_entry,
+}, {
+ .instr_mask = 0xfe000000,
+ .instr_val = 0xfe000000,
+ .cpsr_mask = PSR_T_BIT,
+ .cpsr_val = PSR_T_BIT,
+ .fn = vfp_support_entry,
}};

static struct undef_hook vfp_support_hook = {

On Thu, Sep 21, 2023 at 12:40 PM Mark-PK Tsai <mark-pk.tsai@xxxxxxxxxxxx> wrote:
>
> > On 2023-09-20 09:39, Mark-PK Tsai wrote:
> > > Add vudot opcode to the VFP undef hook to fix the
> > > potentially undefined instruction error when the
> > > user space executes vudot instruction.
> >
> > Did the kernel expose a hwcap to say that the dot product extension is
> > supported? I'm pretty sure it didn't, so why would userspace expect this
> > to work? ;)
>
> The hwcap for dotprod has been exported since commit:
>
> 62ea0d873af3 ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd
>
> >
> > IIRC Amit was looking at defining the hwcaps to align with arm64 compat,
> > but I believe that series faltered since most of them weren't actually
> > needed (and I think at that point it was still missing the VFP support
> > code parts). It would be nice if someone could pick up and combine both
>
> Were the mentioned series related to this commit?
>
> 62ea0d873af3 ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd
>
> > efforts and get this done properly; fill in *all* the hwcaps and
> > relevant handling for extensions which Cortex-A55 supports (since
> > there's definitely more than just VUDOT), and then hopefully we're done
> > for good.
>
> Agree.
>
> >
> > > Before this commit, kernel didn't handle the undef exception
> > > caused by vudot and didn't enable VFP in lazy VFP context
> > > switch code like other NEON instructions.
> > > This led to the occurrence of the undefined instruction
> > > error as following:
> > >
> > > [ 250.741238 ] 0904 (26902): undefined instruction: pc=004014ec
> > > ...
> > > [ 250.741287 ] PC is at 0x4014ec
> > > [ 250.741298 ] LR is at 0xb677874f
> > > [ 250.741303 ] pc : [<004014ec>] lr : [<b677874f>] psr: 80070010
> > > [ 250.741309 ] sp : beffedb0 ip : b67d7864 fp : beffee58
> > > [ 250.741314 ] r10: 00000000 r9 : 00000000 r8 : 00000000
> > > [ 250.741319 ] r7 : 00000001 r6 : 00000001 r5 : beffee90 r4 : 00401470
> > > [ 250.741324 ] r3 : beffee20 r2 : beffee30 r1 : beffee40 r0 : 004003a8
> > > [ 250.741331 ] Flags: Nzcv IRQs on FIQs on Mode USER_32 ISA ARM Segment user
> > > [ 250.741339 ] Control: 10c5383d Table: 32d0406a DAC: 00000055
> > > [ 250.741348 ] Code: f4434aef f4610aef f4622aef f4634aef (fc620df4)
> > >
> > > Below is the assembly of the user program:
> > >
> > > 0x4014dc <+108>: vst1.64 {d20, d21}, [r3:128]
> > > 0x4014e0 <+112>: vld1.64 {d16, d17}, [r1:128]
> > > 0x4014e4 <+116>: vld1.64 {d18, d19}, [r2:128]
> > > 0x4014e8 <+120>: vld1.64 {d20, d21}, [r3:128] --> switch out
> > > 0x4014ec <+124>: vudot.u8 q8, q9, q10 <-- switch in, and FPEXC.EN = 0
> > > SIGILL(illegal instruction)
> > >
> > > Link: https://services.arm.com/support/s/case/5004L00000XsOjP
> >
> > Linking to your private support case is not useful to upstream. Even I
> > can't open that link.
>
> I thought that maybe someone in arm need this.
> But it seems a bit noisy so I will remove the link from v2.
>
> >
> > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@xxxxxxxxxxxx>
> > > ---
> > > arch/arm/vfp/vfpmodule.c | 12 ++++++++++++
> > > 1 file changed, 12 insertions(+)
> > >
> > > diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
> > > index 7e8773a2d99d..7eab8d1019d2 100644
> > > --- a/arch/arm/vfp/vfpmodule.c
> > > +++ b/arch/arm/vfp/vfpmodule.c
> > > @@ -788,6 +788,12 @@ static struct undef_hook neon_support_hook[] = {{
> > > .cpsr_mask = PSR_T_BIT,
> > > .cpsr_val = 0,
> > > .fn = vfp_support_entry,
> > > +}, {
> > > + .instr_mask = 0xffb00000,
> > > + .instr_val = 0xfc200000,
> > > + .cpsr_mask = PSR_T_BIT,
> > > + .cpsr_val = 0,
> > > + .fn = vfp_support_entry,
> > > }, {
> > > .instr_mask = 0xef000000,
> > > .instr_val = 0xef000000,
> > > @@ -800,6 +806,12 @@ static struct undef_hook neon_support_hook[] = {{
> > > .cpsr_mask = PSR_T_BIT,
> > > .cpsr_val = PSR_T_BIT,
> > > .fn = vfp_support_entry,
> > > +}, {
> > > + .instr_mask = 0xffb00000,
> > > + .instr_val = 0xfc200000,
> > > + .cpsr_mask = PSR_T_BIT,
> > > + .cpsr_val = PSR_T_BIT,
> > > + .fn = vfp_support_entry,
> >
> > Why have two entries conditional on each possible value of one bit for
> > otherwise identical encodings? Surely it suffices to set both cpsr_mask
> > and cpsr_val to 0?
>
> You're right.
> I will set both cpsr_mask and cpsr_val to 0 and use single entry,
> as you suggested, in the v2 patch.
>
> Thanks.
>
> >
> > Thanks,
> > Robin.
> >
> > > }};
> > >
> > > static struct undef_hook vfp_support_hook = {