Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
From: Pawan Gupta
Date: Wed Sep 25 2024 - 21:04:58 EST
On Thu, Sep 26, 2024 at 01:32:19AM +0100, Andrew Cooper wrote:
> On 26/09/2024 1:17 am, Pawan Gupta wrote:
> > On Wed, Sep 25, 2024 at 04:46:23PM -0700, Pawan Gupta wrote:
> >> On Thu, Sep 26, 2024 at 12:29:00AM +0100, Andrew Cooper wrote:
> >>> On 25/09/2024 11:25 pm, Pawan Gupta wrote:
> >>>> Robert Gill reported below #GP in 32-bit mode when dosemu software was
> >>>> executing vm86() system call:
> >>>>
> >>>> general protection fault: 0000 [#1] PREEMPT SMP
> >>>> CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
> >>>> Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
> >>>> EIP: restore_all_switch_stack+0xbe/0xcf
> >>>> EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
> >>>> ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
> >>>> DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
> >>>> CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
> >>>> Call Trace:
> >>>> show_regs+0x70/0x78
> >>>> die_addr+0x29/0x70
> >>>> exc_general_protection+0x13c/0x348
> >>>> exc_bounds+0x98/0x98
> >>>> handle_exception+0x14d/0x14d
> >>>> exc_bounds+0x98/0x98
> >>>> restore_all_switch_stack+0xbe/0xcf
> >>>> exc_bounds+0x98/0x98
> >>>> restore_all_switch_stack+0xbe/0xcf
> >>>>
> >>>> This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
> >>>> are enabled. This is because segment registers with an arbitrary user value
> >>>> can result in #GP when executing VERW. Intel SDM vol. 2C documents the
> >>>> following behavior for VERW instruction:
> >>>>
> >>>> #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
> >>>> FS, or GS segment limit.
> >>>>
> >>>> CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
> >>>> space. Use %cs selector to reference VERW operand. This ensures VERW will
> >>>> not #GP for an arbitrary user %ds.
> >>>>
> >>>> Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
> >>>> Cc: stable@xxxxxxxxxxxxxxx # 5.10+
> >>>> Reported-by: Robert Gill <rtgill82@xxxxxxxxx>
> >>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
> >>>> Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@xxxxxxxxxxxxx/
> >>>> Suggested-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> >>>> Suggested-by: Brian Gerst <brgerst@xxxxxxxxx>
> >>>> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@xxxxxxxxxxxxxxx>
> >>>> ---
> >>>> arch/x86/include/asm/nospec-branch.h | 6 ++++--
> >>>> 1 file changed, 4 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> >>>> index ff5f1ecc7d1e..e18a6aaf414c 100644
> >>>> --- a/arch/x86/include/asm/nospec-branch.h
> >>>> +++ b/arch/x86/include/asm/nospec-branch.h
> >>>> @@ -318,12 +318,14 @@
> >>>> /*
> >>>> * Macro to execute VERW instruction that mitigate transient data sampling
> >>>> * attacks such as MDS. On affected systems a microcode update overloaded VERW
> >>>> - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
> >>>> + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> >>>> + * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> >>>> + * 32-bit mode.
> >>>> *
> >>>> * Note: Only the memory operand variant of VERW clears the CPU buffers.
> >>>> */
> >>>> .macro CLEAR_CPU_BUFFERS
> >>>> - ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> >>>> + ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> >>>> .endm
> >>> People ought rightly to double-take at this using %cs and not %ss.
> >>> There is a good reason, but it needs describing explicitly. May I
> >>> suggest the following:
> >>>
> >>> *...
> >>> * In 32bit mode, the memory operand must be a %cs reference. The data
> >>> segments may not be usable (vm86 mode), and the stack segment may not be
> >>> flat (espfix32).
> >>> *...
> >> Thanks for the suggestion. I will include this.
> >>
> >>> .macro CLEAR_CPU_BUFFERS
> >>> #ifdef __x86_64__
> >>> ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
> >>> #else
> >>> ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
> >>> #endif
> >>> .endm
> >>>
> >>> This also lets you drop _ASM_RIP(). It's a cute idea, but is more
> >>> confusion than it's worth, because there's no such thing in 32bit mode.
> >>>
> >>> "%cs:_ASM_RIP(mds_verw_sel)" reads as if it does nothing, because it
> >>> really doesn't in 64bit mode.
> >> Right, will drop _ASM_RIP() in 32-bit mode and %cs in 64-bit mode.
> > Its probably too soon for next version, pasting the patch here:
> >
> > ---8<---
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index e18a6aaf414c..4228a1fd2c2e 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -318,14 +318,21 @@
> > /*
> > * Macro to execute VERW instruction that mitigate transient data sampling
> > * attacks such as MDS. On affected systems a microcode update overloaded VERW
> > - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> > - * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> > - * 32-bit mode.
> > + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
> > *
> > * Note: Only the memory operand variant of VERW clears the CPU buffers.
> > */
> > .macro CLEAR_CPU_BUFFERS
> > - ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> > +#ifdef CONFIG_X86_64
> > + ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> > +#else
> > + /*
> > + * In 32bit mode, the memory operand must be a %cs reference. The data
> > + * segments may not be usable (vm86 mode), and the stack segment may not
> > + * be flat (ESPFIX32).
> > + */
>
> I was intending for this to replace the "Using %cs" sentence, as a new
> paragraph in that main comment block.
The reason I added the comment to 32-bit leg is because most readers will
not care about 32-bit mode. The comment will mostly be a distraction for
majority. People who care about 32-bit mode will read the comment in 32-bit
leg. I can move the comment to main block if you still want.