Re: [tip: x86/cleanups] x86/segment: Use MOVL when reading segment registers

From: Uros Bizjak

Date: Thu Mar 12 2026 - 15:39:58 EST


On Thu, Mar 12, 2026 at 7:55 PM H. Peter Anvin <hpa@xxxxxxxxx> wrote:
>
> On 2026-03-12 02:30, Uros Bizjak wrote:
> >
> > We would like to always use MOVL to avoid 0x66 operand size override
> > prefix when reading to a register, but MOVL does not support memory
> > operands. MOVW is required in this case.
> >
>
> Just use "mov" without any suffix.
>
> GAS LISTING movseg.s page 1
>
>
> 1 .text
> 2
> 3 0000 8CD8 mov %ds,%eax
> 4 0002 668CD8 mov %ds,%ax
> 5 0005 8C1B mov %ds,(%rbx)

True, but with a register operand we would prefer this (note %k modifier):

short foo (void)
{
short r;
asm ("mov %%ds, %k0" : "=r"(r));
return r;
}

to avoid 0x66 operand size prefix even for 16bit output registers.

Also, "mov %ds,%eax" will zero-extend to the output register all the
way to 64-bit width. When savesegment() is defined as:

+#define savesegment(seg, var) \
+ asm volatile("movl %%" #seg ",%k0" : "=r" (var))

it allows the following patch:

--cut here--
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 2ba5f166e58f..c7f98977663c 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -187,7 +187,6 @@ void set_personality_ia32(bool);

#define ELF_CORE_COPY_REGS(pr_reg, regs) \
do { \
- unsigned v; \
(pr_reg)[0] = (regs)->r15; \
(pr_reg)[1] = (regs)->r14; \
(pr_reg)[2] = (regs)->r13; \
@@ -211,10 +210,10 @@ do { \
(pr_reg)[20] = (regs)->ss; \
(pr_reg)[21] = x86_fsbase_read_cpu(); \
(pr_reg)[22] = x86_gsbase_read_cpu_inactive(); \
- asm("movl %%ds,%0" : "=r" (v)); (pr_reg)[23] = v; \
- asm("movl %%es,%0" : "=r" (v)); (pr_reg)[24] = v; \
- asm("movl %%fs,%0" : "=r" (v)); (pr_reg)[25] = v; \
- asm("movl %%gs,%0" : "=r" (v)); (pr_reg)[26] = v; \
+ savesegment(ds, (pr_reg)[23]); \
+ savesegment(es, (pr_reg)[24]); \
+ savesegment(fs, (pr_reg)[25]); \
+ savesegment(gs, (pr_reg)[26]); \
} while (0);

--cut here--

that results in:

mov $ds, %eax
mov %rax, <mem>

without intermediate uint32 -> uint64 zext instruction.

Unfortunately, it is not possible to use %k on memory output operands
(so, "=rm" is not allowed with %k). Although the modifier is nop with
-masm=att, it will error out the compilation with -masm=intel. where:

mov %ds, DWORD PTR mb[rip]

Also, we want to store to exactly 2-byte memory location:

+#define __savesegment(seg, loc) \
+do { \
+ BUILD_BUG_ON(sizeof(loc) != 2); \
+ asm volatile("movw %%" #seg ",%0" : "=m" (loc)); \
+} while (0)

This macro will allow direct seg register store in e.g.:

--cut here--
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 4c718f8adc59..84c8d7a047d6 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -197,8 +197,8 @@ int copy_thread(struct task_struct *p, const
struct kernel_clone_args *args)
p->thread.gsindex = current->thread.gsindex;
p->thread.gsbase = current->thread.gsbase;

- savesegment(es, p->thread.es);
- savesegment(ds, p->thread.ds);
+ __savesegment(es, p->thread.es);
+ __savesegment(ds, p->thread.ds);

if (p->mm && (clone_flags & (CLONE_VM | CLONE_VFORK)) == CLONE_VM)
set_bit(MM_CONTEXT_LOCK_LAM, &p->mm->context.flags);
--cut here--

Uros.