Re: [PATCH v3 1/4] x86/clear_page: extend clear_page*() for multi-page clearing
From: Peter Zijlstra
Date: Mon Apr 14 2025 - 07:03:26 EST
On Mon, Apr 14, 2025 at 08:32:29AM +0200, Ingo Molnar wrote:
> > static inline void clear_page(void *page)
> > {
> > + unsigned int length = PAGE_SIZE;
> > /*
> > - * Clean up KMSAN metadata for the page being cleared. The assembly call
> > + * Clean up KMSAN metadata for the pages being cleared. The assembly call
> > * below clobbers @page, so we perform unpoisoning before it.
>
> > */
> > - kmsan_unpoison_memory(page, PAGE_SIZE);
> > - alternative_call_2(clear_page_orig,
> > - clear_page_rep, X86_FEATURE_REP_GOOD,
> > - clear_page_erms, X86_FEATURE_ERMS,
> > + kmsan_unpoison_memory(page, length);
> > +
> > + alternative_call_2(clear_pages_orig,
> > + clear_pages_rep, X86_FEATURE_REP_GOOD,
> > + clear_pages_erms, X86_FEATURE_ERMS,
> > "=D" (page),
> > - "D" (page),
> > + ASM_INPUT("D" (page), "S" (length)),
> > "cc", "memory", "rax", "rcx");
> > }
> >
> > diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
> > index a508e4a8c66a..bce516263b69 100644
> > --- a/arch/x86/lib/clear_page_64.S
> > +++ b/arch/x86/lib/clear_page_64.S
> > @@ -13,20 +13,35 @@
> > */
> >
> > /*
> > - * Zero a page.
> > - * %rdi - page
> > + * Zero kernel page aligned region.
> > + *
> > + * Input:
> > + * %rdi - destination
> > + * %esi - length
> > + *
> > + * Clobbers: %rax, %rcx
> > */
> > -SYM_TYPED_FUNC_START(clear_page_rep)
> > - movl $4096/8,%ecx
> > +SYM_TYPED_FUNC_START(clear_pages_rep)
> > + movl %esi, %ecx
> > xorl %eax,%eax
> > + shrl $3,%ecx
> > rep stosq
> > RET
> > -SYM_FUNC_END(clear_page_rep)
> > -EXPORT_SYMBOL_GPL(clear_page_rep)
> > +SYM_FUNC_END(clear_pages_rep)
> > +EXPORT_SYMBOL_GPL(clear_pages_rep)
> >
> > -SYM_TYPED_FUNC_START(clear_page_orig)
> > +/*
> > + * Original page zeroing loop.
> > + * Input:
> > + * %rdi - destination
> > + * %esi - length
> > + *
> > + * Clobbers: %rax, %rcx, %rflags
> > + */
> > +SYM_TYPED_FUNC_START(clear_pages_orig)
> > + movl %esi, %ecx
> > xorl %eax,%eax
> > - movl $4096/64,%ecx
> > + shrl $6,%ecx
>
> So if the natural input parameter is RCX, why is this function using
> RSI as the input 'length' parameter? Causes unnecessary register
> shuffling.
This symbol is written as a C function with C calling convention, even
though it is only meant to be called from that clear_page() alternative.
If we want to go change all this, then we should go do the same we do
for __clear_user() and write it thusly:
asm volatile(ALTERNATIVE("rep stosb",
"call rep_stos_alternative", ALT_NOT(X86_FEATURE_FSRS)
: "+c" (size), "+D" (addr), ASM_CALL_CONSTRAINT
: "a" (0))
And forget about all those clear_page_*() thingies.