Re: [PATCH v6 2/5] x86/kexec: do unconditional WBINVD for bare-metal in relocate_kernel()

From: Huang, Kai
Date: Tue Sep 10 2024 - 05:53:10 EST


On Tue, 2024-09-10 at 02:46 +0000, Kaplan, David wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> > -----Original Message-----
> > From: Huang, Kai <kai.huang@xxxxxxxxx>
> > Sent: Monday, September 9, 2024 9:42 PM
> > To: Kaplan, David <David.Kaplan@xxxxxxx>; Hansen, Dave
> > <dave.hansen@xxxxxxxxx>; bp@xxxxxxxxx; tglx@xxxxxxxxxxxxx;
> > peterz@xxxxxxxxxxxxx; mingo@xxxxxxxxxx; hpa@xxxxxxxxx;
> > kirill.shutemov@xxxxxxxxxxxxxxx
> > Cc: x86@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; pbonzini@xxxxxxxxxx;
> > seanjc@xxxxxxxxxx; Williams, Dan J <dan.j.williams@xxxxxxxxx>; Lendacky,
> > Thomas <Thomas.Lendacky@xxxxxxx>; Edgecombe, Rick P
> > <rick.p.edgecombe@xxxxxxxxx>; Yamahata, Isaku
> > <isaku.yamahata@xxxxxxxxx>; Kalra, Ashish <Ashish.Kalra@xxxxxxx>;
> > bhe@xxxxxxxxxx; nik.borisov@xxxxxxxx; sagis@xxxxxxxxxx; Dave Young
> > <dyoung@xxxxxxxxxx>
> > Subject: Re: [PATCH v6 2/5] x86/kexec: do unconditional WBINVD for bare-
> > metal in relocate_kernel()
> >
> > Caution: This message originated from an External Source. Use proper
> > caution when opening attachments, clicking links, or responding.
> >
> >
> > > > --- a/arch/x86/kernel/machine_kexec_64.c
> > > > +++ b/arch/x86/kernel/machine_kexec_64.c
> > > > @@ -322,16 +322,9 @@ void machine_kexec_cleanup(struct kimage
> > *image)
> > > > void machine_kexec(struct kimage *image) {
> > > > unsigned long page_list[PAGES_NR];
> > > > - unsigned int host_mem_enc_active;
> > > > int save_ftrace_enabled;
> > > > void *control_page;
> > > >
> > > > - /*
> > > > - * This must be done before load_segments() since if call depth
> > tracking
> > > > - * is used then GS must be valid to make any function calls.
> > > > - */
> > > > - host_mem_enc_active =
> > > > cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT);
> > > > -
> > >
> > > Functionally the patch looks fine. I would suggest keeping some form of
> > this comment though, because the limitation about not being able to make
> > function calls after load_segments() is arguably non-obvious and this
> > comment served as a warning for future modifications in this area.
> >
> > Yeah this makes sense. Thanks.
> >
> > I think we can add some text to the existing comment of load_segments() to
> > call out this. Allow me to dig into more about call depth tracking to
> > understand it better -- relocate_kernel() after load_segments() seems to be a
> > real function call and I want to know how does it interact with call depth
> > tracking.
>
> That one is explicitly ignored, see skip_addr() in arch/x86/kernel/callthunks.c
>

That was I thought too. Thanks for pointing out.

How about below?

--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -351,6 +351,11 @@ void machine_kexec(struct kimage *image)
*
* I take advantage of this here by force loading the
* segments, before I zap the gdt with an invalid value.
+ *
+ * Note this resets GS to 0. Don't make any function call after
+ * here since call depth tracking uses per-cpu variables to
+ * operate (relocate_kernel is explicitly ignored by call depth
+ * tracking).
*/

Btw, it would be very helpful if you can help to verify this patch doesn't break
call depth tracking in your environment. Thanks!