Re: [PATCH v3 4/7] x86/hyper-v: allocate and use Virtual Processor Assist Pages

From: Peter Zijlstra
Date: Thu Mar 15 2018 - 09:49:03 EST


On Thu, Mar 15, 2018 at 12:45:03PM +0100, Thomas Gleixner wrote:
> On Thu, 15 Mar 2018, Vitaly Kuznetsov wrote:
> > Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:
> > > On Fri, 9 Mar 2018, Vitaly Kuznetsov wrote:
> > >> @@ -198,6 +218,12 @@ static int hv_cpu_die(unsigned int cpu)
> > >> struct hv_reenlightenment_control re_ctrl;
> > >> unsigned int new_cpu;
> > >>
> > >> + if (hv_vp_assist_page && hv_vp_assist_page[cpu]) {
> > >> + wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
> > >> + vfree(hv_vp_assist_page[cpu]);
> > >> + hv_vp_assist_page[cpu] = NULL;
> > >
> > > So this is freed before the CPU is actually dead. And this runs in
> > > preemtible context. Is the wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0); enough to
> > > prevent eventual users of the assist page on the outgoing CPU from
> > > accessing it?
> > >
> >
> > After we do wrmsrl() the page is no longer 'magic' so in case eventual
> > users try using it they'll most likely misbehave -- so changing the
> > shutdown order won't help.
> >
> > The only user of these pages is currently KVM. Can we still have vCPUs
> > running on the outgoing CPU at this point? If case we can we're in
> > trouble and we need to somehow kick them out first.
>
> The first thing we do in unplug is to mark the CPU inactive, but I'm not
> sure whether that prevents something which was on the CPU before and
> perhaps preempted or is affine to that CPU to be scheduled in
> again. Peter????

I think we can still have tasks running at this point.

AP_ACTIVE (sched_cpu_deactivate) simply takes the CPU out of the active
mask, which guarantees no new tasks will land on the CPU.

We'll then proceed all the way to TEARDOWN_CPU as 'normal', at which
point we'll call stop_machine() which does the old DYING callbacks.

It sounds like we want this done here, although possibly we can't do
vfree() from that context, in which case it needs to store the pointer
and do that from a BP callback (what used to be the OFFLINE callbacks or
something).