RE: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall

From: Michael Kelley
Date: Wed Oct 24 2018 - 12:53:55 EST


From: Yi Sun <yi.y.sun@xxxxxxxxxxxxxxx> Sent: Friday, October 19, 2018 6:14 AM
>
> The HvNotifyLongSpinWait hypercall (HVCALL_NOTIFY_LONG_SPIN_WAIT)
> is used by a guest OS to notify the hypervisor that the calling
> virtual processor is attempting to acquire a resource that is
> potentially held by another virtual processor within the same
> Virtual Machine. This scheduling hint improves the scalability of
> VMs with more than one virtual processor on Hyper-V.
>
> Per MSFT TLFS, the retry number (SpinWaitInfo) is sent to hypervisor
> only when the retry number exceeds the recommended number. If
> recommended number is 0xFFFFFFFF, never retry.

The HvNotifyLongSpinWait hypercall should be understood to be
advisory only. As you noted, it is a scheduling hint to the
hypervisor that some virtual CPU in the VM holds a spin lock. Even
though Linux knows which vCPU holds the spin lock, the hypercall
does not provide a way to give that information to Hyper-V. The
hypercall always returns immediately.

The "retry number" is a bit mis-named in the Hyper-V Top Level
Functional Spec (TLFS). It is essentially a threshold value. Hyper-V is
saying "don't bother to advise me about the spin lock until you have
done a certain number of spins." This threshold prevents
over-notifying Hyper-V such that the notification becomes somewhat
meaningless. It's not immediately clear to me why the hypercall passes
that value as an input, but maybe it lets the Hyper-V scheduler prioritize
among vCPUs based on how many times they have spun for a lock. I
think we were told that current Hyper-V implementations ignore this
input value anyway.

I believe the description of the sentinel value 0xFFFFFFFF in the
Hyper-V TLFS is incorrect. Because it is the max possible threshold
value, that value in the EBX register just means to not ever bother to
notify. The description should be "0xFFFFFFFF indicates never to notify."
The value does *not* indicate anything about retrying to obtain the
spin lock.

> static bool __initdata hv_pvspin = true;
>
> +bool hv_notify_long_spin_wait(int retry_num)

retry_num should probably be declared as unsigned int. You
don't want it to be treated as a negative number if the high
order bit is set.

> +{
> + /*
> + * Per MSFT TLFS, the SpinWaitInfo is sent to hypervisor only when
> + * the retry number exceeds the recommended number.
> + *
> + * If recommended number is 0xFFFFFFFF, never retry.
> + */
> + if (ms_hyperv.num_spin_retry == HYPERV_SPINLOCK_RETRY_NEVER)
> + return false;
> +
> + if ((0 == retry_num % ms_hyperv.num_spin_retry) && retry_num)

I don't know if the "%" function is right here. Your implementation will
notify Hyper-V on every multiple of num_spin_retry. The alternative is to
notify once when the threshold is exceeded, and never again for this
particular attempt to obtain a spin lock. We should check with the Hyper-V
team for which approach they expect to be used.

> + hv_do_fast_hypercall8(HVCALL_NOTIFY_LONG_SPIN_WAIT,
> + retry_num);

The Hyper-V TLFS seems to be inconsistent on whether the input parameter
is 32-bits or 64-bits. In one place it is typed as UINT64, but in another place
it is shown as only 4 bytes. Need to clear this up with the Hyper-V team as
well.

> +
> + return true;

I don't see a need for this function to return true vs. false. Any calling code
should not change its behavior based on num_spin_retry. This function will
either notify Hyper-V or not notify Hyper-V, depending on whether the number
of attempts to obtain the spinlock meets the threshold. But calling code will
do the same thing regardless of whether such a notification is made.

Michael