Re: Getting rid of inside_vm in intel8x0

From: George Dunlap
Date: Mon Apr 04 2016 - 05:07:50 EST


On 02/04/16 13:57, Andy Lutomirski wrote:
> On Fri, Apr 1, 2016 at 10:33 PM, Takashi Iwai <tiwai@xxxxxxx> wrote:
>> On Sat, 02 Apr 2016 00:28:31 +0200,
>> Luis R. Rodriguez wrote:
>>> If the former, could a we somehow detect an emulated device other than through
>>> this type of check ? Or could we *add* a capability of some sort to detect it
>>> on the driver ? This would not address the removal, but it could mean finding a
>>> way to address emulation issues.
>>>
>>> If its an IO issue -- exactly what is the causing the delays in IO ?
>>
>> Luis, there is no problem about emulation itself. It's rather an
>> optimization to lighten the host side load, as I/O access on a VM is
>> much heavier.
>>
>>>>>> This is satisfied mostly only on VM, and can't
>>>>>> be measured easily unlike the IO read speed.
>>>>>
>>>>> Interesting, note the original patch claimed it was for KVM and
>>>>> Parallels hypervisor only, but since the code uses:
>>>>>
>>>>> +#if defined(__i386__) || defined(__x86_64__)
>>>>> + inside_vm = inside_vm || boot_cpu_has(X86_FEATURE_HYPERVISOR);
>>>>> +#endif
>>>>>
>>>>> This makes it apply also to Xen as well, this makes this hack more
>>>>> broad, but does is it only applicable when an emulated device is
>>>>> used ? What about if a hypervisor is used and PCI passthrough is
>>>>> used ?
>>>>
>>>> A good question. Xen was added there at the time from positive
>>>> results by quick tests, but it might show an issue if it's running on
>>>> a very old chip with PCI passthrough. But I'm not sure whether PCI
>>>> passthrough would work on such old chipsets at all.
>>>
>>> If it did have an issue then that would have to be special cased, that
>>> is the module parameter would not need to be enabled for such type of
>>> systems, and heuristics would be needed. As you note, fortunately this
>>> may not be common though...
>>
>> Actually this *is* module parametered. If set to a boolean value, it
>> can be applied / skipped forcibly. So, if there has been a problem on
>> Xen, this should have been reported. That's why I wrote it's no
>> common case. This comes from the real experience.
>>
>>> but if this type of work around may be
>>> taken as a precedent to enable other types of hacks in other drivers
>>> I'm very fearful of more hacks later needing these considerations as
>>> well.
>>>
>>>>>>> There are a pile of nonsensical "are we in a VM" checks of various
>>>>>>> sorts scattered throughout the kernel, they're all a mess to maintain
>>>>>>> (there are lots of kinds of VMs in the world, and Linux may not even
>>>>>>> know it's a guest), and, in most cases, it appears that the correct
>>>>>>> solution is to delete the checks. I just removed a nasty one in the
>>>>>>> x86_32 entry asm, and this one is written in C so it should be a piece
>>>>>>> of cake :)
>>>>>>
>>>>>> This cake looks sweet, but a worm is hidden behind the cream.
>>>>>> The loop in the code itself is already a kludge for the buggy hardware
>>>>>> where the inconsistent read happens not so often (only at the boundary
>>>>>> and in a racy way). It would be nice if we can have a more reliably
>>>>>> way to know the hardware buggyness, but it's difficult,
>>>>>> unsurprisingly.
>>>>>
>>>>> The concern here is setting precedents for VM cases sprinkled in the kernel.
>>>>> The assumption here is such special cases are really paper'ing over another
>>>>> type of issue, so its best to ultimately try to root cause the issue in
>>>>> a more generalized fashion.
>>>>
>>>> Well, it's rather bare metal that shows the buggy behavior, thus we
>>>> need to paper over it. In that sense, it's other way round; we don't
>>>> tune for VM. The VM check we're discussing is rather for skipping the
>>>> strange workaround.
>>>
>>> What is it exactly about a VM that enables this work around to be skipped?
>>> I don't quite get it yet.
>>
>> VM -- at least the full one with the sound hardware emulation --
>> doesn't have the hardware bug. So, the check isn't needed.
>
> Here's the issue, though: asking "am I in a VM" is not a good way to
> learn properties of hardware. Just off the top of my head, here are
> some types of VM and what they might imply about hardware:
>
> Intel Kernel Guard: your sound card is passed through from real hardware.
>
> Xen: could go either way. In dom0, it's likely passed through. In
> domU, it could be passed through or emulated, and I believe this is
> the case for all of the Xen variants.
>
> KVM: Probably emulated, but could be passed through.

I'm not sure exactly why I was CC'd into this thread, but this is an
important point -- even if you're running in a VM, you may actually have
direct un-emulated IO access to a real (buggy) piece of hardware; in
which case it sounds like you still need the work-around. So
boot_cpu_has(X86_FEATURE_HYPERVISOR) is probably not the right check.

-George