RE: [RFC PATCH v2 1/1] platform-msi: Add platform check for subdevice irq domain

From: Tian, Kevin
Date: Thu Jan 07 2021 - 01:56:12 EST


> From: Leon Romanovsky <leon@xxxxxxxxxx>
> Sent: Thursday, January 7, 2021 2:09 PM
>
> On Thu, Jan 07, 2021 at 02:04:29AM +0000, Tian, Kevin wrote:
> > > From: Leon Romanovsky <leon@xxxxxxxxxx>
> > > Sent: Thursday, January 7, 2021 12:02 AM
> > >
> > > On Wed, Jan 06, 2021 at 11:23:39AM -0400, Jason Gunthorpe wrote:
> > > > On Wed, Jan 06, 2021 at 12:40:17PM +0200, Leon Romanovsky wrote:
> > > >
> > > > > I asked what will you do when QEMU will gain needed functionality?
> > > > > Will you remove QEMU from this list? If yes, how such "new" kernel
> will
> > > > > work on old QEMU versions?
> > > >
> > > > The needed functionality is some VMM hypercall, so presumably new
> > > > kernels that support calling this hypercall will be able to discover
> > > > if the VMM hypercall exists and if so superceed this entire check.
> > >
> > > Let's not speculate, do we have well-known path?
> > > Will such patch be taken to stable@/distros?
> > >
> >
> > There are two functions introduced in this patch. One is to detect whether
> > running on bare metal or in a virtual machine. The other is for deciding
> > whether the platform supports ims. Currently the two are identical because
> > ims is supported only on bare metal at current stage. In the future it will
> look
> > like below when ims can be enabled in a VM:
> >
> > bool arch_support_pci_device_ims(struct pci_dev *pdev)
> > {
> > return on_bare_metal() || hypercall_irq_domain_supported();
> > }
> >
> > The VMM vendor list is for on_bare_metal, and suppose a vendor will
> > never be removed once being added to the list since the fact of running
> > in a VM never changes, regardless of whether this hypervisor supports
> > extra VMM hypercalls.
>
> This is what I imagined, this list will be forever, and this worries me.
>
> I don't know if it is true or not, but guess that at least Oracle and
> Microsoft bare metal devices and VMs will have same DMI_SYS_VENDOR.

It's true. David Woodhouse also said it's the case for Amazon EC2 instances.

>
> It means that this on_bare_metal() function won't work reliably in many
> cases. Also being part of include/linux/msi.h, at some point of time,
> this function will be picked by the users outside for the non-IMS cases.
>
> I didn't even mention custom forks of QEMU which are prohibited to change
> DMI_SYS_VENDOR and private clouds with custom solutions.

In this case the private QEMU forks are encouraged to set CPUID (X86_
FEATURE_HYPERVISOR) if they do plan to adopt a different vendor name.

>
> The current array makes DMI_SYS_VENDOR interface as some sort of ABI. If
> in the future,
> the QEMU will decide to use more hipster name, for example "qEmU", this
> function
> won't work.
>
> I'm aware that DMI_SYS_VENDOR is used heavily in the kernel code and
> various names for the same company are good example how not reliable it.
>
> The most hilarious example is "Dell/Dell Inc./Dell Inc/Dell Computer
> Corporation/Dell Computer",
> but other companies are not far from them.
>
> Luckily enough, this identification is used for hardware product that
> was released to the market and their name will be stable for that
> specific model. It is not the case here where we need to ensure future
> compatibility too (old kernel on new VM emulator).
>
> I'm not in position to say yes or no to this patch and don't have plans to do it.
> Just expressing my feeling that this solution is too hacky for my taste.
>

I agree with your worries and solely relying on DMI_SYS_VENDOR is
definitely too hacky. In previous discussions with Thomas there is no
elegant way to handle this situation. It has to be a heuristic approach.
First we hope the CPUID bit is set properly in most cases thus is checked
first. Then other heuristics can be made for the remaining cases. DMI_
SYS_VENDOR is the first hint and more can be added later. For example,
when IOMMU is present there is vendor specific way to detect whether
it's real or virtual. Dave also mentioned some BIOS flag to indicate a
virtual machine. Now probably the real question here is whether people
are OK with CPUID+DMI_SYS_VENDOR combo check for now (and grow
it later) or prefer to having all identified heuristics so far in-place together...

Thanks
Kevin