Re: [PATCH 0/9] xen/x86: PVH Dom0 fixes and fallout adjustments

From: Roger Pau Monné
Date: Tue Sep 14 2021 - 07:22:36 EST


On Tue, Sep 14, 2021 at 11:03:23AM +0200, Jan Beulich wrote:
> On 14.09.2021 10:32, Roger Pau Monné wrote:
> > On Tue, Sep 07, 2021 at 12:04:34PM +0200, Jan Beulich wrote:
> >> In order to try to debug hypervisor side breakage from XSA-378 I found
> >> myself urged to finally give PVH Dom0 a try. Sadly things didn't work
> >> quite as expected. In the course of investigating these issues I actually
> >> spotted one piece of PV Dom0 breakage as well, a fix for which is also
> >> included here.
> >>
> >> There are two immediate remaining issues (also mentioned in affected
> >> patches):
> >>
> >> 1) It is not clear to me how PCI device reporting is to work. PV Dom0
> >> reports devices as they're discovered, including ones the hypervisor
> >> may not have been able to discover itself (ones on segments other
> >> than 0 or hotplugged ones). The respective hypercall, however, is
> >> inaccessible to PVH Dom0. Depending on the answer to this, either
> >> the hypervisor will need changing (to permit the call) or patch 2
> >> here will need further refinement.
> >
> > I would rather prefer if we could limit the hypercall usage to only
> > report hotplugged segments to Xen. Then Xen would have to scan the
> > segment when reported and add any devices found.
> >
> > Such hypercall must be used before dom0 tries to access any device, as
> > otherwise the BARs won't be mapped in the second stage translation and
> > the traps for the MCFG area won't be setup either.
>
> This might work if hotplugging would only ever be of segments, and not
> of individual devices. Yet the latter is, I think, a common case (as
> far as hotplugging itself is "common").

Right, I agree to use hypercalls to report either hotplugged segments
or devices. However I would like to avoid mandating usage of the
hypercall for non-hotplug stuff, as then OSes not having hotplug
support don't really need to care about making use of those
hypercalls.

> Also don't forget about SR-IOV VFs - they would typically not be there
> when booting. They would materialize when the PF driver initializes
> the device. This is, I think, something that can be dealt with by
> intercepting writes to the SR-IOV capability.

My plan was to indeed trap SR-IOV capability accesses, see:

https://lore.kernel.org/xen-devel/20180717094830.54806-1-roger.pau@xxxxxxxxxx/

I just don't have time ATM to continue this work.

> But I wonder whether
> there might be other cases where devices become "visible" only while
> the Dom0 kernel is already running.

I would consider those kind of hotplug devices, and hence would
require the use of the hypercall in order to notify Xen about them.

> >> 2) Dom0, unlike in the PV case, cannot access the screen (to use as a
> >> console) when in a non-default mode (i.e. not 80x25 text), as the
> >> necessary information (in particular about VESA-bases LFB modes) is
> >> not communicated. On the hypervisor side this looks like deliberate
> >> behavior, but it is unclear to me what the intentions were towards
> >> an alternative model. (X may be able to access the screen depending
> >> on whether it has a suitable driver besides the presently unusable
> >> /dev/fb<N> based one.)
> >
> > I had to admit most of my boxes are headless servers, albeit I have
> > one NUC I can use to test gfx stuff, so I don't really use gfx output
> > with Xen.
> >
> > As I understand such information is fetched from the BIOS and passed
> > into Xen, which should then hand it over to the dom0 kernel?
>
> That's how PV Dom0 learns of the information, yes. See
> fill_console_start_info(). (I'm in the process of eliminating the
> need for some of the "fetch from BIOS" in Xen right now, but that's
> not going to get us as far as being able to delete that code, no
> matter how much in particular Andrew would like that to happen.)
>
> > I guess the only way for Linux dom0 kernel to fetch that information
> > would be to emulate the BIOS or drop into realmode and issue the BIOS
> > calls?
>
> Native Linux gets this information passed from the boot loader, I think
> (except in the EFI case, as per below).
>
> > Is that an issue on UEFI also, or there dom0 can fetch the framebuffer
> > info using the PV EFI interface?
>
> There it's EFI boot services functions which can be invoked before
> leaving boot services (in the native case). Aiui the PVH entry point
> lives logically past any EFI boot services interaction, and hence
> using them is not an option (if there was EFI firmware present in Dom0
> in the first place, which I consider difficult all by itself - this
> can't be the physical system's firmware, but I also don't see where
> virtual firmware would be taken from).
>
> There is no PV EFI interface to obtain video information. With the
> needed information getting passed via start_info, PV has no need for
> such, and I would be hesitant to add a fundamentally redundant
> interface for PVH. The more that the information needed isn't EFI-
> specific at all.

I think our only option is to expand the HVM start info information to
convey that data from Xen into dom0.

Thanks, Roger.