Re: [PATCH] xen/pci: try to reserve MCFG areas earlier

From: Igor Druzhinin
Date: Fri Sep 06 2019 - 19:04:38 EST




On 06/09/2019 23:30, Boris Ostrovsky wrote:
> On 9/3/19 8:20 PM, Igor Druzhinin wrote:
>> If MCFG area is not reserved in E820, Xen by default will defer its usage
>> until Dom0 registers it explicitly after ACPI parser recognizes it as
>> a reserved resource in DSDT. Having it reserved in E820 is not
>> mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2)
>> and firmware is free to keep a hole E820 in that place. Xen doesn't know
>> what exactly is inside this hole since it lacks full ACPI view of the
>> platform therefore it's potentially harmful to access MCFG region
>> without additional checks as some machines are known to provide
>> inconsistent information on the size of the region.
>>
>> Now xen_mcfg_late() runs after acpi_init() which is too late as some basic
>> PCI enumeration starts exactly there. Trying to register a device prior
>> to MCFG reservation causes multiple problems with PCIe extended
>> capability initializations in Xen (e.g. SR-IOV VF BAR sizing). There are
>> no convenient hooks for us to subscribe to so try to register MCFG
>> areas earlier upon the first invocation of xen_add_device().
>
>
> Where is MCFG parsed? pci_arch_init()?

It happens twice:
1) first time early one in pci_arch_init() that is arch_initcall - that
time pci_mmcfg_list will be freed immediately there because MCFG area is
not reserved in E820;
2) second time late one in acpi_init() which is subsystem_initcall right
before where PCI enumeration starts - this time ACPI tables will be
checked for a reserved resource and pci_mmcfg_list will be finally
populated.

The problem is that on a system that doesn't have MCFG area reserved in
E820 pci_mmcfg_list is empty before acpi_init() and our PCI hooks are
called in the same place. So MCFG is still not in use by Xen at this
point since we haven't reached our xen_mcfg_late().

Igor