Re: [EXTERNAL] Re: [PATCH v3 1/3] PCI: Introduce pcibios_ignore_alignment_request

From: Shawn Anastasio
Date: Thu May 30 2019 - 18:36:48 EST




On 5/30/19 1:55 AM, Sam Bobroff wrote:
On Tue, May 28, 2019 at 03:36:34PM +1000, Oliver wrote:
On Tue, May 28, 2019 at 2:03 PM Shawn Anastasio <shawn@xxxxxxxxxx> wrote:

Introduce a new pcibios function pcibios_ignore_alignment_request
which allows the PCI core to defer to platform-specific code to
determine whether or not to ignore alignment requests for PCI resources.

The existing behavior is to simply ignore alignment requests when
PCI_PROBE_ONLY is set. This is behavior is maintained by the
default implementation of pcibios_ignore_alignment_request.

Signed-off-by: Shawn Anastasio <shawn@xxxxxxxxxx>
---
drivers/pci/pci.c | 9 +++++++--
include/linux/pci.h | 1 +
2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8abc843b1615..8207a09085d1 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5882,6 +5882,11 @@ resource_size_t __weak pcibios_default_alignment(void)
return 0;
}

+int __weak pcibios_ignore_alignment_request(void)
+{
+ return pci_has_flag(PCI_PROBE_ONLY);
+}
+
#define RESOURCE_ALIGNMENT_PARAM_SIZE COMMAND_LINE_SIZE
static char resource_alignment_param[RESOURCE_ALIGNMENT_PARAM_SIZE] = {0};
static DEFINE_SPINLOCK(resource_alignment_lock);
@@ -5906,9 +5911,9 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
p = resource_alignment_param;
if (!*p && !align)
goto out;
- if (pci_has_flag(PCI_PROBE_ONLY)) {
+ if (pcibios_ignore_alignment_request()) {
align = 0;
- pr_info_once("PCI: Ignoring requested alignments (PCI_PROBE_ONLY)\n");
+ pr_info_once("PCI: Ignoring requested alignments\n");
goto out;
}

I think the logic here is questionable to begin with. If the user has
explicitly requested re-aligning a resource via the command line then
we should probably do it even if PCI_PROBE_ONLY is set. When it breaks
they get to keep the pieces.

That said, the real issue here is that PCI_PROBE_ONLY probably
shouldn't be set under qemu/kvm. Under the other hypervisor (PowerVM)
hotplugged devices are configured by firmware before it's passed to
the guest and we need to keep the FW assignments otherwise things
break. QEMU however doesn't do any BAR assignments and relies on that
being handled by the guest. At boot time this is done by SLOF, but
Linux only keeps SLOF around until it's extracted the device-tree.
Once that's done SLOF gets blown away and the kernel needs to do it's
own BAR assignments. I'm guessing there's a hack in there to make it
work today, but it's a little surprising that it works at all...

IIRC Sam Bobroff was looking at hotplug under pseries recently so he
might have something to add. He's sick at the moment, but I'll ask him
to take a look at this once he's back among the living

There seems to be some code already in the kernel that will disable
PCI_PROBE_ONLY based on a device tree property, so I did a quick test
today and it seems to work. Only a trivial tweak is needed in QEMU to
do it (have spapr_dt_chosen() add a node called "linux,pci-probe-only"
with a value of 0), and that would allow us to set it only for QEMU (and
not PowerVM) if that's what we want to do. Is that useful?

(I haven't done any real testing yet but the guest booted up OK.)

It was my understanding that PCI_PROBE_ONLY should actually be set
initially so that Linux uses SLOF's BAR assignments. The issue here
is that PCI_PROBE_ONLY shouldn't be honored after initial bringup
on KVM so that hotplugged PCI devices can have custom BAR alignments.

Of course, if there's no need to honor SLOF's initial assignments,
I assume disabling PCI_PROBE_ONLY would work fine. In fact, I'm
not entirely sure why it's done in the first place. Does anybody
know?

If there is actually a valid reason for preserving SLOF's initial
assignments, then it seems like the correct solution is to disable
PCI_PROBE_ONLY after initial PCI bringup or ignore it in
pci_specified_resource_alignment() like I do in this patch set.

Bjorn Helgaas also suggested marking individual resources provided
by SLOF/PHYP with IORESOURCE_PCI_FIXED which would remove the need
to use PCI_PROBE_ONLY altogether.

Any thoughts?

- Shawn