Re: [PATCH] pci: Disable slot presence detection around bus reset

From: Alex Williamson
Date: Wed Apr 24 2013 - 17:33:55 EST


On Thu, 2013-02-14 at 20:53 -0700, Alex Williamson wrote:
> On Thu, 2013-02-14 at 16:47 -0700, Bjorn Helgaas wrote:
> > On Thu, Feb 14, 2013 at 11:37 AM, Alex Williamson
> > <alex.williamson@xxxxxxxxxx> wrote:
> > > A bus reset can trigger a presence detection change and result in a
> > > suprise hotplug. This is generally not what we want to happen when
> > > trying to reset a device. Disable the presence detection control on
> > > on bridges around bus reset.
> >
> > This is a really interesting situation, and I'm not quite ready to
> > sign up to the idea that this is really a problem and that if it is,
> > this is the way we want to fix it.
> >
> > What would happen if we *did* handle this as a hotplug event, with a
> > removal followed by an add?
> >
> > The scheme where pci_reset_function() does "pci_save_state(dev);
> > pci_dev_reset(dev); pci_restore_state(dev);" makes me nervous.
> >
> > We're saving and restoring some of PCI config space around the reset,
> > but there's no guarantee that we're preserving *all* the important
> > state in config space because I think devices can have non-architected
> > device-specific things in config space that we don't know how to
> > save/restore.
> >
> > Devices also have internal state not exposed via config space. That
> > state is lost during the reset but can't be restored by
> > pci_restore_state(). So it seems like pci_reset_function() is
> > pretending to do something it can't really do reliably.
> >
> > If we make it so a reset is always handled as a remove+add, then we'll
> > use a more generic path, and we'll get all the stuff you expect when
> > initializing a new device -- resource assignment, IRQ setup, quirks,
> > etc. Quirks in particular seem like something we want, but don't
> > currently get with pci_reset_function().
> >
> > Oh, and the "disable presence detect" approach below only works for
> > things below a PCIe bridge with native hotplug, right? I wonder what
> > happens if we reset devices below a bridge using SHPC or acpiphp.
>
> Triggering a remove+add is not useful for the way we use it today. The
> users I'm aware of are KVM device assignment and VFIO, where we trigger
> it in an attempt to get the device to a known state so that we have some
> hope of repeatability. In those scenarios the reset is initiated by the
> driver. The interface isn't meant to guarantee the device is returned
> to an identical state as it was before reset. If it did, why would we
> call it? We want to get to a state as near to power on, but still with
> config programming, as we can.
>
> Being driver directed, having the reset initiate a remove is pretty near
> the last thing we want. That limits the scope of calling it to only
> when the driver can readily release the device. If we have the device
> attached to a guest or userspace driver, that's potentially a lot of
> setup and teardown and effectively extending a surprise removal all the
> way up the stack.
>
> Obviously a bus reset is a big hammer and we do exhaust all the little
> hammers of flr and pm reset before we try it, but in this case, we know
> the device that's going away and with all likelihood, it's coming right
> back at the same location. If we take the path of forcing a remove+add,
> let's just remove it from the reset_function call path and we'll do
> without reset for those devices. Thanks,

Time to revisit this bug. Clearly when a driver or userspace calls
pci_reset_function the intention is not to have the device be
hot-unplugged and re-plugged. So I think we either need to prevent that
from happening or politely decline the reset.

I don't really know how to do this on acpiphp or shpc or whatever other
hotplug controllers we support. So, what if we add a reset_slot
callback to hotplug_slot_ops? We could then make pci_parent_bus_reset
do something like:

if (dev->slot && dev->slot->hotplug_slot) {
if (!dev->slot->hotplug_slot->reset_slot)
return -ENOTTY;

return dev->slot->hotplug_slot->reset_slot(dev->slot->hotplug_slot);
} else {
... standard secondary bus reset...
}

I'd actually also like to add a pci_reset_bus interface because we do
have cases where the pci_reset_function is not sufficient (device
doesn't do any useful reset of it's own and pci_parent_bus_reset won't
because there are other devices on the bus). Graphics cards in
particular are biting us here. When all of the devices on the bus are
owned by a driver, this would provide a less device dependent reset. It
would use same logic and code as enabled with reset_slot. Thoughts?
Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/