Re: Special handling of display/VGA devices in hotplug drivers

From: Bjorn Helgaas
Date: Thu Dec 11 2014 - 17:08:02 EST


[+cc Praveen]

On Thu, Dec 11, 2014 at 12:32 PM, Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote:
> On Thu, 11 Dec 2014 13:11:36 -0500
> Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
>> On Thu, Dec 11, 2014 at 10:34:30AM -0700, Bjorn Helgaas wrote:
>> > It looks like you added the initial pciehp driver [1], which
>> > includes the following code in pciehp_disable_slot():
>> >
>> > + if (class_code == PCI_BASE_CLASS_DISPLAY) {
>> > + /* Display/Video adapter (not supported) */
>> > + rc = REMOVE_NOT_SUPPORTED;
>> >
>> > + /* If it's a bridge, check the VGA Enable bit */
>> > + if ((header_type & 0x7F) == PCI_HEADER_TYPE_BRIDGE) {
>> > + rc = pci_bus_read_config_byte (pci_bus, devfn,
>> > PCI_BRIDGE_CONTROL, &BCR);
>> > + if (rc)
>> > + return rc;
>> > +
>> > + /* If the VGA Enable bit is set, remove isn't supported */
>> > + if (BCR & PCI_BRIDGE_CTL_VGA) {
>> > + rc = REMOVE_NOT_SUPPORTED;
>> >
>> > I'm trying to figure out why VGA devices are handled specially. I
>> > can't find anything in the PCI specs that mentions this. Most of
>> > the other PCI hotplug drivers have similar code. Do you remember
>> > anything about this?
>>
>> The PCI spec said that you were not allowed to hotplug VGA drivers.
>> The big issue is that POST usually needs to run on those things, and
>> there is no way to POST a PCI hotplugged device.

I don't think this is a problem any more, is it? I think X can
execute option ROMs, and if we assign the guest to a VM, the guest
BIOS can also do it.

> Yeah, the legacy I/O regions get routed through the bridge with the VGA
> bit set, and most legacy code probably can't handle that (whether POST,
> VBIOS, or VGA drivers).
>
> There is some code for moving the VGA routing around, so that might be
> an option if you wanted to remove such a bridge. You'd have to find a
> VGA device under another bridge, and enable routing to that first, then
> you could do the remove.

The legacy code thing does seem like an issue. Since ac81860ea073, we
don't actually fail when removing a VGA device; we only fail when
removing a bridge with VGA routing enabled. Maybe that should be
tweaked so we fail when removing either a bridge or a VGA device to
which the legacy ranges are currently routed. Then we could still
remove secondary VGA devices like the computational GPUs that
motivated ac81860ea073.

Failing in pciehp_unconfigure_device() makes sense if the user pressed
the attention button to request removal. But if the bridge was
surprise-removed, e.g., user pulled out an ExpressCard, the bridge is
already gone, and failing just means we can't clean things up, so
we're going to leave drivers bound to the bridge and downstream
devices.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/