Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()

From: Eran Liberty
Date: Wed Jul 23 2008 - 14:35:29 EST


Matthew Wilcox wrote:

Look at the consequences of calling fixup_resource() twice on the same
resource:

res->start = (res->start + offset) & mask;
res->end = (res->end + offset) & mask;

You'll add 'offset' to the other resources twice.
Tring to find heads and tails in the code, I dug in. This is what I understand from the overall flow... and my points of interests:

1. pci_scan_child_bus() starts with iterating over all the devfn and scanning for a device with that devfn

drivers/pci/probe.c
1056 unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
1057 {
1063 /* Go find them, Rover! */
1064 for (devfn = 0; devfn < 0x100; devfn += 8)
1065 pci_scan_slot(bus, devfn);

2. pci_scan_slot() will continue to scan all the functions that devfn might have to over
drivers/pci/probe.c
1019 int pci_scan_slot(struct pci_bus *bus, int devfn)
1020 {
1026 for (func = 0; func < 8; func++, devfn++) {
1029 dev = pci_scan_single_device(bus, devfn);

3. pci_scan_single_device() will scan / test if this dev is valid.
drivers/pci/probe.c
996 struct pci_dev *__ref pci_scan_single_device(struct pci_bus *bus, int devfn)
997 {
998 struct pci_dev *dev;
999 1000 dev = pci_scan_device(bus, devfn);

and add it. REGARDLESS if it is already on the pci bus list or not.
1001 if (!dev)
1002 return NULL;
1003 1004 pci_device_add(dev, bus);
1005 1006 return dev;
1007 }

This is the first thing that needs to be fixed IMHO.

4. pci_scan_child_bus() will go on to fixup the bus whether it needs fixing or not.
drivers/pci/probe.c
1072 pcibios_fixup_bus(bus);

This might be the second thing that needs amending.

5. pcibios_fixup_bus(bus) calls _pcibios_fixup_bus which will call fixup_resource() once per a bus resource (in contrast of per device on that bus) and fixes the resources of that bus.
arch/powerpc/kernel/pci-common.c
784 static void __devinit __pcibios_fixup_bus(struct pci_bus *bus)
785 {
787 struct pci_dev *dev = bus->self;
798 for (i = 0; i < PCI_BUS_NUM_RESOURCES; ++i) {
833 fixup_resource(res, dev);
834 }
835 }
850 }

As you have correctly observed without other changes this would cause trouble on a bus that has resources.
Since i am not pulling the plug on the whole bus just on selected devices I can defensively skip this part

6. Now I should be able to call pci_bus_assign_resources(bus) which will go over all the newly added device and assign resource to them.

7. Last step I should be able to pci_bus_add_devices(bus) the, now, resource fixed devices.

Steps 6 and 7 are the one I miss most over which my device wont work after being re-born.

Soooo ...

If I really wanted to make it, working my way around the current code, I would have done something like this.

bus = null;
while ((bus = pci_find_next_bus(bus)) != NULL) {
for (devfn = 0; devfn < 0x100; devfn += 8) {
for (func = 0; func < 8; func++) {
struct pci_dev *dev = <http://liberty/lxr/ident?v=e500-linux-2.6.26-rc4;i=pci_get_slot>pci_get_slot(struct pci_bus *bus, unsigned int devfn);
if (dev) continue;
dev = pci_scan_single_device(bus, devfn);
if (!dev)
continue;
pci_device_add(dev, bus);
}
}
pci_bus_assign_resources(bus);
pci_bus_add_devices(bus);
}

WOW.... first time for me to code in Mozilla Thunderbird.

Had I was this smart to begin with I would have done exactly that and go home to sleep like a decent person. But since we are here, I feel there should be a either correction in the existing code to allow the:
pci_scan_child_bus(bus) -> pci_bus_assign_resources(bus) -> pci_bus_add_devices(bus) sequence.
Or a new helper function to the PCI that does more or less what I have just described.

If you want I can code it, patch it, and rip the glory :)

That was a long one :)

Liberty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/