Re: [RFC/PATCH] Add pci_walk_bus function to PCI core

From: Kyle Moffett
Date: Wed Aug 10 2005 - 01:49:41 EST


On Aug 10, 2005, at 02:10:49, Arjan van de Ven wrote:
On Wed, 2005-08-10 at 11:36 +1000, Paul Mackerras wrote:

Greg,

Any comments on this patch? Would you be amenable to it going in post
2.6.13?

The PCI error recovery infrastructure needs to be able to contact all
the drivers affected by a PCI error event, which may mean traversing
all the devices under a given PCI-PCI bridge. This patch adds a
function to the PCI core that traverses all the PCI devices on a PCI
bus and under any PCI-PCI bridges on that bus (recursively), calling a
given function for each device.

is there a way to avoid the recursion somehow? Recursion is "not fun"
stack usage wise, esp if you have really deep hierarchies....

Hmm, it looks like PCI error recovery wants breadth-first recursion, so
you should be able to do some sort of tail-recursion or something. If
only one error-recovery action on a given subtree can be going at a time,
you should be able to add an "error_recovery" linked-list to the device
structure and do something like this:

void recover(...) {
struct list_head recovery_list = LIST_HEAD_INIT(recovery_list);
list_add(&dev->error_recovery, &recovery_list);

while(!list_empty(&recovery_list)) {
struct some_device_type *dev =
list_entry(recovery_list->next, struct some_device_type, error_recovery);

dev->some_recovery_function(dev, [...]);

list_del(&dev->error_recovery);
}
}

Then each PCI-PCI bridge's some_recovery_function could do this:

void some_recovery_function(struct some_device_type *dev, [...]) {
struct some_device_type *child;

actually_do_my_recovery();

list_for_each_entry(child, dev->some_pci_subdev_list, some_pci_list) {
if (needs_recovery(child))
list_add_tail(&child->error_recovery,&dev->error_recovery);
}
}

With such an arrangement, the callstack is as shallow as possible:

recover
some_recovery_function
actually_do_my_recovery
needs_recovery
childs_recovery_function
[...]

If you can have multiple simultaneous error-recovery actions per subtree,
that wouldn't properly work unless they were exclusive-blocking, IE:
an error recovery action triggers an error on a subtree which must
recover itself. In that case, with some extra state saved in the recover
function and passed to the "some_recovery_function", you could allow the
other recovery to continue before resuming.

If you can have two CPUs recovering the same device tree, I'd be inclined
to wonder what kind of strange errors you're causing on the PCI bus :-D,
and I'd be interested in an example of how that could work in any sane way.

Cheers,
Kyle Moffett

--
Premature optimization is the root of all evil in programming
-- C.A.R. Hoare



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/