Re: [PATCH] PPC64: EEH Recovery

From: Paul Mackerras
Date: Wed Jan 19 2005 - 01:02:05 EST


Linas Vepstas writes:

> p.s. It was not clear to me if the EEH patch previously sent
> (6 January 2005, same subject line) will be wending its way into
> the main Torvalds kernel tree, or not. I hadn't really gotten
> confirmation one way or another.

I'm not really totally happy with it yet, on a number of fronts:

1. You're adding more PCI-specific stuff to the device_node struct,
which I don't like. I would prefer that the device_node tree
contains basically just what we get from OF, and that we have a
separate struct for storing ppc64-specific information for each PCI
device. Fixing that is outside the scope of your patch, though.

2. I don't see why the device nodes for the PCI subtree being reset
would go away, and thus I don't see the need for your eeh_cfg_tree
struct.

3. Is there a good reason why we can't use the assigned-addresses
property on the relevant device tree nodes to tell us what to set
the BARs to?

4. I think the 5 second sleep is quite bogus, and shows that we have
the flow of control wrong. In particular I think it should be a
userland write to a sysfs file that kicks off the restart process
rather than it just happening after 5 seconds. Anyway, what
process or thread is executing that 5 second sleep? Is it keventd
or something?

5. AFAICS userland will get an unplug notification for the device, but
nothing to indicate that is due to an EEH slot isolation event. I
think userland should be told about EEH events.

Regards,
Paul.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/