Re: [PATCH] x86: sysctl to allow panic on IOCK NMI error

From: Maciej W. Rozycki
Date: Wed Jul 01 2009 - 13:32:22 EST


On Wed, 1 Jul 2009, Ingo Molnar wrote:

> > ENOTIME, sorry. Next year perhaps. Or a homework project for
> > one of the newbies. ;)
>
> You know that this project would kill a newbie, right? :)

Well, that's just a fast track to become a veteran, isn't it? ;)

> We have no real southbridge drivers on x86 - but we should certainly
> add some. Also, walking the PCI device tree from NMI context is
> tricky as the lists there are not NMI safe - we could crash if we
> happen to get a #IOCK while loading/unloading drivers (which is rare
> but could happen).

That shouldn't be a problem if we were about to panic(). For a more
sophisticated attempt of recovery -- yes, that would have to be addressed.

The possibly simplest approach could be keeping a local list of PCI
configuration space addresses of the PCI status (and secondary status, as
applicable) register of all the currently present devices. That would
still require some care as many northbridges do not provide means for
atomic PCI config accesses and a NMI could happen because of some other
than the CPU master's activity in the middle of a regular config access
being done by the CPU. So e.g. the config address register would have to
be restored so that the regular config data access goes to the right
location once the NMI handler has concluded. But that does not sound as
complicated as traversing the regular structures.

> IMHO it's all very much desired functionality, but highly
> non-trivial.

Memory ECC error handlers would benefit from some southbridge
infrastructure too.

Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/