Re: [PATCH/RFC] I/O-check interface for driver's error handling

From: Benjamin Herrenschmidt
Date: Fri Mar 04 2005 - 22:32:15 EST


On Fri, 2005-03-04 at 14:54 +0100, Pavel Machek wrote:
> Hi!
>
> > > If there's no ->error method, at leat call ->remove so one device only
> > > takes itself down.
> > >
> > > Does this make sense?
> >
> > This was my thought too last time we had this discussion. A completely
> > asynchronous call is probably needed in addition to Hidetoshi's proposed API,
> > since as you point out, the driver may not be running when an error occurs
> > (e.g. in the case of a DMA error or more general bus problem). The async
>
> Hmm, before we go async way (nasty locking, no?) could driver simply
> ask "did something bad happen while I was sleeping?" at begining of each
> function?
>
> For DMA problems, driver probably has its own, timer-based,
> "something is wrong" timer, anyway, no?

No, there is no nasty locking, when the callback happens, pretty much
all IOs have stopped anyway due to errors, and we aren't on a critical
code path.

Polling for error might be possible, but async notification is the way
to go because whatever does error management need to be able to
separately:

- notify all drivers on the affected bus segment
- one the above is done, and based on system/driver capabilities (API
to be defined) eventually re-enable IO access and do a new round of
notifications
- based on system/driver capabilities, eventually reset the slot and
notify drivers to re-initialize themselves.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/