Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5)

From: Rafael J. Wysocki
Date: Sun Mar 08 2009 - 17:37:18 EST


On Sunday 08 March 2009, Alan Stern wrote:
> On Sun, 8 Mar 2009, Linus Torvalds wrote:
>
> > On Sat, 7 Mar 2009, Alan Stern wrote:
> > >
> > > You didn't answer my question. Why bother to distinguish between
> > > "wake-up" interrupts and non-"wake-up" interrupts?
> > >
> > > In other words, why not simply abort the suspend if IRQ_PENDING is set
> > > for _any_ interrupt during sysdev_suspend()?
> >
> > .. because some drivers might not actually shut down the hardware until
> > they get to "suspend_late"? If even then, for that matter - a driver may
> > simply not care, knowing that the hardware will be powered off, and will
> > be re-initialized at resume.
> >
> > The thinking that you have to shut your hardware down at "->suspend()"
> > time is a _disease_. There are literally classes of hardware out there
> > where that would be an outright _bug_, like for a PCI bridge device. For
> > many devices, "suspend()" has to be the phase where you shut down the
> > _external_ stuff (eg for a disk controller, it's when you'd flush and stop
> > your disks), but the controller itself may well be alive until later.
>
> Yes, certainly. I agree completely.
>
> But there is a difference between shutting down the hardware and merely
> preventing it from generating interrupt requests. If a device remains
> capable of generating IRQs after its driver's suspend method has run,
> the driver runs the risk of having its handler called at a time when it
> isn't prepared to cope correctly. Of course, this will depend on the
> details of how the driver is written.
>
> There have been examples in the past of devices that, for one reason or
> another, _did_ generate IRQs at inconvenient times. The hardware or
> the BIOS may have done improper initialization, for example. On a
> shared IRQ this led to interrupt storms.

Well, we're now trying to fix exactly this problem. :-)

> IIRC, the solution was to add a PCI quirk routine to disable IRQ generation
> at an early stage. Didn't e100 have this problem?

I don't remember, sorry.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/