Re: [RFC][PATCH] PM: Introduce new top level suspend and hibernationcallbacks (rev. 2)

From: Alan Stern
Date: Thu Mar 20 2008 - 21:21:46 EST

Next message: Alan Cox: "Re: 2.6.23-rc9-git4: pata_pcmcia, disabling IRQ #9 and Re: libataand legacy ide pcmcia failure"
Previous message: Oleg Nesterov: "Re: [PATCH 5/5] ptrace: it is fun to strace /sbin/init"
In reply to: Rafael J. Wysocki: "Re: [linux-pm] [RFC][PATCH] PM: Introduce new top level suspend and hibernation callbacks (rev. 2)"
Next in thread: Rafael J. Wysocki: "Re: [RFC][PATCH] PM: Introduce new top level suspend and hibernation callbacks (rev. 2)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, 21 Mar 2008, Rafael J. Wysocki wrote:

> Hi,
>
> Below is an updated version of the $subject patch. It has only been
> compilation tested, but I'd like to get as much feedback as reasonably possible
> at this stage to avoid redesigning things later in the process.
>
> The most important changes from the previous one are the following:
> * Callbacks to be executed with interrupts disabled are now separated from
> the "regular" ones. There still are six of them, but IMHO this is what may
> be necessary in the most general case (eg. a driver may have to carry out
> some operations with interrupts disabled, which are different for SUSPEND,
> FREEZE and HIBERNATE; the complementary "resume" callbacks are needed for
> error recovery, for example)
> * It occured to me that the ->freeze() and ->quiesce() callbacks were
> essentially the same, so I removed the latter
> * The ->recover() callback was also dropped, as ->thaw() may well be used instead
> just fine (it seems)
> * ->prepare() and ->complete() are now called in separate loops which causes
> some funny complications with error recovery. Namely, we must remember which
> devices have already been ->prepare()d, so that we call ->complete() for them
> in the error path. Moreover, there may be some ->prepare()d and some
> ->suspend()ed devices at the same time, so if there's an error we should
> ->resume() the ->suspend()ed ones and ->complete() everything etc.
> This may be handled in many different ways, but I decided to introduce a new
> list on which the ->complete()d devices are stored. Accordingly, the
> registration of new children of a device is blocked between ->prepare() and
> ->complete() (it was only blocked between ->prepare() and ->resume() in the
> previous version).
>
> Please have a look.

I don't have time to go through this in detail now, but a few things
stand out.

One trivial problem is that your dpm_prepare and dpm_complete routines
go through the list in the wrong order.

More importantly, the situation with prepare and complete is getting
rather complicated. Now that you're adding dev->power.state, why not
go all the way? Get rid of all the different lists, keeping just
dpm_active and possibly dpm_destroy. Instead of moving devices between
different lists, just store in dev->power.state an identification for
which list the device is supposed to be on or is supposed to be moving
to. (Or else have a bunch of bitfields indicating which methods have
been called.) This has the advantage that the entries will never get
out of order, even if devices are registered at the wrong time.

There's some question about when we want the PM core to start
preventing new children from being registered. Should this start right
after prepare() returns, or should it start right before suspend() is
called? The first alternative sounds better to me. Not only does it
agree with the purpose of the prepare method, it also makes race
situations easier to handle.

Since dpm_prepare is supposed to go through the list in the forward
direction, logically the "root" of the device tree should be the first
thing "prepared". This means you should not allow parentless devices
to be registered any time after dpm_prepare has started. Is that
liable to cause problems?

There may be similar problems with complete(). A number of drivers
check during their resume method for the presence of new children
plugged in while the system was asleep. All these drivers would have
to be converted over to the new scheme if they weren't permitted to
register the new children before complete() was called. Of course,
this is easily fixed by permitting new children starting from when
resume() is called rather than when complete() is called.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Alan Cox: "Re: 2.6.23-rc9-git4: pata_pcmcia, disabling IRQ #9 and Re: libataand legacy ide pcmcia failure"
Previous message: Oleg Nesterov: "Re: [PATCH 5/5] ptrace: it is fun to strace /sbin/init"
In reply to: Rafael J. Wysocki: "Re: [linux-pm] [RFC][PATCH] PM: Introduce new top level suspend and hibernation callbacks (rev. 2)"
Next in thread: Rafael J. Wysocki: "Re: [RFC][PATCH] PM: Introduce new top level suspend and hibernation callbacks (rev. 2)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]