Re: [GIT PULL] PM updates for 2.6.33

From: Linus Torvalds
Date: Mon Dec 07 2009 - 15:49:05 EST




On Mon, 7 Dec 2009, Alan Stern wrote:
>
> Yes, I like this approach better and better.
>
> There is still a problem. In your code outlines, you have presented a
> classic depth-first (suspend) or depth-last (resume) tree algorithm.

Yes, I did that because that clarifies the locking rules (ie "we traverse
parents nodes last/first"), not because it was actually relevant to
anything else.

And the whole pre-order vs post-order is important, and really only shows
up when you show the pseudo-code as a tree walk.

> But that's not how the PM core works. Instead it maintains dpm_list, a
> list of all devices in order of registration.

Right. I did mention that in a couple of the asides, I'm well aware that
we don't actually traverse the tree as a tree.

But the "traverse list forward" is logically the same thing as doing
a pre-order DFS, while going backwards is equivalent to doing a post-order
DFS, since all we really care about is the whole "parent first" or
"children first" part of the ordering.

So I wanted to show the logic in pseudo-code using the tree walk (because
that explains the logic _conceptually_ much better), but the actual code
would just do the list traversal.

> The consequence is that there's no way to hand off an entire subtree to
> an async thread. And as a result, your single-pass algorithm runs into
> the kind of "stall" problem I described before.

No, look again. There's no stall in the thing, because all it really
depends on is (for the suspend path) is that it sees all children before
the parent (because the child will do a "down_read()" on the parent node
and that should not stall), and for the resume path it depends on seeing
the parent node before any children (because the parent node does that
"down_write()" on its own node).

Everything else is _entirely_ asynchronous, including all the other locks
it takes. So there are no stalls (except, of course, if we then hit limits
on numbers of outstanding async work and refuse to create too many
outstanding async things, but that's a separate issue, and intentional, of
course).

You're right that my first one (two-phase suspend) had a stall situation.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/