Re: [PATCH] [patch 4a/4] ipc: sem optimise simple operations

From: Manfred Spraul
Date: Sat Aug 15 2009 - 12:32:17 EST


On 08/15/2009 04:49 PM, Nick Piggin wrote:

I don't see how you've argued that yours is better.

Lower number of new code lines,
Lower total code size increase.
Lower number of seperate codepaths.
Lower runtime memory consumption.
Two seperate patches for the two algorithm improvements.

The main advantage of your version is that you optimize more cases.
If you are worried about memory consumption, we can add _rcu variants
to hlists and use them.
There is no need for _rcu, the whole code runs under a spinlock.
Thus the wait_for_zero queue could be converted to a hlist immediately.

Hmm: Did you track my proposals for your version?

- exit_sem() is not a hot path.
I would propose to tread every exit_sem as update_queue, not an update_queue_simple for every individual UNDO.

- create an unlink_queue() helper that contains the updates to q->lists and sma->complex_count.
Three copies ask for errors.

- now: use a hlist for the zero queue.

And if you are worried about text size, then
I would bet my version actually uses less icache in the case of
simple ops being used.
It depends. After disabling inlining, including all helper functions that differ:

My proposal: 301 bytes for update_queue.

"simple", only negv: 226 bytes
"simple, negv+zero: 354 bytes
simple+complex: 526 bytes.

Thus with only +-1 simple ops, your version uses less icache. If both +-1 and 0 ops are used, your version uses more icache.

Could you please send me your benchmark app?

--
Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/