Re: [RFC PATCH v2 01/10] CPU hotplug: Provide APIs for "light" atomicreaders to prevent CPU offline

From: Srivatsa S. Bhat
Date: Wed Dec 05 2012 - 13:49:15 EST

Replaying what Tejun wrote:

(cc'ing Oleg)

Hello, Srivatsa.

On 12/06/2012 12:13 AM, Srivatsa S. Bhat wrote:
> Also, since we don't use per-cpu locks (because rwlocks themselves are quite
> scalable for readers), we don't end up in any lock ordering problems that can
> occur if we try to use per-cpu locks.

Read-lock really isn't that scalable when you compare it to
preempt_disable/enable(). When used on hot paths, it's gonna generate
a lot of cacheline pingpongs. This patch is essentially creating a
new big lock which has potential for being very hot.

preempt_disable/enable() + stop_machine() essentially works as percpu
rwlock with very heavy penalty on the writer side. Because the reader
side doesn't even implement spinning while writer is in progress, the
writer side has to preempt the readers before entering critical
section and that's what the "stopping machine" is about.

Note that the resolution on the reader side is very low. Any section
w/ preemption disabled is protected against stop_machine(). Also, the
stop_machine() itself is extremely heavy involving essentially locking
up the machine until all CPUs can reach the same condition via
scheduling the stop_machine tasks. So, I *think* all you need to do
here is making cpu online locking finer grained (separated from
preemption) and lighten the writer side a bit. I'm quite doubtful
that you would need to go hunting donw all get_online_cpus(). They
aren't used that often anyway.

Anyways, so, separating out cpu hotplug locking from preemption is the
right thing to do but I think rwlock is likely to be too heavy on the
reader side. I think percpu reader accounting + reader spinning while
writer in progress should be a good combination. It's a bit heavier
than preempt_disable() - it'll have an extra conditional jump on the
hot path, but there won't be any cacheline bouncing. The writer side
would need to synchronize against all CPUs but only against the ones
actually read locking cpu hotplug. As long as reader side critical
sections don't go crazy, it should be okay.

So, we basically need percpu_rwlock. We already have percpu_rwsem.
We used to have some different variants of writer-heavy locks. Dunno
what happened to them. Maybe we still have it somewhere. Oleg has
been working on the area lately and should know more. Oleg, it seems
CPU hotplug needs big-reader rwlock, ideas on how to proceed?


-- tejun

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at