Re: [RFC][PATCH 0/5] Signal scalability series

From: Thomas Gleixner
Date: Mon Oct 03 2011 - 09:56:37 EST

On Sun, 2 Oct 2011, Tejun Heo wrote:
> On Sat, Oct 01, 2011 at 03:03:29PM +0200, Peter Zijlstra wrote:
> > On Sat, 2011-10-01 at 11:16 +0100, Matt Fleming wrote:
> > > I also think Thomas/Peter mentioned something about latency in
> > > delivering timer signals because of contention on the per-process
> > > siglock. They might have some more details on that.
> >
> > Right, so signal delivery is O(nr_threads), which precludes being able
> > to deliver signals from hardirq context, leading to lots of ugly in -rt.
> Signal delivery is O(#threads)? Delivery of fatal signal is of course
> but where do we walk all threads during non-fatal signal deliveries?
> What am I missing?

Delivery of any process wide signal can result in an O(thread) walk to
find a valid target. That's true for user space originated and kernel
space originated (e.g. posix timers) signals.

> > Breaking up the multitude of uses of siglock certainly seems worthwhile
> > esp. if it also allows for a cleanup of the horrid mess called
> > signal_struct (which really should be called process_struct or so).
> >
> > And yes, aside from that the siglock can be quite contended because its
> > pretty much the one lock serializing all of the process wide state.
> Hmmm... can you please be a bit more specific? I personally has never
> seen a case where siglock becomes a problem and IIUC Matt also doesn't

Signal heavy applications suffer massivly from sighand->siglock
contention. sighand->siglock protects the world and some more and Matt
has explained it quite proper. And we have rather large code pathes
covered by it (posix-cpu-timers are the worst of all).

> have actual use case at hand. Given the fragile nature of this part
> of kernel, it would be nice to know what the return is.

The return is finer grained locking and in the end a faster signal
delivery path which benefits everyone as we do not burden a random
interrupted task with the convoluted signal delivery because we want
to burden the task using signals with it.


