Re: [rfc] "fair" rw spinlocks

From: Paul E. McKenney
Date: Mon Dec 07 2009 - 20:39:11 EST


On Mon, Dec 07, 2009 at 03:19:59PM -0800, Eric W. Biederman wrote:
> Andi Kleen <andi@xxxxxxxxxxxxxx> writes:
>
> > ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:
> >
> >> "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> writes:
> >>>
> >>> Is it required that all of the processes see the signal before the
> >>> corresponding interrupt handler returns? (My guess is "no", which
> >>> enables a trick or two, but thought I should ask.)
> >>
> >> Not that I recall. I think it is just an I/O completed signal.
> >
> > Wasn't there the sysrq SAK too? That one definitely would need
> > to be careful about synchronicity.
>
> SAK from sysrq is done through schedule work, I seem to recall the
> locking being impossible otherwise. There is also send_sig_all and a
> few others from sysrq. I expect we could legitimately make them
> schedule_work as well if needed.

OK, I will chance it... Here is one possible trick:

o Maintain a list of ongoing group-signal operations, protected
by some suitable lock. These could be in a per-chain-locked
hash table, hashed by the signal target (e.g., pgrp).

o When a task is created, it scans the above list, committing
suicide (or doing whatever the signal requires) if appropriate.

o When creating a child task, the parent holds an SRCU across
creation. It acquires SRCU before starting creation, and
releases it when it knows that the child has completed
scanning the above list.

o The updater does the following:

o Add its request to the above list.

o Wait for an SRCU grace period to elapse.

o Kill off everything currently in the task list,
and then wait for each such task to get to a point
where it can be guaranteed not to spawn additional
tasks. (This might be mediated via a reference
count in the corresponding list element, or by
rescanning the task list, or any of a number of
similar tricks.)

Of course, if the signal is non-fatal, then it is
necessary only to wait until the child has taken
the signal.

o If it is possible for a given task's children to
outlive it, despite the fact that the children must
commit suicide upon finding themselves indicated by the
list, wait for another SRCU grace period to elapse.
(This additional SRCU grace period would be required
for a non-fatal pgrp signal, for example.)

o Remove the element from the list.

Does this approach make sense, or am I misunderstanding the problem?

Either way, one additional question... It seems to me that non-fatal
signals really don't require the above mechanism, because if a task
handles the signal, and then spawns a child, one can argue that the
child came after the signal and should thus be unaffected. Right?
Or more confusion on my part?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/