Re: rwlock_t unfairness and tasklist_lock

From: Oleg Nesterov
Date: Sat Jan 12 2013 - 12:32:07 EST


On 01/09, Michel Lespinasse wrote:
>
> On Wed, Jan 9, 2013 at 9:49 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> > On 01/08, Michel Lespinasse wrote:
> >> Like others before me, I have discovered how easy it is to DOS a
> >> system by abusing the rwlock_t unfairness and causing the
> >> tasklist_lock read side to be continuously held
> >
> > Yes. Plus it has perfomance problems.
> >
> > It should die. We still need the global lock to protect, say,
> > init_task.tasks list, but otherwise we need the per-process locking.
>
> To be clear: I'm not trying to defend tasklist_lock here.

I understand,

> However,
> given how long this has been a known issue, I think we should consider
> attacking the problem from the lock fairness perspective first and
> stop waiting for an eventual tasklist_lock death.

And probably you are right,

> >> - Would there be any fundamental objection to implementing a fair
> >> rwlock_t and dealing with the reentrancy issues in tasklist_lock ? My
> >> proposal there would be along the lines of:
> >
> > I don't really understand your proposal in details, but until we kill
> > tasklist_lock, perhaps it makes sense to implement something simple, say,
> > write-biased rwlock and add "int task_struct->tasklist_read_lock_counter"
> > to avoid the read-write-read deadlock.
>
> Right. But one complexity that has to be dealt with, is how to handle
> reentrant uses of the tasklist_lock read side,
> ...
>
> there is still the
> possibility of an irq coming up in before the counter is incremented.

Sure, I didn't try to say that it is trivial to implement
read_lock_tasklist(), we should prevent this race.

> So to deal with that, I think we have to explicitly detect the
> tasklist_lock uses that are in irq/softirq context and deal with these
> differently from those in process context

I disagree. In the long term, I think that tasklist (or whatever we use
instead) should be never used in irq/atomic context. And probably the
per-process lock should be rw_semaphore (although it is not recursive).

But until then, if we try to improve the things somehow, we should not
complicate the code, we need something simple.

But actually I am not sure, you can be right.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/