Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

From: Ilya Dryomov
Date: Fri Aug 01 2014 - 09:50:13 EST


On Fri, Aug 1, 2014 at 5:27 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Fri, Aug 01, 2014 at 04:56:27PM +0400, Ilya Dryomov wrote:
>> I'm going to fix up rbd_request_fn(), but I want to make sure
>> I understand this in full.
>>
>> - Previously the danger of calling blocking primitives on the way to
>> schedule(), i.e. with task->state != TASK_RUNNING, was that if the
>> blocking primitive was indeed to block the task state would be set
>> back to TASK_RUNNING and the schedule() that that task was on the way
>> to wouldn't have any effect. Your "Add extra reschedule point" patch
>> essentially made calling mutex_lock() and probably others much more
>> wrong that it used to be, because mutex_lock() may now reschedule
>> when the task is not on the mutex wait queue.
>
> Right and in general we cannot allow spurious wakeups (although we try
> very hard to deal with them in generic code, which is why things more or
> less worked).
>
> But if you do a patch that 'randomly' ignores ->state on schedule (I did
> one) stuff comes apart _real_ quick.
>
> Therefore you should very much not destroy ->state on the way to
> schedule.
>
>> - There is nothing wrong with releasing queue_lock and reenabling IRQs
>> in rbd_request_fn() as long as it doesn't block and I remember to
>> disable IRQs and take queue_lock back on return.
>
> Releasing queue_lock might be ok, dunno about the blk locking, however
> reenabling IRQs it is actually wrong as per blk_flush_plug_list() since
> that uses local_irq_save()/restore() which means it can be called from
> contexts which cannot deal with enabling IRQs, and then your
> ->request_fn() goes and does that.
>
> Now maybe blk_flush_plug_list() is overly paranoid and it could use
> local_irq_disable()/enable() instead, I don't know. But until it does, a
> request_fn() should never reenable IRQs.
>
>> I'm asking because rbd_request_fn() is probably not the only broken in
>> this way code path. I poked around and found read_events() in aio.c,
>> it seems to have been written with the "danger" assumption that
>> I outlined above and there is even a comment to it.
>
> I'm fairly sure there's more broken stuff, I didn't dare looking.
>
>> Does that above make sense or am I missing something?
>
> I think that's about it.

Thanks for clarifying things. CC'ing Kent to draw attention to
read_events().

Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/