Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler

From: Jens Axboe
Date: Wed Apr 06 2005 - 13:00:37 EST


On Wed, Apr 06 2005, James Bottomley wrote:
> On Tue, 2005-03-29 at 14:03 +0200, Jens Axboe wrote:
> > It is quite a serious problem, not just for CFQ. SCSI referencing is
> > badly broken there.
>
> OK ... I accept that with regard to the queue lock.

It is much deeper than that. The recent hack to kill requests is yet
another example of that. At least this work-around makes it a little
better, but the mid layer assumption that sdev going to zero implies the
queue going away at the same time is inherently broken.

> However, rather than trying to work out a way to tie all the refcounted
> objects together, what about the simpler solution of making the lock
> bound to the lifetime of the queue?

That's essentially what the work-around does.

> As far as SCSI is concerned, we could simply move the lock into the
> request_queue structure and everything would work since the device holds
> a reference to the queue. The way it would work is that we'd simply
> have a lock in the request_queue structure, but it would be up to the
> device to pass it in in blk_init_queue. Then we'd alter the scsi_device
> sdev_lock to be a pointer to the queue lock? This scheme would also
> work for the current users who have a global lock (they simply wouldn't
> use the lock int the request_queue).
>
> The only could on the horizon with this scheme is that there may
> genuinely be places where we want multiple queues to share a non-global
> lock: situations where we have shared issue queues (like IDE), or
> shared tag resources are a possibility. To cope with those, we'd
> probably have to have a separately allocated, reference counted lock.
>
> However, I'm happy to implement the simpler solution (lock in
> requuest_queue) if you agree.

I rather like the queue lock being a pointer, so you can share at
whatever level you want. Lets not grow the request_queue a full lock
just to work around a bug elsewhere.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/