Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler

From: Jens Axboe
Date: Wed Apr 06 2005 - 13:03:21 EST


On Wed, Apr 06 2005, Tejun Heo wrote:
> Jens Axboe wrote:
> >On Wed, Apr 06 2005, Arjan van de Ven wrote:
> >
> >>>@@ -324,6 +334,7 @@
> >>> issue_flush_fn *issue_flush_fn;
> >>> prepare_flush_fn *prepare_flush_fn;
> >>> end_flush_fn *end_flush_fn;
> >>>+ release_queue_data_fn *release_queue_data_fn;
> >>>
> >>> /*
> >>> * Auto-unplugging state
> >>
> >>where does this function method actually get called?
> >
> >
> >I missed the hunk in ll_rw_blk.c, rmk pointed the same thing out not 5
> >minutes ago :-)
> >
> >The patch would not work anyways, as scsi_sysfs.c clears queuedata
> >unconditionally. This is a better work-around, it just makes the queue
> >hold a reference to the device as well only killing it when the queue is
> >torn down.
> >
> >Still not super happy with it, but I don't see how to solve the circular
> >dependency problem otherwise.
> >
>
> Hello, Jens.
>
> I've been thinking about it for a while. The problem is that we're
> reference counting two different objects to track lifetime of one
> entity. This happens in both SCSI upper and mid layers. In the upper
> layer, genhd and scsi_disk (or scsi_cd, ...) are ref'ed separately while
> they share their destiny together (not really different entity) and in
> the middle layer scsi_device and request_queue does the same thing.
> Circular dependency is occuring because we separate one entity into two
> and reference counting them separately. Two are actually one and
> necessarily want each other. (until death aparts. Wow, serious. :-)

That's precisely correct.

> IMHO, what we need to do is consolidate ref counting such that in each
> layer only one object is reference counted, and the other object is
> freed when the ref counted object is released. The object of choice
> would be genhd in upper layer and request_queue in mid layer. All
> ref-counting should be updated to only ref those objects. We'll need to
> add a release callback to genhd and make request_queue properly
> reference counted.
>
> Conceptually, scsi_disk extends genhd and scsi_device extends
> request_queue. So, to go one step further, as what UL represents is
> genhd (disk device) and ML request_queue (request-based device),
> embedding scsi_disk into genhd and scsi_device into request_queue will
> make the architecture clearer. To do this, we'll need something like
> alloc_disk_with_udata(int minors, size_t udata_len) and the equivalent
> for request_queue.
>
> I've done this half-way and then doing it without fixing the SCSI
> model seemed silly so got into working on the state model. (BTW, the
> state model is almost done, I'm about to run tests.)
>
> What do you think? Jens?

It is of course the optimal solution to really solve the hierarchy of
references, but more involved. If you have time / desire to do it, I'd
be happy to review it :-)

For now I'm happy with the work-around. It's not too ugly, and at least
it makes it possible to kill the worse work-around of
scsi_kill_requests().

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/