Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
From: Tejun Heo
Date: Wed Apr 06 2005 - 08:40:54 EST
Jens Axboe wrote:
On Wed, Apr 06 2005, Arjan van de Ven wrote:
@@ -324,6 +334,7 @@
issue_flush_fn *issue_flush_fn;
prepare_flush_fn *prepare_flush_fn;
end_flush_fn *end_flush_fn;
+ release_queue_data_fn *release_queue_data_fn;
/*
* Auto-unplugging state
where does this function method actually get called?
I missed the hunk in ll_rw_blk.c, rmk pointed the same thing out not 5
minutes ago :-)
The patch would not work anyways, as scsi_sysfs.c clears queuedata
unconditionally. This is a better work-around, it just makes the queue
hold a reference to the device as well only killing it when the queue is
torn down.
Still not super happy with it, but I don't see how to solve the circular
dependency problem otherwise.
Hello, Jens.
I've been thinking about it for a while. The problem is that we're
reference counting two different objects to track lifetime of one
entity. This happens in both SCSI upper and mid layers. In the upper
layer, genhd and scsi_disk (or scsi_cd, ...) are ref'ed separately while
they share their destiny together (not really different entity) and in
the middle layer scsi_device and request_queue does the same thing.
Circular dependency is occuring because we separate one entity into two
and reference counting them separately. Two are actually one and
necessarily want each other. (until death aparts. Wow, serious. :-)
IMHO, what we need to do is consolidate ref counting such that in each
layer only one object is reference counted, and the other object is
freed when the ref counted object is released. The object of choice
would be genhd in upper layer and request_queue in mid layer. All
ref-counting should be updated to only ref those objects. We'll need to
add a release callback to genhd and make request_queue properly
reference counted.
Conceptually, scsi_disk extends genhd and scsi_device extends
request_queue. So, to go one step further, as what UL represents is
genhd (disk device) and ML request_queue (request-based device),
embedding scsi_disk into genhd and scsi_device into request_queue will
make the architecture clearer. To do this, we'll need something like
alloc_disk_with_udata(int minors, size_t udata_len) and the equivalent
for request_queue.
I've done this half-way and then doing it without fixing the SCSI
model seemed silly so got into working on the state model. (BTW, the
state model is almost done, I'm about to run tests.)
What do you think? Jens?
--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/