Re: blk_congestion_wait racy?

From: Nick Piggin
Date: Wed Mar 10 2004 - 01:43:54 EST




Andrew Morton wrote:

Nick Piggin <piggin@xxxxxxxxxxxxxxx> wrote:

But I'm guessing that you have no requests in flight by the time
blk_congestion_wait gets called, so nothing ever gets kicked.


That's why blk_congestion_wait() in -mm propagates the schedule_timeout()
return value. You can do:

if (blk_congestion_wait(...))
printk("ouch\n");

If your kernel says ouch much, we have a problem.



Martin, have you tried adding this printk?

Andrew, could you take the following patch (even though it didn't fix
the problem).

I think the smp_mb isn't needed because the rl waitqueue stuff is
serialised by the queue spinlocks.

The addition of the smp_mb and the other change is to try to close the
window for races a bit. Obviously they can still happen, it's a racy
interface and it doesn't matter much.

linux-2.6-npiggin/drivers/block/ll_rw_blk.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff -puN drivers/block/ll_rw_blk.c~blk-congestion-races drivers/block/ll_rw_blk.c
--- linux-2.6/drivers/block/ll_rw_blk.c~blk-congestion-races 2004-03-10 16:38:33.000000000 +1100
+++ linux-2.6-npiggin/drivers/block/ll_rw_blk.c 2004-03-10 16:41:29.000000000 +1100
@@ -110,6 +110,9 @@ static void clear_queue_congested(reques

bit = (rw == WRITE) ? BDI_write_congested : BDI_read_congested;
clear_bit(bit, &q->backing_dev_info.state);
+
+ smp_mb(); /* congestion_wqh is not synchronised. This is still racy,
+ * but better. It isn't a big deal */
if (waitqueue_active(wqh))
wake_up(wqh);
}
@@ -1543,7 +1546,6 @@ static void freed_request(request_queue_
if (rl->count[rw] < queue_congestion_off_threshold(q))
clear_queue_congested(q, rw);
if (rl->count[rw]+1 <= q->nr_requests) {
- smp_mb();
if (waitqueue_active(&rl->wait[rw]))
wake_up(&rl->wait[rw]);
if (!waitqueue_active(&rl->wait[rw]))
@@ -2036,8 +2038,8 @@ long blk_congestion_wait(int rw, long ti
DEFINE_WAIT(wait);
wait_queue_head_t *wqh = &congestion_wqh[rw];

- blk_run_queues();
prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
+ blk_run_queues();
ret = io_schedule_timeout(timeout);
finish_wait(wqh, &wait);
return ret;

_