Re: 2.6.4-rc1-mm1: queue-congestion-dm-implementation patch

From: Andrew Morton
Date: Tue Mar 02 2004 - 17:21:10 EST


Kevin Corry <kevcorry@xxxxxxxxxx> wrote:
>
> > Changing down_read() in dm_any_congested to down_read_trylock() would
> > probably fix it for bdi_*_congested(). If you can tell me how to
> > reproduce it I can try a few things..
>
> Switching to down_read_trylock() would certainly eliminate this problem, as
> long as you don't *need* to check the congestion of the underlying devices
> each time dm_any_congested() is called.

It's clear from the trace: we're doing down_read() inside
sync_sb_inodes()'s inode_lock.

Yes, a trylock would fix it up, but it's a bit sleazy.

And we might have another problem here: we're thnking about removing the
global plugging queue and switching disk plugging over to being a
per-address_space thing. So blk_run_queues() goes away and is replaced with

mapping->backing_dev_info.unplug(&mapping->backing_dev_info);

so we only unplug the queues which back the relevant address_space.

The top-level code will take the top-level queue's lock, and the
devicemapper's implementation of backing_dev_info.unplug() will be called
under the top-level queue lock and will need to perform the same
devicemapper table walk. A trylock here will not be acceptable: if it
fails, a process hangs up for seconds or even minutes.

So for two reasons now, it's looking like that semaphore which protects the
devicemapper tables needs to become a spinlock. One which has interesting
ranking properties.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/