Re: loop device deadlocks

From: Steve Dodd (steved@loth.demon.co.uk)
Date: Tue Jun 13 2000 - 15:41:54 EST


On Mon, Jun 05, 2000 at 07:07:57AM +0200, Mike Galbraith wrote:

> All I had to do to trigger the get_request problem was to do an iozone.

Curiouser and curiouser..

I played with iozone tonight on my victim box, booting with mem=8m seems to
lead to loop device disaster pretty rapidly.. But this didn't seem to be
a request issue. I had plugging disabled (I assume no other loop changes have
gone into test1-ac, I've not caught up yet), and saw the following stack
traces:

iozone process:

schedule
wakeup_bdflush
refill_freelist
getblk
block_getblk
ext2_get_block
__block_prepare_write
block_prepare_write
ext2_prepare_write
lo_send
do_lo_request
generic_make_request
__ll_rw_block
ll_rw_block
sync_page_buffers
try_to_free_buffers
shrink_mmap
do_try_to_free_pages
try_to_free_pages
__alloc_pages
generic_file_write
sys_write

kswapd:

schedule
wakeup_bdflush
refill_freelist
getblk
block_getblk
ext2_get_block
__block_prepare_write
block_prepare_write
ext2_prepare_write
lo_send
do_lo_request
generic_make_request
__ll_rw_block
ll_rw_block
sync_page_buffers
try_to_free_buffers
shrink_mmap
do_try_to_free_pages
kswapd

kflushd:

schedule
__find_lock_page
grab_cache_page
lo_send
do_lo_request
generic_make_request
__ll_rw_block
ll_rw_block
flush_dirty_buffers
bdflush

In other words, ext2 had a page locked but ran out of buffers trying to do
I/O on it, and attempts by bdflush to free any end up with us waiting for
that page to be unlocked..

I guess the next thing I'm going to try is Jens' [k]loopd patch, but even
with that I'm not convinced there aren't a whole load of resource (buffer,
buffer head, request) starvation problems.

Cheers,
Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Jun 15 2000 - 21:00:31 EST