Re: arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm , improvement , [Re: 2.2.0 B

Linus Torvalds (torvalds@transmeta.com)
Thu, 7 Jan 1999 14:57:34 -0800 (PST)


On Thu, 7 Jan 1999, Linus Torvalds wrote:
>
> The deadlock I suspect is:
> - we're low on memory
> - we allocate or look up a new block on the filesystem. This involves
> getting the ext2 superblock lock, and doing a "bread()" of the free
> block bitmap block.
> - this causes us to try to allocate a new buffer, and we are so low on
> memory that we go into try_to_free_pages() to find some more memory.
> - try_to_free_pages() finds a shared memory file to page out.
> - trying to page that out, it looks up the buffers on the filesystem it
> needs, but deadlocks on the superblock lock.

Confirmed. Hpa was good enough to reproduce this, and my debugging code
caught the (fairly deep) deadlock:

system_call ->
sys_write ->
ext2_file_write ->
ext2_getblk ->
ext2_alloc_block -> ** gets superblock lock **
ext2_new_block ->
getblk ->
refill_freelist ->
grow_buffers ->
__get_free_pages ->
try_to_free_pages ->
swap_out ->
swap_out_process ->
swap_out_vma ->
try_to_swap_out ->
filemap_swapout ->
filemap_write_page ->
ext2_file_write ->
ext2_getblk ->
ext2_alloc_block ->
__wait_on_super ** BOOM - we want the superblock lock again **

and I suspect the fix is fairly simple: I'll just add back the __GFP_IO
bit (we kind of used to have one that did something similar) which will
make the swap-out code not write out shared pages when it allocates
buffers.

The better fix would actually be to make sure that filesystems do not hold
locks around these kinds of blocking operations, but that is harder to do
at this late stage.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/