Re: More 2.2.17pre9 VM issues

From: Andrea Arcangeli (andrea@suse.de)
Date: Mon Jul 03 2000 - 09:40:37 EST


On Mon, 3 Jul 2000, Andrea Arcangeli wrote:

>Stephen, are we really sure we still need kpiod? Isn't GFP_IO meant to be
>clear if anybody is helding any filesystem lock (like superblock lock)?
>
>The fact is that we don't do any write-throttling in swap_out due kpiod
>async nature. If we would do_write_page then we would do the write
>throttling in the balance_dirty code while generating the dirty buffers in
>write(2) and things would be probably go much better with respect of
>memory allocations. While now swap_out returns succes in a hurry without
>doing any progress. This is of course ok for the places that can't swapout
>at all but when we have GFP_IO set we should be able to write to the fs
>(otherwise what is GFP_IO meant for? if writing to the fs isn't safe when
>we have GFP_IO set then swapfile (as opposed to swap device) wouldn't be
>safe either). So it looks like to me I can drop kpiod but please tell me
>if I'm missing something.
>
>I see only one place that is using GFP_IO even while it grabs the
>superblock lock and that looks a bug in 2.2.17pre9:
>
>--- 2.2.17pre9/fs/ext2/super.c Mon Jan 17 16:44:42 2000
>+++ /tmp/p Mon Jul 3 13:20:12 2000
>@@ -589,7 +589,7 @@
> EXT2_BLOCKS_PER_GROUP(sb);
> db_count = (sb->u.ext2_sb.s_groups_count + EXT2_DESC_PER_BLOCK(sb) - 1) /
> EXT2_DESC_PER_BLOCK(sb);
>- sb->u.ext2_sb.s_group_desc = kmalloc (db_count * sizeof (struct buffer_head *), GFP_KERNEL);
>+ sb->u.ext2_sb.s_group_desc = kmalloc (db_count * sizeof (struct buffer_head *), GFP_BUFFER);
> if (sb->u.ext2_sb.s_group_desc == NULL) {
> printk ("EXT2-fs: not enough memory\n");
> goto failed_mount;

I dropped kpiod and I applied the above patch and as I expected mmap002
stopped running out of memory and it seems to work. I also had a (cleaner
and more robust) backdoor for doing write throttling in shrink_mmap since
I understand if the swap space gets filled we may need to wait on cache
I/O completation if all the cache is dirty (but such write throttling was
not enough to fix mmap002 on 8mbyte of ram and after all it was rasonable
that mmap002 was failing given that we was assuming a success the mere
queueing of a request in kpiod). I feel like dropping kpiod it's thinko
and we really need kpiod but I can't find anymore a good reason for it
and dropping it partly fixes the MAP_SHARED problem so...

I also changed sync_page_buffers to be based on per-buffer information
(BH_Wait_IO) to know if we have to block to wait the I/O to complete or
if we can just lazily try a WRITEA.

The patch includes also all the other VM/cache patches that was in my
2.2.17pre6aa2 tree plus the recent GFP-race-fix-3.

        ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.17pre9/VM-global-patch-1

Don't try the above on a production box, but I'd like if people could try
to break it and tell me how it behaves in responsiveness during heavy I/O
and during swap. I only checked that mmap002 is rock solid with 8mbyte of
ram (really I tested setting 8mbyte of ram in the memtest lib include and
booting with mem=8m and that means I only had 6mbyte and half usable, plus
apache and all the other usual daemons in background). Also mmap001 runs
solid on 8mbyte of ram with it.

Also the above have the no-swapout included so it may make performance
difference in swap scenarios and I'd like to know about it and I can
provide a patch to backout the no-swapout part if there's interest to
test only the rest.

WARNING DETAIL: in the patch I'm potentially wait_on_buffer and doing
ll_rw_block(WRITE) even if GFP_IO is _not_ set. This looks safe because
ll_rw_block won't acquire any fs/VFS lock and wait_on_buffer will block
only if the buffer is locked and if the buffer is locked the blockdevice
driver will be sure able to do I/O on it without need of any fs/VFS lock.
And then I'll get the wakeup from an irq handler that won't need locks
either. Thus I think it's safe but if you see a flaw in the logic let me
know. Also 2.4.x isn't doing I/O if GFP_IO is not set and I think that is
because the GFP_IO semantics got misunderstood. (GFP_IO only means we
can't do I/O operations that involve filesystem/vfs locks like the
superblock lock and the inode semaphores)

I probably won't have time to stress it in other ram scenario in the next
new few days but by reading the code I'm quite ok with it. Incremental
patches and/or hints are welcome of course.

Thanks.

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jul 07 2000 - 21:00:12 EST