Re: Linux-2.2.0 bad VM behaviour "dd if=/dev/zero of=/dev/hdc bs=256k"

Andrea Arcangeli (andrea@e-mind.com)
Wed, 27 Jan 1999 20:11:02 +0100 (CET)


On Wed, 27 Jan 1999, Stephen C. Tweedie wrote:

> Hi,
>
> On 26 Jan 1999 23:00:13 +0100, Andi Kleen <ak@muc.de> said:
>
> > I have similar problems, although 2.2.0(release) seems to be better
> > than 2.2.0pre9. Instead of freezing the machine completely the logins
> > only take a long time - still looks like a brown paper bag bug to me.
>
> On what size of memory? I can't reproduce any such problems on my test
> boxes here.

Stephen the problem is that currently when we write to a block device we
only:

1. get the buffer uptodate
2. write to the buffer the new data in the right place
3. mark the buffer dirty and uptodate
4. release the buffer

but we don't start I/O at all, so in a msec we'll be in a state with
128mbyte full of dirty and so not freeable buffers.

kflushd, update, sync try try try to sync these buffer to disk but it's
far too much for them and the first time an application reclaim memory it
will hog.

Having such many dirty buffers sleeping in the buffer cache without
starting I/O is like a bomb that it's waiting to explode.

So I have a fix for that. make_request NR_REQUEST*2/3 (write case) should
also take care that there aren't too much buffers dirty in the buffer
cache, I think this should be enough to allow good performances
avoiding any risk to hang due not freeable memory.

Here my fix against 2.2.0:

Index: block_dev.c
===================================================================
RCS file: /var/cvs/linux/fs/block_dev.c,v
retrieving revision 1.1.2.1
diff -u -r1.1.2.1 block_dev.c
--- block_dev.c 1999/01/18 01:32:24 1.1.2.1
+++ linux/fs/block_dev.c 1999/01/27 19:07:18
@@ -16,6 +16,39 @@
#define MAX_BUF_PER_PAGE (PAGE_SIZE / 512)
#define NBUF 64

+/*
+ * Start I/O every time we get a write request. Not doing this could lead
+ * to massive amount of not freeable buffers that sleep in the buffer cache,
+ * and that could stall the system far too much if for some reason we'll need
+ * to reclaim buffer memory at some time or if we request a sync.
+ *
+ * Andrea Arcangeli 27 Jen 1999
+ */
+static int FASTCALL(do_block_write(struct buffer_head **, ssize_t, int));
+static int do_block_write(struct buffer_head ** bufferlist,
+ ssize_t buffercount, int wait)
+{
+ int i;
+
+ ll_rw_block(WRITE, buffercount, bufferlist);
+ if (!wait)
+ {
+ for(i=0; i<buffercount; i++)
+ brelse(bufferlist[i]);
+ return 0;
+ } else {
+ int error = 0;
+ for(i=0; i<buffercount; i++)
+ {
+ wait_on_buffer(bufferlist[i]);
+ if (!buffer_uptodate(bufferlist[i]))
+ error = 1;
+ brelse(bufferlist[i]);
+ }
+ return error;
+ }
+}
+
ssize_t block_write(struct file * filp, const char * buf,
size_t count, loff_t *ppos)
{
@@ -110,32 +143,19 @@
buf += chars;
mark_buffer_uptodate(bh, 1);
mark_buffer_dirty(bh, 0);
- if (filp->f_flags & O_SYNC)
- bufferlist[buffercount++] = bh;
- else
- brelse(bh);
- if (buffercount == NBUF){
- ll_rw_block(WRITE, buffercount, bufferlist);
- for(i=0; i<buffercount; i++){
- wait_on_buffer(bufferlist[i]);
- if (!buffer_uptodate(bufferlist[i]))
- write_error=1;
- brelse(bufferlist[i]);
- }
+ bufferlist[buffercount++] = bh;
+ if (buffercount == NBUF)
+ {
+ write_error = do_block_write(bufferlist, buffercount,
+ filp->f_flags & O_SYNC);
buffercount=0;
}
if(write_error)
break;
}
- if ( buffercount ){
- ll_rw_block(WRITE, buffercount, bufferlist);
- for(i=0; i<buffercount; i++){
- wait_on_buffer(bufferlist[i]);
- if (!buffer_uptodate(bufferlist[i]))
- write_error=1;
- brelse(bufferlist[i]);
- }
- }
+ if ( buffercount )
+ write_error = do_block_write(bufferlist, buffercount,
+ filp->f_flags & O_SYNC);
filp->f_reada = 1;
if(write_error)
return -EIO;

People that is getting problems with `dd` writing to a block device could
try this my patch and feedback?

Andrea Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/