Re: [BUG] Unable to handle kernel NULL pointerdereference...as_move_to_dispatch+0x11/0x135
From: Andrew Morton
Date: Fri Feb 02 2007 - 17:12:48 EST
On Fri, 2 Feb 2007 12:56:30 -0800
Andrew Vasquez <andrew.vasquez@xxxxxxxxxx> wrote:
> On Thu, 01 Feb 2007, Andrew Morton wrote:
>
> > On Mon, 22 Jan 2007 10:35:10 -0800 Andrew Vasquez <andrew.vasquez@xxxxxxxxxx> wrote:
> > > Basically what is happening from the FC side is the initiator executes
> > > a simple dt test:
> > >
> > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats limit=2m passes=1000000 pattern=iot dlimit=2048
> > >
> > > against a single lun (a very basic Windows target mode driver).
> > > During the test a port-enable, port-disable script is running agains
> > > the switch's port that is connected to the target (this occurs every
> > > sixty seconds (for a disabled duration of 2 seconds). Additionally,
> > > the target itself is set to LOGO (logout) or drop off the topology
> > > every 30 seconds.
> >
> > I don't understand what effect the port-enable/port-disable has upon the
> > system. Will it cause I/O errors, or what?
>
> No I/O errors should make there way to the upper-layers (block/FS).
> The system *should* be shielded from the fibre-channel fabric events.
> I just wanted to explain what the (basic sanity) test did.
>
> > > This test runs fine up to 2.6.19.
> >
> > One thing we did in there was to give direct-io-against-blockdevs some
> > special-case bio-preparation code. Perhaps this is tickling a bug somehow.
> >
> > We can revert that change like this:
> >
> >
> > diff -puN fs/block_dev.c~a fs/block_dev.c
> > --- a/fs/block_dev.c~a
> > +++ a/fs/block_dev.c
> > @@ -196,8 +196,47 @@ static void blk_unget_page(struct page *
> > pvec->page[--pvec->idx] = page;
> > }
> >
> > +static int
> > +blkdev_get_blocks(struct inode *inode, sector_t iblock,
> > + struct buffer_head *bh, int create)
> ...
>
> Hmm, with this patch we've noted two main differences:
>
> 1) I/O throughput with the basic 'dd' command used (above) is back to
> 60MB/s, rather than the appalling 20-22 MB/s we were seeing with
> 2.6.20-rcX.
>
> 2) No panics -- so far with 2+ hours of testing. With our vanilla
> system of 2.6.20-rc7, the test could trigger the panic within 15 to
> 20 minutes.
>
> We'll let this run over the weekend -- I'll certainly let you know if
> anything has changed (failures).
Oh crap, I didn't realise we had a performance regression as well.
direct-io against a blockdev is kinda important for some people, so this is
a must-fix for 2.6.20. I'll prepare a minimal patch to switch 2.6.20 back
to the 2.6.19 codepaths.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/