Re: [BUG] 2.6.23-git18 Kernel oops in sg helpers

From: Jens Axboe
Date: Wed Oct 24 2007 - 08:26:44 EST


On Wed, Oct 24 2007, Andy Whitcroft wrote:
> On Tue, Oct 23, 2007 at 08:44:20PM +0200, Jens Axboe wrote:
> > On Tue, Oct 23 2007, Kamalesh Babulal wrote:
> > > Hi,
> > >
> > > Kernel oops is triggered while running fsx-linux test, followed by cpu softlock
> > > over the AMD box
> > >
> > > Unable to handle kernel NULL pointer dereference at 0000000000000018 RIP:
> > > [<ffffffff8021f2f6>] gart_map_sg+0x26c/0x406
> > > PGD 10185b067 PUD 10075b067 PMD 0
> > > Oops: 0002 [1] SMP
> > > CPU 3
> > > Modules linked in:
> > > Pid: 18676, comm: fsx-linux Not tainted 2.6.23-git18-autokern1 #1
> > > RIP: 0010:[<ffffffff8021f2f6>] [<ffffffff8021f2f6>] gart_map_sg+0x26c/0x406
> > > RSP: 0000:ffff810181edf948 EFLAGS: 00010002
> >
> > Can you check where gart_map_sg+0x26c is at? Make sure you have
> > CONFIG_DEBUG_INFO defined, then do:
> >
> > $ gdb vmlinux
> > $ l *gart_map_sg+0x26c
>
> Ok, this problem still seems to be about in 2.6.24-rc1. Here is the gdb
> output from that version, the panic (also below) seems the same:
>
> (gdb) l *gart_map_sg+0x26c
> 0xffffffff8022011e is in gart_map_sg (arch/x86/kernel/pci-gart_64.c:433).
> 428 goto error;
> 429 out++;
> 430 flush_gart();
> 431 if (out < nents) {
> 432 sgmap = sg_next(sgmap);
> 433 sgmap->dma_length = 0;
> 434 }
> 435 return out;
> 436
> 437 error:
>
> So it seems sg_next has returned 0.

Interesting. Can you add a

printk("mapped %d of %d\n", out, nents);

prior to that sg_next() call and reproduce?

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/