Re: lockdep question (was Re: IPoIB caused a kernel: BUG: softlockupdetected on CPU#0!)

From: Michael S. Tsirkin
Date: Mon Mar 12 2007 - 10:19:53 EST


> Quoting Ingo Molnar <mingo@xxxxxxx>:
> Subject: Re: lockdep question (was Re: IPoIB caused a kernel: BUG: softlockup detected on CPU#0!)
>
>
> * Michael S. Tsirkin <mst@xxxxxxxxxxxxxx> wrote:
>
> > > could you turn on CONFIG_SLAB_DEBUG as well?
> > >
> > > that should catch certain types of use-after-free accesses, and
> > > lockdep will also warn if a still locked object is freed.
> >
> > Hmm, no, this does not look like use-after-free. I enabled
> > CONFIG_SLAB_DEBUG, and I still see the same message, so the memory was
> > not overwritten by slab debugger.
>
> that's still not conclusive - the memory might not have been allocated
> by slab again to detect it. Your magic-number check definitely shows
> some sort of corruption going on, right?

Not necessarily in such a direct way.

I currently think we are somehow getting neighbours where
neigh->dev points to a loopback device - that's type 772,
and this seems to make sense.
I printed out the device name and sure enough it is "lo".

Is it true that sticking the following

static int ipoib_neigh_setup_dev(struct net_device *dev,
struct neigh_parms *parms)
{
parms->neigh_destructor = ipoib_neigh_destructor;

return 0;
}

in dev->neigh_setup, as ipoib does, guarantees that neighbour->dev will point to
the current device for any neighbour which ipoib_neigh_destructor gets?

That's the assumption IPoIB makes, and it seems broken in this instance.

How could that be?

--
MST
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/