Re: Kernel crash after using new Intel NIC (igb)

From: Ben Hutchings
Date: Thu May 26 2011 - 13:05:15 EST


On Wed, 2011-05-25 at 08:35 +0200, Eric Dumazet wrote:
> Le mardi 24 mai 2011 Ã 23:06 -0700, Arun Sharma a Ãcrit :
> > On Wed, May 25, 2011 at 04:44:29AM +0200, Eric Dumazet wrote:
> > >
> > > Hmm, thanks for the report. Are you running x86 or another arch ?
> > >
> >
> > This was on x86.
> >
> > > We probably need some sort of memory barrier.
> > >
> > > However, locking this central lock makes the thing too slow, I'll try to
> > > use an atomic_inc_return on p->refcnt instead. (and then lock
> > > unused_peers.lock if we got a 0->1 transition)
> >
> > Another possibility is to do the list_empty() check twice. Once without
> > taking the lock and again with the spinlock held.
> >
>
> Why ?
>
> list_del_init(&p->unused); (done under lock of course) is safe, you can
> call it twice, no problem.
>
> No, the real problem is the (!list_empty(&p->unused) test : It seems to
> not always tell the truth if not done under lock.

Of course not; list modification operations are not atomic.

Ben.

--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/