Re: IGMP and rwlock: Dead ocurred again on TILEPro

From: Eric Dumazet
Date: Thu Feb 17 2011 - 01:39:23 EST


Le mercredi 16 fÃvrier 2011 Ã 21:46 -0800, David Miller a Ãcrit :
> From: AmÃrico Wang <xiyou.wangcong@xxxxxxxxx>
> Date: Thu, 17 Feb 2011 13:42:37 +0800
>
> > On Thu, Feb 17, 2011 at 01:04:14PM +0800, Cypher Wu wrote:
> >>>
> >>> Have you turned CONFIG_LOCKDEP on?
> >>>
> >>> I think Eric already converted that rwlock into RCU lock, thus
> >>> this problem should disappear. Could you try a new kernel?
> >>>
> >>> Thanks.
> >>>
> >>
> >>I haven't turned CONFIG_LOCKDEP on for test since I didn't get too
> >>much information when we tried to figured out the former deadlock.
> >>
> >>IGMP used read_lock() instead of read_lock_bh() since usually
> >>read_lock() can be called recursively, and today I've read the
> >>implementation of MIPS, it's should also works fine in that situation.
> >>The implementation of TILEPro cause problem since after it use TNS set
> >>the lock-val to 1 and hold the original value and before it re-set
> >>lock-val a new value, it a race condition window.
> >>
> >
> > I see no reason why you can't call read_lock_bh() recursively,
> > read_lock_bh() is roughly equalent to local_bh_disable() + read_lock(),
> > both can be recursive.
> >
> > But I may miss something here. :-/
>
> IGMP is doing this so that taking the read lock does not stop packet
> processing.
>
> TILEPro's rwlock implementation is simply buggy and needs to be fixed.

Yep. Finding all recursive readlocks in kernel and convert them to
another locking model is probably more expensive than fixing TILEPro
rwlock implementation.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/