Re: rcu: BUG on exit_group

From: Paul E. McKenney
Date: Fri May 04 2012 - 01:33:54 EST


On Fri, May 04, 2012 at 06:08:34AM +0200, Sasha Levin wrote:
> On Thu, May 3, 2012 at 7:01 PM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > On Thu, May 03, 2012 at 05:55:14PM +0200, Sasha Levin wrote:
> >> On Thu, May 3, 2012 at 5:41 PM, Paul E. McKenney
> >> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >> > On Thu, May 03, 2012 at 10:57:19AM +0200, Sasha Levin wrote:
> >> >> Hi Paul,
> >> >>
> >> >> I've hit a BUG similar to the schedule_tail() one when. It happened
> >> >> when I've started fuzzing exit_group() syscalls, and all of the traces
> >> >> are starting with exit_group() (there's a flood of them).
> >> >>
> >> >> I've verified that it indeed BUGs due to the rcu preempt count.
> >> >
> >> > Hello, Sasha,
> >> >
> >> > Which version of -next are you using?  I did some surgery on this
> >> > yesterday based on some bugs Hugh Dickins tracked down, so if you
> >> > are using something older, please move to the current -next.
> >>
> >> I'm using -next from today (3.4.0-rc5-next-20120503-sasha-00002-g09f55ae-dirty).
> >
> > Hmmm...  Looking at this more closely, it looks like there really is
> > an attempt to acquire a mutex within an RCU read-side critical section,
> > which is illegal.  Could you please bisect this?
>
> Right, the issue is as you described, taking a mutex inside rcu_read_lock().
>
> The offending commit is (I've cc'ed all parties from it):
>
> commit adf79cc03092ee4aec70da10e91b05fb8116ac7b
> Author: Ying Han <yinghan@xxxxxxxxxx>
> Date: Thu May 3 15:44:01 2012 +1000
>
> memcg: add mlock statistic in memory.stat
>
> With the issue there being is that in munlock_vma_page(), it now does
> a mem_cgroup_begin_update_page_stat() which takes the rcu_read_lock(),
> so when the older code that was there previously will try taking a
> mutex you'll get a BUG.

Hmmm... One approach would be to switch from rcu_read_lock() to
srcu_read_lock(), though this means carrying the index returned from
the srcu_read_lock() to the matching srcu_read_unlock() -- and making
the update side use synchronize_srcu() rather than synchronize_rcu().
Alternatively, it might be possible to defer acquiring the lock until
after exiting the RCU read-side critical section, but I don't know enough
about mm to even guess whether this might be possible.

There are probably other approaches as well...

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/