Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92

From: Glauber Costa
Date: Sun Jun 23 2013 - 07:51:39 EST

On Fri, Jun 21, 2013 at 11:00:21AM +0200, Michal Hocko wrote:
> On Thu 20-06-13 17:12:01, Michal Hocko wrote:
> > I am bisecting it again. It is quite tedious, though, because good case
> > is hard to be sure about.
> OK, so now I converged to 2d4fc052 (inode: convert inode lru list to generic lru
> list code.) in my tree and I have double checked it matches what is in
> the linux-next. This doesn't help much to pin point the issue I am
> afraid :/
Can you revert this patch (easiest way ATM is to rewind your tree to a point
right before it) and apply the following patch?

As Dave has mentioned, it is very likely that this bug was already there, we
were just not ever checking imbalances. The attached patch would tell us at
least if the imbalance was there before. If this is the case, I would suggest
turning the BUG condition into a WARN_ON_ONCE since we would be officially
not introducing any regression. It is no less of a bug, though, and we should
keep looking for it.

The main change from before / after the patch is that we are now keeping things
per node. One possibility of having this BUGing would be to have an inode to be
inserted into one node-lru and removed from another. I cannot see how it could
happen, because kernel pages are stable in memory and are not moved from node
to node. We could still have some sort of weird bug in the node calculation
function. In any case, would it be possible for you to artificially restrict
your setup to a single node ? Although I have no idea how to do that, we seem
to have no parameter to disable numa. Maybe booting with less memory, enough to
fit a single node?

