Re: bad rss-counter message in 3.14rc5

From: Sasha Levin
Date: Tue Mar 18 2014 - 22:42:35 EST


On 03/18/2014 10:12 PM, Hugh Dickins wrote:
On Tue, 18 Mar 2014, Sasha Levin wrote:
On 03/18/2014 08:38 PM, Hugh Dickins wrote:
On Tue, 11 Mar 2014, Dave Jones wrote:
On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote:
> On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote:
> > >
> > > Dave, iirc trinity can write log file pointing which exactly
syscall sequence
> > > was passed, right? Share it too please.
> >
> > Hm, I may have been mistaken, and the damage was done by a previous
run.
> > I went from being able to reproduce it almost instantly to now not
being able
> > to reproduce it at all. Will keep trying.
>
> Sasha already gave a link to the syscalls sequence, so no rush.

It'd be nice to get a more concise reproducer, his list had a little of
everything in there.

I've so far failed to find any explanation for your swapops.h BUG;
but believe I have identified one cause for "Bad rss-counter"s.

My hunch is that the swapops.h BUG is "nearby", but I just cannot
fit it together (the swapops.h BUG comes when rmap cannot find all
all the migration entries it inserted earlier: it's a very useful
BUG for validating rmap).

Untested patch below: I can't quite say Reported-by, because it may
not even be one that you and Sasha have been seeing; but I'm hopeful,
remap_file_pages is in the list.

Please give this a try, preferably on 3.14-rc or earlier: I've never
seen "Bad rss-counter"s there myself (trinity uses remap_file_pages
a lot more than most of us); but have seen them on mmotm/next, so
some other trigger is coming up there, I'll worry about that once
it reaches 3.15-rc.

The patch fixed the "Bad rss-counter" errors I've been seeing both in
3.14-rc7 and -next.

Great, thanks a lot, Sasha. I was afraid that you'd hit those swapops
BUGs, which seemed perhaps to be paired with these; but glad to hear
a positive. Let's see how Dave fares. (I've not forgotten shmem
fallocate, by the way, but those probably aren't as high on my agenda
as you'd like.)

I do hit the swapops issue a lot, I didn't think that your patch was
supposed to fix that so I didn't mention it.

Thanks for keeping shmem in mind, I've removed shmem from testing for now
but I agree, it's not one of the more important issues to be taken care of.


Thanks,
Sasha

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/