Re: [PATCH 3/3] mm: slub: Default slub_max_order to 0

From: James Bottomley
Date: Fri May 13 2011 - 00:12:28 EST


On Thu, 2011-05-12 at 19:47 -0500, James Bottomley wrote:
> On Fri, 2011-05-13 at 00:15 +0200, Johannes Weiner wrote:
> > On Thu, May 12, 2011 at 05:04:41PM -0500, James Bottomley wrote:
> > > On Thu, 2011-05-12 at 15:04 -0500, James Bottomley wrote:
> > > > Confirmed, I'm afraid ... I can trigger the problem with all three
> > > > patches under PREEMPT. It's not a hang this time, it's just kswapd
> > > > taking 100% system time on 1 CPU and it won't calm down after I unload
> > > > the system.
> > >
> > > Just on a "if you don't know what's wrong poke about and see" basis, I
> > > sliced out all the complex logic in sleeping_prematurely() and, as far
> > > as I can tell, it cures the problem behaviour. I've loaded up the
> > > system, and taken the tar load generator through three runs without
> > > producing a spinning kswapd (this is PREEMPT). I'll try with a
> > > non-PREEMPT kernel shortly.
> > >
> > > What this seems to say is that there's a problem with the complex logic
> > > in sleeping_prematurely(). I'm pretty sure hacking up
> > > sleeping_prematurely() just to dump all the calculations is the wrong
> > > thing to do, but perhaps someone can see what the right thing is ...
> >
> > I think I see the problem: the boolean logic of sleeping_prematurely()
> > is odd. If it returns true, kswapd will keep running. So if
> > pgdat_balanced() returns true, kswapd should go to sleep.
> >
> > This?
>
> I was going to say this was a winner, but on the third untar run on
> non-PREEMPT, I hit the kswapd livelock. It's got much farther than
> previous attempts, which all hang on the first run, but I think the
> essential problem is still (at least on this machine) that
> sleeping_prematurely() is doing too much work for the wakeup storm that
> allocators are causing.
>
> Something that ratelimits the amount of time we spend in the watermark
> calculations, like the below (which incorporates your pgdat fix) seems
> to be much more stable (I've not run it for three full runs yet, but
> kswapd CPU time is way lower so far).

I've hammered it for several hours now with multiple loads; I can't seem
to break it (famous last words, of course).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/