Re: 2.6.39-rc4+: Kernel leaking memory during FS scanning,regression?

From: Bruno PrÃmont
Date: Mon Apr 25 2011 - 13:01:10 EST


On Mon, 25 April 2011 Linus Torvalds wrote:
> 2011/4/25 Bruno PrÃmont <bonbons@xxxxxxxxxxxxxxxxx>:
> >
> > kmemleak reports 86681 new leaks between shortly after boot and -2 state.
> > (and 2348 additional ones between -2 and -4).
>
> I wouldn't necessarily trust kmemleak with the whole RCU-freeing
> thing. In your slubinfo reports, the kmemleak data itself also tends
> to overwhelm everything else - none of it looks unreasonable per se.
>
> That said, you clearly have a *lot* of filp entries. I wouldn't
> consider it unreasonable, though, because depending on load those may
> well be fine. Perhaps you really do have some application(s) that hold
> thousands of files open. The default file limit is 1024 (I think), but
> you can raise it, and some programs do end up opening tens of
> thousands of files for filesystem scanning purposes.
>
> That said, I would suggest simply trying a saner kernel configuration,
> and seeing if that makes a difference:
>
> > Yes, it's uni-processor system, so SMP=n.
> > TINY_RCU=y, PREEMPT_VOLUNTARY=y (whole /proc/config.gz attached keeping
> > compression)
>
> I'm not at all certain that TINY_RCU is appropriate for
> general-purpose loads. I'd call it more of a "embedded low-performance
> option".

Well, TINY_RCU is the only option when doing PREEMPT_VOLUNTARY on
SMP=n...

> The _real_ RCU implementation ("tree rcu") forces quiescent states
> every few jiffies and has logic to handle "I've got tons of RCU
> events, I really need to start handling them now". All of which I
> think tiny-rcu lacks.

Going to try it out (will take some time to compile), kmemleak disabled.

> So right now I suspect that you have a situation where you just have a
> simple load that just ends up never triggering any RCU cleanup, and
> the tiny-rcu thing just keeps on gathering events and delays freeing
> stuff almost arbitrarily long.

I hope tiny-rcu is not that broken... as it would mean driving any
PREEMPT_NONE or PREEMPT_VOLUNTARY system out of memory when compiling
packages (and probably also just unpacking larger tarballs or running
things like du).

And with system doing nothing (except monitoring itself) memory usage
goes increasing all the time until it starves (well it seems to keep
~20M free, pushing processes it can to swap). Config is just being
make oldconfig from working 2.6.38 kernel (answering default for new
options)

Memory usage evolution graph in first message of this thread:
http://thread.gmane.org/gmane.linux.kernel.mm/61909/focus=1130480

Attached graph matching numbers of previous mail. (dropping caches was at
17:55, system idle since then)

Bruno


> So try CONFIG_PREEMPT and CONFIG_TREE_PREEMPT_RCU to see if the
> behavior goes away. That would confirm the "it's just tinyrcu being
> too dang stupid" hypothesis.
>
> Linus

Attachment: jupiter.png
Description: PNG image