Re: get_vmalloc_info() and /proc/meminfo insanely expensive

From: Linus Torvalds
Date: Thu Aug 13 2015 - 16:04:24 EST

On Thu, Aug 13, 2015 at 12:50 PM, David Rientjes <rientjes@xxxxxxxxxx> wrote:
> Rather than a time based approach, why not invalidate when it is known to
> change, either on the next call to get_vmalloc_info() or as part of the
> vmalloc operation itself?

I started doing that, looking at all the users of vmap_area_lock, and
decided that it's just too messy for me. I wanted something minimal
and obvious. The vmap_area handling is not obvious, especially since
the whole vmalloc statistics don't just depend on the vmap area list,
but also take the individual flags into account (ie it counts
VM_LAZY_FREE[ING] entries as having already been removed etc).

So I started looking at actually turning the vmap_area_lock into a
seqlock or even a rwlock (because some of the users are pure readers)
just to clarify things, and that wasn't hard per se, but I threw up my
hands in disgust. The code that modifies the thing is full of "take
lock, look up,. drop lock, do something, take lock again") etc.

Yes, we could just sprinkle "invalidate_cache = 1" statements around
in all the places that can cause this, but that would be pretty ad-hoc
and ugly too. And since the reader side is actually entirely lockless
(currently using rcu, with my patch it has the additional lockless and
racy cache reading), to do it *right* you actually need to use at a
minimum memory ordering rules. So it would be fairly subtle too.

In contrast, my "just base it on time" sure as hell isn't subtle. It's
not great, but at least it's obvious and the crazy logic is

So I'd be all for somebody actually taking the time to do it right.
I'm pretty sure nobody really cares enough, though.

Willing to prove me wrong?

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at