Re: get_vmalloc_info() and /proc/meminfo insanely expensive

From: Linus Torvalds
Date: Thu Aug 13 2015 - 14:33:12 EST


On Wed, Aug 12, 2015 at 10:52 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> I get the feeling this file should be rewritten. But that's not going
> to happen. The "let's just cache the last value for one jiffy" seemed
> to be the minimal fixup to it.

Here's a totally untested patch (I'll reboot and test soon - it does
at least compile for me).

Notice the total lack of locking, which means that it's fundamentally
racy. I really can't find it inside myself to care. Introducing those
vmalloc fields was a mistake to begin with, any races here are "ok, we
get values that were valid at some point, but it might have been a
second ago".

And I also can't find it in myself to care about the "on 32-bit,
jiffies wraps in 49 days if HZ is 1000". If somebody carefully avoids
ever reading /proc/meminfo for 49 days, and then reads it in _just_
the right second, and gets a really stale value, I'm just going to do
my honey badger impression.

Because we really shouldn't have added the vmalloc information to
/proc/meminfo to begin with, and nobody ever cares about those values
anyway.

Comments?

Linus
mm/vmalloc.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 48 insertions(+), 2 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 2faaa2976447..0d0b96ed8948 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2688,7 +2688,7 @@ static int __init proc_vmalloc_init(void)
}
module_init(proc_vmalloc_init);

-void get_vmalloc_info(struct vmalloc_info *vmi)
+static void calc_vmalloc_info(struct vmalloc_info *vmi)
{
struct vmap_area *va;
unsigned long free_area_size;
@@ -2735,5 +2735,51 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
out:
rcu_read_unlock();
}
-#endif

+/*
+ * Calculating the vmalloc information is expensive, and nobody really
+ * deeply cares about it anyway. Yet, some versions of glibc end up
+ * reading /proc/meminfo a lot, not because they care about the vmalloc
+ * fields, but because they care about the total memory info.
+ *
+ * So to alleviate that expense, we cache the vmalloc information for
+ * a second. NOTE! This is fundamentally racy, since the accesses to
+ * the two fields in "struct vmalloc_info" and the cache timeout are
+ * all entirely unsynchronized. We just don't care.
+ */
+void get_vmalloc_info(struct vmalloc_info *vmi)
+{
+ static unsigned long cache_timeout = INITIAL_JIFFIES;
+ static struct vmalloc_info cached_info;
+ unsigned long now = jiffies, last = READ_ONCE(cache_timeout);
+
+ if (now - last < HZ) {
+ *vmi = cached_info;
+ return;
+ }
+
+ /*
+ * We update the cache timeout early, because we (again) do
+ * not care if somebody else comes in and sees slightly stale
+ * information. We'd rather return more stale information
+ * than waste time with multiple CPU's all calculating the
+ * new state.
+ *
+ * Note: the barriers are here not to fix any races, but to
+ * avoid the compiling spreading out the updates to these
+ * variables any more than necessary.
+ *
+ * Also note that we calculate the new state into the 'vmi'
+ * buffer that is passed in, and private to the caller. That
+ * is intentional: we do not want to update the cached info
+ * incrementally during the calculations.
+ */
+ WRITE_ONCE(cache_timeout, now);
+ barrier();
+
+ calc_vmalloc_info(vmi);
+
+ barrier();
+ cached_info = *vmi;
+}
+#endif