[PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info

From: Ingo Molnar
Date: Sun Aug 23 2015 - 04:18:04 EST



* George Spelvin <linux@xxxxxxxxxxx> wrote:

> Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> > I think this is too complex.
> >
> > How about something simple like the patch below (on top of the third patch)?
>
> > It makes the vmalloc info transactional - /proc/meminfo will always print a
> > consistent set of numbers. (Not that we really care about races there, but it
> > looks really simple to solve so why not.)
>
> Looks like a huge simplification!
>
> It needs a comment about the approximate nature of the locking and
> the obvious race conditions:
> 1) The first caller to get_vmalloc_info() clears vmap_info_changed
> before updating vmap_info_cache, so a second caller is likely to
> get stale data for the duration of a calc_vmalloc_info call.
> 2) Although unlikely, it's possible for two threads to race calling
> calc_vmalloc_info, and the one that computes fresher data updates
> the cache first, so the later write leaves stale data.
>
> Other issues:
> 3) Me, I'd make vmap_info_changed a bool, for documentation more than
> any space saving.
> 4) I wish there were a trylock version of write_seqlock, so we could
> avoid blocking entirely. (You *could* hand-roll it, but that eats
> into the simplicity.)

Ok, fair enough - so how about the attached approach instead, which uses a 64-bit
generation counter to track changes to the vmalloc state.

This is still very simple, but should not suffer from stale data being returned
indefinitely in /proc/meminfo. We might race - but that was true before as well
due to the lock-less RCU list walk - but we'll always return a correct and
consistent version of the information.

Lightly tested. This is a replacement patch to make it easier to read via email.

I also made sure there's no extra overhead in the !CONFIG_PROC_FS case.

Note that there's an even simpler variant possible I think: we could use just the
two generation counters and barriers to remove the seqlock.

Thanks,

Ingo

==============================>