Re: [PATCH v2 1/3] mm: Shuffle initial free memory

From: Michal Hocko
Date: Thu Oct 04 2018 - 03:48:44 EST


On Wed 03-10-18 19:15:24, Dan Williams wrote:
> Some data exfiltration and return-oriented-programming attacks rely on
> the ability to infer the location of sensitive data objects. The kernel
> page allocator, especially early in system boot, has predictable
> first-in-first out behavior for physical pages. Pages are freed in
> physical address order when first onlined.
>
> Introduce shuffle_free_memory(), and its helper shuffle_zone(), to
> perform a Fisher-Yates shuffle of the page allocator 'free_area' lists
> when they are initially populated with free memory at boot and at
> hotplug time.
>
> Quoting Kees:
> "While we already have a base-address randomization
> (CONFIG_RANDOMIZE_MEMORY), attacks against the same hardware and
> memory layouts would certainly be using the predictability of
> allocation ordering (i.e. for attacks where the base address isn't
> important: only the relative positions between allocated memory).
> This is common in lots of heap-style attacks. They try to gain
> control over ordering by spraying allocations, etc.
>
> I'd really like to see this because it gives us something similar
> to CONFIG_SLAB_FREELIST_RANDOM but for the page allocator."
>
> Another motivation for this change is performance in the presence of a
> memory-side cache. In the future, memory-side-cache technology will be
> available on generally available server platforms. The proposed
> randomization approach has been measured to improve the cache conflict
> rate by a factor of 2.5X on a well-known Java benchmark. It avoids
> performance peaks and valleys to provide more predictable performance.
>
> While SLAB_FREELIST_RANDOM reduces the predictability of some local slab
> caches it leaves vast bulk of memory to be predictably in order
> allocated. That ordering can be detected by a memory side-cache.
>
> The shuffling is done in terms of 'shuffle_page_order' sized free pages
> where the default shuffle_page_order is MAX_ORDER-1 i.e. 10, 4MB this
> trades off randomization granularity for time spent shuffling.
> MAX_ORDER-1 was chosen to be minimally invasive to the page allocator
> while still showing memory-side cache behavior improvements.
>
> The performance impact of the shuffling appears to be in the noise
> compared to other memory initialization work. Also the bulk of the work
> is done in the background as a part of deferred_init_memmap().

This is the biggest portion of the series and I am wondering why do we
need it at all. Why it isn't sufficient to rely on the patch 3 here?
Pages freed from the bootmem allocator go via the same path so they
might be shuffled at that time. Or is there any problem with that?
Not enough entropy at the time when this is called or the final result
is not randomized enough (some numbers would be helpful).
--
Michal Hocko
SUSE Labs