Re: [PATCH 6/7] mm: parallelize deferred_init_memmap()

From: Josh Triplett
Date: Mon May 04 2020 - 19:44:39 EST


On May 4, 2020 3:33:58 PM PDT, Alexander Duyck <alexander.duyck@xxxxxxxxx> wrote:
>On Thu, Apr 30, 2020 at 1:12 PM Daniel Jordan
><daniel.m.jordan@xxxxxxxxxx> wrote:
>> /*
>> - * Initialize and free pages in MAX_ORDER sized increments so
>> - * that we can avoid introducing any issues with the buddy
>> - * allocator.
>> + * More CPUs always led to greater speedups on tested
>systems, up to
>> + * all the nodes' CPUs. Use all since the system is
>otherwise idle now.
>> */
>
>I would be curious about your data. That isn't what I have seen in the
>past. Typically only up to about 8 or 10 CPUs gives you any benefit,
>beyond that I was usually cache/memory bandwidth bound.

I've found pretty much linear performance up to memory bandwidth, and on the systems I was testing, I didn't saturate memory bandwidth until about the full number of physical cores. From number of cores up to number of threads, the performance stayed about flat; it didn't get any better or worse.

- Josh