Re: [patch 1/2] mm, zone: track number of pages in free area by migratetype

From: Vlastimil Babka
Date: Fri Nov 18 2016 - 15:59:00 EST

On 11/17/2016 11:11 PM, David Rientjes wrote:
> On Thu, 17 Nov 2016, Vlastimil Babka wrote:
>>> The total number of free pages is still tracked, however, to not make
>>> zone_watermark_ok() more expensive. Reading /proc/pagetypeinfo, however,
>>> is faster.
>> Yeah I've already seen a case with /proc/pagetypeinfo causing soft
>> lockups due to high number of iterations...
> Thanks for taking a look at the patchset!
> Wow, I haven't seen /proc/pagetypeinfo soft lockups yet, I thought this
> was a relatively minor point :)

Well to be honest, it was a system misconfigured with numa=off which
made the lists both longer and more numa-distant. But nevertheless, we
might get there. It's not nice when userspace can so easily trigger long
iterations under the zone/node lock...

> But it looks like we need some
> improvement in this behavior independent of memory compaction anyway.


>>> This patch introduces no functional change and increases the amount of
>>> per-zone metadata at worst by 48 bytes per memory zone (when CONFIG_CMA
>>> and CONFIG_MEMORY_ISOLATION are enabled).
>> Isn't it 48 bytes per zone and order?
> Yes, sorry, I'll fix that in v2. I think less than half a kilobyte for
> each memory zone is satisfactory for extra tracking, compaction
> improvements, and optimized /proc/pagetypeinfo, though.

I'm not worried about memory usage, but perhaps cache usage.

>>> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>
>> I'd be for this if there are no performance regressions. It affects hot
>> paths and increases cache footprint. I think at least some allocator
>> intensive microbenchmark should be used.
> I can easily implement a test to stress movable page allocations from
> fallback MIGRATE_UNMOVABLE pageblocks and freeing back to the same
> pageblocks. I assume we're not interested in memory offline benchmarks.

I meant just allocation benchmarks to see how much the extra operations
and cache footprint matters.

> What do you think about the logic presented in patch 2/2? Are you
> comfortable with a hard-coded ratio such as 1/64th of free memory or would
> you prefer to look at the zone's watermark with the number of free pages
> from MIGRATE_MOVABLE pageblocks rather than NR_FREE_PAGES? I was split
> between the two options.

The second options makes more sense to me intuitively as it resembles
what we've been doing until now. Maybe just don't require such a large
gap as compaction_suitable does?

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>