Re: [PATCH] mm/vmstat: Reduce zone lock hold time when reading /proc/pagetypeinfo

From: Waiman Long
Date: Tue Oct 22 2019 - 14:40:21 EST


On 10/22/19 2:00 PM, Waiman Long wrote:
> On 10/22/19 12:57 PM, Michal Hocko wrote:
>
>>> and used nr_free to compute the missing count. Since MIGRATE_MOVABLE
>>> is usually the largest one on large memory systems, this is the one
>>> to be skipped. Since the printing order is migration-type => order, we
>>> will have to store the counts in an internal 2D array before printing
>>> them out.
>>>
>>> Even by skipping the MIGRATE_MOVABLE pages, we may still be holding the
>>> zone lock for too long blocking out other zone lock waiters from being
>>> run. This can be problematic for systems with large amount of memory.
>>> So a check is added to temporarily release the lock and reschedule if
>>> more than 64k of list entries have been iterated for each order. With
>>> a MAX_ORDER of 11, the worst case will be iterating about 700k of list
>>> entries before releasing the lock.
>> But you are still iterating through the whole free_list at once so if it
>> gets really large then this is still possible. I think it would be
>> preferable to use per migratetype nr_free if it doesn't cause any
>> regressions.
>>
> Yes, it is still theoretically possible. I will take a further look at
> having per-migrate type nr_free. BTW, there is one more place where the
> free lists are being iterated with zone lock held - mark_free_pages().

Looking deeper into the code, the exact migration type is not stored in
the page itself. An initial movable page can be stolen to be put into
another migration type. So in a delete or move from free_area, we don't
know exactly what migration type the page is coming from. IOW, it is
hard to get accurate counts of the number of entries in each lists.

I am not saying this is impossible, but doing it may require stealing
some bits from the page structure to store this information which is
probably not worth the benefit we can get from it. So if you have any
good suggestion of how to do it without too much cost, please let me
know about it. Otherwise, I will probably stay with the current patch.

Cheers,
Longman