Re: [PATCH] mm: reuse the unshared swapcache page in do_wp_page

From: David Hildenbrand
Date: Thu Jan 20 2022 - 15:37:54 EST


On 20.01.22 21:09, David Hildenbrand wrote:
> On 20.01.22 21:07, Matthew Wilcox wrote:
>> On Thu, Jan 20, 2022 at 08:55:12PM +0100, David Hildenbrand wrote:
>>>>>> David, does any of it regards the lru_cache_add() reference issue that I
>>>>>> mentioned? [1]
>>
>>> +++ b/mm/memory.c
>>> @@ -3291,19 +3291,28 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
>>> if (PageAnon(vmf->page)) {
>>> struct page *page = vmf->page;
>>>
>>> - /* PageKsm() doesn't necessarily raise the page refcount */
>>> - if (PageKsm(page) || page_count(page) != 1)
>>> + /*
>>> + * PageKsm() doesn't necessarily raise the page refcount.
>>> + *
>>> + * These checks are racy as long as we haven't locked the page;
>>> + * they are a pure optimization to avoid trying to lock the page
>>> + * and trying to free the swap cache when there is little hope
>>> + * it will actually result in a refcount of 1.
>>> + */
>>> + if (PageKsm(page) || page_count(page) > 1 + PageSwapCache(page))
>>> goto copy;
>>> if (!trylock_page(page))
>>> goto copy;
>>> - if (PageKsm(page) || page_mapcount(page) != 1 || page_count(page) != 1) {
>>> + if (PageSwapCache(page))
>>> + try_to_free_swap(page);
>>> + if (PageKsm(page) || page_count(page) != 1) {
>>> unlock_page(page);
>>> goto copy;
>>> }
>>> /*
>>> - * Ok, we've got the only map reference, and the only
>>> - * page count reference, and the page is locked,
>>> - * it's dark out, and we're wearing sunglasses. Hit it.
>>> + * Ok, we've got the only page reference from our mapping
>>> + * and the page is locked, it's dark out, and we're wearing
>>> + * sunglasses. Hit it.
>>> */
>>> unlock_page(page);
>>> wp_page_reuse(vmf);
>>>
>>>
>>> I added some vmstats that monitor various paths. After one run of
>>> ./forceswap 2 1000000 1
>>> I'm left with a rough delta (including some noise) of
>>> anon_wp_copy_count 1799
>>> anon_wp_copy_count_early 1
>>> anon_wp_copy_lock 983396
>>> anon_wp_reuse 0
>>>
>>> The relevant part of your reproducer is
>>>
>>> for (i = 0; i < nops; i++) {
>>> if (madvise((void *)p, PAGE_SIZE * npages, MADV_PAGEOUT)) {
>>> perror("madvise");
>>> exit(-1);
>>> }
>>>
>>> for (j = 0; j < npages; j++) {
>>> c = p[j * PAGE_SIZE];
>>> c++;
>>> time -= rdtscp();
>>> p[j * PAGE_SIZE] = c;
>>> time += rdtscp();
>>> }
>>> }
>>>
>>> For this specific reproducer at least, the page lock seems to be the thingy that prohibits
>>> reuse if I interpret the numbers correctly. We pass the initial page_count() check.
>>>
>>> Haven't looked into the details, and I would be curious how that performs with actual
>>> workloads, if we can reproduce similar behavior.
>>
>> I don't see how that patch addresses the lru issue. Wouldn't we need
>> something like ...
>>
>> if (!PageLRU(page))
>> lru_add_drain_all();
>>

lru_add_drain_all() takes a mutex ... best we can do I guess is drain
the local CPU using lru_add_drain(). I'll go play with it and see what
breaks :)

--
Thanks,

David / dhildenb