Re: [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction

From: Michael Ellerman
Date: Wed Nov 13 2019 - 06:08:57 EST


David Hildenbrand <david@xxxxxxxxxx> writes:
> On 12.11.19 11:46, Michael Ellerman wrote:
>> David Hildenbrand <david@xxxxxxxxxx> writes:
>>> On 31.10.19 15:29, David Hildenbrand wrote:
>>>> We can now get rid of the cmm_lock and completely rely on the balloon
>>>> compaction internals, which now also manage the page list and the lock.
>> ...
>>>> +
>>>> +static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
>>>> + struct page *newpage, struct page *page,
>>>> + enum migrate_mode mode)
>>>> +{
>>>> + unsigned long flags;
>>>> +
>>>> + /*
>>>> + * loan/"inflate" the newpage first.
>>>> + *
>>>> + * We might race against the cmm_thread who might discover after our
>>>> + * loan request that another page is to be unloaned. However, once
>>>> + * the cmm_thread runs again later, this error will automatically
>>>> + * be corrected.
>>>> + */
>>>> + if (plpar_page_set_loaned(newpage)) {
>>>> + /* Unlikely, but possible. Tell the caller not to retry now. */
>>>> + pr_err_ratelimited("%s: Cannot set page to loaned.", __func__);
>>>> + return -EBUSY;
>>>> + }
>>>> +
>>>> + /* balloon page list reference */
>>>> + get_page(newpage);
>>>> +
>>>> + spin_lock_irqsave(&b_dev_info->pages_lock, flags);
>>>> + balloon_page_insert(b_dev_info, newpage);
>>>> + balloon_page_delete(page);
>>>
>>> I think I am missing a b_dev_info->isolated_pages-- here.
>>
>> I don't know this code at all, but looking at other balloon drivers they
>> do seem to do that in roughly the same spot.
>>
>> I'll add it, how can we test that it's correct?
>
> It's certainly correct. We increment when we isolate
> (balloon_page_isolate()) and decrement when we un-isolate.
>
> Un-isolate happens when we putback a isolated page
> (balloon_page_putback() - migration aborted) or when we successfully
> migrate it (via balloon_page_migrate()).
>
> The issue is that we cannot decrement in balloon_page_migrate(), as we
> have to hold the b_dev_info->pages_lock. That's why we have to do it in
> the registered callback under lock.

OK, I get it now.

> Please note that b_dev_info->isolated_pages is only needed for a sanity
> check in balloon_page_dequeue(). That's why I didn't notice during
> testing. I wonder if we should at some point rip out that sanity check ...

OK. Sanity checks can be good, though checks that call BUG() are less
nice :) But I'm not an mm expert so I'll defer to you folks on the
sanity check.

For now I've merged this series with the decrement added to
cmm_migratepage().

cheers