Re: [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction

From: David Hildenbrand
Date: Tue Nov 12 2019 - 06:12:49 EST


On 12.11.19 11:46, Michael Ellerman wrote:
David Hildenbrand <david@xxxxxxxxxx> writes:
On 31.10.19 15:29, David Hildenbrand wrote:
We can now get rid of the cmm_lock and completely rely on the balloon
compaction internals, which now also manage the page list and the lock.
...
+
+static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
+ struct page *newpage, struct page *page,
+ enum migrate_mode mode)
+{
+ unsigned long flags;
+
+ /*
+ * loan/"inflate" the newpage first.
+ *
+ * We might race against the cmm_thread who might discover after our
+ * loan request that another page is to be unloaned. However, once
+ * the cmm_thread runs again later, this error will automatically
+ * be corrected.
+ */
+ if (plpar_page_set_loaned(newpage)) {
+ /* Unlikely, but possible. Tell the caller not to retry now. */
+ pr_err_ratelimited("%s: Cannot set page to loaned.", __func__);
+ return -EBUSY;
+ }
+
+ /* balloon page list reference */
+ get_page(newpage);
+
+ spin_lock_irqsave(&b_dev_info->pages_lock, flags);
+ balloon_page_insert(b_dev_info, newpage);
+ balloon_page_delete(page);

I think I am missing a b_dev_info->isolated_pages-- here.

I don't know this code at all, but looking at other balloon drivers they
do seem to do that in roughly the same spot.

I'll add it, how can we test that it's correct?

It's certainly correct. We increment when we isolate (balloon_page_isolate()) and decrement when we un-isolate.

Un-isolate happens when we putback a isolated page (balloon_page_putback() - migration aborted) or when we successfully migrate it (via balloon_page_migrate()).

The issue is that we cannot decrement in balloon_page_migrate(), as we have to hold the b_dev_info->pages_lock. That's why we have to do it in the registered callback under lock.

Please note that b_dev_info->isolated_pages is only needed for a sanity check in balloon_page_dequeue(). That's why I didn't notice during testing. I wonder if we should at some point rip out that sanity check ...

Thanks and cheers!


cheers



--

Thanks,

David / dhildenb