Re: [PATCH 2/8] xen/balloon: Move common memory reservation routines to a module
From: Boris Ostrovsky
Date: Wed May 30 2018 - 11:51:18 EST
On 05/30/2018 04:29 AM, Oleksandr Andrushchenko wrote:
> On 05/29/2018 11:03 PM, Boris Ostrovsky wrote:
>> On 05/29/2018 02:22 PM, Oleksandr Andrushchenko wrote:
>>> On 05/29/2018 09:04 PM, Boris Ostrovsky wrote:
>>>> On 05/25/2018 11:33 AM, Oleksandr Andrushchenko wrote:
>>>> @@ -463,11 +457,6 @@ static enum bp_state
>>>> increase_reservation(unsigned long nr_pages)
>>>> ÂÂÂÂÂÂ int rc;
>>>> ÂÂÂÂÂÂ unsigned long i;
>>>> ÂÂÂÂÂÂ struct pageÂÂ *page;
>>>> -ÂÂÂ struct xen_memory_reservation reservation = {
>>>> -ÂÂÂÂÂÂÂ .address_bits = 0,
>>>> -ÂÂÂÂÂÂÂ .extent_order = EXTENT_ORDER,
>>>> -ÂÂÂÂÂÂÂ .domidÂÂÂÂÂÂÂ = DOMID_SELF
>>>> -ÂÂÂ };
>>>> ÂÂ ÂÂÂÂÂ if (nr_pages > ARRAY_SIZE(frame_list))
>>>> ÂÂÂÂÂÂÂÂÂÂ nr_pages = ARRAY_SIZE(frame_list);
>>>> @@ -486,9 +475,7 @@ static enum bp_state
>>>> increase_reservation(unsigned long nr_pages)
>>>> ÂÂÂÂÂÂÂÂÂÂ page = balloon_next_page(page);
>>>> ÂÂÂÂÂÂ }
>>>> ÂÂ -ÂÂÂ set_xen_guest_handle(reservation.extent_start, frame_list);
>>>> -ÂÂÂ reservation.nr_extents = nr_pages;
>>>> -ÂÂÂ rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
>>>> +ÂÂÂ rc = xenmem_reservation_increase(nr_pages, frame_list);
>>>> ÂÂÂÂÂÂ if (rc <= 0)
>>>> ÂÂÂÂÂÂÂÂÂÂ return BP_EAGAIN;
>>>> ÂÂ @@ -496,29 +483,7 @@ static enum bp_state
>>>> increase_reservation(unsigned long nr_pages)
>>>> ÂÂÂÂÂÂÂÂÂÂ page = balloon_retrieve(false);
>>>> ÂÂÂÂÂÂÂÂÂÂ BUG_ON(page == NULL);
>>>> ÂÂ -#ifdef CONFIG_XEN_HAVE_PVMMU
>>>> -ÂÂÂÂÂÂÂ /*
>>>> -ÂÂÂÂÂÂÂÂ * We don't support PV MMU when Linux and Xen is using
>>>> -ÂÂÂÂÂÂÂÂ * different page granularity.
>>>> -ÂÂÂÂÂÂÂÂ */
>>>> -ÂÂÂÂÂÂÂ BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE);
>>>> -
>>>> -ÂÂÂÂÂÂÂ if (!xen_feature(XENFEAT_auto_translated_physmap)) {
>>>> -ÂÂÂÂÂÂÂÂÂÂÂ unsigned long pfn = page_to_pfn(page);
>>>> -
>>>> -ÂÂÂÂÂÂÂÂÂÂÂ set_phys_to_machine(pfn, frame_list[i]);
>>>> -
>>>> -ÂÂÂÂÂÂÂÂÂÂÂ /* Link back into the page tables if not highmem. */
>>>> -ÂÂÂÂÂÂÂÂÂÂÂ if (!PageHighMem(page)) {
>>>> -ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ int ret;
>>>> -ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ret = HYPERVISOR_update_va_mapping(
>>>> -ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ (unsigned long)__va(pfn << PAGE_SHIFT),
>>>> -ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ mfn_pte(frame_list[i], PAGE_KERNEL),
>>>> -ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 0);
>>>> -ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ BUG_ON(ret);
>>>> -ÂÂÂÂÂÂÂÂÂÂÂ }
>>>> -ÂÂÂÂÂÂÂ }
>>>> -#endif
>>>> +ÂÂÂÂÂÂÂ xenmem_reservation_va_mapping_update(1, &page,
>>>> &frame_list[i]);
>>>>
>>>> Can you make a single call to xenmem_reservation_va_mapping_update(rc,
>>>> ...)? You need to keep track of pages but presumable they can be put
>>>> into an array (or a list). In fact, perhaps we can have
>>>> balloon_retrieve() return a set of pages.
>>> This is actually how it is used later on for dma-buf, but I just
>>> didn't want
>>> to alter original balloon code too much, but this can be done, in
>>> order of simplicity:
>>>
>>> 1. Similar to frame_list, e.g. static array of struct page* of size
>>> ARRAY_SIZE(frame_list):
>>> more static memory is used, but no allocations
>>>
>>> 2. Allocated at run-time with kcalloc: allocation can fail
>>
>> If this is called in freeing DMA buffer code path or in error path then
>> we shouldn't do it.
>>
>>
>>> 3. Make balloon_retrieve() return a set of pages: will require
>>> list/array allocation
>>> and handling, allocation may fail, balloon_retrieve prototype change
>>
>> balloon pages are strung on the lru list. Can we keep have
>> balloon_retrieve return a list of pages on that list?
> First of all, before we go deep in details, I will highlight
> the goal of the requested change: for balloon driver we call
> xenmem_reservation_va_mapping_update(*1*, &page, &frame_list[i]);
> from increase_reservation
> and
> xenmem_reservation_va_mapping_reset(*1*, &page);
> from decrease_reservation and it seems to be not elegant because of
> that one page/frame passed while we might have multiple pages/frames
> passed at once.
>
> In the balloon driver the producer of pages for increase_reservation
> is balloon_retrieve(false) and for decrease_reservation it is
> alloc_page(gfp).
> In case of decrease_reservation the page is added on the list:
> LIST_HEAD(pages);
> [...]
> list_add(&page->lru, &pages);
>
> and in case of increase_reservation it is retrieved page by page
> and can be put on a list as well with the same code from
> decrease_reservation, e.g.
> LIST_HEAD(pages);
> [...]
> list_add(&page->lru, &pages);
>
> Thus, both decrease_reservation and increase_reservation may hold
> their pages on a list before calling
> xenmem_reservation_va_mapping_{update|reset}.
>
> For that we need a prototype change:
> xenmem_reservation_va_mapping_reset(<nr_pages>, <list of pages>);
> But for xenmem_reservation_va_mapping_update it will look like:
> xenmem_reservation_va_mapping_update(<nr_pages>, <list of pages>,
> <array of frames>)
> which seems to be inconsistent. Converting entries of the static
> frame_list array
> into corresponding list doesn't seem to be cute as well.
>
> For dma-buf use-case arrays are more preferable as dma-buf constructs
> scatter-gather
> tables from array of pages etc. and if page list is passed then it
> needs to be
> converted into page array anyways.
>
> So, we can:
> 1. Keep the prototypes as is, e.g. accept array of pages and use
> nr_pages == 1 in
> case of balloon driver (existing code)
> 2. Statically allocate struct page* array in the balloon driver and
> fill it with pages
> when those pages are retrieved:
> static struct page *page_list[ARRAY_SIZE(frame_list)];
> which will take additional 8KiB of space on 64-bit platform, but
> simplify things a lot.
> 3. Allocate struct page *page_list[ARRAY_SIZE(frame_list)] dynamically
>
> As to Boris' suggestion "balloon pages are strung on the lru list. Can
> we keep have
> balloon_retrieve return a list of pages on that list?"
> Because of alloc_xenballooned_pages' retry logic for page retireval, e.g.
> ÂÂÂ while (pgno < nr_pages) {
> ÂÂÂ ÂÂÂ page = balloon_retrieve(true);
> ÂÂÂ ÂÂÂ if (page) {
> [...]
> ÂÂÂ ÂÂÂ } else {
> ÂÂÂ ÂÂÂ ÂÂÂ ret = add_ballooned_pages(nr_pages - pgno);
> [...]
> ÂÂÂ }
> I wouldn't change things that much.
>
> IMO, we can keep 1 page based API with the only overhead for balloon
> driver of
> function calls to xenmem_reservation_va_mapping_{update|reset} for
> each page.
I still think what I suggested is doable but we can come back to it
later and keep your per-page implementation for now.
BTW, I also think you can further simplify
xenmem_reservation_va_mapping_* routines by bailing out right away if
xen_feature(XENFEAT_auto_translated_physmap). In fact, you might even
make them inlines, along the lines of
inline void xenmem_reservation_va_mapping_reset(unsigned long count,
struct page **pages)
{
#ifdef CONFIG_XEN_HAVE_PVMMU
if (!xen_feature(XENFEAT_auto_translated_physmap))
__xenmem_reservation_va_mapping_reset(...)
#endif
}
Or some such.
-boris
-boris