Re: [Xen-devel] [PATCH] xen/p2m: fix extra memory regions accounting

From: Juergen Gross
Date: Fri Sep 04 2015 - 04:07:55 EST


On 09/04/2015 09:57 AM, Roger Pau Monnà wrote:
El 04/09/15 a les 9.47, Juergen Gross ha escrit:
On 09/04/2015 09:37 AM, Roger Pau Monnà wrote:
Hello,

El 04/09/15 a les 7.07, Juergen Gross ha escrit:
Could you try the attached patch? It should do the job. It is booting
fine on my laptop, but I think you should try it on the machine with
the memory ranges not at page boundary.


Juergen


extramem.patch


commit 3d0f8aa4d1b4c9c16a81902a197b5d6a77e182a0
Author: Juergen Gross <jgross@xxxxxxxx>
Date: Thu Sep 3 17:44:04 2015 +0200

xen: switch extra memory accounting to use pfns

Instead of using physical addresses for accounting of extra memory
areas available for ballooning switch to pfns as this is much less
error prone regarding partial pages.

Thanks for taking care of this! I've tested it on the box that has
non-page aligned memory ranges and it works fine, only a couple of
comments below.

Reported-by: Roger Pau MonnÃÂ <roger.pau@xxxxxxxxxx>
Signed-off-by: Juergen Gross <jgross@xxxxxxxx>

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 7a5d566..aa58bc4 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -90,62 +90,65 @@ static void __init xen_parse_512gb(void)
xen_512gb_limit = val;
}

-static void __init xen_add_extra_mem(phys_addr_t start, phys_addr_t
size)
+static void __init xen_add_extra_mem(unsigned long start_pfn,
+ unsigned long n_pfns)

Not very important, but for type consistency this should probably be
xen_pfn_t instead of unsigned long here and below.

All of the p2m code is using unsigned long for pfns. I wouldn't mind
changing this to use xen_pfn_t instead, but this should be done in a
separate patch. I'll put it on my list.


{
int i;

for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++) {
/* Add new region. */
- if (xen_extra_mem[i].size == 0) {
- xen_extra_mem[i].start = start;
- xen_extra_mem[i].size = size;
+ if (xen_extra_mem[i].n_pfns == 0) {
+ xen_extra_mem[i].start_pfn = start_pfn;
+ xen_extra_mem[i].n_pfns = n_pfns;
break;
}
/* Append to existing region. */
- if (xen_extra_mem[i].start + xen_extra_mem[i].size == start) {
- xen_extra_mem[i].size += size;
+ if (xen_extra_mem[i].start_pfn + xen_extra_mem[i].n_pfns ==
+ start_pfn) {
+ xen_extra_mem[i].n_pfns += n_pfns;
break;
}

I also noticed this with the original code, why isn't there a case to
prepend to an existing region:

if (start_pfn + n_pfns == xen_extra_mem[i].start_pfn) {
xen_extra_mem[i].n_pfns += n_pfns;
xen_extra_mem[i].start_pfn = start_pfn;
}

Processing of memory is done from low to high addresses. This case
should never happen. And even if it does, the only downside from
not handling this scenario is wasting an additional table entry.

Right, this case would only be useful for xen_del_extra_mem, and as you
say, the worst that can happen is that we end up using an extra slot.

Even this case should be impossible. It would require two adjacent
regions to be present in xen_extra_mem before calling xen_del_extra_mem
which is impossible as stated above.



}
if (i == XEN_EXTRA_MEM_MAX_REGIONS)
printk(KERN_WARNING "Warning: not enough extra memory
regions\n");

- memblock_reserve(start, size);
+ memblock_reserve(PFN_PHYS(start_pfn), PFN_PHYS(n_pfns));
}

[...]

@@ -831,9 +833,11 @@ char * __init xen_memory_setup(void)
chunk_size = min(size, mem_end - addr);
} else if (extra_pages) {
chunk_size = min(size, PFN_PHYS(extra_pages));
- extra_pages -= PFN_DOWN(chunk_size);
- xen_add_extra_mem(addr, chunk_size);
- xen_max_p2m_pfn = PFN_DOWN(addr + chunk_size);
+ pfn_s = PFN_UP(addr);
+ n_pfns = PFN_DOWN(addr + chunk_size) - pfn_s;

Should xen_add_extra_mem check for empty ranges and bail out early, or
should the caller make sure it doesn't try to add empty ranges?

IMHO it's easier and cleaner to add the check to xen_add_extra_mem.

This isn't critical at all. Adding an empty range is a nop, as a table
entry is regarded to be not used when n_pfns is 0.

Yes, it's just so we can bail out earlier instead of iterating over the
whole xen_extra_mem for empty ranges. IMHO, I would add a check to
xen_add_extra_mem so we don't waste cycles.

The question is whether the additional check for each call wouldn't add
more cycles than the early bail out for the rare case would win.

A comment seems to be an appropriate measure. :-)


Juergen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/