On Tue, 9 Nov 2021, Oleksandr wrote:
On 28.10.21 19:37, Stefano Stabellini wrote:But elsewhere where?
Hi Stefano
I am sorry for the late response.
On Tue, 26 Oct 2021, Oleksandr Tyshchenko wrote:
From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>I am a bit confused.. why do we need to do this? Who could be
The main reason of this change is that unpopulated-alloc
code cannot be used in its current form on Arm, but there
is a desire to reuse it to avoid wasting real RAM pages
for the grant/foreign mappings.
The problem is that system "iomem_resource" is used for
the address space allocation, but the really unallocated
space can't be figured out precisely by the domain on Arm
without hypervisor involvement. For example, not all device
I/O regions are known by the time domain starts creating
grant/foreign mappings. And following the advise from
"iomem_resource" we might end up reusing these regions by
a mistake. So, the hypervisor which maintains the P2M for
the domain is in the best position to provide unused regions
of guest physical address space which could be safely used
to create grant/foreign mappings.
Introduce new helper arch_xen_unpopulated_init() which purpose
is to create specific Xen resource based on the memory regions
provided by the hypervisor to be used as unused space for Xen
scratch pages.
If arch doesn't implement arch_xen_unpopulated_init() to
initialize Xen resource the default "iomem_resource" will be used.
So the behavior on x86 won't be changed.
Also fall back to allocate xenballooned pages (steal real RAM
pages) if we do not have any suitable resource to work with and
as the result we won't be able to provide unpopulated pages.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
---
Changes RFC -> V2:
- new patch, instead of
"[RFC PATCH 2/2] xen/unpopulated-alloc: Query hypervisor to provide
unallocated space"
---
drivers/xen/unpopulated-alloc.c | 89
+++++++++++++++++++++++++++++++++++++++--
include/xen/xen.h | 2 +
2 files changed, 88 insertions(+), 3 deletions(-)
diff --git a/drivers/xen/unpopulated-alloc.c
b/drivers/xen/unpopulated-alloc.c
index a03dc5b..1f1d8d8 100644
--- a/drivers/xen/unpopulated-alloc.c
+++ b/drivers/xen/unpopulated-alloc.c
@@ -8,6 +8,7 @@
#include <asm/page.h>
+#include <xen/balloon.h>
#include <xen/page.h>
#include <xen/xen.h>
@@ -15,13 +16,29 @@ static DEFINE_MUTEX(list_lock);
static struct page *page_list;
static unsigned int list_count;
+static struct resource *target_resource;
+static struct resource xen_resource = {
+ .name = "Xen unused space",
+};
+
+/*
+ * If arch is not happy with system "iomem_resource" being used for
+ * the region allocation it can provide it's own view by initializing
+ * "xen_resource" with unused regions of guest physical address space
+ * provided by the hypervisor.
+ */
+int __weak arch_xen_unpopulated_init(struct resource *res)
+{
+ return -ENOSYS;
+}
+
static int fill_list(unsigned int nr_pages)
{
struct dev_pagemap *pgmap;
- struct resource *res;
+ struct resource *res, *tmp_res = NULL;
void *vaddr;
unsigned int i, alloc_pages = round_up(nr_pages, PAGES_PER_SECTION);
- int ret = -ENOMEM;
+ int ret;
res = kzalloc(sizeof(*res), GFP_KERNEL);
if (!res)
@@ -30,7 +47,7 @@ static int fill_list(unsigned int nr_pages)
res->name = "Xen scratch";
res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
- ret = allocate_resource(&iomem_resource, res,
+ ret = allocate_resource(target_resource, res,
alloc_pages * PAGE_SIZE, 0, -1,
PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL);
if (ret < 0) {
@@ -38,6 +55,31 @@ static int fill_list(unsigned int nr_pages)
goto err_resource;
}
+ /*
+ * Reserve the region previously allocated from Xen resource to avoid
+ * re-using it by someone else.
+ */
+ if (target_resource != &iomem_resource) {
+ tmp_res = kzalloc(sizeof(*tmp_res), GFP_KERNEL);
+ if (!res) {
+ ret = -ENOMEM;
+ goto err_insert;
+ }
+
+ tmp_res->name = res->name;
+ tmp_res->start = res->start;
+ tmp_res->end = res->end;
+ tmp_res->flags = res->flags;
+
+ ret = insert_resource(&iomem_resource, tmp_res);
+ if (ret < 0) {
+ pr_err("Cannot insert IOMEM resource [%llx - %llx]\n",
+ tmp_res->start, tmp_res->end);
+ kfree(tmp_res);
+ goto err_insert;
+ }
+ }
erroneously re-using the region? Are you saying that the next time
allocate_resource is called it could find the same region again? It
doesn't seem possible?
No, as I understand the allocate_resource() being called for the same root
resource won't provide the same region... We only need to do this (insert the
region into "iomem_resource") if we allocated it from our *internal*
"xen_resource", as *global* "iomem_resource" (which is used everywhere) is not
aware of that region has been already allocated. So inserting a region here we
reserving it, otherwise it could be reused elsewhere.
Let's say that allocate_resource allocates a range from xen_resource.
From reading the code, it doesn't look like iomem_resource would have
that range because the extended regions described under /hypervisor are
not added automatically to iomem_resource.
So what if we don't call insert_resource? Nothing could allocate the
same range because iomem_resource doesn't have it at all and
xen_resource is not used anywhere if not here.
What am I missing?
Or maybe it is the other way around: core Linux code assumes everything
is described in iomem_resource so something under kernel/ or mm/ would
crash if we start using a page pointing to an address missing from
iomem_resource?
If target_resource is always valid, then we don't need this specialok.pgmap = kzalloc(sizeof(*pgmap), GFP_KERNEL);initialized = false
if (!pgmap) {
ret = -ENOMEM;
@@ -95,12 +137,40 @@ static int fill_list(unsigned int nr_pages)
err_memremap:
kfree(pgmap);
err_pgmap:
+ if (tmp_res) {
+ release_resource(tmp_res);
+ kfree(tmp_res);
+ }
+err_insert:
release_resource(res);
err_resource:
kfree(res);
return ret;
}
+static void unpopulated_init(void)
+{
+ static bool inited = false;
Good point, thank you. Will do. To be honest, I also don't like the current
+ int ret;Would it make sense to call unpopulated_init from an init function,
+
+ if (inited)
+ return;
+
+ /*
+ * Try to initialize Xen resource the first and fall back to default
+ * resource if arch doesn't offer one.
+ */
+ ret = arch_xen_unpopulated_init(&xen_resource);
+ if (!ret)
+ target_resource = &xen_resource;
+ else if (ret == -ENOSYS)
+ target_resource = &iomem_resource;
+ else
+ pr_err("Cannot initialize Xen resource\n");
+
+ inited = true;
+}
rather than every time xen_alloc_unpopulated_pages is called?
approach much.
I don't think, however I didn't tested on x86 and might be wrong, but
/**The commit message says that the behavior on x86 doesn't change but this
* xen_alloc_unpopulated_pages - alloc unpopulated pages
* @nr_pages: Number of pages
@@ -112,6 +182,16 @@ int xen_alloc_unpopulated_pages(unsigned int
nr_pages, struct page **pages)
unsigned int i;
int ret = 0;
+ unpopulated_init();
+
+ /*
+ * Fall back to default behavior if we do not have any suitable
resource
+ * to allocate required region from and as the result we won't be able
to
+ * construct pages.
+ */
+ if (!target_resource)
+ return alloc_xenballooned_pages(nr_pages, pages);
seems to be a change that could impact x86?
according to the current patch, on x86 the "target_resource" is always valid
and points to the "iomem_resource" as arch_xen_unpopulated_init() is not
implemented. So there won't be any fallback to use
alloc_(free)_xenballooned_pages() here and fill_list() will behave as usual.
check. In fact, the condition should never be true.
You raised a really good question, on Arm we need a fallback to balloon outI think it is unnecessary as we already assume today that
RAM pages again if hypervisor doesn't provide extended regions (we run on old
version, no unused regions with reasonable size, etc), so I decided to put a
fallback code here, an indicator of the failure is invalid "target_resource".
&iomem_resource is always available.
I noticed the patch which is about to be upstreamed that removesI would keep it simple and remove the fallback from this patch. So:
alloc_(free)xenballooned_pages API [1]. Right now I have no idea how/where
this fallback could be implemented as this is under build option control
(CONFIG_XEN_UNPOPULATED_ALLOC). So the API with the same name is either used
for unpopulated pages (if set) or ballooned pages (if not set). I would
appreciate suggestions regarding that. I am wondering would it be possible and
correctly to have both mechanisms (unpopulated and ballooned) enabled by
default and some init code to decide which one to use at runtime or some sort?
- if not CONFIG_XEN_UNPOPULATED_ALLOC, then balloon
- if CONFIG_XEN_UNPOPULATED_ALLOC, then
- xen_resource if present
- otherwise iomem_resource
The xen_resource/iomem_resource config can be done at init time using
target_resource. At runtime, target_resource is always != NULL so we
just go ahead and use it.