Re: [RFC PATCH 2/2] xen/grant-table: Use unpopulated DMAable pages instead of real RAM ones

From: Stefano Stabellini
Date: Fri Jun 03 2022 - 17:19:45 EST


On Tue, 17 May 2022, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
>
> Depends on CONFIG_XEN_UNPOPULATED_ALLOC. If enabled then unpopulated
> DMAable (contiguous) pages will be allocated for grant mapping into
> instead of ballooning out real RAM pages.
>
> TODO: Fallback to real RAM pages if xen_alloc_unpopulated_dma_pages()
> fails.
>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
> ---
> drivers/xen/grant-table.c | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> index 8ccccac..2bb4392 100644
> --- a/drivers/xen/grant-table.c
> +++ b/drivers/xen/grant-table.c
> @@ -864,6 +864,25 @@ EXPORT_SYMBOL_GPL(gnttab_free_pages);
> */
> int gnttab_dma_alloc_pages(struct gnttab_dma_alloc_args *args)
> {
> +#ifdef CONFIG_XEN_UNPOPULATED_ALLOC
> + int ret;

This is an alternative implementation of the same function. If we are
going to use #ifdef, then I would #ifdef the entire function, rather
than just the body. Otherwise within the function body we can use
IS_ENABLED.


> + ret = xen_alloc_unpopulated_dma_pages(args->dev, args->nr_pages,
> + args->pages);
> + if (ret < 0)
> + return ret;
> +
> + ret = gnttab_pages_set_private(args->nr_pages, args->pages);
> + if (ret < 0) {
> + gnttab_dma_free_pages(args);

it should xen_free_unpopulated_dma_pages ?


> + return ret;
> + }
> +
> + args->vaddr = page_to_virt(args->pages[0]);
> + args->dev_bus_addr = page_to_phys(args->pages[0]);

There are two things to note here.

The first thing to note is that normally we would call pfn_to_bfn to
retrieve the dev_bus_addr of a page because pfn_to_bfn takes into
account foreign mappings. However, these are freshly allocated pages
without foreign mappings, so page_to_phys/dma should be sufficient.


The second has to do with physical addresses and DMA addresses. The
functions are called gnttab_dma_alloc_pages and
xen_alloc_unpopulated_dma_pages which make you think we are retrieving a
DMA address here. However, to get a DMA address we need to call
page_to_dma rather than page_to_phys.

page_to_dma takes into account special offsets that some devices have
when accessing memory. There are real cases on ARM where the physical
address != DMA address, e.g. RPi4.

However, to call page_to_dma you need to specify as first argument the
DMA-capable device that is expected to use those pages for DMA (e.g. an
ethernet device or a MMC controller.) While the args->dev we have in
gnttab_dma_alloc_pages is the gntdev_miscdev.

So this interface cannot actually be used to allocate memory that is
supposed to be DMA-able by a DMA-capable device, such as an ethernet
device.

But I think that should be fine because the memory is meant to be used
by a userspace PV backend for grant mappings. If any of those mappings
end up being used for actual DMA in the kernel they should go through the
drivers/xen/swiotlb-xen.c and xen_phys_to_dma should be called, which
ends up calling page_to_dma as appropriate.

It would be good to double-check that the above is correct and, if so,
maybe add a short in-code comment about it:

/*
* These are not actually DMA addresses but regular physical addresses.
* If these pages end up being used in a DMA operation then the
* swiotlb-xen functions are called and xen_phys_to_dma takes care of
* the address translations:
*
* - from gfn to bfn in case of foreign mappings
* - from physical to DMA addresses in case the two are different for a
* given DMA-mastering device
*/



> + return ret;
> +#else
> unsigned long pfn, start_pfn;
> size_t size;
> int i, ret;
> @@ -910,6 +929,7 @@ int gnttab_dma_alloc_pages(struct gnttab_dma_alloc_args *args)
> fail:
> gnttab_dma_free_pages(args);
> return ret;
> +#endif
> }
> EXPORT_SYMBOL_GPL(gnttab_dma_alloc_pages);
>
> @@ -919,6 +939,12 @@ EXPORT_SYMBOL_GPL(gnttab_dma_alloc_pages);
> */
> int gnttab_dma_free_pages(struct gnttab_dma_alloc_args *args)
> {
> +#ifdef CONFIG_XEN_UNPOPULATED_ALLOC
> + gnttab_pages_clear_private(args->nr_pages, args->pages);
> + xen_free_unpopulated_dma_pages(args->dev, args->nr_pages, args->pages);
> +
> + return 0;
> +#else
> size_t size;
> int i, ret;
>
> @@ -946,6 +972,7 @@ int gnttab_dma_free_pages(struct gnttab_dma_alloc_args *args)
> dma_free_wc(args->dev, size,
> args->vaddr, args->dev_bus_addr);
> return ret;
> +#endif
> }
> EXPORT_SYMBOL_GPL(gnttab_dma_free_pages);
> #endif
> --
> 2.7.4
>