Re: Regression when booting 5.15 as dom0 on arm64 (WAS: Re: [linux-linus test] 161829: regressions - FAIL)

From: Stefano Stabellini
Date: Mon May 10 2021 - 21:46:37 EST


On Mon, 10 May 2021, Christoph Hellwig wrote:
> On Sat, May 08, 2021 at 12:32:37AM +0100, Julien Grall wrote:
> > The pointer dereferenced seems to suggest that the swiotlb hasn't been
> > allocated. From what I can tell, this may be because swiotlb_force is set
> > to SWIOTLB_NO_FORCE, we will still enable the swiotlb when running on top
> > of Xen.
> >
> > I am not entirely sure what would be the correct fix. Any opinions?
>
> Can you try something like the patch below (not even compile tested, but
> the intent should be obvious?
>
>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 16a2b2b1c54d..7671bc153fb1 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -44,6 +44,8 @@
> #include <asm/tlb.h>
> #include <asm/alternative.h>
>
> +#include <xen/arm/swiotlb-xen.h>
> +
> /*
> * We need to be able to catch inadvertent references to memstart_addr
> * that occur (potentially in generic code) before arm64_memblock_init()
> @@ -482,7 +484,7 @@ void __init mem_init(void)
> if (swiotlb_force == SWIOTLB_FORCE ||
> max_pfn > PFN_DOWN(arm64_dma_phys_limit))
> swiotlb_init(1);
> - else
> + else if (!IS_ENABLED(CONFIG_XEN) || !xen_swiotlb_detect())
> swiotlb_force = SWIOTLB_NO_FORCE;
>
> set_max_mapnr(max_pfn - PHYS_PFN_OFFSET);

The "IS_ENABLED(CONFIG_XEN)" is not needed as the check is already part
of xen_swiotlb_detect().


But let me ask another question first. Do you think it makes sense to have:

if (swiotlb_force == SWIOTLB_NO_FORCE)
return 0;

at the beginning of swiotlb_late_init_with_tbl? I am asking because
swiotlb_late_init_with_tbl is meant for special late initializations,
right? It shouldn't really matter the presence or absence of
SWIOTLB_NO_FORCE in regards to swiotlb_late_init_with_tbl. Also the
commit message for "swiotlb: Make SWIOTLB_NO_FORCE perform no
allocation" says that "If a platform was somehow setting
swiotlb_no_force and a later call to swiotlb_init() was to be made we
would still be proceeding with allocating the default SWIOTLB size
(64MB)." Our case here is very similar, right? So the allocation should
proceed?


Which brings me to a separate unrelated issue, still affecting the path
xen_swiotlb_init -> swiotlb_late_init_with_tbl. If swiotlb_init(1) is
called by mem_init then swiotlb_late_init_with_tbl will fail due to the
check:

/* protect against double initialization */
if (WARN_ON_ONCE(io_tlb_default_mem))
return -ENOMEM;

xen_swiotlb_init is meant to ask Xen to make a bunch of pages physically
contiguous. Then, it initializes the swiotlb buffer based on those
pages. So it is a problem that swiotlb_late_init_with_tbl refuses to
continue. However, in practice it is not a problem today because on ARM
we don't actually make any special requests to Xen to make the pages
physically contiguous (yet). See the empty implementation of
arch/arm/xen/mm.c:xen_create_contiguous_region. I don't know about x86.

So maybe we should instead do something like the appended?


diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index f8f07469d259..f5a3638d1dee 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -152,6 +152,7 @@ static int __init xen_mm_init(void)
struct gnttab_cache_flush cflush;
if (!xen_swiotlb_detect())
return 0;
+ swiotlb_exit();
xen_swiotlb_init();

cflush.op = 0;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 8ca7d505d61c..f17be37298a7 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -285,9 +285,6 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
unsigned long bytes = nslabs << IO_TLB_SHIFT, i;
struct io_tlb_mem *mem;

- if (swiotlb_force == SWIOTLB_NO_FORCE)
- return 0;
-
/* protect against double initialization */
if (WARN_ON_ONCE(io_tlb_default_mem))
return -ENOMEM;