Re: Regression after commit 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly")

From: Michal Hocko
Date: Sun Feb 11 2018 - 04:27:15 EST


On Thu 08-02-18 15:20:04, Matthew Wilcox wrote:
> On Thu, Feb 08, 2018 at 05:06:49AM -0800, Matthew Wilcox wrote:
> > On Thu, Feb 08, 2018 at 02:29:57PM +0800, Kai Heng Feng wrote:
> > > A user with i386 instead of AMD64 machine reports [1] that commit 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitlyâ) causes a regression.
> > > BUG_ON(PageHighMem(pg)) in drivers/media/common/saa7146/saa7146_core.c always gets triggered after that commit.
> >
> > Well, the BUG_ON is wrong. You can absolutely have pages which are both
> > HighMem and under the 4GB boundary. Only the first 896MB (iirc) are LowMem,
> > and the next 3GB of pages are available to vmalloc_32().
>
> ... nevertheless, 19809c2da28a does in fact break vmalloc_32 on 32-bit. Look:
>
> #if defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA32)
> #define GFP_VMALLOC32 GFP_DMA32 | GFP_KERNEL
> #elif defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA)
> #define GFP_VMALLOC32 GFP_DMA | GFP_KERNEL
> #else
> #define GFP_VMALLOC32 GFP_KERNEL
> #endif
>
> So we pass in GFP_KERNEL to __vmalloc_node, which calls __vmalloc_node_range
> which calls __vmalloc_area_node, which ORs in __GFP_HIGHMEM.

Dohh. I have missed this. I was convinced that we always add GFP_DMA32
when doing vmalloc_32. Sorry about that. The above definition looks
quite weird to be honest. First of all do we have any 64b system without
both DMA and DMA32 zones? If yes, what is the actual semantic of
vmalloc_32? Or is there any magic forcing GFP_KERNEL into low 32b?

Also I would expect that __GFP_DMA32 should do the right thing on 32b
systems. So something like the below should do the trick
---
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 673942094328..2eab5d1ef548 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1947,7 +1947,8 @@ void *vmalloc_exec(unsigned long size)
#elif defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA)
#define GFP_VMALLOC32 GFP_DMA | GFP_KERNEL
#else
-#define GFP_VMALLOC32 GFP_KERNEL
+/* This should be only 32b systems */
+#define GFP_VMALLOC32 GFP_DMA32 | GFP_KERNEL
#endif

/**

--
Michal Hocko
SUSE Labs