Re: [PATCH v2] slab,slub: ignore __GFP_WAIT if we're booting or suspending
From: Andrew Morton
Date: Fri Jun 12 2009 - 11:23:42 EST
On Fri, 12 Jun 2009 12:45:21 +0300 (EEST) Pekka J Enberg <penberg@xxxxxxxxxxxxxx> wrote:
> From: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
> Date: Fri, 12 Jun 2009 12:39:58 +0300
> Subject: [PATCH] Sanitize "gfp" flags during boot
>
> With the recent shuffle of initialization order to move memory related
> inits earlier, various subtle breakage was introduced in archs like
> powerpc due to code somewhat assuming that GFP_KERNEL can be used as
> soon as the allocators are up. This is not true because any __GFP_WAIT
> allocation will cause interrupts to be enabled, which can be fatal if
> it happens too early.
>
> This isn't trivial to fix on every call site. For example, powerpc's
> ioremap implementation needs to be called early. For that, it uses two
> different mechanisms to carve out virtual space. Before memory init,
> by moving down VMALLOC_END, and then, by calling get_vm_area().
> Unfortunately, the later does GFK_KERNEL allocations. But we can't do
> anything else because once vmalloc's been initialized, we can no longer
> safely move VMALLOC_END to carve out space.
>
> There are other examples, wehere can can be called either very early
> or later on when devices are hot-plugged. It would be a major pain for
> such code to have to "know" whether it's in a context where it should
> use GFP_KERNEL or GFP_NOWAIT.
>
> Finally, by having the ability to silently removed __GFP_WAIT from
> allocations, we pave the way for suspend-to-RAM to use that feature
> to also remove __GFP_IO from allocations done after suspending devices
> has started. This is important because such allocations may hang if
> devices on the swap-out path have been suspended, but not-yet suspended
> drivers don't know about it, and may deadlock themselves by being hung
> into a kmalloc somewhere while holding a mutex for example.
>
> ...
>
> +/*
> + * We set up the page allocator and the slab allocator early on with interrupts
> + * disabled. Therefore, make sure that we sanitize GFP flags accordingly before
> + * everything is up and running.
> + */
> +gfp_t gfp_allowed_bits = ~(__GFP_WAIT|__GFP_FS | __GFP_IO);
__read_mostly
> +void mm_late_init(void)
> +{
> + /*
> + * Interrupts are enabled now so all GFP allocations are safe.
> + */
> + gfp_allowed_bits = __GFP_BITS_MASK;
> +}
Using plain old -1 here would be a more obviously-correct change.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/