Re: Fwd: vmalloc error: btrfs-delalloc btrfs_work_helper [btrfs] in kernel 6.3.x

From: David Sterba
Date: Mon May 22 2023 - 15:16:23 EST


On Mon, May 22, 2023 at 06:00:42PM +0200, Uladzislau Rezki wrote:
> > Hi,
> >
> > I notice a regression report on Bugzilla [1]. Quoting from it:
> >
> > > after updating from 6.2.x to 6.3.x, vmalloc error messages started to appear in the dmesg
> > >
> > >
> > >
> > > # free
> > > total used free shared buff/cache available
> > > Mem: 16183724 1473068 205664 33472 14504992 14335700
> > > Swap: 16777212 703596 16073616
> > >
> > >
> > > (zswap enabled)
> >
> > See bugzilla for the full thread and attached dmesg.
> >
> > On the report, the reporter can't perform the required bisection,
> > unfortunately.
> >
> > Anyway, I'm adding it to regzbot:
> >
> > #regzbot introduced: v6.2..v6.3 https://bugzilla.kernel.org/show_bug.cgi?id=217466
> > #regzbot title: btrfs_work_helper dealloc error in v6.3.x
> >
> > Thanks.
> >
> > [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217466
> >
> According to dmesg output from the bugzilla, the vmalloc tries to
> allocate high order pages: 1 << 9. Since it fails to get a order-9 page
> you get the warning:

That we want a order 9 is intentional, it's for a compression workspace
(bugzilla comment 5). It's allocated as kvzalloc i.e. with the fallback
to vmalloc in case the first one fails.

> <snip>
> if (area->nr_pages != nr_small_pages) {
> /* vm_area_alloc_pages() can also fail due to a fatal signal */
> if (!fatal_signal_pending(current))
> warn_alloc(gfp_mask, NULL,
> "vmalloc error: size %lu, page order %u, failed to allocate pages",
> area->nr_pages * PAGE_SIZE, page_order);
> goto fail;
> }
> <snip>
>
> and it fails.
>
> If the __GFP_NOFAIL is passed, the vm_area_alloc_pages() function switches
> to allocate 0-order pages instead. I think the fix is to call the
> kvmalloc_node() with __GFP_NOFAIL flag.

__GFP_NOFAIL does not make sense here and we've tried hard not to used
it anywhere because of the deadlocky effects. Did you mean __GFP_NOWARN?
That's a patch I sent today but there's another comment in the bugzilla
that we got more allocation warnings for huge (2M) allocations, this
time it was for a deduplication ioctl.

This seems to be a noticeable change in 6.3, before we disable the
warning in our code I think the MM guys could have a look. So far it
seems we're about to paper of a problem.