Re: [PATCH] maple_tree: use GFP_KERNEL on mas_node_count

From: Liam R. Howlett
Date: Thu Sep 14 2023 - 11:32:14 EST


* Jaeseon Sim <jason.sim@xxxxxxxxxxx> [230913 22:49]:
> > * Jaeseon Sim <jason.sim@xxxxxxxxxxx> [230907 00:41]:
> > > > On Thu, Sep 07, 2023 at 12:02:02PM +0800, Peng Zhang wrote:
> > > > >
> > > > >
> > > > > 在 2023/9/7 11:49, Matthew Wilcox 写道:
> > > > > > On Thu, Sep 07, 2023 at 12:39:14PM +0900, 심재선 wrote:
> > > > > > > Use GFP_KERNEL on mas_node_count instead of GFP_NOWAIT | __GFP_NOWARN
> > > > > > > in order to allow memory reclaim.
> > > > > There are many paths that call maple tree's mas_node_count(). Some paths
> > > > > cannot reclaim memory.
> > > >
> > > > Right ... but we should be handling the ENOMEM inside the maple tree and
> > > > allocating some nodes with GFP_KERNEL instead of failing fork().
> > > >
> > > > > > What testing did you do of this patch? In particular, did you try it
> > > > > > with lockdep enabled?
> > > I did power on/off test with this patch.
> > > I did not try it with lockdep enabled.
> >
> > To accomplish the same result, but with a much smaller scope that will
> > work with lockdep, I would suggest changing mas_expected_entries() to
> > use mas_node_count_gfp() (which already exists) and pass in GFP_KERNEL.
> >
> > Since fork is the only current user of mas_expected_entries(), this
> > won't break other users and we can deal with changing it for others if
> > it is needed.
> >
> > If we do go this route, please add a note in the documentation about
> > using GFP_KERNEL.
> >
> > Willy, does that work for you?
> >
> > Thanks,
> > Liam
>
> Dear Liam,
> Can I ask you the reason why mas_node_count is using GFP_NOWAIT | __GFP_NOWARN?

Must users in the VMA space have complicated locking schemes which
require no sleeping during a store operation. Most operations will drop
the lock and re-try with GFP_KERNEL when using the internal lock (see
mas_nomem()).

> I wonder if other callers for mas_node_count might have similar issue.

The external callers who need GFP_KERNEL are either using
mas_store_gfp() or mas_prealloc to set up a store prior to taking a
series of other locks.

During a mas_prealloc() or mas_expected_entries() call, we set the
MA_STATE_PREALLOC flag to indicate that there are nodes preallocated.
This is to catch users who call mas_node_count() and require increased
allocations when allocations should not be taken. You can see this flag
directly below the line you modified.

>
> From your opinion, I'll post v2 patch as follows

Thanks. Please test with lockdep but I don't see a nesting lock issue
with fork and this change.

>
> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index ee1ff0c59fd7..b0229271c24e 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -5574,7 +5574,7 @@ int mas_expected_entries(struct ma_state *mas, unsigned long nr_entries)
> /* Internal nodes */
> nr_nodes += DIV_ROUND_UP(nr_nodes, nonleaf_cap);
> /* Add working room for split (2 nodes) + new parents */
> - mas_node_count(mas, nr_nodes + 3);
> + mas_node_count_gfp(mas, nr_nodes + 3, GFP_KERNEL);
>
> /* Detect if allocations run out */
> mas->mas_flags |= MA_STATE_PREALLOC;
> --
> 2.17.1
>
> Thanks
> Jaeseon