Re: [PATCH] maple_tree: Remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()
From: Liam R. Howlett
Date: Tue Jan 24 2023 - 10:23:46 EST
* David Hildenbrand <david@xxxxxxxxxx> [230124 08:18]:
> On 06.01.23 19:36, Liam Howlett wrote:
> > * Mike Rapoport <rppt@xxxxxxxxxx> [230106 02:28]:
> > > On Thu, Jan 05, 2023 at 04:05:34PM +0000, Liam Howlett wrote:
> > > > Preallocations are common in the VMA code to avoid allocating under
> > > > certain locking conditions. The preallocations must also cover the
> > > > worst-case scenario. Removing the GFP_ZERO flag from the
> > > > kmem_cache_alloc() (and bulk variant) calls will reduce the amount of
> > > > time spent zeroing memory that may not be used. Only zero out the
> > > > necessary area to keep track of the allocations in the maple state.
> > > > Zero the entire node prior to using it in the tree.
> > > >
> > > > This required internal changes to node counting on allocation, so the
> > > > test code is also updated.
> > > >
> > > > This restores some micro-benchmark performance:
> > > > up to +9% in mmtests mmap1 by my testing
> > > > +10% to +20% in mmap, mmapaddr, mmapmany tests reported by Red Hat
> > > >
> > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=2149636
> > > > Reported-by: Jirka Hladky <jhladky@xxxxxxxxxx>
> > > > Suggested-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> > > > Signed-off-by: Liam Howlett <Liam.Howlett@xxxxxxxxxx>
> > > > ---
> > > > lib/maple_tree.c | 80 +++++++++++++++++---------------
> > > > tools/testing/radix-tree/maple.c | 18 +++----
> > > > 2 files changed, 52 insertions(+), 46 deletions(-)
> > > >
> > > > diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> > > > index 26e2045d3cda..82a8121fe49b 100644
> > > > --- a/lib/maple_tree.c
> > > > +++ b/lib/maple_tree.c
> > > > @@ -149,13 +149,12 @@ struct maple_subtree_state {
> > > > /* Functions */
> > > > static inline struct maple_node *mt_alloc_one(gfp_t gfp)
> > > > {
> > > > - return kmem_cache_alloc(maple_node_cache, gfp | __GFP_ZERO);
> > > > + return kmem_cache_alloc(maple_node_cache, gfp);
> > > > }
> > > > static inline int mt_alloc_bulk(gfp_t gfp, size_t size, void **nodes)
> > > > {
> > > > - return kmem_cache_alloc_bulk(maple_node_cache, gfp | __GFP_ZERO, size,
> > > > - nodes);
> > > > + return kmem_cache_alloc_bulk(maple_node_cache, gfp, size, nodes);
> > > > }
> > > > static inline void mt_free_bulk(size_t size, void __rcu **nodes)
> > > > @@ -1127,9 +1126,10 @@ static inline struct maple_node *mas_pop_node(struct ma_state *mas)
> > > > {
> > > > struct maple_alloc *ret, *node = mas->alloc;
> > > > unsigned long total = mas_allocated(mas);
> > > > + unsigned int req = mas_alloc_req(mas);
> > > > /* nothing or a request pending. */
> > > > - if (unlikely(!total))
> > > > + if (WARN_ON(!total))
> > >
> > > Hmm, isn't WARN_ON() here too much?
> >
> > I don't think so. If we get to the point of asking for a node while we
> > don't have any to give, then it's too late. It means we (I) have
> > calculated the necessary nodes incorrectly and we won't have enough
> > memory to fit things into the tree. It should never happen.
>
> Either way, the suggestion is to use WARN_ON_ONCE() instead of WARN_ON().
>
> ... now documented in Documentation/process/coding-style.rst
>
Thanks, this is good information. I was actually debating a BUG_ON()
but it has been documented to that WARN_ON() is better, but
WARN_ON_ONCE() is best.
I would like the stack trace for any code path that happens to cause it
even if there was a previous code path that has already triggered the
backtrace.
The memory should have all been obtained or entered a previous error
path so an OOM issue will not cause a cascade of flooding backtraces.
This would also result in a null pointer dereference today, so I think
it's safe to say it's rare.
Thanks,
Liam