Re: [PATCH v3 3/3] iova: defer maple tree erase on GFP_ATOMIC failure

From: Rik van Riel

Date: Fri Jun 12 2026 - 12:04:05 EST


On Tue, 2026-06-09 at 10:04 -0300, Jason Gunthorpe wrote:
> On Tue, Jun 02, 2026 at 11:35:48PM -0400, Rik van Riel wrote:
> > +/*
> > + * Remove an IOVA entry from the maple tree. Returns true on
> > success.
> > + * On failure (maple tree node allocation under GFP_ATOMIC
> > failed),
> > + * returns false — the entry remains in the tree and the caller
> > must
> > + * not free the struct iova.
> > + */
> > +static bool remove_iova(struct iova_domain *iovad, struct iova
> > *iova)
> >  {
> >   MA_STATE(mas, &iovad->mtree, iova->pfn_lo, iova->pfn_hi);
> >  
> > @@ -165,7 +175,36 @@ static void remove_iova(struct iova_domain
> > *iovad, struct iova *iova)
> >   if (iova->pfn_lo < iovad->dma_32bit_pfn)
> >   iovad->max32_alloc_size = iovad->dma_32bit_pfn;
> >  
> > - mas_store_gfp(&mas, NULL, GFP_ATOMIC);
> > + if (mas_store_gfp(&mas, NULL, GFP_ATOMIC))
> > + return false;
>
> But why does it use mas_store(NULL) instead of mas_erase()? I thought
> the iova alloc/free has to be pair wise, we don't split allocations?
>
I just looked into this some more, and I was
confused earlier this week.

The mas_erase() function calls mas_nomem(mas, GFP_KERNEL),
which is not safe to call while holding a spinlock.

The remove_iova() function holds a spinlock, with
interrupts blocked, and needs to run like that because
it could be called from places like IO completion
handlers.

That leaves the option of either having slightly
uglier maple tree code, or going back to the
augmented rbtree (but cleaning that up a little).

Just let me know what you prefer, I'm happy to do
either.

--
All Rights Reversed.