Re: [PATCH v2 2/5] binder: Make shrinker rely solely on per-VMA lock
From: Suren Baghdasaryan
Date: Fri Jun 12 2026 - 12:45:58 EST
On Fri, Jun 12, 2026 at 9:05 AM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>
> On 6/12/26 08:41, Vlastimil Babka (SUSE) wrote:
> >>> If the vma lookup fails because the mmap write lock is held, but the vma
> >>> actually exists (has not been unmapped), then this code might "successfully"
> >>> remove the page without invoking zap_vma_range(). This means that the
> >>> page does not actually get freed and will just hang around forever until
> >>> the process owning the vma exits or Binder needs this page and maps a
> >>> new page on top of the page.
> >> Yeah, I think if lock_vma_under_rcu() returns NULL you just need to
> >> jump to err_mmap_read_lock_failed, like we currently do if
> >> mmap_read_trylock() fails.
> > I don't think that will be enough as well, as the current code AFAICS does
> > something meaninfgul when mmap_read_trylock() suceeds but vma_lookup returns
> > NULL because there's no vma at that address. Now we would just assume the
> > trylock failed even if the reason was that vma lookup found nothing for the
> > address. The problem is that lock_vma_under_rcu() can't distinguish those
> > two outcomes, so we would need something that does?
>
> I spent way too much time staring at this yesterday.
>
> I think the key to distinguishing between:
>
> vma==NULL because there's no VMA
> and
> vma==NULL because of a trylock failure
>
> is binder_alloc_is_mapped(). It won't return false until vm_ops->close()
> finishes. vm_ops->close() shouldn't be able to happen while
> lock_vma_under_rcu() is held. So if you've got a non-NULL VMA, you've
> also got a stable is binder_alloc_is_mapped().
By "stable binder_alloc_is_mapped()" do you mean it would always be
true? Asking because in your patch you removed this condition:
- if (vma && !binder_alloc_is_mapped(alloc))
- goto err_invalid_vma;
So, previously if we found the VMA but binder_alloc_is_mapped()==false
we would bail out and now we don't. Are you reasoning that this
combination is impossible?
>
> So, if you've got a vma!=NULL *and* binder_alloc_is_mapped()==true, I
> think you can be pretty sure you've got the right VMA.
>
> If you have vma==NULL and binder_alloc_is_mapped()==true, you can be
> pretty sure that you hit some kind of transient lock_vma_under_rcu()
> failure.
>
> I came up with the attached patch. More eyeballs would be welcome.
> There's a _lot_ going on here.