Re: [PATCH v2 2/5] binder: Make shrinker rely solely on per-VMA lock
From: Suren Baghdasaryan
Date: Fri Jun 12 2026 - 12:02:47 EST
On Fri, Jun 12, 2026 at 8:41 AM Vlastimil Babka (SUSE)
<vbabka@xxxxxxxxxx> wrote:
>
> On 6/11/26 21:59, Suren Baghdasaryan wrote:
> > On Thu, Jun 11, 2026 at 12:53 AM Alice Ryhl <aliceryhl@xxxxxxxxxx> wrote:
> >>
> >> > b/drivers/android/binder_alloc.c | 26 +++++++++-----------------
> >> > 1 file changed, 9 insertions(+), 17 deletions(-)
> >> >
> >> > diff -puN drivers/android/binder_alloc.c~binder-try-vma-lock drivers/android/binder_alloc.c
> >> > --- a/drivers/android/binder_alloc.c~binder-try-vma-lock 2026-06-10 15:57:55.274412018 -0700
> >> > +++ b/drivers/android/binder_alloc.c 2026-06-10 15:57:55.277412124 -0700
> >> > @@ -1142,7 +1142,6 @@ enum lru_status binder_alloc_free_page(s
> >> > struct vm_area_struct *vma;
> >> > struct page *page_to_free;
> >> > unsigned long page_addr;
> >> > - int mm_locked = 0;
> >> > size_t index;
> >> >
> >> > if (!mmget_not_zero(mm))
> >> > @@ -1151,15 +1150,12 @@ enum lru_status binder_alloc_free_page(s
> >> > index = mdata->page_index;
> >> > page_addr = alloc->vm_start + index * PAGE_SIZE;
> >> >
> >> > - /* attempt per-vma lock first */
> >> > + /*
> >> > + * Attempt per-vma lock. This is essentially a
> >> > + * "trylock". It can fail even if the VMA exists
> >> > + * for 'page_addr'.
> >> > + */
> >> > vma = lock_vma_under_rcu(mm, page_addr);
> >> > - if (!vma) {
> >> > - /* fall back to mmap_lock */
> >> > - if (!mmap_read_trylock(mm))
> >> > - goto err_mmap_read_lock_failed;
> >> > - mm_locked = 1;
> >> > - vma = vma_lookup(mm, page_addr);
> >> > - }
> >> >
> >> > if (!mutex_trylock(&alloc->mutex))
> >> > goto err_get_alloc_mutex_failed;
> >> > @@ -1188,13 +1184,11 @@ enum lru_status binder_alloc_free_page(s
> >> > zap_vma_range(vma, page_addr, PAGE_SIZE);
> >> >
> >> > trace_binder_unmap_user_end(alloc, index);
> >> > +
> >> > + vma_end_read(vma);
> >> > }
> >> >
> >> > mutex_unlock(&alloc->mutex);
> >> > - if (mm_locked)
> >> > - mmap_read_unlock(mm);
> >> > - else
> >> > - vma_end_read(vma);
> >> > mmput_async(mm);
> >> > binder_free_page(page_to_free);
> >> >
> >> > @@ -1203,11 +1197,9 @@ enum lru_status binder_alloc_free_page(s
> >> > err_invalid_vma:
> >> > mutex_unlock(&alloc->mutex);
> >> > err_get_alloc_mutex_failed:
> >> > - if (mm_locked)
> >> > - mmap_read_unlock(mm);
> >> > - else
> >> > + if (vma)
> >> > vma_end_read(vma);
> >> > -err_mmap_read_lock_failed:
> >> > +err_vma_lock_failed:
>
> This label is unused btw, which is related to Alice's point.
>
> >> > mmput_async(mm);
> >>
> >> If the vma lookup fails because the mmap write lock is held, but the vma
> >> actually exists (has not been unmapped), then this code might "successfully"
> >> remove the page without invoking zap_vma_range(). This means that the
> >> page does not actually get freed and will just hang around forever until
> >> the process owning the vma exits or Binder needs this page and maps a
> >> new page on top of the page.
> >
> > Yeah, I think if lock_vma_under_rcu() returns NULL you just need to
> > jump to err_mmap_read_lock_failed, like we currently do if
> > mmap_read_trylock() fails.
>
> I don't think that will be enough as well, as the current code AFAICS does
> something meaninfgul when mmap_read_trylock() suceeds but vma_lookup returns
> NULL because there's no vma at that address. Now we would just assume the
> trylock failed even if the reason was that vma lookup found nothing for the
> address. The problem is that lock_vma_under_rcu() can't distinguish those
> two outcomes, so we would need something that does?
Ah, you are right. I misread the condition in that code.
Changing lock_vma_under_rcu() to distinguish between lock failure and
no-VMA case would require some refactoring. vma_start_read() should
return -ESRCH if VMA was not found and then we handle that in
lock_vma_under_rcu(). That part is not hard because vma_start_read()
has only 2 users but then we have to propagate that error code to
lock_vma_under_rcu() users and there are many of them. Though
converting these users is straght forward. Instead of checking if
(vma) they would have to check if (IS_ERR_OR_NULL(vma)).