Re: [PATCH v6] ANDROID: binder: change down_write to down_read

From: Martijn Coenen
Date: Tue May 15 2018 - 03:46:19 EST

Next message: Christoph Hellwig: "Re: [PATCH 3/5] nvme: call nvmf_create_ctrl before checking for duplicate assignment"
Previous message: Andy Shevchenko: "Re: [PATCH v5 2/2] mtd: rawnand: use bit-wise majority to recover the contents of ONFI parameter"
In reply to: Minchan Kim: "Re: [PATCH v6] ANDROID: binder: change down_write to down_read"
Next in thread: Minchan Kim: "Re: [PATCH v6] ANDROID: binder: change down_write to down_read"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, May 15, 2018 at 8:27 AM, Minchan Kim <minchan.kernel@xxxxxxxxx> wrote:
> Hi Joel,
>
> Sorry for the late response. I was off.
>
> On Wed, May 09, 2018 at 04:19:41PM -0700, Joel Fernandes wrote:
>> On Wed, May 09, 2018 at 03:40:23PM +0900, Minchan Kim wrote:
>> > On Tue, May 08, 2018 at 04:08:13PM -0700, Joel Fernandes wrote:
>> > > On Tue, May 08, 2018 at 07:51:01PM +0900, Minchan Kim wrote:
>> > > > On Mon, May 07, 2018 at 10:28:29AM -0700, Joel Fernandes wrote:
>> > > > > On Mon, May 07, 2018 at 11:15:37PM +0900, Minchan Kim wrote:
>> > > > > > binder_update_page_range needs down_write of mmap_sem because
>> > > > > > vm_insert_page need to change vma->vm_flags to VM_MIXEDMAP unless
>> > > > > > it is set. However, when I profile binder working, it seems
>> > > > > > every binder buffers should be mapped in advance by binder_mmap.
>> > > > > > It means we could set VM_MIXEDMAP in binder_mmap time which is
>> > > > > > already hold a mmap_sem as down_write so binder_update_page_range
>> > > > > > doesn't need to hold a mmap_sem as down_write.
>> > > > > > Please use proper API down_read. It would help mmap_sem contention
>> > > > > > problem as well as fixing down_write abuse.
>> > > > > >
>> > > > > > Ganesh Mahendran tested app launching and binder throughput test
>> > > > > > and he said he couldn't find any problem and I did binder latency
>> > > > > > test per Greg KH request(Thanks Martijn to teach me how I can do)
>> > > > > > I cannot find any problem, too.
>> > > > > >
>> > > > > > Cc: Ganesh Mahendran <opensource.ganesh@xxxxxxxxx>
>> > > > > > Cc: Joe Perches <joe@xxxxxxxxxxx>
>> > > > > > Cc: Arve HjÃnnevÃg <arve@xxxxxxxxxxx>
>> > > > > > Cc: Todd Kjos <tkjos@xxxxxxxxxx>
>> > > > > > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
>> > > > > > Reviewed-by: Martijn Coenen <maco@xxxxxxxxxxx>
>> > > > > > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
>> > > > > > ---
>> > > > > > drivers/android/binder.c | 4 +++-
>> > > > > > drivers/android/binder_alloc.c | 6 +++---
>> > > > > > 2 files changed, 6 insertions(+), 4 deletions(-)
>> > > > > >
>> > > > > > diff --git a/drivers/android/binder.c b/drivers/android/binder.c
>> > > > > > index 4eab5be3d00f..7b8e96f60719 100644
>> > > > > > --- a/drivers/android/binder.c
>> > > > > > +++ b/drivers/android/binder.c
>> > > > > > @@ -4730,7 +4730,9 @@ static int binder_mmap(struct file *filp, struct vm_area_struct *vma)
>> > > > > > failure_string = "bad vm_flags";
>> > > > > > goto err_bad_arg;
>> > > > > > }
>> > > > > > - vma->vm_flags = (vma->vm_flags | VM_DONTCOPY) & ~VM_MAYWRITE;
>> > > > > > + vma->vm_flags |= VM_DONTCOPY | VM_MIXEDMAP;
>> > > > > > + vma->vm_flags &= ~VM_MAYWRITE;
>> > > > > > +
>> > > > > > vma->vm_ops = &binder_vm_ops;
>> > > > > > vma->vm_private_data = proc;
>> > > > > >
>> > > > > > diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
>> > > > > > index 5a426c877dfb..4f382d51def1 100644
>> > > > > > --- a/drivers/android/binder_alloc.c
>> > > > > > +++ b/drivers/android/binder_alloc.c
>> > > > > > @@ -219,7 +219,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
>> > > > > > mm = alloc->vma_vm_mm;
>> > > > > >
>> > > > > > if (mm) {
>> > > > > > - down_write(&mm->mmap_sem);
>> > > > > > + down_read(&mm->mmap_sem);
>> > > > >
>> > > > >
>> > > > > Nice. Is there a need to hold the reader-lock at all here? Just curious what
>> > > > > else is it protecting (here or in vm_insert_page).
>> > > >
>> > > > It should protect vm_area_struct. IOW, when we try insert page into virtual address area,
>> > > > vma shouldn't be changed(ie, unmap/collapse/split).
>> > >
>> > > When you say unmap, are you talking about pages being unmapped from the VMA
>> > > or something else?
>> >
>> > I mean to destroy vm_area_struct itself as well as unmap pages.
>> >
>> > >
>> > > For the collapse/split part, the binder VMA (vm_area_struct) itself isn't
>> > > changed after the initial mmap (AFAIK) so I don't see a need for protection
>> > > there.
>> >
>> > There is no way to unmap in runtime? What happens if some buggy applications
>> > do unmap by mistake?
>> > Cannot we access those B's vma from A context?
>> > If B process is exiting, the VMA would be gone while A is accessing.
>>
>> If the VMA is gone, then why is it a good idea to map pages into it anyway?
>
> Hmm, sorry about that. I couldn't understand your point.

I think Joel may have meant that if the VMA is gone (forex because the
process unmaps or dies), we shouldn't allocate pages into that VMA
anymore. I don't think we do that currently - we check if the vma is
still valid before allocating.

>
>>
>> About the unmap at runtime part, your commit message was a bit confusing. You
>> said "every binder buffers should be mapped in advance by binder_mmap." but I
>> think the new binder shrinker mechanism doesn't make that true anymore.
>
> It's good point. I didn't know know that.
> When I see binder_vm_fault, it emits SIGBUS. That means shrinker cannot zap pages
> process is using, I think. IOW, every pages for binder are mapped at mmap time
> and is never mapped in runtime by page fault. Right?

Right - the address range is allocated once, and an initial amount of
pages is mapped into it. For every transaction into that process, we
will see if there's enough pages, and if not allocate so that we have
enough of them - so this is not done by page fault. The shrinker won't
touch pages for which a transaction is in progress. Of course a
process itself could still try to read from an unallocated address,
but in that case returning SIGBUS and having that process crash seems
fine.

I'm also not sure the read lock is needed, but I would need to read a
whole lot more code to convince myself it's not.

>
> Thanks.
>
>>
>> The unmap by mistake point is a valid one I guess.
>>
>> thanks,
>>
>> - Joel
>>

Next message: Christoph Hellwig: "Re: [PATCH 3/5] nvme: call nvmf_create_ctrl before checking for duplicate assignment"
Previous message: Andy Shevchenko: "Re: [PATCH v5 2/2] mtd: rawnand: use bit-wise majority to recover the contents of ONFI parameter"
In reply to: Minchan Kim: "Re: [PATCH v6] ANDROID: binder: change down_write to down_read"
Next in thread: Minchan Kim: "Re: [PATCH v6] ANDROID: binder: change down_write to down_read"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]