Re: [PATCH v5] ANDROID: binder: change down_write to down_read

From: Ganesh Mahendran
Date: Mon Apr 02 2018 - 06:02:07 EST


2018-04-02 15:11 GMT+08:00 Minchan Kim <minchan@xxxxxxxxxx>:
> On Mon, Apr 02, 2018 at 02:46:14PM +0800, Ganesh Mahendran wrote:
>> 2018-04-02 14:34 GMT+08:00 Minchan Kim <minchan@xxxxxxxxxx>:
>> > On Fri, Mar 30, 2018 at 12:04:07PM +0200, Greg Kroah-Hartman wrote:
>> >> On Fri, Mar 30, 2018 at 10:29:21AM +0900, Minchan Kim wrote:
>> >> > Hi Ganesh,
>> >> >
>> >> > On Fri, Mar 30, 2018 at 09:21:55AM +0800, Ganesh Mahendran wrote:
>> >> > > 2018-03-29 14:54 GMT+08:00 Minchan Kim <minchan@xxxxxxxxxx>:
>> >> > > > binder_update_page_range needs down_write of mmap_sem because
>> >> > > > vm_insert_page need to change vma->vm_flags to VM_MIXEDMAP unless
>> >> > > > it is set. However, when I profile binder working, it seems
>> >> > > > every binder buffers should be mapped in advance by binder_mmap.
>> >> > > > It means we could set VM_MIXEDMAP in binder_mmap time which is
>> >> > > > already hold a mmap_sem as down_write so binder_update_page_range
>> >> > > > doesn't need to hold a mmap_sem as down_write.
>> >> > > >
>> >> > > > Android suffers from mmap_sem contention so let's reduce mmap_sem
>> >> > > > down_write.
>> >> > >
>> >> > > Hi, Minchan:
>> >> > >
>> >> > > It seems there is performance regression of this patch.
>> >> >
>> >> > You mean "This patch aims for solving performance regression" not "This patch
>> >> > makes performance regression"?
>> >> >
>> >> > >
>> >> > > Do you have some test result of android app launch time or binderThroughput?
>> >> >
>> >> > Unfortunately, I don't have any number. The goal is to reduce the number of
>> >> > call mmap_sem as write-side lock because it makes priority inversion of threads
>> >> > easily and that's one of clear part I spot that we don't need write-side lock.
>> >>
>> >> Please always run the binderThroughput tests when making binder changes
>> >> (there is a binder test suite in the CTS Android tests), as that ensures
>> >> that you are not causing performance regressions as well as just normal
>> >> bug regressions :)
>> >
>> > Thanks for the information. I didn't notice that such kinds of tests for
>> > binder. I will keep it in mind.
>> >
>> > Today, I have setup the testing for my phone and found testing was very
>> > fluctuating even without my patch. It might be not good with my test
>> > skill. I emulated user's behavior with various touch event. With it, I open
>> > various apps and play with them several times. Before starting the test,
>> > I did "adb shell stop && adb shell start && echo 3 > /proc/sys/vm/drop_caches"
>> >
>> > Such 15% noise was very easy to make it.
>> >
>> > Ganesh, How did you measure? What's the stddev?
>>
>> Hi, Minchan:
>>
>> Sorry for the late response, a little busy these days. :)
>>
>> We have our own test tools to measure app launch time, or you can use
>> android systrace to get the app launch time. We tested your V1 patch:
>> https://patchwork.kernel.org/patch/10312057/
>> and found app lunch time regression.
>
> V1 had a bug with VM_MAYWRITE. Could you confirm it with v5?

I have finished binder Throughput test. The test result is stable,
there is no performance
regression found both in v1 and v5.

base patch_v1 patch_v5
-----------------------------------------------------------
91223.4 90560.2 89644.5
90520.3 89583.1 89048.2
89833.2 90247.6 90091.3
90740.2 90276.7 90994.2
89703.5 90112.4 89994.6
89945.1 89122.8 88937.7
89872.8 90357.3 89307.4
89913.2 90355.4 89563.8
88979 90393.4 90182.8
89577.3 90946.8 90441.4
AVG 90030.8 90195.57 89820.59

Before the test, I stop the android framework by:
adb shell stop

>
> Please tell me more detail. What apps are slower compared to old?
> Every apps are slowed with avg 15%? Then, what's the stddev?

Not all of the apps slowed 15%, The app *avg* launch time slowed 15%.
And We will re-launch the test tomorrow: base, v1,v5. We will get the
test result in two days later. Then I will post all the app launch time details.

>
> The reason I'm asking is as I mentioned, it would be caused by rw_semaphore
> implementation and priority of threads which calls binder operation so I
> guess it would be not deterministic.
>
> When I had an simple experiment, it was very fluctuating as I expected.
> (the testing enviroment might be not good in my side).
> If it's real problem on real practice, better fix is not using write_lock
> of mmap_sem(it's abusing the write-side lock) but should adjust priority,
> I think. What do you think?

If you want to narrow the range of the problem. We can disable binder priority
inherit, and do not set the priority(currently it is nice -10 or fifo)
of top app in Android AMS.
I think we need to wait for the test result to see whether it really
has performance
regression.

>
> Anyway, before the further discussion, we should confirm the root cause.
>
>>
>> I will use binderThroghput tool to test the patch today or tomorrow.
>>
>
> Thanks. I will do.