Re: [PATCH v1 0/2] mremap refactor: check src address for vma boundaries first.
From: Oliver Sang
Date: Thu Aug 15 2024 - 22:39:39 EST
hi, Jeff,
On Thu, Aug 15, 2024 at 01:19:06PM -0700, Jeff Xu wrote:
> Hi Oliver,
>
> On Thu, Aug 15, 2024 at 11:16 AM Jeff Xu <jeffxu@xxxxxxxxxxxx> wrote:
> >
> > On Wed, Aug 14, 2024 at 12:14 AM <jeffxu@xxxxxxxxxxxx> wrote:
> > >
> > > From: Jeff Xu <jeffxu@xxxxxxxxxxxx>
> > >
> > > mremap doesn't allow relocate, expand, shrink across VMA boundaries,
> > > refactor the code to check src address range before doing anything on
> > > the destination, i.e. destination won't be unmapped, if src address
> > > failed the boundaries check.
> > >
> > > This also allows us to remove can_modify_mm from mremap.c, since
> > > the src address must be single VMA, can_modify_vma is used.
> > >
> > > It is likely this will improve the performance on mremap, previously
> > > the code does sealing check using can_modify_mm for the src address range,
> > > and the new code removed the loop (used by can_modify_mm).
> > >
> > > In order to verify this patch doesn't regress on mremap, I added tests in
> > > mseal_test, the test patch can be applied before mremap refactor patch or
> > > checkin independently.
> > >
> > > Also this patch doesn't change mseal's existing schematic: if sealing fail,
> > > user can expect the src/dst address isn't updated. So this patch can be
> > > applied regardless if we decided to go with current out-of-loop approach
> > > or in-loop approach currently in discussion.
> > >
> > > Regarding the perf test report by stress-ng [1] title:
> > > 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
> > >
> > > The test is using below for testing:
> > > stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64
> > >
> > > I can't repro this using ChromeOS, the pagemove test shows large value
> > > of stddev and stderr, and can't reasonably refect the performance impact.
> > >
> > > For example: I write a c program [2] to run the above pagemove test 10 times
> > > and calculate the stddev, stderr, for 3 commits:
> > >
> > > 1> before mseal feature is added:
> > > Ops/sec:
> > > Mean : 3564.40
> > > Std Dev : 2737.35 (76.80% of Mean)
> > > Std Err : 865.63 (24.29% of Mean)
> > >
> > > 2> after mseal feature is added:
> > > Ops/sec:
> > > Mean : 2703.84
> > > Std Dev : 2085.13 (77.12% of Mean)
> > > Std Err : 659.38 (24.39% of Mean)
> > >
> > > 3> after current patch (mremap refactor)
> > > Ops/sec:
> > > Mean : 3603.67
> > > Std Dev : 2422.22 (67.22% of Mean)
> > > Std Err : 765.97 (21.26% of Mean)
> > >
> > > The result shows 21%-24% stderr, this means whatever perf improvment/impact
> > > there might be won't be measured correctly by this test.
> > >
> > > This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU.
> > > And I reboot the machine before each test, and take the first 10 runs with
> > > run_stress_ng 10
> > >
> > > (I will run longer duration to see if test still shows large stdDev,StdErr)
> > >
> > I took more samples (100 run ), the stddev/stderr is smaller, however
> > still not at a range that can reasonably measure the perf improvement
> > here.
> >
> > The tests were taken using the same machine as (10 times run above)
> > and exact the same steps: i.e. change to certain kernel commit, reboot
> > test device, take the first test result.
> >
> > 1> Before mseal feature is added:
> > Statistics:
> > Ops/sec:
> > Mean : 1733.26
> > Std Dev : 842.13 (48.59% of Mean)
> > Std Err : 84.21 (4.86% of Mean)
> >
> > 2> After mseal feature is added
> > Statistics:
> > Ops/sec:
> > Mean : 1701.53
> > Std Dev : 1017.29 (59.79% of Mean)
> > Std Err : 101.73 (5.98% of Mean)
> >
> > 3> After mremap refactor (this patch)
> > Statistics:
> > Ops/sec:
> > Mean : 1097.04
> > Std Dev : 860.67 (78.45% of Mean)
> > Std Err : 86.07 (7.85% of Mean)
> >
> > Summary: even when the stderr is down to 4%-%8 percentage range, the
> > stddev is still too big.
> >
> > Hence, there are other unknown, random variables that impact this test.
> >
> I could not repro the 4% degradation with my test machine
> (Chromebook), this can be entirely due to the specific test and this
> test machine.
>
> Do you think it is possible to do a few more tests ? This time I like
> to have a larger sample size (100 run)
>
> stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64
>
> Please run the test for each commit following the exact steps, e.g.
> reboot the machine, run the test, get the first 100 results for
> sample. Please don't select or drop any unstable report because then
> the data will be biased. If possible, please includes stddiv and
> stderr for the data (or raw data if not possible, and I will do
> post-processing)
>
> for 3 commits:
> -> this patch.
what's the base of it? could I directly apply this patch upon the commit
what you said "after mseal feature" as below?
> -> after mseal feature
> -> before mseal feature
could you exlictly point to two commit-id?
>
> Thank you for your time and assistance in helping me on understanding
> this issue.
due to resource constraint, please expect that we need several days to finish
this test request.
>
> Best regards,
> -Jeff
>
> > -Jeff
> >
> > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@xxxxxxxxx/
> > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c
> > >
> > >
> > > Jeff Xu (2):
> > > mseal:selftest mremap across VMA boundaries.
> > > mseal: refactor mremap to remove can_modify_mm
> > >
> > > mm/internal.h | 24 ++
> > > mm/mremap.c | 77 +++----
> > > mm/mseal.c | 17 --
> > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++-
> > > 4 files changed, 353 insertions(+), 58 deletions(-)
> > >
> > > --
> > > 2.46.0.76.ge559c4bf1a-goog
> > >