Re: [PATCH v2 0/2] Try to release mmap_lock temporarily in smaps_rollup
From: Chinwen Chang
Date: Thu Aug 13 2020 - 12:11:36 EST
On Thu, 2020-08-13 at 02:53 -0700, Michel Lespinasse wrote:
> On Wed, Aug 12, 2020 at 7:14 PM Chinwen Chang
> <chinwen.chang@xxxxxxxxxxxx> wrote:
> > Recently, we have observed some janky issues caused by unpleasantly long
> > contention on mmap_lock which is held by smaps_rollup when probing large
> > processes. To address the problem, we let smaps_rollup detect if anyone
> > wants to acquire mmap_lock for write attempts. If yes, just release the
> > lock temporarily to ease the contention.
> >
> > smaps_rollup is a procfs interface which allows users to summarize the
> > process's memory usage without the overhead of seq_* calls. Android uses it
> > to sample the memory usage of various processes to balance its memory pool
> > sizes. If no one wants to take the lock for write requests, smaps_rollup
> > with this patch will behave like the original one.
> >
> > Although there are on-going mmap_lock optimizations like range-based locks,
> > the lock applied to smaps_rollup would be the coarse one, which is hard to
> > avoid the occurrence of aforementioned issues. So the detection and
> > temporary release for write attempts on mmap_lock in smaps_rollup is still
> > necessary.
>
> I do not mind extending the mmap lock API as needed. However, in the
> past I have tried adding rwsem_is_contended to mlock(), and later to
> mm_populate() paths, and IIRC gotten pushback on it both times. I
> don't feel strongly on this, but would prefer if someone else approved
> the rwsem_is_contended() use case.
>
Hi Michel,
Thank you for your kind feedback.
In my opinion, the difference between the case in smaps_rollup and the
one in your example is that, for the former, the interference comes from
the outside of the affected process, for the latter, it doesn't.
In other words, anyone may use smaps_rollup to probe the affected
process without the information about what step the affected one is
executing.
> Couple related questions, how many VMAs are we looking at here ? Would
> need_resched() be workable too ?
>
It depends on the types of applications. The number of VMAs we are
looking at is around 3000 ~ 4000.
As far as I know, need_resched() is used by the caller to check if it
should release the CPU resource for others. But in the case of
smaps_rollup, the affected process is contended on the mmap_lock, not
waiting for CPU. So need_resched() may not be workable here.
Thanks again for your comment.
Chinwen