Re: [PATCH mmotm 4/5] huge tmpfs: avoid premature exposure of new pagetable revert

From: Stephen Rothwell
Date: Wed Apr 20 2016 - 19:56:03 EST


Hi Hugh,

On Sat, 16 Apr 2016 16:38:15 -0700 (PDT) Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>
> This patch reverts all of my 09/31, your
> huge-tmpfs-avoid-premature-exposure-of-new-pagetable.patch
> and also the mm/memory.c changes from the patch after it,
> huge-tmpfs-map-shmem-by-huge-page-pmd-or-by-page-team-ptes.patch
>
> I've diffed this against the top of the tree, but it may be better to
> throw this and huge-tmpfs-avoid-premature-exposure-of-new-pagetable.patch
> away, and just delete the mm/memory.c part of the patch after it.
>
> This is in preparation for 5/5, which replaces what was done here.
> Why? Numerous reasons. Kirill was concerned that my movement of
> map_pages from before to after fault would show performance regression.
> Robot reported vm-scalability.throughput -5.5% regression, bisected to
> the avoid premature exposure patch. Andrew was concerned about bloat
> in mm/memory.o. Google had seen (on an earlier kernel) an OOM deadlock
> from pagetable allocations being done while holding pagecache pagelock.
>
> I thought I could deal with those later on, but the clincher came from
> Xiong Zhou's report that it had broken binary execution from DAX mount.
> Silly little oversight, but not as easily fixed as first appears, because
> DAX now uses the i_mmap_rwsem to guard an extent from truncation: which
> would be open to deadlock if pagetable allocation goes down to reclaim
> (both are using only the read lock, but in danger of an rwr sandwich).
>
> I've considered various alternative approaches, and what can be done
> to get both DAX and huge tmpfs working again quickly. Eventually
> arrived at the obvious: shmem should use the new pmd_fault().
>
> Reported-by: kernel test robot <xiaolong.ye@xxxxxxxxx>
> Reported-by: Xiong Zhou <jencce.kernel@xxxxxxxxx>
> Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
> ---
> mm/filemap.c | 10 --
> mm/memory.c | 225 +++++++++++++++++++++----------------------------
> 2 files changed, 101 insertions(+), 134 deletions(-)

I added this at the end of mmotm in linux-next today. I will leave
Andrew to sort it out later.

--
Cheers,
Stephen Rothwell