Re: [PATCH -mm -v9 2/3] mm, THP, swap: Check whether THP can be split firstly

From: Johannes Weiner
Date: Wed Apr 19 2017 - 12:13:35 EST


On Wed, Apr 19, 2017 at 03:06:24PM +0800, Huang, Ying wrote:
> From: Huang Ying <ying.huang@xxxxxxxxx>
>
> To swap out THP (Transparent Huage Page), before splitting the THP,
> the swap cluster will be allocated and the THP will be added into the
> swap cache. But it is possible that the THP cannot be split, so that
> we must delete the THP from the swap cache and free the swap cluster.
> To avoid that, in this patch, whether the THP can be split is checked
> firstly. The check can only be done racy, but it is good enough for
> most cases.
>
> With the patchset, the swap out throughput improves 3.6% (from about
> 4.16GB/s to about 4.31GB/s) in the vm-scalability swap-w-seq test case
> with 8 processes. The test is done on a Xeon E5 v3 system. The swap
> device used is a RAM simulated PMEM (persistent memory) device. To
> test the sequential swapping out, the test case creates 8 processes,
> which sequentially allocate and write to the anonymous pages until the
> RAM and part of the swap device is used up.
>
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> [for can_split_huge_page()]

How often does this actually happen in practice? Because all that this
protects us from is trying to allocate a swap cluster - which with the
si->free_clusters list really isn't all that expensive - and return it
again. Unless this happens all the time in practice, this optimization
seems misplaced.

It's especially a little strange because in the other email I asked
about the need for unlikely() annotations, yet this patch is adding
branches and checks for what seems to be an unlikely condition into
the THP hot path.

I'd suggest you drop both these optimization attempts unless there is
real data proving that they have a measurable impact.