Re: [PATCH v7 2/2] mm: support large folios swap-in for sync io devices
From: Barry Song
Date: Wed Aug 21 2024 - 17:13:28 EST
On Thu, Aug 22, 2024 at 1:31 AM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
>
> On Wed, Aug 21, 2024 at 03:45:40PM GMT, hanchuanhua@xxxxxxxx wrote:
> > From: Chuanhua Han <hanchuanhua@xxxxxxxx>
> >
> >
> > 3. With both mTHP swap-out and swap-in supported, we offer the option to enable
> > zsmalloc compression/decompression with larger granularity[2]. The upcoming
> > optimization in zsmalloc will significantly increase swap speed and improve
> > compression efficiency. Tested by running 100 iterations of swapping 100MiB
> > of anon memory, the swap speed improved dramatically:
> > time consumption of swapin(ms) time consumption of swapout(ms)
> > lz4 4k 45274 90540
> > lz4 64k 22942 55667
> > zstdn 4k 85035 186585
> > zstdn 64k 46558 118533
>
> Are the above number with the patch series at [2] or without? Also can
> you explain your experiment setup or how can someone reproduce these?
Hi Shakeel,
The data was recorded after applying both this patch (swap-in mTHP) and
patch [2] (compressing/decompressing mTHP instead of page). However,
without the swap-in series, patch [2] becomes useless because:
If we have a large object, such as 16 pages in zsmalloc:
do_swap_page will happen 16 times:
1. decompress the whole large object and copy one page;
2. decompress the whole large object and copy one page;
3. decompress the whole large object and copy one page;
....
16. decompress the whole large object and copy one page;
So, patchset [2] will actually degrade performance rather than
enhance it if we don't have this swap-in series. This swap-in
series is a prerequisite for the zsmalloc/zram series.
We reproduced the data through the following simple steps:
1. Collected anonymous pages from a running phone and saved them to a file.
2. Used a small program to open and read the file into a mapped anonymous
memory.
3. Do the belows in the small program:
swapout_start_time
madv_pageout()
swapout_end_time
swapin_start_time
read_data()
swapin_end_time
We calculate the throughput of swapout and swapin using the difference between
end_time and start_time. Additionally, we record the memory usage of zram after
the swapout is complete.
>
> > [2] https://lore.kernel.org/all/20240327214816.31191-1-21cnbao@xxxxxxxxx/
>
Thanks
Barry