Re: [PATCH 2/2] kcompressd: Add Kcompressd for accelerated zram compression

From: Barry Song
Date: Tue Mar 11 2025 - 03:25:36 EST


On Tue, Mar 11, 2025 at 8:05 PM Barry Song <21cnbao@xxxxxxxxx> wrote:
>
> On Tue, Mar 11, 2025 at 2:26 AM Qun-wei Lin (林群崴)
> <Qun-wei.Lin@xxxxxxxxxxxx> wrote:
> >
> > On Sat, 2025-03-08 at 08:41 +1300, Barry Song wrote:
> > >
> > > External email : Please do not click links or open attachments until
> > > you have verified the sender or the content.
> > >
> > >
> > > On Sat, Mar 8, 2025 at 1:02 AM Qun-Wei Lin <qun-wei.lin@xxxxxxxxxxxx>
> > > wrote:
> > > >
> > > > Introduced Kcompressd to offload zram page compression, improving
> > > > system efficiency by handling compression separately from memory
> > > > reclaiming. Added necessary configurations and dependencies.
> > > >
> > > > Signed-off-by: Qun-Wei Lin <qun-wei.lin@xxxxxxxxxxxx>
> > > > ---
> > > > drivers/block/zram/Kconfig | 11 ++
> > > > drivers/block/zram/Makefile | 3 +-
> > > > drivers/block/zram/kcompressd.c | 340
> > > > ++++++++++++++++++++++++++++++++
> > > > drivers/block/zram/kcompressd.h | 25 +++
> > > > drivers/block/zram/zram_drv.c | 22 ++-
> > > > 5 files changed, 397 insertions(+), 4 deletions(-)
> > > > create mode 100644 drivers/block/zram/kcompressd.c
> > > > create mode 100644 drivers/block/zram/kcompressd.h
> > > >
> > > > diff --git a/drivers/block/zram/Kconfig
> > > > b/drivers/block/zram/Kconfig
> > > > index 402b7b175863..f0a1b574f770 100644
> > > > --- a/drivers/block/zram/Kconfig
> > > > +++ b/drivers/block/zram/Kconfig
> > > > @@ -145,3 +145,14 @@ config ZRAM_MULTI_COMP
> > > > re-compress pages using a potentially slower but more
> > > > effective
> > > > compression algorithm. Note, that IDLE page recompression
> > > > requires ZRAM_TRACK_ENTRY_ACTIME.
> > > > +
> > > > +config KCOMPRESSD
> > > > + tristate "Kcompressd: Accelerated zram compression"
> > > > + depends on ZRAM
> > > > + help
> > > > + Kcompressd creates multiple daemons to accelerate the
> > > > compression of pages
> > > > + in zram, offloading this time-consuming task from the
> > > > zram driver.
> > > > +
> > > > + This approach improves system efficiency by handling page
> > > > compression separately,
> > > > + which was originally done by kswapd or direct reclaim.
> > >
> > > For direct reclaim, we were previously able to compress using
> > > multiple CPUs
> > > with multi-threading.
> > > After your patch, it seems that only a single thread/CPU is used for
> > > compression
> > > so it won't necessarily improve direct reclaim performance?
> > >
> >
> > Our patch only splits the context of kswapd. When direct reclaim is
> > occurred, it is bypassed, so direct reclaim remains unchanged, with
> > each thread that needs memory directly reclaiming it.
>
> Qun-wei, I’m getting a bit confused. Looking at the code in page_io.c,
> you always call swap_writepage_bdev_async() no matter if it is kswapd
> or direct reclaim:
>
> - else if (data_race(sis->flags & SWP_SYNCHRONOUS_IO))
> + else if (data_race(sis->flags & SWP_WRITE_SYNCHRONOUS_IO))
> swap_writepage_bdev_sync(folio, wbc, sis);
> else
> swap_writepage_bdev_async(folio, wbc, sis);
>
> In zram, I notice you are bypassing kcompressd by:
>
> + if (!nr_kcompressd || !current_is_kswapd())
> + return -EBUSY;
>
> How will this work if no one is calling __end_swap_bio_write(&bio),
> which is present in swap_writepage_bdev_sync()?
> Am I missing something? Or is it done by zram_bio_write() ?
>
> On the other hand, zram is a generic block device, and coupling its
> code with kswapd/direct reclaim somehow violates layering
> principles :-)
>
> >
> > > Even for kswapd, we used to have multiple threads like [kswapd0],
> > > [kswapd1],
> > > and [kswapd2] for different nodes. Now, are we also limited to just
> > > one thread?
> >
> > We only considered a single kswapd here and didn't account for multiple
> > instances. Since I use kfifo to collect the bios, if there are multiple
> > kswapds, we need to add a lock to protect the bio queue. I can revise
> > this in the 2nd version, or do you have any other suggested approaches?
>
> I'm wondering if we can move the code to vmscan/page_io instead
> of zram. If we're using a sync I/O swap device or have enabled zswap,
> we could run reclamation in this separate thread, which should also be

Sorry for the typo:
s/reclamation/__swap_writepage/g

> NUMA-aware.
>
> I would definitely be interested in prototyping it when I have the time.
>

Thanks
Barry