Re: [PATCH] zram: use copy_page for full page copy

From: Sergey Senozhatsky
Date: Fri Jun 14 2024 - 01:31:30 EST


On (24/06/13 22:25), Christoph Hellwig wrote:
> On Thu, Jun 13, 2024 at 08:04:22AM +0800, Jisheng Zhang wrote:
> > commit 42e99bd975fd ("zram: optimize memory operations with
> > clear_page()/copy_page()") optimize page copy/clean operations, but
> > then commit d72e9a7a93e4 ("zram: do not use copy_page with non-page
> > aligned address") removes the optimization because there's memory
> > corruption at that time, the reason was well explained. But after
> > commit 1f7319c74275 ("zram: partial IO refactoring"), partial IO uses
> > alloc_page() instead of kmalloc to allocate a page, so we can bring
> > back the optimization.
> >
> > commit 80ba4caf8ba9 ("zram: use copy_page for full page copy") brings
> > back partial optimization, missed one point in zram_write_page().
> > optimize the full page copying in zram_write_page() with copy_page()
> >
> > Signed-off-by: Jisheng Zhang <jszhang@xxxxxxxxxx>
> > ---
> > drivers/block/zram/zram_drv.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index 3acd7006ad2c..4b2b5098062f 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -1478,11 +1478,13 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
> > dst = zs_map_object(zram->mem_pool, handle, ZS_MM_WO);
> >
> > src = zstrm->buffer;
> > - if (comp_len == PAGE_SIZE)
> > + if (comp_len == PAGE_SIZE) {
> > src = kmap_local_page(page);
> > - memcpy(dst, src, comp_len);
> > - if (comp_len == PAGE_SIZE)
> > + copy_page(dst, src);
> > kunmap_local(src);
> > + } else {
> > + memcpy(dst, src, comp_len);
> > + }
>
> I know this is pre-existing code, but why do we need to kmap
> for comp_len == PAGE_SIZE and not for the other cases? Something
> feels really obsfucated here.

It is tricky a little.

If we managed to compress page (size < zsmalloc uncompressible watermark)
then src is per-CPU buffer with compressed data. Otherwise src is original
page (with uncompressed data).