Re: [PATCH 1/2] zram: remove BD_CAP_SYNCHRONOUS_IO with writeback feature
From: Minchan Kim
Date: Thu Aug 02 2018 - 22:51:52 EST
Hi Andrew,
On Thu, Aug 02, 2018 at 02:13:04PM -0700, Andrew Morton wrote:
> On Thu, 2 Aug 2018 14:11:12 +0900 Minchan Kim <minchan@xxxxxxxxxx> wrote:
>
> > If zram supports writeback feature, it's no more syncrhonous
> > device beause zram does synchronous IO opeation for
> > incompressible page.
> >
> > Do not pretend to be syncrhonous IO device. It makes system
> > very sluggish as waiting IO completion from upper layer.
> >
> > Furthermore, it makes user-after-free problem because swap
> > think the opearion is done when the IO functions returns so
> > it could free page by will(e.g., lock_page_or_retry and
> > goto out_release in do_swap_page) but in fact, IO is
> > asynchrnous so driver could access just freed page afterward.
> >
> > This patch fixes the problem.
I fixed my faults from original description.
Otherwise, ones you corrected looks good to me.
>
> That changelog is rather hard to follow. Please review my edits:
>
> : If zram supports writeback feature, it's no longer a BD_CAP_SYNCHRONOUS_IO
> : device beause zram does synchronous IO operations for incompressible pages.
asynchronous
> :
> : Do not pretend to be synchronous IO device. It makes the system very
> : sluggish due to waiting for IO completion from upper layers.
> :
> : Furthermore, it causes a user-after-free problem because swap thinks the
> : opearion is done when the IO functions returns so it can free the page
> : (e.g., lock_page_or_retry and goto out_release in do_swap_page) but in
> : fact, IO is asynchrnous so the driver could access a just freed page
asynchronous
> : afterward.
> :
> : This patch fixes the problem.
>
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index 7436b2d27fa3..0b6eda1bd77a 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -298,7 +298,8 @@ static void reset_bdev(struct zram *zram)
> > zram->backing_dev = NULL;
> > zram->old_block_size = 0;
> > zram->bdev = NULL;
> > -
> > + zram->disk->queue->backing_dev_info->capabilities |=
> > + BDI_CAP_SYNCHRONOUS_IO;
> > kvfree(zram->bitmap);
> > zram->bitmap = NULL;
> > }
> > @@ -400,6 +401,8 @@ static ssize_t backing_dev_store(struct device *dev,
> > zram->backing_dev = backing_dev;
> > zram->bitmap = bitmap;
> > zram->nr_pages = nr_pages;
> > + zram->disk->queue->backing_dev_info->capabilities &=
> > + ~BDI_CAP_SYNCHRONOUS_IO;
> > up_write(&zram->init_lock);
> >
> > pr_info("setup backing device %s\n", file_name);
>
> A reader looking at this would wonder "why the heck are we doing that".
> Adding a code comment would help them.
I will add
/*
* With writeback feature, zram does a asynchronous IO so it's no longer
* synchronous device. If it pretend to be, upper layer could wait IO
* completion rather than (submit and return), which will cause system
* sluggish.
* Furthermore, when the IO function returns(e.g., swap_readpage),
* uppler lay could guess IO was done so it could deallocate the page
* freely but in fact, IO is going on and it finally could cause
* use-after-free when the IO is really done.
*/
>
> Is it legitimate to be altering the bdi capabilities at this level? Or
> is this hacky?
Most of device's bdi capability seems to be static but there are few drivers
which can change capability. For example, BDI_CAP_STABLE_WRITES
drivers/scsi/iscsi_tcp.c
drivers/md/raid5.c
I believe it's driver itself advertisement stuff so I hope it's not hack.
>
> If "yes" then should reset_bdev() be unconditionally setting
> BDI_CAP_SYNCHRONOUS_IO? Shouldn't it be restoring that flag to its
> previous setting?
>
Yu, reset_bdev should set it unconditionally. Because zram's default
mode is synchronous and it changed only if user set backing device.
I will respin the patch with revised comment and description.
Thanks.