Re: [PATCH 1/2] zram: remove BD_CAP_SYNCHRONOUS_IO with writeback feature

From: Andrew Morton
Date: Thu Aug 02 2018 - 17:13:09 EST


On Thu, 2 Aug 2018 14:11:12 +0900 Minchan Kim <minchan@xxxxxxxxxx> wrote:

> If zram supports writeback feature, it's no more syncrhonous
> device beause zram does synchronous IO opeation for
> incompressible page.
>
> Do not pretend to be syncrhonous IO device. It makes system
> very sluggish as waiting IO completion from upper layer.
>
> Furthermore, it makes user-after-free problem because swap
> think the opearion is done when the IO functions returns so
> it could free page by will(e.g., lock_page_or_retry and
> goto out_release in do_swap_page) but in fact, IO is
> asynchrnous so driver could access just freed page afterward.
>
> This patch fixes the problem.

That changelog is rather hard to follow. Please review my edits:

: If zram supports writeback feature, it's no longer a BD_CAP_SYNCHRONOUS_IO
: device beause zram does synchronous IO operations for incompressible pages.
:
: Do not pretend to be synchronous IO device. It makes the system very
: sluggish due to waiting for IO completion from upper layers.
:
: Furthermore, it causes a user-after-free problem because swap thinks the
: opearion is done when the IO functions returns so it can free the page
: (e.g., lock_page_or_retry and goto out_release in do_swap_page) but in
: fact, IO is asynchrnous so the driver could access a just freed page
: afterward.
:
: This patch fixes the problem.

> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 7436b2d27fa3..0b6eda1bd77a 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -298,7 +298,8 @@ static void reset_bdev(struct zram *zram)
> zram->backing_dev = NULL;
> zram->old_block_size = 0;
> zram->bdev = NULL;
> -
> + zram->disk->queue->backing_dev_info->capabilities |=
> + BDI_CAP_SYNCHRONOUS_IO;
> kvfree(zram->bitmap);
> zram->bitmap = NULL;
> }
> @@ -400,6 +401,8 @@ static ssize_t backing_dev_store(struct device *dev,
> zram->backing_dev = backing_dev;
> zram->bitmap = bitmap;
> zram->nr_pages = nr_pages;
> + zram->disk->queue->backing_dev_info->capabilities &=
> + ~BDI_CAP_SYNCHRONOUS_IO;
> up_write(&zram->init_lock);
>
> pr_info("setup backing device %s\n", file_name);

A reader looking at this would wonder "why the heck are we doing that".
Adding a code comment would help them.

Is it legitimate to be altering the bdi capabilities at this level? Or
is this hacky?

If "yes" then should reset_bdev() be unconditionally setting
BDI_CAP_SYNCHRONOUS_IO? Shouldn't it be restoring that flag to its
previous setting?