Re: [RFC PATCH 1/2] mtd: core: protect access to mtd devices while in suspend

From: Boris Brezillon
Date: Fri Oct 08 2021 - 11:30:59 EST


Hi Sean,

Can you please submit that as a separate thread, ideally with an
incremented version number, a changelog and a reference to all your
previous attempts.

On Fri, 8 Oct 2021 16:38:24 +0200
Sean Nyekjaer <sean@xxxxxxxxxx> wrote:

> This will prevent reading/writing/erasing to a suspended mtd device.
> It will force mtd_write()/mtd_read()/mtd_erase() to wait for
> mtd_resume() to unlock access to mtd devices.

I think this has to be done for all the hooks except ->_reboot(),
->_get_device() and ->_put_device().

>
> Exec_op[0] speed things up, so we see this race when rawnand devices going

Mention the commit directly:

Commit ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op") speed
things up, so we see this race when rawnand devices going ...

> into suspend. But it's actually "mtd: rawnand: Simplify the locking" that

But it's actually commit 013e6292aaf5 ("mtd: rawnand: Simplify the
locking") that ...

> allows it to return errors rather than locking, before that commit it would
> have waited for the rawnand device to resume.
>
> Tested on a iMX6ULL.
>
> [0]:
> ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op")
>
> Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking")
> Signed-off-by: Sean Nyekjaer <sean@xxxxxxxxxx>

You flagged yourself as the author even though you didn't really write
that code. I guess I'm fine with that, but I'd appreciate a

Suggested-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx>

here, at least.

> ---
>
> Hope I got it all :)
>
> drivers/mtd/mtdcore.c | 57 ++++++++++++++++++++++++++++++++++++++++-
> include/linux/mtd/mtd.h | 36 ++++++++++++++++++--------
> 2 files changed, 81 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
> index c8fd7f758938..3c93202e6cbb 100644
> --- a/drivers/mtd/mtdcore.c
> +++ b/drivers/mtd/mtdcore.c
> @@ -36,6 +36,44 @@
>
> struct backing_dev_info *mtd_bdi;
>
> +static void mtd_start_access(struct mtd_info *mtd)
> +{
> + struct mtd_info *master = mtd_get_master(mtd);
> +
> + /*
> + * Don't take the suspend_lock on devices that don't
> + * implement the suspend hook. Otherwise, lockdep will
> + * complain about nested locks when trying to suspend MTD
> + * partitions or MTD devices created by gluebi which are
> + * backed by real devices.
> + */
> + if (!master->_suspend)
> + return;
> +

You need to remove the ->_suspend()/->_resume() implementation in
mtd_concat.c, otherwise you'll hit the case described in the comment.

BTW, did you test this series with lockdep enabled to make sure we
don't introduce a deadlock?

> + /*
> + * Wait until the device is resumed. Should we have a
> + * non-blocking mode here?
> + */
> + while (1) {
> + down_read(&master->master.suspend_lock);
> + if (!master->master.suspended)
> + return;
> +
> + up_read(&master->master.suspend_lock);
> + wait_event(master->master.resume_wq, master->master.suspended == 0);
> + }
> +}
> +
> +static void mtd_end_access(struct mtd_info *mtd)
> +{
> + struct mtd_info *master = mtd_get_master(mtd);
> +
> + if (!master->_suspend)
> + return;
> +
> + up_read(&master->master.suspend_lock);
> +}
> +