Re: INFO: task hung in wb_shutdown (2)

From: Jan Kara
Date: Thu May 03 2018 - 11:13:52 EST


On Wed 02-05-18 07:14:51, Tetsuo Handa wrote:
> >From 1b90d7f71d60e743c69cdff3ba41edd1f9f86f93 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Date: Wed, 2 May 2018 07:07:55 +0900
> Subject: [PATCH v2] bdi: wake up concurrent wb_shutdown() callers.
>
> syzbot is reporting hung tasks at wait_on_bit(WB_shutting_down) in
> wb_shutdown() [1]. This seems to be because commit 5318ce7d46866e1d ("bdi:
> Shutdown writeback on all cgwbs in cgwb_bdi_destroy()") forgot to call
> wake_up_bit(WB_shutting_down) after clear_bit(WB_shutting_down).
>
> Introduce a helper function clear_and_wake_up_bit() and use it, in order
> to avoid similar errors in future.
>
> [1] https://syzkaller.appspot.com/bug?id=b297474817af98d5796bc544e1bb806fc3da0e5e
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Reported-by: syzbot <syzbot+c0cf869505e03bdf1a24@xxxxxxxxxxxxxxxxxxxxxxxxx>
> Fixes: 5318ce7d46866e1d ("bdi: Shutdown writeback on all cgwbs in cgwb_bdi_destroy()")
> Cc: Tejun Heo <tj@xxxxxxxxxx>
> Cc: Jan Kara <jack@xxxxxxx>
> Cc: Jens Axboe <axboe@xxxxxx>
> Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>

Thanks for debugging this and for the fix Tetsuo! The patch looks good to
me. You can add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

Honza

> ---
> include/linux/wait_bit.h | 17 +++++++++++++++++
> mm/backing-dev.c | 2 +-
> 2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
> index 9318b21..2b0072f 100644
> --- a/include/linux/wait_bit.h
> +++ b/include/linux/wait_bit.h
> @@ -305,4 +305,21 @@ struct wait_bit_queue_entry {
> __ret; \
> })
>
> +/**
> + * clear_and_wake_up_bit - clear a bit and wake up anyone waiting on that bit
> + *
> + * @bit: the bit of the word being waited on
> + * @word: the word being waited on, a kernel virtual address
> + *
> + * You can use this helper if bitflags are manipulated atomically rather than
> + * non-atomically under a lock.
> + */
> +static inline void clear_and_wake_up_bit(int bit, void *word)
> +{
> + clear_bit_unlock(bit, word);
> + /* See wake_up_bit() for which memory barrier you need to use. */
> + smp_mb__after_atomic();
> + wake_up_bit(word, bit);
> +}
> +
> #endif /* _LINUX_WAIT_BIT_H */
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index 023190c..fa5e6d7 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -383,7 +383,7 @@ static void wb_shutdown(struct bdi_writeback *wb)
> * the barrier provided by test_and_clear_bit() above.
> */
> smp_wmb();
> - clear_bit(WB_shutting_down, &wb->state);
> + clear_and_wake_up_bit(WB_shutting_down, &wb->state);
> }
>
> static void wb_exit(struct bdi_writeback *wb)
> --
> 1.8.3.1
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR