Re: [PATCH v2 1/7] zram: fix lockdep warning of free block handling

From: Andrew Morton
Date: Mon Nov 26 2018 - 15:49:40 EST


On Mon, 26 Nov 2018 17:28:07 +0900 Minchan Kim <minchan@xxxxxxxxxx> wrote:

>
> ...
>
> With writeback feature, zram_slot_free_notify could be called
> in softirq context by end_swap_bio_read. However, bitmap_lock
> is not aware of that so lockdep yell out. Thanks.
>
> The problem is not only bitmap_lock but it is also zram_slot_lock
> so straightforward solution would disable irq on zram_slot_lock
> which covers every bitmap_lock, too.
> Although duration disabling the irq is short in many places
> zram_slot_lock is used, a place(ie, decompress) is not fast
> enough to hold irqlock on relying on compression algorithm
> so it's not a option.
>
> The approach in this patch is just "best effort", not guarantee
> "freeing orphan zpage". If the zram_slot_lock contention may happen,
> kernel couldn't free the zpage until it recycles the block. However,
> such contention between zram_slot_free_notify and other places to
> hold zram_slot_lock should be very rare in real practice.
> To see how often it happens, this patch adds new debug stat
> "miss_free".
>
> It also adds irq lock in get/put_block_bdev to prevent deadlock
> lockdep reported. The reason I used irq disable rather than bottom
> half is swap_slot_free_notify could be called with irq disabled
> so it breaks local_bh_enable's rule. The irqlock works on only
> writebacked zram slot entry so it should be not frequent lock.
>
> Cc: stable@xxxxxxxxxxxxxxx # 4.14+
> Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
> ---
> drivers/block/zram/zram_drv.c | 56 +++++++++++++++++++++++++----------
> drivers/block/zram/zram_drv.h | 1 +
> 2 files changed, 42 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 4879595200e1..472027eaed60 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -53,6 +53,11 @@ static size_t huge_class_size;
>
> static void zram_free_page(struct zram *zram, size_t index);
>
> +static int zram_slot_trylock(struct zram *zram, u32 index)
> +{
> + return bit_spin_trylock(ZRAM_LOCK, &zram->table[index].value);
> +}
> +
> static void zram_slot_lock(struct zram *zram, u32 index)
> {
> bit_spin_lock(ZRAM_LOCK, &zram->table[index].value);
> @@ -443,29 +448,45 @@ static ssize_t backing_dev_store(struct device *dev,
>
> static unsigned long get_entry_bdev(struct zram *zram)
> {
> - unsigned long entry;
> + unsigned long blk_idx;
> + unsigned long ret = 0;
>
> - spin_lock(&zram->bitmap_lock);
> /* skip 0 bit to confuse zram.handle = 0 */
> - entry = find_next_zero_bit(zram->bitmap, zram->nr_pages, 1);
> - if (entry == zram->nr_pages) {
> - spin_unlock(&zram->bitmap_lock);
> - return 0;
> + blk_idx = find_next_zero_bit(zram->bitmap, zram->nr_pages, 1);
> + if (blk_idx == zram->nr_pages)
> + goto retry;
> +
> + spin_lock_irq(&zram->bitmap_lock);
> + if (test_bit(blk_idx, zram->bitmap)) {
> + spin_unlock_irq(&zram->bitmap_lock);
> + goto retry;
> }
>
> - set_bit(entry, zram->bitmap);
> - spin_unlock(&zram->bitmap_lock);
> + set_bit(blk_idx, zram->bitmap);

Here we could do

if (test_and_set_bit(...)) {
spin_unlock(...);
goto retry;

But it's weird to take the spinlock on behalf of bitops which are
already atomic!

It seems rather suspicious to me. Why are we doing this?

> + ret = blk_idx;
> + goto out;
> +retry:
> + spin_lock_irq(&zram->bitmap_lock);
> + blk_idx = find_next_zero_bit(zram->bitmap, zram->nr_pages, 1);
> + if (blk_idx == zram->nr_pages)
> + goto out;
> +
> + set_bit(blk_idx, zram->bitmap);
> + ret = blk_idx;
> +out:
> + spin_unlock_irq(&zram->bitmap_lock);
>
> - return entry;
> + return ret;
> }
>
> static void put_entry_bdev(struct zram *zram, unsigned long entry)
> {
> int was_set;
> + unsigned long flags;
>
> - spin_lock(&zram->bitmap_lock);
> + spin_lock_irqsave(&zram->bitmap_lock, flags);
> was_set = test_and_clear_bit(entry, zram->bitmap);
> - spin_unlock(&zram->bitmap_lock);
> + spin_unlock_irqrestore(&zram->bitmap_lock, flags);

Here's another one. Surely that locking is unnecessary.

> WARN_ON_ONCE(!was_set);
> }
>