Re: [PATCHv4 01/17] zram: switch to non-atomic entry locking
From: Sergey Senozhatsky
Date: Thu Feb 06 2025 - 02:47:18 EST
On (25/02/06 08:38), Sebastian Andrzej Siewior wrote:
> On 2025-02-06 16:01:12 [+0900], Sergey Senozhatsky wrote:
> > So I completely reworked this bit and we don't have that problem
> > anymore, nor the problem of "inventing locking schemes in 2025".
> >
> > In short - we are returning back to bit-locks, what zram has been using
> > until commit 9518e5bfaae19 ("zram: Replace bit spinlocks with a spinlock_t),
> > not bit-spinlock these time around, that won't work with linux-rt, but
> > wait_on_bit and friends. Entry lock is exclusive, just like before,
> > but lock owner can sleep now, any task wishing to lock that same entry
> > will wait to be woken up by the current lock owner once it unlocks the
> > entry. For cases when lock is taken from atomic context (e.g. slot-free
> > notification from softirq) we continue using TRY lock, which has been
> > introduced by commit 3c9959e025472 ("zram: fix lockdep warning of free block
> > handling"), so there's nothing new here. In addition I added some lockdep
> > annotations, just to be safe.
> >
> > There shouldn't be too many tasks competing for the same entry. I can
> > only think of cases when read/write (or slot-free notification if zram
> > is used as a swap device) vs. writeback or recompression (we cannot have
> > writeback and recompression simultaneously).
>
> So if I understand, you want to get back to bit spinlocks but sleeping
> instead of polling. But why? Do you intend to have more locks per entry
> so that you use the additional bits with the "lock"?
zram is atomic right now, e.g.
zram_read()
lock entry by index # disables preemption
map zsmalloc entry # possibly memcpy
decompress
unmap zsmalloc
unlock entry # enables preemption
That's a pretty long time to keep preemption disabled (e.g. using slow
algorithm like zstd or deflate configured with high compression levels).
Apart from that that, difficult to use async algorithms, which can
e.g. wait for a H/W to become available, or algorithms that might want
to allocate memory internally during compression/decompression, e.g.
zstd).
Entry lock is not the only lock in zram currently that makes it
atomic, just one of.
> > It currently looks like this:
> >
> …
> > static __must_check bool zram_slot_try_lock(struct zram *zram, u32 index)
> > {
> > unsigned long *lock = &zram->table[index].flags;
> >
> > if (!test_and_set_bit_lock(ZRAM_ENTRY_LOCK, lock)) {
> > #ifdef CONFIG_DEBUG_LOCK_ALLOC
> > mutex_acquire(&zram->table[index].lockdep_map, 0, 0, _RET_IP_);
> > #endif
> > return true;
> > }
> > return false;
> > }
>
> I hope the caller does not poll.
Yeah, we don't that in the code.
> > static void zram_slot_lock(struct zram *zram, u32 index)
> > {
> > unsigned long *lock = &zram->table[index].flags;
> >
> > WARN_ON_ONCE(!preemptible());
>
> you want might_sleep() here instead. preemptible() works only on
> preemptible kernels. And might_sleep() is already provided by
> wait_on_bit_lock(). So this can go.
wait_on_bit_lock() has might_sleep().
> > wait_on_bit_lock(lock, ZRAM_ENTRY_LOCK, TASK_UNINTERRUPTIBLE);
> > #ifdef CONFIG_DEBUG_LOCK_ALLOC
> > mutex_acquire(&zram->table[index].lockdep_map, 0, 0, _RET_IP_);
> > #endif
>
> I would argue that you want this before the wait_on_bit_lock() simply
> because you want to see a possible deadlock before it happens.
Ack.
> > static void zram_slot_unlock(struct zram *zram, u32 index)
> > {
> > unsigned long *lock = &zram->table[index].flags;
> >
> > clear_and_wake_up_bit(ZRAM_ENTRY_LOCK, lock);
> > #ifdef CONFIG_DEBUG_LOCK_ALLOC
> > mutex_release(&zram->table[index].lockdep_map, _RET_IP_);
> > #endif
> Also before. So it complains about release a not locked lock before it
> happens.
OK.