Re: [PATCHv4 01/17] zram: switch to non-atomic entry locking

From: Sergey Senozhatsky
Date: Mon Feb 03 2025 - 02:12:16 EST


On (25/02/03 12:26), Sergey Senozhatsky wrote:
> On (25/01/31 14:55), Andrew Morton wrote:
> > > +static void zram_slot_write_lock(struct zram *zram, u32 index)
> > > +{
> > > + atomic_t *lock = &zram->table[index].lock;
> > > + int old = atomic_read(lock);
> > > +
> > > + do {
> > > + if (old != ZRAM_ENTRY_UNLOCKED) {
> > > + cond_resched();
> > > + old = atomic_read(lock);
> > > + continue;
> > > + }
> > > + } while (!atomic_try_cmpxchg(lock, &old, ZRAM_ENTRY_WRLOCKED));
> > > +}
> >
> > I expect that if the calling userspace process has realtime policy (eg
> > SCHED_FIFO) then the cond_resched() won't schedule SCHED_NORMAL tasks
> > and this becomes a busy loop. And if the machine is single-CPU, the
> > loop is infinite.
>
> So for that scenario to happen zram needs to see two writes() to the same
> index (page) simultaneously? Or read() and write() on the same index (page)
> concurrently?

Just to put more details:

1) zram always works with only one particular zram entry index, which
is provided by an upper layer (e.g. bi_sector)

2) for read

read()
zram_read(page, index)
rlock entry[index]
decompress entry zshandle page
runlock entry[index]

for write

write()
zram_write(page, index)
len = compress page obj
handle = zsmalloc len
wlock entry[index]
entry.handle = handle
entry.len = len
wunlock entry[index]

3) at no point zram locks more than entry index
a) there is no entry cross-locking (entries are not hierarchical)
b) there is no entry lock nesting (including recursion)


I guess where we actually need zram entry lock is writeback and
recompression. Writeback moves object from zsmalloc pool to actual
physical storage, freeing zsmalloc memory after that and setting
zram entry[index] handle to the backikng device's block idx, which
needs synchronization. Recompression does a similar thing, it frees
the old zsmalloc handle and stores recompressed objects under new
zsmalloc handle, it thus updates zram entry[index] handle to point
to the new location, which needs to be synchronized.