Re: [PATCHv4 14/17] zsmalloc: make zspage lock preemptible

From: Yosry Ahmed
Date: Mon Feb 03 2025 - 16:11:40 EST


On Mon, Feb 03, 2025 at 12:13:49PM +0900, Sergey Senozhatsky wrote:
> On (25/01/31 15:51), Yosry Ahmed wrote:
> > > +static void zspage_read_lock(struct zspage *zspage)
> > > +{
> > > + atomic_t *lock = &zspage->lock;
> > > + int old = atomic_read(lock);
> > > +
> > > + do {
> > > + if (old == ZS_PAGE_WRLOCKED) {
> > > + cpu_relax();
> > > + old = atomic_read(lock);
> > > + continue;
> > > + }
> > > + } while (!atomic_try_cmpxchg(lock, &old, old + 1));
> > > +}
> > > +
> > > +static void zspage_read_unlock(struct zspage *zspage)
> > > +{
> > > + atomic_dec(&zspage->lock);
> > > +}
> > > +
> > > +static bool zspage_try_write_lock(struct zspage *zspage)
> > > +{
> > > + atomic_t *lock = &zspage->lock;
> > > + int old = ZS_PAGE_UNLOCKED;
> > > +
> > > + preempt_disable();
> > > + if (atomic_try_cmpxchg(lock, &old, ZS_PAGE_WRLOCKED))
> >
> > FWIW, I am usually afraid to manually implement locking like this. For
> > example, queued_spin_trylock() uses atomic_try_cmpxchg_acquire() not
> > atomic_try_cmpxchg(), and I am not quite sure what could happen without
> > ACQUIRE semantics here on some architectures.
>
> I looked into it a bit, wasn't sure either. Perhaps we can switch
> to acquire/release semantics, I'm not an expert on this, would highly
> appreciate help.
>
> > We also lose some debugging capabilities as Hilf pointed out in another
> > patch.
>
> So that zspage lock should have not been a lock, I think, it's a ref-counter
> and it's being used as one
>
> map()
> {
> page->users++;
> }
>
> unmap()
> {
> page->users--;
> }
>
> migrate()
> {
> if (!page->users)
> migrate_page();
> }

Hmm, but in this case we want migration to block new map/unmap
operations. So a vanilla refcount won't work.

>
> > Just my 2c.
>
> Perhaps we can sprinkle some lockdep on it. For instance:

Honestly this looks like more reason to use existing lock primitives to
me. What are the candidates? I assume rw_semaphore, anything else?

I guess the main reason you didn't use a rw_semaphore is the extra
memory usage. Seems like it uses ~32 bytes more than rwlock_t on x86_64.
That's per zspage. Depending on how many compressed pages we have
per-zspage this may not be too bad.

For example, if a zspage has a chain length of 4, and the average
compression ratio of 1/3, that's 12 compressed pages so the extra
overhead is <3 bytes per compressed page.

Given that the chain length is a function of the class size, I think we
can calculate the exact extra memory overhead per-compressed page for
each class and get a mean/median over all classes.

If the memory overhead is insignificant I'd rather use exisitng lock
primitives tbh.