Re: [PATCH 5/5] locking/percpu-rwsem: Remove the embedded rwsem

From: Peter Zijlstra
Date: Tue Dec 17 2019 - 05:27:05 EST


On Tue, Nov 19, 2019 at 04:58:26PM +0100, Oleg Nesterov wrote:
> On 11/19, Waiman Long wrote:
> >
> > On 11/13/19 5:21 AM, Peter Zijlstra wrote:
> > > +static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry,
> > > + unsigned int mode, int wake_flags,
> > > + void *key)
> > > +{
> > > + struct task_struct *p = get_task_struct(wq_entry->private);
> > > + bool reader = wq_entry->flags & WQ_FLAG_CUSTOM;
> > > + struct percpu_rw_semaphore *sem = key;
> > > +
> > > + /* concurrent against percpu_down_write(), can get stolen */
> > > + if (!__percpu_rwsem_trylock(sem, reader))
> > > + return 1;
> > > +
> > > + list_del_init(&wq_entry->entry);
> > > + smp_store_release(&wq_entry->private, NULL);
> > > +
> > > + wake_up_process(p);
> > > + put_task_struct(p);
> > > +
> > > + return !reader; /* wake 'all' readers and 1 writer */
> > > +}
> > > +
> >
> > If I read the function correctly, you are setting the WQ_FLAG_EXCLUSIVE
> > for both readers and writers and __wake_up() is called with an exclusive
> > count of one. So only one reader or writer is woken up each time.
>
> This depends on what percpu_rwsem_wake_function() returns. If it returns 1,
> __wake_up_common() stops, exactly because all waiters have WQ_FLAG_EXCLUSIVE.

Indeed, let me see if I can clarify that somehow.

> > However, the comment above said we wake 'all' readers and 1 writer. That
> > doesn't match the actual code, IMO.
>
> Well, "'all' readers" probably means "all readers before writer",

Correct.

> > To match the comments, you should
> > have set WQ_FLAG_EXCLUSIVE flag only on writer. In this case, you
> > probably don't need WQ_FLAG_CUSTOM to differentiate between readers and
> > writers.
>
> See above...
>
> note also the
>
> if (!__percpu_rwsem_trylock(sem, reader))
> return 1;
>
> at the start of percpu_rwsem_wake_function(). We want to stop wake_up_common()
> as soon as percpu_rwsem_trylock() fails. Because we know that if it fails once
> it can't succeed later. Although iiuc this can only happen if another (new)
> writer races with __wake_up(&sem->waiters).

Yes, writer-writer stealing can cause that. I even put a comment in
there :-)

> I guess WQ_FLAG_CUSTOM can be avoided, percpu_rwsem_wait() could do
>
> if (read)
> __add_wait_queue_entry_tail(...);
> else {
> wq_entry.flags |= WQ_FLAG_EXCLUSIVE;
> __add_wait_queue(...);
> }
>
> but this is "unfair".

Yes, I could not make it fair without that extra bit, and I figured we
have plenty bits there to play with so why not.