Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1

From: Waiman Long
Date: Sat Apr 04 2020 - 22:51:46 EST


On 4/3/20 10:28 PM, Linus Torvalds wrote:
> On Fri, Apr 3, 2020 at 7:02 PM Waiman Long <longman@xxxxxxxxxx> wrote:
>> So in term of priority, my current thinking is
>>
>> upgrading unfair reader > unfair reader > reader/writer
>>
>> A higher priority locker will block other lockers from acquiring the lock.
> An alternative option might be to have readers normally be 100% normal
> (ie with fairness wrt writers), and not really introduce any special
> "unfair reader" lock.
A regular down_read() caller will be handled normally.
> Instead, all the unfairness would come into play only when the special
> case - execve() - does it's special "lock for reading with intent to
> upgrade".
>
> But when it enters that kind of "intent to upgrade" lock state, it
> would not only block all subsequent writers, it would also guarantee
> that all other readers can continue to go).

Yes, that shouldn't be hard to do. If that is what is required, we may
only need a special upgrade function to drain the OSQ and then wake up
all the readers in the wait queue. I will add a flags argument to that
special upgrade function so that we may be able to select different
behavior in the future.

The regular down_read_interruptible() can be used unless we want to
designate only some readers are allowed to do upgrade by calling a
special down_read() function.
>
> So then the new rwsem operations would be
>
> - read_with_write_intent_lock_interruptible()
>
> This is the beginning of "execve()", and waits for all writers to
> exit, and puts the lock into "all readers can go" mode.
>
> You could think of it as a "I'm queuing myself for a write lock,
> but I'm allowing readers to go ahead" state.
>
> - read_lock_to_write_upgrade()
>
> This is the "now this turns into a regular write lock". It needs to
> wait for all other readers to exit, of course.
>
> - read_with_write_intent_unlock()
>
> This is the "I'm unqueuing myself, I aborted and will not become a
> write lock after all" operation.
>
> NOTE! In this model, there may be multiple threads that do that
> initial queuing thing. We only guarantee that only one of them will
> get to the actual write lock stage, and the others will abort before
> that happens.
>
> If that is a more natural state machine, then that should work fine
> too. And it has some advantages, in that it keeps the readers normally
> fair, and only turns them unfair when we get to that special
> read-for-write stage.
>
> But whatever it most natural for the rwsem code. Entirely up to you.

To be symmetric with the existing downgrade_write() function, I will
choose the name upgrade_read() for the upgrade function.

Will that work for you?

Cheers,
Longman