Re: [locks] 6d390e4b5d: will-it-scale.per_process_ops -96.6% regression

From: NeilBrown
Date: Fri Mar 13 2020 - 22:31:17 EST


On Fri, Mar 13 2020, Jeff Layton wrote:

> On Thu, 2020-03-12 at 09:07 -0700, Linus Torvalds wrote:
>> On Wed, Mar 11, 2020 at 9:42 PM NeilBrown <neilb@xxxxxxx> wrote:
>> > It seems that test_and_set_bit_lock() is the preferred way to handle
>> > flags when memory ordering is important
>>
>> That looks better.
>>
>> The _preferred_ way is actually the one I already posted: do a
>> "smp_store_release()" to store the flag (like a NULL pointer), and a
>> smp_load_acquire() to load it.
>>
>> That's basically optimal on most architectures (all modern ones -
>> there are bad architectures from before people figured out that
>> release/acquire is better than separate memory barriers), not needing
>> any atomics and only minimal memory ordering.
>>
>> I wonder if a special flags value (keeping it "unsigned int" to avoid
>> the issue Jeff pointed out) might be acceptable?
>>
>> IOW, could we do just
>>
>> smp_store_release(&waiter->fl_flags, FL_RELEASED);
>>
>> to say that we're done with the lock? Or do people still look at and
>> depend on the flag values at that point?
>
> I think nlmsvc_grant_block does. We could probably work around it
> there, but we'd need to couple this change with some clear
> documentation to make it clear that you can't rely on fl_flags after
> locks_delete_block returns.
>
> If avoiding new locks is preferred here (and I'm fine with that), then
> maybe we should just go with the patch you sent originally (along with
> changing the waiters to wait on fl_blocked_member going empty instead
> of the fl_blocker going NULL)?

I agree. I've poked at this for a while and come to the conclusion that
I cannot really come up with anything that is structurally better than
your patch.
The idea of list_del_init_release() and list_empty_acquire() is growing
on me though. See below.

list_empty_acquire() might be appropriate for waitqueue_active(), which
is documented as requiring a memory barrier, but in practice seems to
often be used without one.

But I'm happy for you to go with your patch that changes all the wait
calls.

NeilBrown



diff --git a/fs/locks.c b/fs/locks.c
index 426b55d333d5..2e5eb677c324 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -174,6 +174,20 @@

#include <linux/uaccess.h>

+/* Should go in list.h */
+static inline int list_empty_acquire(const struct list_head *head)
+{
+ return smp_load_acquire(&head->next) == head;
+}
+
+static inline void list_del_init_release(struct list_head *entry)
+{
+ __list_del_entry(entry);
+ entry->prev = entry;
+ smp_store_release(&entry->next, entry);
+}
+
+
#define IS_POSIX(fl) (fl->fl_flags & FL_POSIX)
#define IS_FLOCK(fl) (fl->fl_flags & FL_FLOCK)
#define IS_LEASE(fl) (fl->fl_flags & (FL_LEASE|FL_DELEG|FL_LAYOUT))
@@ -724,7 +738,6 @@ static void locks_delete_global_blocked(struct file_lock *waiter)
static void __locks_delete_block(struct file_lock *waiter)
{
locks_delete_global_blocked(waiter);
- list_del_init(&waiter->fl_blocked_member);
waiter->fl_blocker = NULL;
}

@@ -740,6 +753,11 @@ static void __locks_wake_up_blocks(struct file_lock *blocker)
waiter->fl_lmops->lm_notify(waiter);
else
wake_up(&waiter->fl_wait);
+ /*
+ * Tell the world that we're done with it - see comment at
+ * top of locks_delete_block().
+ */
+ list_del_init_release(&waiter->fl_blocked_member);
}
}

@@ -753,6 +771,25 @@ int locks_delete_block(struct file_lock *waiter)
{
int status = -ENOENT;

+ /*
+ * If fl_blocker is NULL, it won't be set again as this thread
+ * "owns" the lock and is the only one that might try to claim
+ * the lock. So it is safe to test fl_blocker locklessly.
+ * Also if fl_blocker is NULL, this waiter is not listed on
+ * fl_blocked_requests for some lock, so no other request can
+ * be added to the list of fl_blocked_requests for this
+ * request. So if fl_blocker is NULL, it is safe to
+ * locklessly check if fl_blocked_requests is empty. If both
+ * of these checks succeed, there is no need to take the lock.
+ * However, some other thread could still be in__locks_wake_up_blocks()
+ * and may yet access 'waiter', so we cannot return and possibly
+ * free the 'waiter' unless we check that __locks_wake_up_blocks()
+ * is done. For that we carefully test fl_blocked_member.
+ */
+ if (waiter->fl_blocker == NULL &&
+ list_empty(&waiter->fl_blocked_requests) &&
+ list_empty_acquire(&waiter->fl_blocked_member))
+ return status;
spin_lock(&blocked_lock_lock);
if (waiter->fl_blocker)
status = 0;

Attachment: signature.asc
Description: PGP signature