Re: [PATCH 2/2] locking/rwsem: Use a return variable in rwsem_spin_on_owner()

From: Jason Low
Date: Thu Apr 09 2015 - 16:36:55 EST


On Thu, 2015-04-09 at 11:16 -0700, Linus Torvalds wrote:
> On Thu, Apr 9, 2015 at 11:08 AM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > The pointer is a known-safe kernel pointer - it's just that it was
> > "known safe" a few instructions ago, and might be rcu-free'd at any
> > time.
>
> Actually, we could even do something like this:
>
> static inline int sem_owner_on_cpu(struct semaphore *sem, struct
> task_struct *owner)
> {
> int on_cpu;
>
> #ifdef CONFIG_DEBUG_PAGEALLOC
> rcu_read_lock();
> #endif
> on_cpu = sem->owner == owner && owner->on_cpu;
> #ifdef CONFIG_DEBUG_PAGEALLOC
> rcu_read_unlock();
> #endif
> return on_cpu;
> }
>
> because we really don't need to hold the RCU lock over the whole loop,
> we just need to validate that the semaphore owner still matches, and
> if so, check that it's on_cpu.
>
> And if CONFIG_DEBUG_PAGEALLOC is set, we don't care about performance
> *at*all*. We will have worse performance problems than doing some RCU
> read-locking inside the loop.
>
> And if CONFIG_DEBUG_PAGEALLOC isn't set, we don't really care about
> locking, since at worst we just access stale memory for one iteration.
>
> Hmm. It's not pretty, but neither is the current "let's just take a
> rcu lock that we don't really need over a loop that doesn't have very
> strict bounding".

So then something like the following (for rwsem)?

We can also run some tests to see how the worst case "access
stale memory for one iteration" to the heuristic can have an affect on
performance, though that probably wouldn't be much of an issue in
practice.

---
kernel/locking/rwsem-xadd.c | 43 ++++++++++++++++++++++++++++---------------
1 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 3417d01..870c574 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -295,6 +295,31 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
}
}

+static inline bool owner_running(struct rw_semaphore *sem, struct task_struct *owner)
+{
+ bool ret;
+
+#ifdef CONFIG_DEBUG_PAGEALLOC
+ rcu_read_lock();
+#endif
+ if (READ_ONCE(sem->owner) == owner) {
+ /*
+ * Ensure we emit the owner->on_cpu dereference
+ * after checking sem->owner still matches owner.
+ */
+ barrier();
+ ret = owner->on_cpu;
+ } else {
+ /* Owner changed. */
+ ret = false;
+ }
+
+#ifdef CONFIG_DEBUG_PAGEALLOC
+ rcu_read_unlock();
+#endif
+ return ret;
+}
+
static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
{
struct task_struct *owner;
@@ -329,25 +354,13 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
{
long count;

- rcu_read_lock();
- while (sem->owner == owner) {
- /*
- * Ensure we emit the owner->on_cpu, dereference _after_
- * checking sem->owner still matches owner, if that fails,
- * owner might point to free()d memory, if it still matches,
- * the rcu_read_lock() ensures the memory stays valid.
- */
- barrier();
-
- /* abort spinning when need_resched or owner is not running */
- if (!owner->on_cpu || need_resched()) {
- rcu_read_unlock();
+ while (owner_running(sem, owner)) {
+ /* Abort spinning when need resched. */
+ if (need_resched())
return false;
- }

cpu_relax_lowlatency();
}
- rcu_read_unlock();

if (READ_ONCE(sem->owner))
return true; /* new owner, continue spinning */
--
1.7.2.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/