Re: [PATCH RFC tip/core/rcu 13/20] rcu: increase synchronize_sched_expedited()batching

From: Tejun Heo
Date: Sat Dec 18 2010 - 11:13:20 EST


Hello,

On 12/17/2010 09:54 PM, Paul E. McKenney wrote:
> From: Tejun Heo <tj@xxxxxxxxxx>
>
> The fix in commit #6a0cc49 requires more than three concurrent instances
> of synchronize_sched_expedited() before batching is possible. This
> patch uses a ticket-counter-like approach that is also not unrelated to
> Lai Jiangshan's Ring RCU to allow sharing of expedited grace periods even
> when there are only two concurrent instances of synchronize_sched_expedited().
>
> This commit builds on Tejun's original posting, which may be found at
> http://lkml.org/lkml/2010/11/9/204, adding memory barriers, avoiding
> overflow of signed integers (other than via atomic_t), and fixing the
> detection of batching.
>
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>

Acked-by: Tejun Heo <tj@xxxxxxxxxx>

Some comments on the sequence testing tho.

> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 49e8e16..af56148 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -47,6 +47,8 @@
> extern int rcutorture_runnable; /* for sysctl */
> #endif /* #ifdef CONFIG_RCU_TORTURE_TEST */
>
> +#define UINT_CMP_GE(a, b) (UINT_MAX / 2 >= (a) - (b))
> +#define UINT_CMP_LT(a, b) (UINT_MAX / 2 < (a) - (b))
> #define ULONG_CMP_GE(a, b) (ULONG_MAX / 2 >= (a) - (b))
> #define ULONG_CMP_LT(a, b) (ULONG_MAX / 2 < (a) - (b))

I don't think the original comparison had overflow problem. (a) < (b)
gives the wrong result on overflow but (int)((a) - (b)) < 0 is
correct.

I find the latter approach cleaner and that way the constant in the
instruction can be avoided too although if the compiler might generate
the same code regardless.

Also, I think the names are misleading. They aren't testing whether
one is greater or less than the other. They're testing whether one is
before or after the other where the counters are used as monotonically
incrementing (with wrapping) sequence, so wouldn't something like the
following be better?

#define SEQ_TEST(a, b, test_op) ({ \
typeof(a) __seq_a = (a); \
typeof(b) __seq_b = (b); \
bool __ret; \
(void)(&__seq_a == &__seq_b); \
switch (sizeof(__seq_a)) { \
case sizeof(char): \
__ret = (char)(__seq_a - __seq_b) test_op 0; \
break; \
case sizeof(int): \
__ret = (int)(__seq_a - __seq_b) test_op 0; \
break; \
case sizeof(long): \
__ret = (long)(__seq_a - __seq_b) test_op 0; \
break; \
case sizeof(long long): \
__ret = (long long)(__seq_a - __seq_b) test_op 0; \
break; \
default: \
__make_build_fail; \
} \
__ret; \
})

#define SEQ_BEFORE(a, b) SEQ_TEST((a), (b), <)
and so on...

Please note that the above macro is completely untested.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/