On 05/28/2014 08:16 AM, Raghavendra K T wrote:
This patch looks very promising.
- My kernbench/ebizzy test on baremetal (32 cpu +ht sandybridge) did not seem to
show the impact of extra cmpxchg. but there should be effect of extra cmpxchg.
Canceled out by better NUMA locality?
- we can further add dynamically changing batch_size implementation (inspiration and
hint by Paul McKenney) as necessary.
I could see a larger batch size being beneficial.
Currently the maximum wait time for a spinlock on a system
with N CPUs is N times the length of the largest critical
section.
Having the batch size set equal to the number of CPUs would only
double that, and better locality (CPUs local to the current
lock holder winning the spinlock operation) might speed things
up enough to cancel that part of that out again...
+#define TICKET_LOCK_INC_SHIFT 1
+#define __TICKET_LOCK_TAIL_INC (1<<TICKET_LOCK_INC_SHIFT)
+
#ifdef CONFIG_PARAVIRT_SPINLOCKS
-#define __TICKET_LOCK_INC 2
#define TICKET_SLOWPATH_FLAG ((__ticket_t)1)
#else
-#define __TICKET_LOCK_INC 1
#define TICKET_SLOWPATH_FLAG ((__ticket_t)0)
#endif
For the !CONFIG_PARAVIRT case, TICKET_LOCK_INC_SHIFT used to be 0,
now you are making it one. Probably not an issue, since even people
who compile with 128 < CONFIG_NR_CPUS <= 256 will likely have their
spinlocks padded out to 32 or 64 bits anyway in most data structures.
+#define TICKET_BATCH 0x4 /* 4 waiters can contend simultaneously */
+#define TICKET_LOCK_BATCH_MASK (~(TICKET_BATCH<<TICKET_LOCK_INC_SHIFT) + \
+ TICKET_LOCK_TAIL_INC - 1)
I do not see the value in having TICKET_BATCH declared with a
hexadecimal number,
does not compile if someone tried a TICKET_BATCH value that
is not a power of 2.