I'm looking at some adaptive spinning with futexes as a way to help reduce the dependence on sched_yield() to implement userspace spinlocks. Chris, I included you in the CC after reading your comments regarding sched_yield() at kernel summit and I thought you might be interested.
I have an experimental patchset that implements FUTEX_LOCK and FUTEX_LOCK_ADAPTIVE in the kernel and use something akin to mutex_spin_on_owner() for the first waiter to spin. What I'm finding is that adaptive spinning actually hurts my particular test case, so I was hoping to poll people for context regarding the existing adaptive spinning implementations in the kernel as to where we see benefit. Under which conditions does adaptive spinning help?
I presume locks with a short average hold time stand to gain the most as the longer the lock is held the more likely the spinner will expire its timeslice or that the scheduling gain becomes noise in the acquisition time. My test case simple calls "lock();unlock()" for a fixed number of iterations and reports the iterations per second at the end of the run. It can run with an arbitrary number of threads as well. I typically run with 256 threads for 10M iterations.
futex_lock: Result: 635 Kiter/s
futex_lock_adaptive: Result: 542 Kiter/s