Re: [patch 0/9] mutex subsystem, -V4

From: Andrew Morton
Date: Mon Dec 26 2005 - 05:36:35 EST


Ingo Molnar <mingo@xxxxxxx> wrote:
>
>
> * Andrew Morton <akpm@xxxxxxxx> wrote:
>
> > One side point on semaphores and mutexes: the so-called "fast path" is
> > generally not performance-critical, because we just don't take them at
> > high frequencies. Any workload which involves taking a semaphore at
> > more than 50,000-100,000 times/second tends to have ghastly
> > overscheduling failure scenarios on SMP. So people hit those
> > scenarios and the code gets converted to a lockless algorithm or to
> > use spinlocking.
> >
> > For example, for a while ext3/JBD was doing 200,000 context-switches
> > per second due to taking lock_super() at high frequencies. When I
> > converted the whole fs to use spin locking throughout the performance
> > in some workloads went up by 1000%.
>
> actually, i'm 99.9% certain [ ;-) ] that all that ext3 spinlock
> conversion pain could have been avoided by converting ext3 to the mutex
> code. Mutexes definitely do not overschedule, even in very high
> frequency lock/unlock scenarios. They behave and perform quite close to
> spinlocks. (which property is obviously a must for the -rt kernel, where
> all spinlocks, rwlocks, seqlocks, rwsems and semaphores are mutexes -
> providing a big playground for locking constructs)

hm. 16 CPUs hitting the same semaphore at great arrival rates. The cost
of a short spin is much less than the cost of a sleep/wakeup. The machine
was doing 100,000 - 200,000 context switches per second.

> hm, can you see any easy way for me to test my bold assertion on ext3,
> by somehow moving/hacking it back to semaphores?

Not really. The problem was most apparent after the lock_kernel() removal
patches. The first thing a CPU hit when it entered the fs was previously
lock_kernel(). That became lock_super() and performance went down the
tubes. From memory, the bad kernel was tip-of-tree as of Memorial Weekend
2003 ;)

I guess you could re-add all the lock_super()s as per 2.5.x's ext3/jbd,
check that it sucks running SDET on 8-way then implement the lock_super()s
via a mutex.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/