Re: [RFC][PATCH 0/7] locking: qspinlock

From: Dave Chinner
Date: Wed Mar 12 2014 - 19:49:20 EST


On Wed, Mar 12, 2014 at 07:15:03AM +0100, Peter Zijlstra wrote:
> On Wed, Mar 12, 2014 at 01:31:53PM +1100, Dave Chinner wrote:
> > With the queuing spinlock, I expected to see somewhat better
> > results, but I didn't at first. Turns out if you have any sort of
> > lock debugging turned on, then the code doesn't ever go into the
> > lock slow path and hence does not ever enter the "lock failed" slow
> > path where all the contention fixes are supposed to be.
>
> Yeah; its a 'feature' of the spinlock debugging to turn all spinlocks
> into test-and-set thingies.
>
> > Anyway, with all lock debugging turned off, the system hangs
> > the instant I start the multithreaded bulkstat workload. Even the
> > console is unrepsonsive.
>
> Oops, I only briefly tested this series in userspace and that seemed to
> work. I'll go prod at it. Thanks for having a look though.
>
> Is that bstat test any easier/faster to setup/run than the aim7 crap?

Depends. I've got a VM setup with a sparse 100TB block device hosted
on SSDs where I can create 50M inodes using fsmark in about 3
and half minutes. I also have a hacked xfstests::src/bstat.c that is
multithreaded that I then run and it triggers it staight away.

Quite frankly, you don't need bulkstat to produce this lock
contention - you'll see it running this on a wide directory
structure on XFS and an SSD:

$ cat ~/tests/walk-scratch.sh
#!/bin/bash

echo Walking via find -ctime
echo 3 > /proc/sys/vm/drop_caches
time (
for d in /mnt/scratch/[0-9]* ; do

for i in $d/*; do
(
echo $i
find $i -ctime 1 > /dev/null
) > /dev/null 2>&1
done &
done
wait
)

echo Walking via ls -R
echo 3 > /proc/sys/vm/drop_caches
time (
for d in /mnt/scratch/[0-9]* ; do

for i in $d/*; do
(
echo $i
ls -R $i
) > /dev/null 2>&1
done &
done
wait
)
$

The directory structure I create has 16 top level directories (0-15)
each with 30-40 subdirectories containing 100,000 files each.
There's a thread per top level directory, and running it on a 16p VM
and an SSD that can do 30,000 IOPS will generate sufficient inode
cache pressure to trigger severe lock contention.

My usual test script for this workload runs mkfs, fsmark,
xfs_repair, bstat, walk-scratch, and finally a multi-threaded rm to
clean up. Usual inode numbers are in the 50-100million for zero
length file workloads, 10-20 million for single block (small) files,
and 100,000-1million for larger files. it's great for stressing VFs,
FS and IO level scalability...

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/