[PATCH v2 0/2] tmpfs: Improve tmpfs scalability

From: Tim Chen
Date: Wed May 26 2010 - 15:35:14 EST


We created a token jar library implementing
per cpu cache of tokens to avoid lock contentions whenever
we retrieve or return a token to a token jar. Using this library
with tmpfs, we find Aim7 fserver throughput improved 270%
on a 4 socket, 32 cores NHM-EX system.

In current implementation of tmpfs, whenever we
get a new page, stat_lock in shmem_sb_info needs to be acquired.
This causes a lot of lock contentions when multiple
threads are using tmpfs simultaneously, which makes
system with large number of cpus scale poorly.
Almost 75% of cpu time was spent contending on
stat_lock when we ran Aim7 fserver load with 128 threads
on a 4 socket, 32 cores NHM-EX system.

The first patch in the series implements the quick token jar.
The second patch update the shmem code of tmpfs to use this
library to improve tmpfs performance.

Version 2 Changes:
1. Make qtoken library callable from interrupt context
2. Change token counters to unsigned long
2. Added detailed comments of the qtoken library and interface functions
3. Fixed various styles issues in version 1
4. Update code to be cpu-hotplug-aware
5. Change return error code of interface functions to non-zero value

Regards,
Tim Chen



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/