Re: -rt dbench scalabiltiy issue

From: Nick Piggin
Date: Sat Oct 17 2009 - 18:39:14 EST


On Fri, Oct 16, 2009 at 01:05:19PM -0700, john stultz wrote:
> 2.6.31.2-rt13-nick on ramfs:
> 46.51% dbench [kernel] [k] _atomic_spin_lock_irqsave
> |
> |--86.95%-- rt_spin_lock_slowlock
> | rt_spin_lock
> | |
> | |--50.08%-- dput
> | | |
> | | |--56.92%-- __link_path_walk
> | | |
> | | --43.08%-- path_put
> | |
> | |--49.12%-- path_get
> | | |
> | | |--63.22%-- path_walk
> | | |
> | | |--36.73%-- path_init
> |
> |--12.59%-- rt_spin_lock_slowunlock
> | rt_spin_unlock
> | |
> | |--49.86%-- path_get
> | | |
> | | |--58.15%-- path_init
> | | | |
> ...
>
>
> So the net of this is: Nick's patches helped some but not that much in
> ramfs filesystems, and hurt ext3 performance w/ -rt.
>
> Maybe I just mis-applied the patches? I'll admit I'm unfamiliar with the
> dcache code, and converting the patches to the -rt tree was not always
> straight forward.

The above are dentry->d_lock, and they are rom path walking. It has
become more pronounced because I use d_lock to protect d_count rather
than an atomic_t (which saves on atomic ops).

But the patchset you have converted it missing the store-free path wailk
patches which will get rid of most of this. The next thing you hit is
glibc reading /proc/mounts to implement statvfs :( If you turn that call
into statfs you'll get a little further (but we need to improve statfs
support for glibc so it doesn't need those hacks).

And then you run into something else, I'd say d_lock again for creating
and unlinking things, but I didn't get a chance to profile it yet.

> Ingo, Nick, Thomas: Any thoughts or comments here? Am I reading perf's
> results incorrectly? Any idea why with Nick's patch the contention in
> dput() hurts ext3 so much worse then in the ramfs case?

ext3 may be doing more dentry refcounting which is hitting the spin
lock. I _could_ be persuaded to turn it back to an atomic_t, however
I will want to wait until other things like the path walking is more
mature which should take a lot of pressure off it.

Also... dbench throughput in exchange for adding an extra atomic at
dput-time is... not a good idea. We would need some more important
workloads I think (even a real samba serving netbench would be
preferable).


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/