Re: Lockless page cache test results

From: Nick Piggin
Date: Fri Apr 28 2006 - 01:06:10 EST

Next message: Nick Piggin: "Re: [PATCH 1/2] mm: serialize OOM kill operations"
Previous message: Andrew Morton: "Re: [PATCH][UPDATE] PCI: Add pci_assign_resource_fixed -- allowfixed address assignments"
In reply to: Linus Torvalds: "Re: Lockless page cache test results"
Next in thread: Linus Torvalds: "Re: Lockless page cache test results"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Linus Torvalds wrote:

On Thu, 27 Apr 2006, Nick Piggin wrote:

Of course, with small files, the actual filename lookup is likely to be the
real limiter.

Although that's lockless so it scales. find_get_page will overtake it
at some point.

filename lookup is only lockless for independent files. You end up getting the "dentry->d_lock" for a successful lookup in the lookup path, so if you have multiple threads looking up the same files (or - MUCH more commonly - directories), you're not going to be lockless.

Oh that's true, I forgot. So the many small files case will often have
as much d_lock activity as tree_lock.

I don't know how we could improve it. I've several times thought that we _should_ be able to do the directory lookups under the rcu read lock and never touch their d_count or d_lock at all, but the locking against directory renaming depends very intimately on d_lock.

It is _possible_ that we should be able to handle it purely with just memory ordering rather than depending on d_lock. That would be wonderful.

Of course, we do actually scale pretty damn well already. I'm just saying that it's not perfect.

See __d_lookup() for details.

Yes I see. Perhaps a seqlock could do the trick (hmm, there already is one),
however we still have to increment the refcount, so there'll always be a
shared cacheline.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Nick Piggin: "Re: [PATCH 1/2] mm: serialize OOM kill operations"
Previous message: Andrew Morton: "Re: [PATCH][UPDATE] PCI: Add pci_assign_resource_fixed -- allowfixed address assignments"
In reply to: Linus Torvalds: "Re: Lockless page cache test results"
Next in thread: Linus Torvalds: "Re: Lockless page cache test results"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]