Re: [PATCH RFC] x86: avoid atomic operation in test_and_set_bit_lock if possible

From: Jan Beulich
Date: Fri Mar 25 2011 - 12:46:32 EST


>>> On 25.03.11 at 17:29, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Mar 25, 2011 at 3:06 AM, Jan Beulich <JBeulich@xxxxxxxxxx> wrote:
>>
>> The problem was observed with __lock_page() (in a variant not
>> upstream for reasons not known to me), and prefixing e.g.
>> trylock_page() with an extra PageLocked() check yielded the
>> below quoted improvements.
>
> Ok. __lock_page() _definitely_ should do the test_bit() thing first,
> because it's normally called from lock_page() that has already tested
> the bit.
>
> But it already seems to do that, so I'm wondering what your variant is.

lock_page() does, but here it's really about __lock_page().

We've got a patch from Nick Piggin in the tree that aims at
speeding up unlock_page, which turns __lock_page() (on
2.6.32) into

void __lock_page(struct page *page)
{
wait_queue_head_t *wq = page_waitqueue(page);
DEFINE_WAIT_BIT(wait, &page->flags, PG_locked);

do {
prepare_to_wait(wq, &wait.wait, TASK_UNINTERRUPTIBLE);
SetPageWaiters(page);
if (likely(PageLocked(page)))
sync_page(page);
} while (!trylock_page(page));
finish_wait(wq, &wait.wait);
}

(No, I do not know why the patch isn't upstream, but Nick is on Cc,
so maybe he can tell.)

> I'm also a bit surprised that lock_page() is that hot (unless your
> _lock_page() variant is simply too broken and ends up spinning?).
> Maybe we have some path that takes the page lock unnecessarily? What's
> the load?

So yes, there is spinning involved. A first improvement from Jack
was to make the SetPageWaiters() (introduced by that patch)
conditional upon the bit not already being set.

But that wasn't yielding the desired improvement, hence they
added a PageLocked() || ... in front of the trylock_page(), which
in turn raised the question why trylock_page() wouldn't do this
generally.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/