Re: [PATCH 1/2] pid: tighten pidmap spinlock critical section by removing kfree()

From: André Goddard Rosa
Date: Mon Nov 23 2009 - 10:26:34 EST


Hi, Oleg!

On Mon, Nov 23, 2009 at 12:03 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> On 11/23, Pekka Enberg wrote:
>> (Adding some CC's.)
>>
>> On Sat, Nov 21, 2009 at 2:16 PM, Andrà Goddard Rosa
>> <andre.goddard@xxxxxxxxx> wrote:
>> > Avoid calling kfree() under pidmap spinlock, calling it afterwards.
>> >
>> > Normally kfree() is very fast, but sometimes it can be slow, so avoid
>> > calling it under the spinlock if we can.
>
> kfree() is called when we race with another process which also
> finds map->page == NULL, allocs the new page and takes pidmap_lock
> before us. This is extremely unlikely case, right?

Right, somehow.

>> > @@ -141,11 +141,12 @@ static int alloc_pidmap(struct pid_namespace *pid_ns)
>> > Â Â Â Â Â Â Â Â Â Â Â Â * installing it:
>> > Â Â Â Â Â Â Â Â Â Â Â Â */
>> > Â Â Â Â Â Â Â Â Â Â Â Âspin_lock_irq(&pidmap_lock);
>> > - Â Â Â Â Â Â Â Â Â Â Â if (map->page)
>> > - Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â kfree(page);
>> > - Â Â Â Â Â Â Â Â Â Â Â else
>> > + Â Â Â Â Â Â Â Â Â Â Â if (!map->page) {
>> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âmap->page = page;
>> > + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â page = NULL;
>> > + Â Â Â Â Â Â Â Â Â Â Â }
>> > Â Â Â Â Â Â Â Â Â Â Â Âspin_unlock_irq(&pidmap_lock);
>> > + Â Â Â Â Â Â Â Â Â Â Â kfree(page);
>
> And this change pessimizes (a little bit) the likely case, when
> the race doesn't happen. And imho this change doesn't make the
> code more readable.
>
> But this is subjective, and technically the patch is correct
> afaics.

It does not affect the likely case which happens when the pidmap is
already allocated.

In the unlikely case where the pidmap must be allocated, if we think
that we could have
let's say 8 processes contending for that spinlock, while one process
got it first and allocated
the page, having the kfree() out of the spinlock would make those
other 7 processes doing
useful work (performing the release of the page) before, because it
would avoid all of them
spinning around waiting until the all the others also free their
allocated pages.

>> > Â Â Â Â Â Â Â Â Â Â Â Âif (unlikely(!map->page))
>> > Â Â Â Â Â Â Â Â Â Â Â Â ï
>
> Hmm. Off-topic, but why alloc_pidmap() does not do this right
> after kzalloc() ?

Hmm... I would say that it's an optimistic best effort. We avoid
failing right away
hoping that another process (racing) had success allocating the page.
That is unlikely! :)

Thank you,
AndrÃ
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/