Re: smp_call_function_single lockups

From: Linus Torvalds
Date: Wed Feb 11 2015 - 14:59:13 EST


On Wed, Feb 11, 2015 at 10:18 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> I'll think about this all, but we couldn't figure anything out last
> time we looked at it, so without more clues, don't hold your breath.

So having looked at it once more, one thing struck me:

Look at smp_call_function_single_async(). The comment says

* Like smp_call_function_single(), but the call is asynchonous and
* can thus be done from contexts with disabled interrupts.

but that is *only* true if we don't have to wait for the csd lock. The
comments even clarify that:

* The caller passes his own pre-allocated data structure
* (ie: embedded in an object) and is responsible for synchronizing it
* such that the IPIs performed on the @csd are strictly serialized.

but it's not at all clear that the caller *can* do that. Since the
"csd_unlock()" is done *after* the call to the callback function, any
serialization done by the caller is fundamentally not trustworthy,
since it cannot serialize with the csd lock - if it releases things in
the callback, the csd lock will still be set after releasing things.

So the caller has a really hard time guaranteeing that CSD_LOCK isn't
set. And if the call is done in interrupt context, for all we know it
is interrupting the code that is going to clear CSD_LOCK, so CSD_LOCK
will never be cleared at all, and csd_lock() will wait forever.

So I actually think that for the async case, we really *should* unlock
before doing the callback (which is what Thomas' old patch did).

And we migth well be better off doing something like

WARN_ON_ONCE(csd->flags & CSD_LOCK);

in smp_call_function_single_async(), because that really is a hard requirement.

And it strikes me that hrtick_csd is one of these cases that do this
with interrupts disabled, and use the callback for serialization. So I
really wonder if this is part of the problem..

Thomas? Am I missing something?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/