Re: frequent lockups in 3.18rc4

From: Thomas Gleixner
Date: Mon Nov 17 2014 - 19:15:22 EST


On Mon, 17 Nov 2014, Linus Torvalds wrote:
> On Mon, Nov 17, 2014 at 2:43 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > So a combo of both (Jens and yours) might do the trick. Patch below.
>
> Yeah, I guess that would work. The important part is that *if*
> somebody really reuses the csd, we'd better have a release barrier
> (which csd_unlock() does, although badly - but this probably isn't
> that performance-critical) *before* we call the function, because
> otherwise there's no real serialization for the reuse.

Indeed.

> Of course, most of these things are presumably always per-cpu data
> structures, so the whole worry about "csd" being accessed from
> different CPU's probably doesn't even exist, and this all works fine
> as-is anyway, even in the presense of odd memory ordering issues.
>
> Judging from Jens' later email, it looks like we simply don't need
> this code at all any more, though, and we could just revert the
> commit.

Right. Reverting it is the proper solution for now. Though we should
really think about the async seperation later. It makes a lot of
sense.

> NOTE! I don't think this actually has anything to do with the actual
> problem that Dave saw. I just reacted to that WARN_ON() when I was
> looking at the code, and it made me go "that looks extremely
> suspicious".

One thing I was looking into today is the increased use of irq_work
which uses IPIs as well. Not sure whether that's related, but I it's
not from my radar yet.

But the possible compiler wreckage (or exposed kernel wreckage) is
frightening in several aspects ...

> Particularly on x86, with strong memory ordering, I don't think that
> any random accesses to 'csd' after the call to 'csd->func()' could
> actually matter. I just felt very nervous about the claim that
> somebody can reuse the csd immediately, that smelled bad to me from a
> *conceptual* standpoint, even if I suspect it works perfectly fine in
> practice.
>
> Anyway, I've found *another* race condition, which (again) doesn't
> actually seem to be an issue on x86.
>
> In particular, "csd_lock()" does things pretty well, in that it does a
> smp_mb() after setting the lock bit, so certainly nothing afterwards
> will leak out of that locked region.
>
> But look at csd_lock_wait(). It just does
>
> while (csd->flags & CSD_FLAG_LOCK)
> cpu_relax();
>
> and basically there's no memory barriers there. Now, on x86, this is a
> non-issue, since all reads act as an acquire, but at least in *theory*
> we have this completely unordered read going on. So any subsequent
> memory oeprations (ie after the return from generic_exec_single()
> could in theory see data from *before* the read.

True.

> So that whole kernel/smp.c locking looks rather dubious. The smp_mb()
> in csd_lock() is overkill (a "smp_store_release()" should be
> sufficient), and I think that the read of csd->flags in csd_unlock()
> should be a smp_load_acquire().
>
> Again, none of this has anything to do with Dave's problem. The memory
> ordering issues really cannot be an issue on x86, I'm just saying that
> there's code there that makes me a bit uncomfortable.

Right you are and we should fix it asap.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/