Re: frequent lockups in 3.18rc4
From: Thomas Gleixner
Date: Mon Nov 17 2014 - 19:15:22 EST
On Mon, 17 Nov 2014, Linus Torvalds wrote:
> On Mon, Nov 17, 2014 at 2:43 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > So a combo of both (Jens and yours) might do the trick. Patch below.
> Yeah, I guess that would work. The important part is that *if*
> somebody really reuses the csd, we'd better have a release barrier
> (which csd_unlock() does, although badly - but this probably isn't
> that performance-critical) *before* we call the function, because
> otherwise there's no real serialization for the reuse.
> Of course, most of these things are presumably always per-cpu data
> structures, so the whole worry about "csd" being accessed from
> different CPU's probably doesn't even exist, and this all works fine
> as-is anyway, even in the presense of odd memory ordering issues.
> Judging from Jens' later email, it looks like we simply don't need
> this code at all any more, though, and we could just revert the
Right. Reverting it is the proper solution for now. Though we should
really think about the async seperation later. It makes a lot of
> NOTE! I don't think this actually has anything to do with the actual
> problem that Dave saw. I just reacted to that WARN_ON() when I was
> looking at the code, and it made me go "that looks extremely
One thing I was looking into today is the increased use of irq_work
which uses IPIs as well. Not sure whether that's related, but I it's
not from my radar yet.
But the possible compiler wreckage (or exposed kernel wreckage) is
frightening in several aspects ...
> Particularly on x86, with strong memory ordering, I don't think that
> any random accesses to 'csd' after the call to 'csd->func()' could
> actually matter. I just felt very nervous about the claim that
> somebody can reuse the csd immediately, that smelled bad to me from a
> *conceptual* standpoint, even if I suspect it works perfectly fine in
> Anyway, I've found *another* race condition, which (again) doesn't
> actually seem to be an issue on x86.
> In particular, "csd_lock()" does things pretty well, in that it does a
> smp_mb() after setting the lock bit, so certainly nothing afterwards
> will leak out of that locked region.
> But look at csd_lock_wait(). It just does
> while (csd->flags & CSD_FLAG_LOCK)
> and basically there's no memory barriers there. Now, on x86, this is a
> non-issue, since all reads act as an acquire, but at least in *theory*
> we have this completely unordered read going on. So any subsequent
> memory oeprations (ie after the return from generic_exec_single()
> could in theory see data from *before* the read.
> So that whole kernel/smp.c locking looks rather dubious. The smp_mb()
> in csd_lock() is overkill (a "smp_store_release()" should be
> sufficient), and I think that the read of csd->flags in csd_unlock()
> should be a smp_load_acquire().
> Again, none of this has anything to do with Dave's problem. The memory
> ordering issues really cannot be an issue on x86, I'm just saying that
> there's code there that makes me a bit uncomfortable.
Right you are and we should fix it asap.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/