Re: [Bug 3.14.17] inconsistent lock state

From: Linus Torvalds
Date: Sun Aug 24 2014 - 13:50:21 EST


On Sun, Aug 24, 2014 at 4:28 AM, Knut Petersen
<Knut_Petersen@xxxxxxxxxxx> wrote:
>
> Since months the postmaster wantonly blocks all mail traffic from the
> biggest german ISP t-online.de to all vger.kernel.org mailing lists,
> therefore I could not cc lkml.

Hmm.

> Please forward the following bug report to lkml and whomever it might of
> interest:

Added the guilty parties to the cc. The problem seems to be that first
/proc/acpi/event was removed in commit 1696d9dc57e0 ("ACPI: Remove the
old /proc/acpi/event interface") and then because that caused
problems, a horribly broken netlink interface was added instead in
commit 0bf6368ee8f2 ("ACPI / button: Add ACPI Button event via netlink
routine")

And that commit really seems to be horribly horribly broken.

It calls the netlink routines from interrupt context, which doesn't
work. Thus lockdep warns about "netlink_poll()" using bh-safe locking:

spin_lock_bh(&sk->sk_receive_queue.lock);

but then __netlink_sendskb() is using that same queue lock from
interrupt context. Not some "subtly wrong" locking caught by lockdep,
but a major bug.

This seems to be going back to 3.14-rc7, which surprises me a bit.
It's been around for a while now, but I don't find a lot of reports.
And I don't see any subtle fixes for this anywhere, so it seems to be
still true today.

Rafael? Lan Tianyu? This is not some minor locking bug. This is a
*major* mistake unless I misread something.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/