Re: [PATCH] tty: fix data race in flush_to_ldisc
From: Dmitry Vyukov
Date: Thu Sep 17 2015 - 07:42:39 EST
On Wed, Sep 16, 2015 at 4:54 AM, Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote:
> On 09/04/2015 03:28 PM, Dmitry Vyukov wrote:
>> On Thu, Sep 3, 2015 at 2:50 AM, Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote:
>>> On 09/02/2015 01:53 PM, Dmitry Vyukov wrote:
>>>> The data race is found with KernelThreadSanitizer (on rev 21bdb584af8c):
>>>>
>>>> ThreadSanitizer: data-race in release_tty
>>>> Write of size 8 by thread T325 (K2579):
>>>> release_tty+0xf3/0x1c0 drivers/tty/tty_io.c:1688
>>>> tty_release+0x698/0x7c0 drivers/tty/tty_io.c:1920
>>>> __fput+0x15f/0x310 fs/file_table.c:207
>>>> ____fput+0x1d/0x30 fs/file_table.c:243
>>>> task_work_run+0x115/0x130 kernel/task_work.c:123
>>>> do_notify_resume+0x73/0x80
>>>> tracehook_notify_resume include/linux/tracehook.h:190
>>>> do_notify_resume+0x73/0x80 arch/x86/kernel/signal.c:757
>>>> int_signal+0x12/0x17 arch/x86/entry/entry_64.S:326
>>>> Previous read of size 8 by thread T19 (K16):
>>>> flush_to_ldisc+0x29/0x300 drivers/tty/tty_buffer.c:472
>>>> process_one_work+0x47e/0x930 kernel/workqueue.c:2036
>>>> worker_thread+0xb0/0x900 kernel/workqueue.c:2170
>>>> kthread+0x150/0x170 kernel/kthread.c:207
>>>
>>> The stack traces are not really helpful in describing how the race
>>> occurs; I would leave it out of the changelog.
>>
>> ok
>
> Just to clarify; my comment is only wrt this particular patch.
> You may have other patches in the future with unclear execution paths
> that may require the call stacks. It's just that this particular
> race has invariant call stacks (for the relevant parts).
>
>
>>>> flush_to_ldisc reads port->itty and checks that it is not NULL,
>>>> concurrently release_tty sets port->itty to NULL. It is possible
>>>> that flush_to_ldisc loads port->itty once, ensures that it is
>>>> not NULL, but then reloads it again and uses. The second load
>>>> can already return NULL, which will cause a crash.
>>>>
>>>> Use READ_ONCE/WRITE_ONCE to read/update port->itty.
>>>
>>> See below.
>>>
>>>> Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
>>>> ---
>>>> drivers/tty/tty_buffer.c | 2 +-
>>>> drivers/tty/tty_io.c | 2 +-
>>>> 2 files changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
>>>> index 4cf263d..1f1031d 100644
>>>> --- a/drivers/tty/tty_buffer.c
>>>> +++ b/drivers/tty/tty_buffer.c
>>>> @@ -469,7 +469,7 @@ static void flush_to_ldisc(struct work_struct *work)
>>>> struct tty_struct *tty;
>>>> struct tty_ldisc *disc;
>>>>
>>>> - tty = port->itty;
>>>> + tty = READ_ONCE(port->itty);
>>>
>>> This is fine.
>>>
>>>> if (tty == NULL)
>>>> return;
>>>>
>>>> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
>>>> index 57fc6ee..aad47df 100644
>>>> --- a/drivers/tty/tty_io.c
>>>> +++ b/drivers/tty/tty_io.c
>>>> @@ -1685,7 +1685,7 @@ static void release_tty(struct tty_struct *tty, int idx)
>>>> tty_driver_remove_tty(tty->driver, tty);
>>>> tty->port->itty = NULL;
>>>> if (tty->link)
>>>> - tty->link->port->itty = NULL;
>>>> + WRITE_ONCE(tty->link->port->itty, NULL);
>>>
>>> This isn't doing anything useful.
>>>
>>> 1. The compiler can't push the store past the cancel_work_sync() (because the
>>> compiler has no visibility into cancel_work_sync()), and,
>>> 2. There's no effect if the compiler hoists the store higher in the release_tty()
>>> because the line discipline has already been closed and killed (so the
>>> tty_ldisc_ref() in flush_to_ldisc() returns NULL anyway).
>>
>> OK, let me do one try at convincing you that WRITE_ONCE here is a good
>> idea. If you are not convinced then I will remove it.
>
> FWIW, I'm not the gatekeeper wrt tree-wide code/best-practices changes;
> iow, convincing me is not a useful goal. But I think this would be a
> good topic/
> presentation for either Kernel Summit or Linux Plumbers.
>
> That said, I'll make some observations below that you should expect to read
> from other contributors/maintainers regarding your point-of-view.
>
>> WRITE_ONCE/READ_ONCE for all shared memory accesses:
>> 1. Make the code more readable but highlighting important aspects.
>
> Most would argue the additional macros detract from readability.
>
>> 2. Required by relevant standards and relieve you, me and everybody
>> else reading this code from spending time on proving that it cannot
>> break (think of multi-file compilation mode, store tearing which
>> compilers indeed known to do in some contexts, and compiler
>> transformations that we don't know of).
>
> Multi-file compilation and the compiler memory model are inherently
> incompatible with kernel code. Instrumenting code rather than specializing
> the tool (compiler) is progress in the wrong direction, imho.
>
> Same with store tearing of primitive types.
>
> Compiler "transforms" are an occasional hazard of kernel development,
> as are straight-up compiler bugs; in my limited experience, each occurs
> with equal frequency.
>
>> 3. Allow tooling that finds undoubtedly harmful bugs, like this one,
>
> While the READ_ONCE() fix improves robustness, I wouldn't categorize
> this as a 'harmful' bug. I seriously doubt any compiler has generated
> a reload of port->itty in flush_to_ldisc(). As such, I would disagree with
> any submission to stable kernels for this fix.
>
>> or in fact, I've just mailed another fix for missed memory barriers in
>> this file (also found by KTSAN).
>
> I really appreciate you uncovering those. But note that READ_ONCE/
> WRITE_ONCE aren't necessary to _find_ those bugs, but rather to
> eliminate false positives (which I can sympathize with).
>
> And in fact, with every potential race, human analysis is required anyway.
>
>> I've described these aspects in more detail here:
>> https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE
>>
>> I don't see any negative aspects to it. Do you see any? Because if you
>> see at least some value in at least on these points and don't see any
>> negative aspects, then it is worth doing.
>
> Consider the converse suggestion; everywhere I even smell race I add
> READ_ONCE/WRITE_ONCE. Kernel code would become unreadable,
> (and slower on hot paths). Consequently, each change will need to be
> justified, and thus, we've arrived where we started: analyzing each
> race on its own merits and fixing real races.
Thanks for the detailed explanation, Peter!
--
Dmitry Vyukov, Software Engineer, dvyukov@xxxxxxxxxx
Google Germany GmbH, DienerstraÃe 12, 80331, MÃnchen
GeschÃftsfÃhrer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und lÃschen Sie die E-Mail und alle AnhÃnge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/