Re: Potential data race in flush_to_ldisc

From: Dmitry Vyukov
Date: Fri Aug 28 2015 - 14:19:00 EST


On Fri, Aug 28, 2015 at 8:10 PM, Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Aug 28, 2015 at 06:57:17PM +0200, Dmitry Vyukov wrote:
>> Hello,
>>
>> We are working on a dynamic data race detector for the Linux kernel,
>> KernelThreadSanitizer (ktsan):
>> https://github.com/google/ktsan/wiki
>>
>> While booting kernel (upstream revision 21bdb584af8c) we got a report:
>>
>> ThreadSanitizer: data-race in release_tty
>>
>> Write of size 8 by thread T325 (K2579):
>> [<ffffffff81655c43>] release_tty+0xf3/0x1c0 drivers/tty/tty_io.c:1688
>> [<ffffffff816563a8>] tty_release+0x698/0x7c0 drivers/tty/tty_io.c:1920
>> [<ffffffff8126154f>] __fput+0x15f/0x310 fs/file_table.c:207
>> [<ffffffff8126176d>] ____fput+0x1d/0x30 fs/file_table.c:243
>> [<ffffffff810b9485>] task_work_run+0x115/0x130 kernel/task_work.c:123
>> (discriminator 1)
>> [< inlined >] do_notify_resume+0x73/0x80
>> tracehook_notify_resume include/linux/tracehook.h:190
>> [<ffffffff81006da3>] do_notify_resume+0x73/0x80 arch/x86/kernel/signal.c:757
>> [<ffffffff81ee25fc>] int_signal+0x12/0x17 arch/x86/entry/entry_64.S:326
>>
>> Previous read of size 8 by thread T19 (K16):
>> [<ffffffff816624d9>] flush_to_ldisc+0x29/0x300 drivers/tty/tty_buffer.c:472
>> [<ffffffff810b1fce>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036
>> [<ffffffff810b2530>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170
>> [<ffffffff810bbbd0>] kthread+0x150/0x170 kernel/kthread.c:207
>> [<ffffffff81ee281f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:526
>>
>>
>> flush_to_ldisc accesses port->itty:
>>
>> static void flush_to_ldisc(struct work_struct *work)
>> {
>> ...
>> tty = port->itty;
>> if (tty == NULL)
>> return;
>> disc = tty_ldisc_ref(tty);
>>
>> while release_tty concurrently sets itty to NULL:
>>
>> static void release_tty(struct tty_struct *tty, int idx)
>> {
>> ...
>> tty->port->itty = NULL;
>> if (tty->link)
>> tty->link->port->itty = NULL;
>> cancel_work_sync(&tty->port->buf.work);
>> tty_kref_put(tty->link);
>> tty_kref_put(tty);
>> }
>>
>> It seems that read of port->itty requires to be at least READ_ONCE,
>> because otherwise flush_to_ldisc can check that itty is not NULL, then
>> re-read it again and crash with NULL deref.
>> I don't know what is ownership and locking story here. There can be
>> larger issue here: either a lock is missing, or itty can be deleted
>> under flush_to_ldisc feet.
>>
>> Please confirm that this is real but. If so please fix it.
>
> Patches are always gladly accepted. Don't force us to try to determine
> if your tool is finding false-positives or not. That is your
> responsibility, not ours :)


Well, I did my homework of eliminating all known false positives from
the tool and also by looking at the code to ensure that the report
makes sense. But I have very little experience with kernel code, so
cannot be 100% sure that this is a real race. So I am asking
maintainers to confirm.
Regarding a patch, should I just take tty_mutex in flush_to_ldisc? If
so, should it be locked before buf->lock or after?


Thank you
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/